Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUERY] How to use Linked Services with Cosmos DB Spark connector Throughput control in Synapse Analytics #42855

Open
golfalot opened this issue Nov 8, 2024 · 2 comments
Labels
Client This issue points to a problem in the data-plane of the library. Cosmos customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention Workflow: This issue is responsible by Azure service team.

Comments

@golfalot
Copy link

golfalot commented Nov 8, 2024

Query/Question

What is the equivalent to spark.synapse.linkedService for use with spark.cosmos.throughputControl.globalControl.database

I need to use private endpoints because of DEP, and having credentials stored with Linked Services is very handy.

Why is this not a Bug or a feature Request?

Feels like I'm missing something really obvious, but the docs don't seem to cover what I'm seeking,

Setup (please complete the following information if applicable):

  • OS: Spark 3.4 Python 3.10 Scala
  • IDE: Synapse Analytics Web based
  • Library/Libraries: default Spark Pool config

PySpark

(
    df_floodre
    .withColumn('id', F.col('uprn').cast('string'))
    .write
    .format("cosmos.oltp")
    .mode("Append")
    .option("spark.synapse.linkedService", "CosmosDB_GraphQL_databasets")
    .option("spark.cosmos.container", 'floodRe')
    .option('spark.cosmos.throughputControl.enabled','true')
    .option('spark.cosmos.throughputControl.name','floodRe_Loader_Jobs_Group') 
    .option('spark.cosmos.throughputControl.globalControl.database','CosmosDB_GraphQL_databasets_ThroughputControl') 
    .option('spark.cosmos.throughputControl.globalControl.container','ThroughputControl-floodre')
    .option('spark.cosmos.throughputControl.targetThroughput',7000)
    .save()
)

I've tried a few likely nonsensical permutations and got 403 or 404 errors. Guidance much appreciated.

@github-actions github-actions bot added Client This issue points to a problem in the data-plane of the library. Cosmos customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention Workflow: This issue is responsible by Azure service team. labels Nov 8, 2024
Copy link

github-actions bot commented Nov 8, 2024

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @kushagraThapar @pjohari-ms @TheovanKraay.

@kushagraThapar
Copy link
Member

@tvaron3 please take a look at this, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Client This issue points to a problem in the data-plane of the library. Cosmos customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention Workflow: This issue is responsible by Azure service team.
Projects
None yet
Development

No branches or pull requests

2 participants