1

I'm using Azure Synapse Link for Dataverse to synchronize 13 tables from a Dynamics instance to Azure Synapse, but I'm seeing far more transactions than expected in the storage account metrics and it's leading to excessive charges. It seems related to the snapshot/partition process, but I'm not 100% sure. Has anyone run across this before and know how to fix it?

Storage account transactions over a 4-hour period

AuthenticationErrors over a 4-hour period

Transactions by API Name

Background / Other Info

  • I first set up this link on Feb 22. When I did that, the tables initially synchronized fairly quickly, but when I went to query them, I'd get the error A transport-level error has occurred when receiving results from the server. (provider: TCP Provider, error: 0 - The specified network name is no longer available.)
  • I tried deleting the link and recreating it. That did not help.
  • I tried synchronizing just the data and not connecting to an Azure Synapse Workspace and that helped, but I need the workspace database.
  • At different times during testing, I would get the error Database 'dataverse_xxx_xxx' on server 'xxx-ondemand' is not currently available. Please retry the connection later. If the problem persists, contact customer support, and provide them the session tracing ID of '{XXXXXXXX-XXX-XXXX-XXXX-XXXXXXXXXXX}'. When this happened, the built-in serverless pool would show Temporarily Unavailable for some time, but would eventually come back.
  • I was not able to query any of the metadata tables until all of the snapshot/partition tables were created and this took multiple hours, which is a behavior I did not see in testing. Further, the lake database in Synapse would show all of the _partition tables, but the same database in SSMS would only show a subset of them.
  • If I hook the storage account up to a Log Analytics workspace, I can't find the AuthenticationError transactions, but that could be my lack of knowledge in that space.
  • Transactions by API Name shows a lot of ListFilesystemDir, GetFilesystemProperties, and Unknown. A similarly configured workspace only shows transaction peaks every hour for GetBlobProperties and ListBlobs.
  • The counts and shape of the Transaction Errors and Transactions by API Name graphs track each other.
  • The Synapse workspace was deployed using an ARM template and should match our dev and test environments.
  • I initially let the process run for several days, thinking that things would settle down, but they did not after 4 days.

Update - 2023-02-28

I hooked up the storage account to a Log Analytics workspace today and have confirmed that the issue seems to be related to the snapshot process. Every Snapshot directory is being queried once every 45 seconds. There are 747 URIs returned from the query below, which means 59,760 transactions every hour. Multiplied by 3, since the same behavior exists for ListFilesystemDir1, GetFilesystemProperties, and Unknown (I couldn't find this in the logs), and we get 179,280 transactions every hour. I'm not seeing any transactions with an AuthenticationError, which seems strange. Everything has a status text of Success.

Snapshot transactions

Update - 2023-03-02

  • I set up a new Synapse workspace using the same ARM template that was used to deploy the workspace with the issue. The new Synapse workspace is exhibiting the same behavior.
  • The behavior exists regardless of which dataverse environment is linked

Update - 2023-03-09

  • The transactions have settled down on the test Synapse workspaces that I set up on 2023-03-02, but the issue remains for my production workspace. Both test workspaces showed a decrease in transactions on 2023-03-05 between 9 AM and 9 PM. The test workspaces are empty except for the dataverse lake database.

Storage transaction plateau then expected

Update - 2023-03-13

Transactions in my production workspace came back to normal levels on March 9, shortly before I spoke to Microsoft support. They suspect it was related to Synapse Link for Dataverse issue that was resolved around the same time but aren't 100% sure. I sent additional log information to help troubleshoot.

Steve Platz
  • 2,215
  • 5
  • 28
  • 27

1 Answers1

0

Also experiencing this issue on 3 new Dataverse synapse link setups.

I can connect to the Serverless SQL Pool and run queries against system views (sys.tables, sys.views, sys.dm_exec_requests_history, etc). These return results just fine.

As soon as I query an External Table it appears that the serverless SQL pool crashes. I get the Transport Level Error since the service is no longer running. If I try to immediately connect back to the database, I see a message stating the DB is in recovery (since I suspect the SQL crashed and is now restarting and recovering the DB.) After a short period, it is running and has recovered the DB, and I can connect.

The same issue occurs though. System view queries all working great but as soon as I query any one of the External tables the SQL service crashes again.

This cycle goes on several times until at some point I am able to query external tables, and then they will all work and the service runs fine. After not running queries for a certain time when I return its back into the service crashing loop - I guess the SQL pool is paused or torn down or whatever when not in use.

All of the linked tables from Dataverse were created with the default update mode in these 3 environments. They're all in the West Europe DC as well.

I'm not sure of the cause yet - I can't see any correlations as to when it works and when it crashes. I don't know whether it is being caused by significant storage account traffic or not.

Logged a support ticket with Azure support team and have sent them my findings so far.

  • Thanks for letting me know that I'm not alone with this. I've had a support ticket open for a little over a week now. As soon as I learn more, I'll post here. – Steve Platz Mar 07 '23 at 11:48
  • The MS Support team confirmed that the engineering team have fixed or mitigated the issue. All three of the Dataverse environments in my case are now working correctly. The transaction volume on the Production storage account has reduced quite significantly - although the dev and test environments never had high transaction volumes and were also displaying the same problem. Anyway - it's all working on my end now. – jdp_altron Mar 14 '23 at 12:35
  • Thanks for the update! I don't know if you saw my latest comment in the question, but that matches what I'm seeing. The support team hasn't confirmed the underlying issue yet, however, but said they suspect it had to do with a managed identity change. – Steve Platz Mar 15 '23 at 13:48