I have a web app running in Azure that was working fine for a couple of years. It's been connecting to a MS SQL Server instance that runs on a VM, mainly because part of it is a legacy app that needed features that Azure SQL didn't support. We've removed some of those old dependencies, and Azure now has SQL Server Managed Instances, so I decided to give that a try.
For some reason though, now many of the web requests that need to access the db have one or more 14-second pauses in the trace. For example:
Each of those SQL lines represents another SQL call to the db. For some of these traces, the whole thing takes around 250ms and the SQL calls are all back-to-back. For some like these, you see nothing for 14 seconds, then one of the SQL calls, then another 14-second gap, and the rest happen in quick succession. Sometimes it's 14 seconds and then they all get called quickly. Sometimes it's 14 seconds, then the first call, then another 14 seconds, then the second call, then another 14 seconds, and then the rest back-to-back.
The histogram of timings is just very odd:
I cropped out the sub-300ms responses, which make up around half of the total. The rest are shown here. The chart doesn't line up completely, but the first tall bar is at 14.3 seconds, the second bar is 28.7 seconds, and the third is at 43.2 seconds.
Of course if I switch the connection string back to the VM, everything goes back to running very quickly. Then I go back to the SQL Managed Instance. The code isn't changing at all. The other thing that's odd is that the first SQL line in that timing chart is a completely different db that isn't being changed, so somehow having the main db use the Managed Instance can also mess up the connection to Azure SQL? I don't think it's strictly caching or anything since I can refresh the page many times in a row and sometimes it's quick, sometimes it's 14-28 seconds. I've also tried changing up the conn string to MI, using private and public hostnames.
Bottom line--is there some reason why one or more calls to SQL Server would have 14-second gaps?