At approximately 2021-06-14 02:00 (UTC) my website began performing very poorly (some requests taking >10s). An analysis of the logs shows that the slowdown only affects routes which establish a DB connection. When I look at the DB, I can see that I am experiencing high Network I/O (ASYNC_NETWORK_IO) wait times.
My understanding is that this can point to a couple problems. Either there is an issue with the network, or more likely, a problem with the application consuming the DB results. https://www.sqlshack.com/reducing-sql-server-async_network_io-wait-type/
Unfortunately, when I investigate any further, both the network and the client seem to be fine.
Problem with the network?
- ping < 1ms
- Utilizing a small fraction of 10GbE bandwidth
- Auto-Negotiation is detecting the network bandwidth properly
Web server / Application processing results too slowly?
- CPU usage is fine
- No code changes when slowdown began
- Request queue emptied quickly
Is there any other explanation for why I might see a sudden increase in ASYNC_NETWORK_IO wait time?
Update
Most resources (like the one I linked above) say that a problem with SQL Server itself is unlikely. However, when I switched out my application's connection string to use my test db (located on a different server), the site performance went back to normal, and no ASYNC_NETWORK_IO wait times could be seen in my test db.
If the issue was with the web server or the client application, I would have expected to see the connection to the test db cause high ASYNC_NETWORK_IO wait times there as well.
Does anyone have any ideas about what could be causing this from the SQL Server side?