There are varying versions of this question on stackoverflow already, but none of them helped me to get to the bottom of my issue. So, here I go again with more specific details of my problem.
We've been randomly getting Transaction (Process ID xx) was deadlocked on lock | communication buffer resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
. Let me be clear, this is not row or table level locking. I've tried enough guessed/random things; I need exact step by step guide on how to troubleshoot deadlock on communication buffer.
If you interested in specific details then read on.
Specific Details of Scenario : We have a very simple Dapper ORM based C# .net core Web API that takes in requests and performs CRUD operations to a database hosted on this Microsoft Sql server. To do this, Connection manager (registered as a scoped service) opens a new IDbConnection
connection in request scope; this connection is used to execute deletes, inserts, updates or get. For insert/update/delete C# line looks like this await connection.ExecuteAsync("<Create update or delete statement>", entity);
For GET
requests we simply run await connection.QueryFirstOrDefaultAsync<TEntity>("<select statement>", entity);
; There are 5 types of entity (all presenting simple non relational tables). They all CRUD by ID.
What has been tried so far
- MAXDOP=1 query hint on SQL statements
- Ensuring only 1 entity CRUD at given point in time for the one kind of entity.
- Restarting SQL server/application instance
- Ensuring ports/RAM/CPU/network bandwidth are not exhausted
- Alter DATABASE XXXXXX SET READ_COMMITTED_SNAPSHOT ON/OFF
- Keeping transactions as small as possible
- Persistent retry policy as a workaround (to handle random transient nature of the issue)
- Single thread per entity type
Server Specifications: We have Microsoft Sql Server 2016 On Azure hosted in virtual machine with 64 cores and 400GB RAM. Usual workload on this server is 10% CPU and 30% RAM, occasionally it goes up to 80% CPU and 350GB RAM. At all the times when this issue occurred, CPU usage was noticed under 20% (mostly around 10%, only one occasion 20%, RAM was under 30% on all occasions).
Deadlock XML Event as per @Dan Guzman's request
File size was too large for this post so created this google drive file. Please click on the following link then in top right corner click download. It is a zip file.
https://drive.google.com/file/d/1oZ4dT8Yrd2uW2oBqBy9XK_laq7ftGzFJ/view?usp=sharing