.NET Core Difference between TransactionScope and DistributedLock

Question

Good morning everyone,

We are implementing a method that stores a document into SqlServer with EF Core.

This method called SAVE is called by multiple endpoints of the controller and is used both from a createDocument endpoint and another endpoint putDocument.

An extra requirement is that this method contains two calls that store the document and its properties into 2 different repositories, therefore we wanted to achieve that if one of the two calls to the repositories fails, it rolls back the changes.

Moreover, the whole code is hosted on multiple machines that's why we also implemented the library of distributed lock of macallon (GitHub) in order to avoid that the same document is accessed/modified by two different machines at the same time.

Is it correct to use a mixup of these solutions? Being a noob for me a transactionScope is already a lock, in a sense that during the transaction other machines cannot even access the same row in the Db but please help me to understand. Thanks

 public void Save(Document document){
using var scope = new TransactionScope(TransactionScopeOption.RequiresNew,
               TransactionScopeAsyncFlowOption.Enabled);

          
            try
            {
                using (var lockHandler = documentLock.CreateLockForGuid(document.Guid))
                {
                    repositoryOne.Save(document);

                    repositoryTwo.Save(document);
                
                    scope.Complete();
               }
            }
}

lockhandler is just a wrapper which calls the distributedLock, SqlDistributedReaderWriterLock with the document.Guid as its name

They have nothing in common?????? Transactions are meant to isolate changes. They *use* locks, but that's only how they're implemented. There's no "DistributedLock" in SQL either. There's an "app lock" that *really shouldn't* be used. Fix your code so it doesn't need locking — Panagiotis Kanavos, Jun 09 '21 at 07:39
What do *you* mean by `DistributedLock`? What are you trying to do and most importantly **why**? Why does your "repository" need an explicit database transaction? How is it implemented? Asking for locks suggests something's seriously wrong. Web applications are meant to handle *hundreds* of concurrent requests. Using locks means you can't handle more than *one* at a time — Panagiotis Kanavos, Jun 09 '21 at 07:44
Why not use EF Core? Transactions and disconnected operations is a problem solved in the late 1990s. There's no need for explicit transactions in most cases. ADO.NET's DataSets/DataTables and EF Core itself don't keep a connection open all the time. All changes are cached and saved in a single internal transaction at the end. Connections are opened and used *only* when loading data or storing changes. This way, rollback is trivial - just don't call `SaveChanges` — Panagiotis Kanavos, Jun 09 '21 at 07:45
`distributedLock of macallon (github) in order to avoid that the same document is accessed/modified by two different machines at the same time.` No, that's not how databases work at all. And you definitely have data access bugs. Databases use *transactions* to prevent concurrent modifications to the same row. Transactions block though, which is why since the late 1990s., optimistic concurrency is used - when saving, the code checks whether the row was modified since it was read. If they were, an error is raised and the code has to decide what to do. This works perfectly and 1000x faster — Panagiotis Kanavos, Jun 09 '21 at 07:50
@PanagiotisKanavos [link](https://github.com/madelson/DistributedLock#readme), we use this library in order to avoid that different machines could wrongly update the same document at the same time — TroubleSomeKubernetes, Jun 09 '21 at 07:50
People think they need locks when they try to implement versioning or checkout/checking logic over "documents" or business entities. Those are *business* concerns though, that can take days. They can't be solved with low-level locks that are meant to live for milliseconds. The solution there is to use versions or edit flags in the table — Panagiotis Kanavos, Jun 09 '21 at 07:51
Again and again, that's what transactions are for. You don't need any kind of explicit locking to do what transactions already do. I strongly suggest removing **all** of the code, and definitely everything even remotely related to "repository" and just use EF Core properly until you understand how databases work. — Panagiotis Kanavos, Jun 09 '21 at 07:53
Why do you think no tutorial shows such code? Why all tutorials show how to use either EF Core alone, or, in the case of ADO.NET, with short-lived transactions? Applications, especially web applications, handle 100s of concurrent connections. People are using databases for this for 30+ years, and nobody has to use such explicit locks. That `DistributedLock` isn't even meant for data access synchronization, it's meant to abstract cross-process synchronization — Panagiotis Kanavos, Jun 09 '21 at 07:54
Thanks @PanagiotisKanavos, so what you suggest, is that I perform the first repository call( which is already a transaction by itself, so no concurrent calls can be done), then the second, maybe return a result back from both of them (e.g check that they are correct) and if one of those is not correct then roll back in the same save method by calling delete(document) or something like this. Thanks — TroubleSomeKubernetes, Jun 09 '21 at 07:59
Which is another thing people don't have a great need for. Few people need a way to replace Windows named mutexes with Redis locks or queues. They latency difference between the two is so huge (from nanoseconds to milliseconds) that the entire application would have to be rewritten to handle this. — Panagiotis Kanavos, Jun 09 '21 at 07:59
`what you suggest` I suggest you write simple, clean data access code first, and *then* try to solve problems that may not even exist. **Don't use "repository" at all** if you can use EF Core. A DbSet is already a repository for a single domain entity, a DbContext is already a unit-of-work covering multiple domain entities. The problem is already *solved*. All you need to combine all changes is to use a *single* DbContext instance, make all changes and only call `SaveChanges` when you want to commit the changes. — Panagiotis Kanavos, Jun 09 '21 at 08:02
And once again, if you want to add versioning or checkout logic to the "documents", you can't use low-level "locks". There are hundreds of documents, and checkouts occur for minutes if not days. Using a lock to "checkout" is similar to a library closing its doors if someone borrows a single book. Instead, they keep track of who borrows what separately. That's what business applications do too — Panagiotis Kanavos, Jun 09 '21 at 08:05
BTW preventing concurrent `POST/PUT` operations is something solved at the HTTP level, using optimistic concurrency, too. This is done through the [ETag](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/ETag) and [If-Match](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/ETag#avoiding_mid-air_collisions) header. When you retrieve an HTTP Response you get an `ETag` response header that identifies the "version" of the object. When you make a `POST`, the `If-Match` header should include the ETag value, telling the server to reject changes if the object's ETag changed — Panagiotis Kanavos, Jun 09 '21 at 08:12
If you use SQL Server, you can add a [rowversion](https://learn.microsoft.com/en-us/sql/t-sql/data-types/rowversion-transact-sql?view=sql-server-ver15) column to a table to enable optimistic concurrency. The `rowversion` for a row always changes if the row is modified. This makes it a great `ETag` value — Panagiotis Kanavos, Jun 09 '21 at 08:15

.NET Core Difference between TransactionScope and DistributedLock

0 Answers0