Locking file in distributed system

Question

I have a distributed application; that is, I have a homogeneous process running on multiple computers talking to a central database and accessing a network file share.

This process picks up a collection files from a network file share (via CIFS), runs an transformation algorithm on those files and copies the output back onto the network file share.

I need to lock the input files so that other servers -- running the same process -- will not work on the same files. For the sake of argument, assume that my description is oversimplified and that the locks are an absolute must.

Here are my proposed solutions, and some thoughts.

1) Use opportunistic locks (oplocks). This solution uses only the file system to lock files. The problem here is that we have to try to get the lock to find out if the lock exists. This seems that it can be expensive as the network redirectors negotiate the locks. The nice thing about this is that oplocks can be created in such a way that they self delete when there is an error.

2) Use database app locks (via sp_getapplock). This seems that it would be much faster, but now we are using a database to lock a file system. Also, database app locks can be scoped via transaction or session which means that I must hold onto the connection if I want to hold onto -- and later release -- the app lock. Currently, we are using connection pooling, which would have to change and that may be a bigger conversation unto itself. The nice thing about this approach is that the locks will get cleaned up if we lose our connection to the server. Of course, this means that if we lose connection to the database, but not the network file share, the lock goes away while we are still processing the input files.

3) Create a database table and stored procedures to represent the items which I would like to lock. This process is straight forward. The down side of this is of course potential network errors. If for some reason, the database becomes unreachable, the lock will remain in effect. We would need to then derive some algorithm to clean this up at a later date.

What is the best solution and why? Answers are not limited to those mentioned above.

Ben · Accepted Answer · 2013-10-07T18:43:28.243

2

For your situation you should use share-mode locks. This is exactly what they were made for.

Oplocks won't to what you want - an oplock is not a lock, and doesn't prevent anyone doing anything. It's a notification mechanism to let the client machine know if anyone accesses the file. This is communicated to the machine by "breaking" your oplock, but this is not something that makes its way to the application layer (i.e. to your code) - it just generates a message to the client operating system to tell it to invalidate it's cached copy and fetch again from the server.

See MSDN here:

http://msdn.microsoft.com/en-us/library/windows/desktop/aa365433(v=vs.85).aspx

The explanation of what happens when another process opens a file on which you hold an oplock is here:

http://msdn.microsoft.com/en-us/library/windows/desktop/aa363786(v=vs.85).aspx

However the important point is that oplocks do not prevent other processes opening the file, they just allow coordination between the client computers. Therefore, oplocks do not lock the file at the application level - they are a feature of the network protocol used by the network file system stack to implement caching. They are not really for applications to use.

Since you are programming on windows the appropriate solution seems to be Share-mode locks, i.e. opening the file with SHARE_DENY_READ|SHARE_DENY_WRITE|SHARE_DENY_DELETE.

If share-mode locks are not supported on the CIFS server, you might consider flock() type locks. (Named after a traditional Unix technique).

If you are processing xyz.xml create a file called xyz.xml.lock (with the CREATE_NEW mode so you don't clobber an existing one). Once you are done, delete it. If you fail to create the file because it already exists, that means that another process is working on it. It might be useful to write information to the lockfile which is useful in debugging, such as the servername and PID. You will also have to have some way of cleaning up abandoned lock files since that won't occur automatically.

Database locks might be appropriate if the CIFS is for example a replicated system so that the flock() lock will not occur atomically across the system. Otherwise I would stick with the filesystem as then there is only one thing to go wrong.

edited Oct 07 '13 at 18:43

answered Oct 07 '13 at 16:55

Ben

34,935
6
74
113

Ben, thank you for the response. What you tell me here about oplocks seems to disagree with what is found under "Network I/O" section of MSDN. Furthermore, it sounds to me that you are implying that SHARE_... flags are in opposition to oplocks, whereas MSDN talks about how these flags give direction to how the network redirectors handles oplocks. Thank you for pointing me to flock. I had not considered that and will look into it. – Phillip Scott Givens Oct 07 '13 at 17:36
@PhillipScottGivens, Expanded answer. Oplocks are not locks from the point of view of the application - they are an implementation detail allowing the client machine network stack to implement caching. For your situation you should use share-mode locks if supported by the server. This is *exactly* what they were made for. – Ben Oct 07 '13 at 18:43
Thank you for the follow up. Again, you tell a different story from Microsoft. From what they have written, I am under the impression that oplocks are not seen from the application. They are created for you by the kernel level file system driver on the remote machine. Whether they are exclusive or not, handle caching or not, or a slew of other factors is determined by the file share flags, file access flags and file options flags. If you disagree, would you please provide me a link supporting your position? – Phillip Scott Givens Oct 07 '13 at 19:59
Sorry, I just saw that you updated your post with links. Give me a moment and I will get back to you with the links that go along with those which you have provided. – Phillip Scott Givens Oct 07 '13 at 20:04
"However, because the server checks the sharing state before it breaks the lock, in the case where the server would deny an open request due to a sharing conflict the server does not break the lock. For example, if you have opened a file, denied sharing for write operations, and obtained a level 1 lock, the server denies another client's request to open the file for writing before it even examines your lock on the file. In this instance, your opportunistic lock is not broken." from http://msdn.microsoft.com/en-us/library/windows/desktop/aa365713(v=vs.85).aspx – Phillip Scott Givens Oct 07 '13 at 20:12
@PhillipScottGivens, I think if you re-read carefully you will find that I am saying the same thing as Microsoft. But the takeaway is 1) use the share mode flags, that's what they are for. 2) Oplocks will not do what you want. – Ben Oct 07 '13 at 20:47
I have read it carefully and I think you are too focused on the local caching aspect of oplocks and overlooking the data coherency aspect. http://msdn.microsoft.com/en-us/library/windows/desktop/aa363880(v=vs.85).aspx – Phillip Scott Givens Oct 07 '13 at 21:31
I have no idea why you think that. I also have no idea why you are interested in oplocks or suggested them in the first place. *These are not the locks you are looking for.* (Waves hand. @PhillipScottGivens loses interest in oplocks which have nothing to do with his problem and uses SHARE_DENY flags instead) – Ben Oct 07 '13 at 21:42
Be nice. "An opportunistic lock (also called an oplock) is a lock placed by a client on a file residing on a server." [MSDN: Opportunistic Locks](http://msdn.microsoft.com/en-us/library/windows/desktop/aa365433(v=vs.85).aspx) - Very first sentence. – Phillip Scott Givens Oct 07 '13 at 22:08
thank you for keeping on this thread. I have re-read the MSDN Network I/O sections many times in trying to understand all gotchyas related to working with the network file system. Your persistence has made me look at them yet another time. I now see that oplocks only matter if actually writing or reading from the file. – Phillip Scott Givens Oct 08 '13 at 00:43

Locking file in distributed system

1 Answers1