2

How to write a thread-safe list using copy-on-write model in .NET?

Below is my current implementation, but after lots of reading about threading, memory barriers, etc, I know that I need to be cautious when multi-threading without locks is involved. Could someone comment if this is the correct implementation?

class CopyOnWriteList
{
    private List<string> list = new List<string>();

    private object listLock = new object();

    public void Add(string item)
    {
        lock (listLock)
        {
            list = new List<string>(list) { item };
        }
    }

    public void Remove(string item)
    {
        lock (listLock)
        {
            var tmpList = new List<string>(list);
            tmpList.Remove(item);
            list = tmpList;
        }
    }

    public bool Contains(string item)
    {
        return list.Contains(item);
    }

    public string Get(int index)
    {
        return list[index];
    }
}

EDIT

To be more specific: is above code thread safe, or should I add something more? Also, will all thread eventually see change in list reference? Or maybe I should add volatile keyword on list field or Thread.MemoryBarrier in Contains method between accessing reference and calling method on it?

Here is for example Java implementation, looks like my above code, but is such approach also thread-safe in .NET?

And here is the same question, but also in Java.

Here is another question related to this one.

Community
  • 1
  • 1
Rafał Kłys
  • 590
  • 6
  • 14
  • The correctness is to use [ConcurrentBag](http://msdn.microsoft.com/en-us/library/dd381779.aspx), needless to re-invent the wheel – cuongle Jun 28 '13 at 13:58
  • 1
    Consider checking out this question: [Which .NET library has copy-on-write collections?](http://stackoverflow.com/questions/8426737/which-net-library-has-copy-on-write-collections) – Alvin Wong Jun 28 '13 at 14:02
  • @Cuong Le: According to MSDN ConcurrentBag is "optimized for scenarios where the same thread will be both producing and consuming data". I need optimization for rare writes and frequent reads, copy-on-write could give me that. – Rafał Kłys Jun 28 '13 at 14:35
  • @Alvin Wong: It's not what I need. 1) I don't want to use F# libs 2) immutable collection is only part of "copy-on-write" – Rafał Kłys Jun 28 '13 at 14:38

3 Answers3

0

Implementation is correct because reference assignment is atomic in accordance to Atomicity of variable references. I would add volatile to list.

Vlad I.
  • 425
  • 2
  • 7
  • Thanks for your answer, but what I would like to achive is lock-free reading from list. I currently use ReaderWriterLockSlim, but hope to learn how to do it without it, using "copy-on-write" pattern. – Rafał Kłys Jun 28 '13 at 18:35
  • Thanks, so now the question is: is 'volatile' needed to make all threads see change in 'list' reference? – Rafał Kłys Jul 01 '13 at 08:21
  • Look at `volatile` in MSDN. It says `Fields that are declared volatile are not subject to compiler optimizations that assume access by a single thread. This ensures that the most up-to-date value is present in the field at all times.` i.e reads and writes are guaranteed to be in the order they appear, and read\write is guaranteed to be atomic. so do you need it ? - yes. – Vlad I. Jul 01 '13 at 10:08
0

Your approach looks correct, but I'd recommend using a string[] rather than a List<string> to hold your data. When you're adding an item, you know exactly how many items are going to be in the resulting collection, so you can create a new array of exactly the size required. When removing an item, you can grab a copy of the list reference and search it for your item before making a copy; if it turns out that the item doesn't exist, there's no need to remove it. If it does exist, you can create a new array of the exact required size, and copy to the new array all the items preceding or following the item to be removed.

Another thing you might want to consider would be to use a int[1] as your lock flag, and use a pattern something like:

static string[] withAddedItem(string[] oldList, string dat)
{
  string[] result = new string[oldList.Length+1];      
  Array.Copy(oldList, result, oldList.Length);
  return result;
}
int Add(string dat) // Returns index of newly-added item
{
  string[] oldList, newList;
  if (listLock[0] == 0)
  {
    oldList  = list;
    newList = withAddedItem(oldList, dat);
    if (System.Threading.Interlocked.CompareExchange(list, newList, oldList) == oldList)
      return newList.Length;
  }
  System.Threading.Interlocked.Increment(listLock[0]);
  lock (listLock)
  {
    do
    {
      oldList  = list;
      newList = withAddedItem(oldList, dat);
    } while (System.Threading.Interlocked.CompareExchange(list, newList, oldList) != oldList);
  }
  System.Threading.Interlocked.Decrement(listLock[0]);
  return newList.Length;
}

If there is no write contention, the CompareExchange will succeed without having to acquire a lock. If there is write contention, writes will be serialized by the lock. Note that the lock here is neither necessary nor sufficient to ensure correctness. Its purpose is to avoid thrashing in the event of write contention. It is possible that thread #1 might get past its first "if" test, and get task task-switched out while many other threads simultaneously try to write the list and start using the lock. If that occurs, thread #1 might then "surprise" the thread in the lock by performing its own CompareExchange. Such an action would result in the lock-holding thread having to waste time making a new array, but that situation should arise rarely enough that the occasional cost of an extra array copy shouldn't matter.

supercat
  • 77,689
  • 9
  • 166
  • 211
  • I looked at "copy-on-write", because in my use case adding and removing from list is rare (like once a month), but checking if item exists is very often (many times a second). I don't care about optimizations in locking in Add and Remove, I need performance in Contains method. "Copy-on-write" promises that, but I'm not sure if my implementation above is thread-save in .NET memory model. – Rafał Kłys Oct 18 '13 at 08:37
  • @RafałKłys: Your code fits in a really annoying gray area, since most implementations of .NET run on the x86 memory model which is a little stricter on the implementation side (more forgiving on the user side) than the .NET one. To be fully safe, it might be necessary to add an explicit memory barrier (I think `System.Threading.Thread.MemoryBarrier()`) between `tmpList.Remove` and `list = tmpList`; I'm pretty sure it's not needed under the x86 model, but if the program was run on some non-x86 implementation of .NET it might be needed. – supercat Oct 18 '13 at 15:13
0

Yes, it is thread-safe:

  1. Collection modifications in Add and Remove are done on separate collections, so it avoids concurrent access to the same collection from Add and Remove or from Add/Remove and Contains/Get.

  2. Assignment of the new collection is done inside lock, which is just pair of Monitor.Enter and Monitor.Exit, which both do a full memory barrier as noted here, which means that after the lock all threads should observe the new value of list field.

Rafał Kłys
  • 590
  • 6
  • 14