0

Possible Duplicate:
Why everyone states that SpinLock is faster?

This question is concerning SpinLock, Monitor & Interlocked.

I made 2 tests which test the performance of Monitor, SpinLock and Interlocked and these tests have left me confused.

My confusion is regarding particularly how fast SpinLock really is. According to my tests SpinLock is slower than Monitor. But based on a number of documents and articles SpinLock should provide performance gains.

And now I wonder in which scenarios would SpinLock give a performance improvement?

Below you can find some details on tests I performed:

In first test I created few threads (as many hardware threads I have) accessing the same shared lock object doing very short operation (or no operation at all: this is just a test).

In second test I created an array of elements and few threads randomly accessing elements in this array. Each element contains its own locking object: System.Object for Monitor test, SpinLock object for SpinLock test, as for Interlocked.Increment, thread uses the public variable of type int inside the array element to perform Interlocked.Increment operation.

In each test access to the shared region is performed in a loop. Each test consisted of 3 routines:

  • Testing SpinLock
  • Testing Monitor
  • Testing Increment.Interlocked

Each test showed that SpinLock was slower than Monitor. So, again the question that bothers me ever since I have performed mentioned tests is which scenarios are suitable for performance improvements given by SpinLock


Posting the code of the tests in order to give the details on it:

(Both tests were compiled against .net 4.5)

TEST 1, The threads are trying to gain exclusive access to the same shared locking object

using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Diagnostics;
using System.Threading.Tasks;
using System.Linq;
using System.Globalization;
using System.ComponentModel;
using System.Threading;
using System.Net.Sockets;
using System.Net;

class Program
{
    static int _loopsCount = 1000000;
    static int _threadsCount = -1;

    static ProcessPriorityClass _processPriority = ProcessPriorityClass.RealTime;
    static ThreadPriority _threadPriority = ThreadPriority.Highest;

    static long _testingVar = 0;


    static void Main(string[] args)
    {
        _threadsCount = Environment.ProcessorCount;
        _threadsCount = (_threadsCount == 0) ? 1 : _threadsCount;

        Console.WriteLine("Cores/processors count: {0}", Environment.ProcessorCount);
        Console.WriteLine("Threads count: {0}", _threadsCount);

        Process.GetCurrentProcess().PriorityClass = _processPriority;

        TimeSpan tsInterlocked = ExecuteInterlocked();
        TimeSpan tsSpinLock = ExecuteSpinLock();
        TimeSpan tsMonitor = ExecuteMonitor();

        Console.WriteLine("Test with interlocked: {0} ms\r\nTest with SpinLock: {1} ms\r\nTest with Monitor: {2} ms",
            tsInterlocked.TotalMilliseconds,
            tsSpinLock.TotalMilliseconds,
            tsMonitor.TotalMilliseconds);

        Console.ReadLine();
    }

    static TimeSpan ExecuteInterlocked()
    {
        _testingVar = 0;

        ManualResetEvent _startEvent = new ManualResetEvent(false);
        CountdownEvent _endCountdown = new CountdownEvent(_threadsCount);

        Thread[] threads = new Thread[_threadsCount];

        for (int i = 0; i < threads.Length; i++)
        {
            threads[i] = new Thread(() =>
                {
                    _startEvent.WaitOne();

                    for (int j = 0; j < _loopsCount; j++)
                    {
                        Interlocked.Increment(ref _testingVar);
                    }

                    _endCountdown.Signal();
                });

            threads[i].Priority = _threadPriority;
            threads[i].Start();
        }

        Stopwatch sw = Stopwatch.StartNew();

        _startEvent.Set();
        _endCountdown.Wait();

        return sw.Elapsed;
    }

    static SpinLock _spinLock = new SpinLock();

    static TimeSpan ExecuteSpinLock()
    {
        _testingVar = 0;

        ManualResetEvent _startEvent = new ManualResetEvent(false);
        CountdownEvent _endCountdown = new CountdownEvent(_threadsCount);

        Thread[] threads = new Thread[_threadsCount];

        for (int i = 0; i < threads.Length; i++)
        {
            threads[i] = new Thread(() =>
            {
                _startEvent.WaitOne();

                bool lockTaken;

                for (int j = 0; j < _loopsCount; j++)
                {
                    lockTaken = false;

                    try
                    {
                        _spinLock.Enter(ref lockTaken);

                        _testingVar++;
                    }
                    finally
                    {
                        if (lockTaken)
                        {
                            _spinLock.Exit();
                        }
                    }
                }

                _endCountdown.Signal();
            });

            threads[i].Priority = _threadPriority;
            threads[i].Start();
        }

        Stopwatch sw = Stopwatch.StartNew();

        _startEvent.Set();
        _endCountdown.Wait();

        return sw.Elapsed;
    }

    static object _locker = new object();

    static TimeSpan ExecuteMonitor()
    {
        _testingVar = 0;

        ManualResetEvent _startEvent = new ManualResetEvent(false);
        CountdownEvent _endCountdown = new CountdownEvent(_threadsCount);

        Thread[] threads = new Thread[_threadsCount];

        for (int i = 0; i < threads.Length; i++)
        {
            threads[i] = new Thread(() =>
            {
                _startEvent.WaitOne();

                bool lockTaken;

                for (int j = 0; j < _loopsCount; j++)
                {
                    lockTaken = false;

                    try
                    {
                        Monitor.Enter(_locker, ref lockTaken);

                        _testingVar++;
                    }
                    finally
                    {
                        if (lockTaken)
                        {
                            Monitor.Exit(_locker);
                        }
                    }
                }

                _endCountdown.Signal();
            });

            threads[i].Priority = _threadPriority;
            threads[i].Start();
        }

        Stopwatch sw = Stopwatch.StartNew();

        _startEvent.Set();
        _endCountdown.Wait();

        return sw.Elapsed;
    }
}

TEST 2, Threads are trying to gain exclusive access to the elements of array, which are picked randomly, i.e. test with the low contention

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;
using System.Threading;
using System.Threading.Tasks;

namespace TestConcurrency
{
    class Program
    {
        static int _loopsCount = 10000000;
        static int _threadsCount = -1;
        static int _arrayCount = 1000;

        static ProcessPriorityClass _processPriority = ProcessPriorityClass.RealTime;
        static ThreadPriority _threadPriority = ThreadPriority.Highest;

        static void Main(string[] args)
        {
            _threadsCount = Environment.ProcessorCount;
            _threadsCount = (_threadsCount == 0) ? 1 : _threadsCount;

            Console.WriteLine("Cores/processors count: {0}", Environment.ProcessorCount);
            Console.WriteLine("Threads count: {0}", _threadsCount);

            Process.GetCurrentProcess().PriorityClass = _processPriority;

            TimeSpan tsInterlocked = ExecuteInterlocked();
            TimeSpan tsSpinLock = ExecuteSpinLock();
            TimeSpan tsMonitor = ExecuteMonitor();

            Console.WriteLine("Test with interlocked: {0} ms\r\nTest with SpinLock: {1} ms\r\nTest with Monitor: {2} ms",
                tsInterlocked.TotalMilliseconds,
                tsSpinLock.TotalMilliseconds,
                tsMonitor.TotalMilliseconds);

            Console.ReadLine();
        }

        static IEnumerable<int> newList()
        {
            return Enumerable.Range(0, _arrayCount);
        }

        static TimeSpan ExecuteMonitor()
        {
            ManualResetEvent _startEvent = new ManualResetEvent(false);
            CountdownEvent _endCountdown = new CountdownEvent(_threadsCount);

            Thread[] threads = new Thread[_threadsCount];
            var array = newList().Select(i => new ArrayElementForMonitor()).ToArray();

            for (int i = 0; i < threads.Length; i++)
            {
                int localI = i;

                threads[i] = new Thread(() =>
                {
                    Random r = new Random(localI * localI * localI);

                    int index = 0;

                    _startEvent.WaitOne();

                    bool lockTaken;

                    for (int j = 0; j < _loopsCount; j++)
                    {
                        index = r.Next(0, _arrayCount);

                        lockTaken = false;

                        try
                        {
                            Monitor.Enter(array[index].Locker, ref lockTaken);
                        }
                        finally
                        {
                            if (lockTaken)
                            {
                                Monitor.Exit(array[index].Locker);
                            }
                        }
                    }

                    _endCountdown.Signal();
                });

                threads[i].Priority = _threadPriority;
                threads[i].Start();
            }

            GC.Collect();

            Stopwatch sw = Stopwatch.StartNew();

            _startEvent.Set();
            _endCountdown.Wait();

            return sw.Elapsed;
        }

        static TimeSpan ExecuteSpinLock()
        {
            ManualResetEvent _startEvent = new ManualResetEvent(false);
            CountdownEvent _endCountdown = new CountdownEvent(_threadsCount);

            Thread[] threads = new Thread[_threadsCount];
            var array = newList().Select(i => new ArrayElementForSpinLock()).ToArray();

            for (int i = 0; i < threads.Length; i++)
            {
                int localI = i;

                threads[i] = new Thread(() =>
                {
                    Random r = new Random(localI * localI * localI);

                    int index = 0;

                    _startEvent.WaitOne();

                    bool lockTaken;

                    for (int j = 0; j < _loopsCount; j++)
                    {
                        index = r.Next(0, _arrayCount);

                        lockTaken = false;

                        try
                        {
                            array[index].Locker.Enter(ref lockTaken);
                        }
                        finally
                        {
                            if (lockTaken)
                            {
                                array[index].Locker.Exit();
                            }
                        }
                    }

                    _endCountdown.Signal();
                });

                threads[i].Priority = _threadPriority;
                threads[i].Start();
            }

            GC.Collect();

            Stopwatch sw = Stopwatch.StartNew();

            _startEvent.Set();
            _endCountdown.Wait();

            return sw.Elapsed;
        }

        static TimeSpan ExecuteInterlocked()
        {
            ManualResetEvent _startEvent = new ManualResetEvent(false);
            CountdownEvent _endCountdown = new CountdownEvent(_threadsCount);

            Thread[] threads = new Thread[_threadsCount];
            var array = newList().Select(i => new ArrayElementInterlocked()).ToArray();

            for (int i = 0; i < threads.Length; i++)
            {
                int localI = i;

                threads[i] = new Thread(() =>
                {
                    Random r = new Random(localI * localI * localI);

                    int index = 0;

                    _startEvent.WaitOne();

                    for (int j = 0; j < _loopsCount; j++)
                    {
                        index = r.Next(0, _arrayCount);

                        Interlocked.Increment(ref array[index].Element);
                    }

                    _endCountdown.Signal();
                });

                threads[i].Priority = _threadPriority;
                threads[i].Start();
            }

            GC.Collect();

            Stopwatch sw = Stopwatch.StartNew();

            _startEvent.Set();
            _endCountdown.Wait();

            return sw.Elapsed;
        }
    }

    public class ArrayElementForMonitor
    {
        public object Locker = new object();
    }

    public class ArrayElementForSpinLock
    {
        public SpinLock Locker = new SpinLock();
    }

    public class ArrayElementInterlocked
    {
        public int Element;
    }
}

ADDITIONAL TEST 3. The test is executed in a single thread. The highest chances the thread to access the lock.

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;
using System.Threading;
using System.Threading.Tasks;

namespace TestSimpleLocking
{
    class Program
    {
        static int _loopsCount = 100000000;

        static ProcessPriorityClass _processPriority = ProcessPriorityClass.RealTime;
        static ThreadPriority _threadPriority = ThreadPriority.Highest;

        static void Main(string[] args)
        {
            Process.GetCurrentProcess().PriorityClass = _processPriority;
            Thread.CurrentThread.Priority = _threadPriority;

            TimeSpan tsInterlocked = ExecuteInterlocked();
            TimeSpan tsSpinLock = ExecuteSpinLock();
            TimeSpan tsMonitor = ExecuteMonitor();

            Console.WriteLine("Test with interlocked: {0} ms\r\nTest with SpinLock: {1} ms\r\nTest with Monitor: {2} ms",
                tsInterlocked.TotalMilliseconds,
                tsSpinLock.TotalMilliseconds,
                tsMonitor.TotalMilliseconds);

            Console.ReadLine();
        }

        static TimeSpan ExecuteMonitor()
        {
            object locker = new object();
            int variable = 0;

            Stopwatch sw = Stopwatch.StartNew();
            bool lockTaken = false;

            for (int i = 0; i < _loopsCount; i++)
            {
                lockTaken = false;

                try
                {
                    Monitor.Enter(locker, ref lockTaken);

                    variable++;
                }
                finally
                {
                    if (lockTaken)
                    {
                        Monitor.Exit(locker);
                    }
                }
            }

            sw.Stop();

            Console.WriteLine(variable);

            return sw.Elapsed;
        }

        static TimeSpan ExecuteSpinLock()
        {
            SpinLock spinLock = new SpinLock();
            int variable = 0;

            Stopwatch sw = Stopwatch.StartNew();

            bool lockTaken = false;

            for (int i = 0; i < _loopsCount; i++)
            {
                lockTaken = false;

                try
                {
                    spinLock.Enter(ref lockTaken);

                    variable++;
                }
                finally
                {
                    if (lockTaken)
                    {
                        spinLock.Exit();
                    }
                }
            }

            sw.Stop();

            Console.WriteLine(variable);

            return sw.Elapsed;
        }

        static TimeSpan ExecuteInterlocked()
        {
            int variable = 0;

            Stopwatch sw = Stopwatch.StartNew();

            for (int i = 0; i < _loopsCount; i++)
            {
                Interlocked.Increment(ref variable);
            }

            sw.Stop();

            Console.WriteLine(variable);

            return sw.Elapsed;
        }
    }
}

As far as I understand the 3rd test is the best case for the SpinLock choice. No contention at all. Single thread - sequenced execution. Why SpinLock is still far behind Monitor? Can anyone point me to some code which would prove me that SpinLock is useful at all (except device driver development)?

Community
  • 1
  • 1
Rauf
  • 312
  • 3
  • 16
  • 4
    Out of interest, why do you think explaining your code in words will be more beneficial than posting your actual code? – Daniel Kelley Feb 01 '13 at 08:27
  • @DanielKelley I believe he posted the code in his previous question: http://stackoverflow.com/questions/14611320/why-everyone-states-that-spinlock-is-faster – Mike Zboray Feb 01 '13 at 08:38
  • @mikez oh, I see... Why two question? – Lorenzo Dematté Feb 01 '13 at 08:45
  • he got very thorough explanations but he just couldn't understand them so he still is posting this. just a troll with big ego. – Boppity Bop Feb 02 '13 at 03:55
  • Thorough? This is just a theory. In documentation it is stated that SpinLock gives a good performance gain in low contention scenario. But test 2 is indeed a low contention scenario. And still SpinLocks turned out to be slow. – Rauf Feb 03 '13 at 21:14
  • You failed to use the correct SpinLock-constructor. You have used the default constructor that enableThreadOwnerTracking = true. This makes the SpinLock perform worse than Monitor.Enter. Instead make it into new SpinLock(false). Also you should use Locker.Exit(false). – Rolf Kristensen May 19 '16 at 19:26

1 Answers1

2

SpinLock is very fast if contention on the resource is low (i.e. when getting the lock on the resource almost always suceeds). Reference: Joe Duffy book and blog http://www.bluebytesoftware.com/blog/

In each test access to the shared region is performed in a loop

_could_mean that contention is high; (BTW, can you post a complete code example? It would help and reduce the "guesswork" required). Therefore, it is likely that the SpinLock spins, then waits - making it worse than a Monitor, which directly waits.

EDIT: after reading the details on your closed, related question: I totally agree with Hans Passant answer:

So basic requirements is that the lock is held for a very short time, which is true in your case. And that there are reasonable odds that the lock can be acquired. Which is not true in your case, the lock is heavily contested by no less than 24 threads.

Blindly using a SpinLock, without measuring and/or without understanding at least the principles behind its design, is a case of premature optimization that can run quickly into a code that is actually slower, or even incorrect: remember, some synchronization structures guarantee fairness and/or progress, other do not; some work better when a lot of access is read-only, some when contention is low, .... And fairness could be relevant in this case.

Just another quick, untested hypothesis: I was more surprised that InterlockedIncrement is slower or equal to Monitor. That made me think about cache coherence issues; after all, Interlocked too works best when there's little write contention, because it is implementing using atomic CAS operations on target variable. Under a write-heavy scenarion like yours it will need a significant amount of retries, end each retry could generate a significant amount of traffic on the inter-core bus to keep cache coherent. Using Monitor could somehow "serialize" access better, reducing traffic on the inter-core/inter-proc bus. But all of this is just guesswork :)

Community
  • 1
  • 1
Lorenzo Dematté
  • 7,638
  • 3
  • 37
  • 77
  • So basic requirements is that the lock is held for a very short time, – Rauf Feb 03 '13 at 21:19
  • So basic requirements is that the lock is held for a very short time: In the second test each element represents a single locking object. Each element is accessed randomly by a few threads: i.e. chances for a short waiting period are very high. Isn't it a low content scenario? – Rauf Feb 03 '13 at 21:27
  • @RaufIsmayilov aren't you working on a 24-core system? "the lock is heavily contested by no less than 24 threads" is not a few threads...all of which have a high likelihood of trying to get the lock at the same time. "Few threads" is when you have 1, sometimes 2, trying to access the lock at a given time – Lorenzo Dematté Feb 04 '13 at 08:50
  • Could you please take a look on the second test. I have performed 2 tests for a reason. I have tested the second test with the only 8 hardware threads on a 4 core CPU, a test in which each element is used as a lock and chances are too low that two of the 8 threads randomly access the same element among 1000 elements. I even increased the length of the array to 100000, and still SpinLock gives up. – Rauf Feb 04 '13 at 09:55