3

Spinlocks may be effective only on systems with real parallelism i.e. multicore/processor systems. That is not surprising due to their design.

Nonetheless, threads sharing the resource must execute on different cores. Otherwise situation is similar to uniprocessor system.

Is is just the matter of probability or scheduler tries to put interlocking threads on different CPUs to provide real concurrency?

Pavel Voronin
  • 13,503
  • 7
  • 71
  • 137

2 Answers2

5

The cost of a thread-context switch is the reason why spinlocks can be useful. The usual number that's quoted is between 2000 and 10,000 cpu cycles for a switch on Windows. Cost that's associated by having to store and reload the processor state and the stalls due to the guaranteed cache misses. This is pure overhead, nothing useful gets done.

So a program can perform better if it waits for a lock to become available, burning up several thousand cycles and periodically trying to enter the lock. Which avoids the context switch if the lock is expected to becomes available quickly.

There is no timer available to wait for such short durations, it is implemented by actually burning cycles with a small loop. Spinning. This is supported at the processor level, the dedicated PAUSE machine code instruction is available to reduce the power consumed while spinning on Intel/AMD cores. Thread.Yield() and Thread.SpinWait() in .NET take advantage of it.

Actually using a spinlock effectively in code can be tricky, it of course only works well if the odds to acquire the lock are high enough. If they are not then spinlocking is actively harmful since it delays a context switch, one that might be required to get another thread to regain the processor and release the lock. The .NET 4.0 SpinLock class is useful, it generates ETW events for failed spins. Not otherwise well documented.

Hans Passant
  • 922,412
  • 146
  • 1,693
  • 2,536
  • Still not clear whether some special mechanisms exist to schedule interlocking threads to different cores. Let two threads compete for the resource on CPU with two cores. If both threads run on the same core then spinning is a complete waste in 50% of waits. So for each particular case the threshold number of the cores can exist when spinning is a guaranteed win on average and inefficiency otherwise. If my speculations correct then code should be clever enough to select locking strategy depending on the hardware. Is there any fallback to usual locking if CLR runs on uniprocessor system? – Pavel Voronin Mar 14 '14 at 11:02
  • There is little to gain if you are getting it fundamentally wrong, you should never start more threads than you have cores. A strategy that the ThreadPool already actively pursues. Both Thread.SpinWait() and the SpinLock class will not spin on a uniprocessor system. – Hans Passant Mar 14 '14 at 11:08
  • Anyway there are thousands of running threads which share small number of cores. I see little difference between 100 and 101 threads. Do I miss something? – Pavel Voronin Mar 14 '14 at 11:13
  • If your program starts thousands of threads then you are doing it *very* wrong. The entire operating system typically manages a thousand threads. All of them blocked. – Hans Passant Mar 14 '14 at 11:17
  • If the machine has a lot of processes that are burning core then you'll certainly notice, your program slows down. Of course that's not a problem you can solve in software, you solve it by getting a better machine or moving an expensive process to another machine. – Hans Passant Mar 14 '14 at 11:26
  • +1 Trying to evaluate spinlock gain/loss is a difficult ask since it depends heavily on, usually unpredictable and changing, loading elsewhere. Bolting on extra cores, just for the purpose of trying to eliminate spinlock inefficiencies, sounds like a waste of resources in itself. I hate spinlocks - not only do they waste CPU, they use up memory bandwidth that the thread holding the lock probably needs to get its work done and release the lock. – Martin James Mar 14 '14 at 13:45
1

When two threads are involved, spinlocks may be effective because of the probability that while one thread is waiting, the other thread releases the lock. You are thus correct that there are no guarantees, and a lot of probability is involved. As a result, you wait for locks with a spinlock only with locks that are held for a very short period of time. Because the thread that was acquiring the lock obviously was executing when the lock was acquired, there is a high probability that the thread is still executing to release the lock if that thread holds it for a very short time.

However, spinlocks can be effective also when IO is involved, i.e. when a thread is not waiting on another thread, but on a hardware event signaling that data is coming in, especially if that data is expected very soon (e.g. waiting on a hardware function being executed).

Kris Vandermotten
  • 10,111
  • 38
  • 49
  • Not really, no. It is very uncommon for spinlocks to be used for I/O waiting. Such waits are usually much too long for anything but a blocking wait. Drivers conventionally use semaphores for signaling I/O completion, (together with requesting a scheduling run by exiting via the kernel instead of a direct interrupt-return). Spinlocks only provide an an effective gain on muilticore systems where the lock is held for a very short time, so reducing the possibility of contention and so CPU/memory-bandwith waste. – Martin James Mar 14 '14 at 09:56
  • @MartinJames That depends on the hardware and the scenario we are talking about. You are absolutely correct when waiting for a response from a remote web server of course. But when asking an on-board sensor for a value, often a spinlock is used to wait for the answer. Anyway, my main point was not the spinlocks being used for IO, but rather that when spinlocks are used for one thread waiting for another thread, OP was correct in stating that it is all a matter of probability that the thread that needs to release the lock is indeed executing on another core. – Kris Vandermotten Mar 14 '14 at 10:52