0

This problem known as Convoying: all other threads have to wait if a thread holding a lock is descheduled due to a time-slice interrupt or page fault. https://en.wikipedia.org/wiki/Lock_(computer_science)#Disadvantages

As known if thread-1 locks the std::mutex and occurs switch of thread-1, and if at this time many threads (2, 3, 4 ...) want to lock this mutex, then all these threads will locked and will be waiting for switch-on thread-1.

The solution to this is to use lock-free algorithms. But if requre to use a mutex, that is some solution to avoid such a situation?

  1. How can I find out in advance for 100 cycles before the imminent switching of thread?

  2. Or how can I raise an exception in advance for 100 cycles before switching the flow on Linux x86_64?

  3. Or how can I to make continue work of thread for some time (100 cycles)?

UPDATE:

I have 20 CPU Cores, and my program have 40 threads divided by 2 parts:

  • Part-1 - 20 threads use 1-st shared resource protected by std::mutex mtx1
  • Part-2 - 20 threads use 2-nd shared resource protected by std::mutex mtx2

It is known that the operating system gives each thread a certain quantum of time to work after which lulls him, and gives the vacant core of the next thread that will run the same time slot.

Part-1: Sometimes, not often, but this case is critical for me, happen that 1 of 20 threads do mtx1.lock() then start work with shared resource and then OS switch-off (puts to sleep) this thread before done mtx1.unlock() - because expired quant of time which allocated by OS to this thread and operating system decided to makes sleep this thread. And OS switch-on this thread only after ~1 - 10ms (30 000 000 cycles). During this time 19 other threads of Part-1 at least once try to get a shared resource each of 10 usec ( 30 000 cycles), but mtx1 is busy.

Then each of 19 threads of Part-1 begins to fall asleep, and vacated CPU-cores are occupied by threads from Part-2. OS see that all cores are busy and don't wake thread of Part-1.

This case occurs not often, but when this occurs then Part-1 (20 threads) freezes a whole 1-10 milliseconds (30 000 000 cycles), which is very unacceptably for the task.

How do that never been a situation with a delay of Part-1 more than 10 microseconds (30 000 cycles)?

Alex
  • 12,578
  • 15
  • 99
  • 195
  • 2
    If the thread is waiting for a lock, then the OS should switch to the next thread, and keep looping until you reach the thread that unlocks the mutex. – Ivan Rubinson Jul 01 '16 at 19:22
  • 8
    I am not sure I understand what the problem is because having all the other threads block while the locking thread runs is exactly the point of using mutex. – Galik Jul 01 '16 at 19:25
  • 2
    What is the relevance of "100 cycles" -- could you explain more about what you are thinking? – Soren Jul 01 '16 at 19:36
  • @Soren if I sure that thread will sleep after 100-1000 cycles, then I sure that I'm sure that will not have time to lock the mutex (CAS of flag), do the job and unlock mutex. It is more profitable to `yield()` in this thread to sleep immediately. – Alex Jul 01 '16 at 19:49
  • @Galik Problem is that I have 40 thread on 20 cores, 1 thread locks mutex and sleeps because OS want this, also 19 threads wait for this mutex and sleeps, and 20 thread which don't use this mutex occupy all 20 cores. – Alex Jul 01 '16 at 19:50
  • Why does "the os want this"? the idea of a mutex is that you own it for a as short a time as possible. – Richard Hodges Jul 01 '16 at 20:10
  • Are you saying that in the hypothetical example where the 20 threads that are not locked are using the 20 cores without being productive? – Soren Jul 01 '16 at 20:13
  • Why is your thread sleeping if it holds the mutex? Looks like bad design. – stark Jul 01 '16 at 22:11
  • @stark Because expired quant of time which allocated by OS to this thread and operating system decided to makes sleep this thread. – Alex Jul 01 '16 at 22:16
  • Research term: Hoare Monitor. – 2785528 Jul 01 '16 at 23:58

1 Answers1

0

The point of lock-free or near-lock-free designs is that if you do need a mutex for something, then that something would be rare and hence you would have a low probability of any two threads hitting the same mutex at the same time.

Your explanation of your design and the countermeasures you are prepared to take sound like you think there is a high probability that all the threads will hit the mutex -- so either your thinking is wrong or your design is wrong.

There is nothing you can do to read the mind of the scheduler, but as discussed here there is something you can do that may influence the way your thread is scheduled -- however I would recommend against playing around with anything like that.

Community
  • 1
  • 1
Soren
  • 14,402
  • 4
  • 41
  • 67