Why priority inversion will happen in this case - Linux?

Question

I have read many posts on priority inversion still I am not able to clarify my understanding on some of the parts. I would be happy if someone can throw some light on my question.

Let's describe the situation first. I have a pseudo code, that is self explanatory.

I have a shared resource - int t; Here is the function that will be executed by the three threads - low task p1, medium priority task p2, high priority task p3.

/**Shared resource **/
int t;

/** Function that will be executed by three threads **/

void func()
{
   printf("hello..world");    /** line number 11**/
   mutex_lock(&lock);         /** line number 12**/
   {                          /** line number 13**/
      for(int i = 0; i<=100; i++) /** line number 14**/
      {                          /** line number 15**/
         t++;                   /** line number 16**/
      }                         /** line number 17**/

   }                            /** line number 18**/

   mutex_unlock(&lock);        /** line number 19**/

}                              /** line number 20**/

Let's say p1(low p).. starts executing func(). Now let's say it is in the line number- 13. after mutex lock.Mean while let's say p3 ..starts running. Now, p3 will be blocked because p1 is in the critical section. So, p3 goes to the blocking state.

Scenario- p1 - inside critical section - in the Running State. p3 - blocked state.

Now, let's say p2 starts running. As p2 is in the running state, it will also be blocked by p1 since p2 is in the critical section. Then how come a priority inversion happens here? I am missing the understanding afterwards, kindly please explain me..

Are my understanding below is correct? If not, then please correct it. What should be the situation when the priority inversion happens by p2 task? I understand that priority inversion happens when p2 starts running. After p2 completion, p1 starts running. And p3 never gets a chance.Or it could be that after p2 is done, p3 runs. This makes the p3 delay. In such cases, the mutex timeout can happen.

This was one of the scenario - bug in our software. Where there was a crash due to mutex time out. This was happening because of priority inversion somebody said. This was fixed by setting the mutex attribute to priority inheritance. I was trying to post-mortem the fix, but I am held with the priority inversion fundamentals. I have read many post - Mars path finder, but I am stuck with my questions. Please help me here.

Guntram Blohm · Accepted Answer · 2014-01-01T10:29:34.850

2

The priority inversion problem does not occur if p2 waits on the mutex as well. In that case, p1 runs until finished, then unlocks the mutex, and the scheduler can schedule p3 next.

Assume p2 is NOT waiting. Instead, it does something completely different, which takes a lot of CPU cycles. While p2 is running, p1 will get little or no CPU resources. So p1 will never (or after a long time) finish and unlock the mutex. When p2 is finally finished, and doesn't use the CPU anymore, p1 will get CPU time again, finish, and unlock the mutex. Now p3 can continue.

In that scenario, p3 had to wait until p2 was finished, even though the priority of p3 was higher than that of p2.

The priority inversion problem is not a problem when all threads are competing for the same resource. It's a problem when there are different resources involved (in my example, the mutex and CPU time), a low priority thread has blocked one resource, a high priority thread is waiting for that resource, but the low priority thread can't free its resource because a medium priority thread prevents the low prio thread from running.

What priority inheritance does is: While p3 is waiting for the mutex, p1 will "inherit" p3's priority. So p1 will get the CPU instead of p2, which means it can finish its task and release the mutex as fast as possible. Once p1 has released the mutex, it will return to its own priority, and the scheduler will allow p3 to run (because p1 is finished and p2's priority is lower than p3's).

edited Jan 01 '14 at 10:29

answered Jan 01 '14 at 10:22

Guntram Blohm

9,667
2
24
31

Thanks I liked the answer.But I am confused on 1 part of your answ. You say that "Assume p2 is NOT waiting...which takes a lot of CPU cycles." How come p1 execution will be interrupted. This is the atomic operation btw mutex_lck and mutex_ulck. The cpu will not schedule anything in between.The other task p2 will get CPU only when the p2 has done the critic sect. In this way the p2 cannot disrupt the execution of p1. When p1 is done the p3 will get a chance. Are you saying this prob occurs when 2 tasks are refering the common resour. but,the 3rd task is doing something independ. – dexterous Jan 01 '14 at 10:46
Yes. That's why i said it is *not* a problem when they compete for the *same* resource, it's a problem when they compete for a *different* resource. P2 has to do something different that has nothing to do with the mutex. – Guntram Blohm Jan 01 '14 at 10:48
So you say that the two tasks should compete for the common resource whereas the third task should look for something else. Also, one more question is- How come p1 execution will be interrupted? This is the atomic operation between mutex_lock and mutex_ulock.The cpu will not schedule anything in between.The other task p2 will get CPU only when the p2 has completed the critical section. In this way, the p2 cannot disrupt the execution of p1. When p1 is completed, the p3 will get a chance. Then, how a problem can occur? – dexterous Jan 01 '14 at 11:01
"The cpu will not schedule anything in between" is wrong. The cpu will schedule whatever it wants. If the thread that's scheduled waits on the same mutex, then it's stopped until the mutex is released, that's what makes the operations between lock and unlock atomic. – Guntram Blohm Jan 01 '14 at 11:39
There are several ways to implement mutexes though, and one way (not a very good one) is to disable all interrupts on mutex_lock and re-enable them on mutex_unlock. If they are implemented this way, you're right about the CPU not scheduling anything else. However, this implementation allows only one central mutex for the whole system (you can't disable interrupts twice), and in a multicore environment, you'd have to halt all other cores as well. So in the generic case, the `mutex_lock` and `mutex_unlock` operations *themselves* allow no context switch, but the code in between does. – Guntram Blohm Jan 01 '14 at 11:41
I had recently seen a fix on mutex time out issue. There were two threads A and B. The thread A was holding one mutex, say 'x', whereas the the other thread B was holding another mutex, say 'y'. The thread A was sigabrt because of the mutex time out. The developer has fixed this issue by adding priority inheritance as the default mutex attribute. This has fixed the issue. Rest is continued in the next comment. – dexterous Jan 08 '14 at 11:50
- in continuation - However, I am not able to understand how this was related to priority inversion. I understand priority inversion, but not able to correlate here. Does it mean that the thread A was holding a mutex, but it never got a chance to pre-empt because thread B and middle thread were always executing? The thread A never got a chance to execute and it was held with the mutex for a very long time. This invoked the mutex time out and eventually a crash. Please let me know your comments. I really don't how it strikes that in such situation we should just change to priority inheritance. – dexterous Jan 08 '14 at 11:50
It's hard to tell anything without knowing details, seeing code and so on. Did A *hold* the mutex X, or did it *wait for* mutex X? If it *holds* the mutex X, there's no reason for it to "time out", except if you have some kind of watchdog thread that kills other threads if they block a mutex too long. One scenario could be: A acquires X, B acquires Y. A has low prio, B medium, C high. C executes the same code that A does, waits on the mutex, and starts the watchdog. B keeps running, and prevents A from doing anything. After a certain amount of time, the watchdog kills A. – Guntram Blohm Jan 08 '14 at 14:20
When you set priority inheritance on X, A will run with C's high privileges as soon as C starts waiting. So A will release the mutex quickly and drop its prio to standard, B and C can continue, while A gets blocked (the cpu is busy with B and C), but at a non-critical point in code where it doesn't hold any resources the other threads need. But unless you send me all the code and pay me to review it for a few days i can only guess, and not say anything definite. – Guntram Blohm Jan 08 '14 at 14:24

score 0 · Answer 2 · edited May 23 '17 at 11:51

To understand this, its better to define the priorities for the tasks, p1,p2 and p3 , priorites will be p1 < p2 < p3.

In normal scenario without priority inversion i.e without any mutex lock when higher priority task wants to run, it justs prempts the lower priority task.

Acquiring mutex will result in disabling the preemption. priority inversion can be explained with just p1 and p3. As in your example, p1 has acquired the lock.

Now p3 being highest priority when tries to acquire the lock, it is blocked.

This is priority inversion where a higher priority task is blocked from running by a lower priority by holding the resource that higher priority task requires.

Refer this for further understanding, What is priority inversion?

Why priority inversion will happen in this case - Linux?

2 Answers2