11

At first glance, my question might look bit trivial. Please bear with me and read completely.

I have identified a busy loop in my Linux kernel module. Due to this, other processes (e.g. sshd) are not getting CPU time for long spans of time (like 20 seconds). This is understandable as my machine has only single CPU and busy loop is not giving chance to schedule other processes.

Just to experiment, I had added schedule() after each iteration in the busy loop. Even though, this would be keeping the CPU busy, it should still let other processes run as I am calling schedule(). But, this doesn't seem to be happening. My user level processes are still hanging for long spans of time (20 seconds).

In this case, the kernel thread got nice value -5 and user level threads got nice value 0. Even with low priority of user level thread, I think 20 seconds is too long to not get CPU.

Can someone please explain why this could be happening?

Note: I know how to remove busy loop completely. But, I want to understand the behaviour of kernel here. Kernel version is 2.6.18 and kernel pre-emption is disabled.

APKar
  • 323
  • 2
  • 8
  • Have you set the state `set_current_state(TASK_INTERRUPTIBLE);` before entering your loop? – ott-- Dec 21 '12 at 13:27
  • No. I didn't do that. I want the task to be in TASK_RUNNING state, so I don't need to wake it up externally. Hoping that scheduler will wake it up later as it is already in run Q. – APKar Dec 21 '12 at 14:29
  • Where is your kernel module looping? Is it in the interrupt service routine or in a kernel thread you create? – Maxim Egorushkin Dec 21 '12 at 14:44
  • It's in a kernel thread, not in interrupt context. – APKar Dec 21 '12 at 14:46
  • So, as @ott mentioned, is that an interruptable thread? – Maxim Egorushkin Dec 21 '12 at 15:01
  • If I understand correctly, you are asking the state of thread.. It is not in INTERRUPTABLE state and the kernel is non-preemptable.. so, there is no way thread can be interrupted (other than interrupts) unless it voluntarily gives up CPU. I am doing exactly that by calling schedule(), so scheduler can get a chance to schedule some other process. But, that doeesn't seem to be happening. Hope I was clear. – APKar Dec 21 '12 at 15:09
  • I don't think that kernel modules can be prempted in the same way as user code, especially depending on what their doing. The kernel just doesn't have as much power over itself as it does over user land code. You could try calling sleep if it is available to your code, or sched_yield. – Linuxios Dec 21 '12 at 15:39
  • Why not run with kernel pre-emption "enabled". – manav m-n Dec 24 '12 at 08:16
  • Sorry guys for my late response. I was on holiday :). Linuxios, @Manav My question was to understand the behaviour. As I mentioned before I know how to fix the issue and I already fixed it. But, I still don't understand the exact behaviour. Thanks for the reply. – APKar Jan 02 '13 at 23:39
  • Just trying to get more context. What's being done in the busy thread? If you can separate it into critical and non-critical sections then perhaps you can give the kernel an opportunity to run other processes. However, disabling pre-emption means no kernel thread can be interrupted. – engineerC Jan 06 '13 at 04:16
  • I think you need to post a basic code sketch of what you are doing. First, the sketch of the module, and a description of how you are determining that a thread is not being able to run. – Noah Watkins Jan 24 '13 at 16:06
  • 2
    To me, it seems like you try to give up the CPU for other tasks while still holding a spinlock. Anyway, I think you should show us the structure of your loop (where do you hold the lock, where do you release it, where is the schedule call etc.). Your question is interesting to me and I am also eager to see the answer ! – Rerito Jan 26 '13 at 09:57

1 Answers1

2

The schedule() function simply invokes the scheduler - it doesn't take any special measures to arrange that the calling thread will be replaced by a different one. If the current thread is still the highest priority one on the run queue then it will be selected by the scheduler once again.

It sounds as if your kernel thread is doing very little work in its busy loop and it's calling schedule() every time round. Therefore, it's probably not using much CPU time itself and hence doesn't have its priority reduced much. Negative nice values carry heavier weight than positives, so the difference between a -5 and a 0 is quite pronounced. The combination of these two effects means I'm not too surprised that user space processes miss out.

As an experiment you could try calling the scheduler every Nth iteration of the loop (you'll have to experiment to find a good value of N for your platform) and see if the situation is better - calling schedule() too often will just waste lots of CPU time in the scheduler. Of course, this is just an experiment - as you have already pointed out, avoiding busy loops is the correct option in production code, and if you want to be sure your thread is replaced by another then set it to be TASK_INTERRUPTIBLE before calling schedule() to remote itself from the run queue (as has already been mentioned in comments).

Note that your kernel (2.6.18) is using the O(1) scheduler which existed until the Completely Fair Scheduler was added in 2.6.23 (the O(1) scheduler having been added in 2.6 to replace the even older O(n) scheduler). The CFS doesn't use run queues and works in a different way, so you might well see different behaviour - I'm less familiar with it, however, so I wouldn't like to predict exactly what differences you'd see. I've seen enough of it to know that "completely fair" isn't the term I'd use on heavily loaded SMP systems with a large number of both cores and processes, but I also accept that writing a scheduler is a very tricky task and it's far from the worst I've seen, and I've never had a significant problem with it on a 4-8 core desktop machine.

Cartroo
  • 4,233
  • 20
  • 22