Strange deadlock in Linux kernel

Question

I normally think that deadlock could be a conflict of acquiring two different locks at two different threads (CPUs) while holding the other lock each other.

But, the lockdep in linux kernel tells me otherwise:

Here is the first one:

[  340.052197]  [<ffffffff81405448>] lock_irq_serial+0x14/0x16
[  340.058529]  [<ffffffff8136cb7e>] tell_me_store+0x178/0x60a
[  340.064858]  [<ffffffff8136a9be>] kobj_attr_store+0xf/0x19
[  340.070641]  [<ffffffff811e0d55>] sysfs_kf_write+0x39/0x3b
[  340.076423]  [<ffffffff811e01ee>] kernfs_fop_write+0xd5/0x11e
[  340.082475]  [<ffffffff81188c0d>] vfs_write+0xb7/0x18f
[  340.087890]  [<ffffffff81189470>] SyS_write+0x42/0x86
[  340.093213]  [<ffffffff816eff79>] ia32_do_call+0x13/0x13

where lock_irq_serial is a spin_lock. This lock is also used inside irq_work infrastructure.

The other part is:

[  344.135856]  [<ffffffff8110be77>] generic_exec_single+0x108/0x120
[  344.142277]  [<ffffffff8109071e>] ? leave_mm+0xbc/0xbc
[  344.147691]  [<ffffffff8109071e>] ? leave_mm+0xbc/0xbc
[  344.153104]  [<ffffffff8109071e>] ? leave_mm+0xbc/0xbc
[  344.158525]  [<ffffffff8110bf46>] smp_call_function_single+0x88/0xa4
[  344.165225]  [<ffffffff8110c0ff>] smp_call_function_many+0xf7/0x21a
[  344.171829]  [<ffffffff8109071e>] ? leave_mm+0xbc/0xbc
[  344.177249]  [<ffffffff810908a2>] native_flush_tlb_others+0x29/0x2b
[  344.183853]  [<ffffffff81090a4a>] flush_tlb_mm_range+0xed/0x146
[  344.190094]  [<ffffffff811769fc>] change_protection+0x126/0x581
[  344.196336]  [<ffffffff81176fa9>] mprotect_fixup+0x152/0x1cb
[  344.202299]  [<ffffffff811771a1>] SyS_mprotect+0x17f/0x20e
[  344.208078]  [<ffffffff816eff79>] ia32_do_call+0x13/0x13

where I don't do anything up there. I think there could be issue with spin-locking in irq_work and also locking it in other place (such as sysfs writing). Could anybody explain further details why this is a deadlock scenario?

What is `tell_me_store` and `lock_irq_serial`? There is no such thing in Linux Kernel. Also, regarding "acquiring two different locks" - not necessary. Deadlock may include N threads and M locks, 2 threads and 2 locks is just a simplest case of it. — myaut, Mar 04 '15 at 08:56
Try to enable [lockdep](http://stackoverflow.com/questions/20892822/how-to-use-lockdep-feature-in-linux-kernel-for-deadlock-detection) feature in your kernel. It should show you more specific message (in runtime). You can also run `coccicheck` on suspicious sources to detect deadlocks. — Sam Protsenko, Mar 04 '15 at 19:01

score 0 · Answer 1 · answered Mar 04 '15 at 18:51

Deadlock can happens in single core.

1) normal kernel context is taking "spin_lock"

2) interrupt handler is taking the same "spin_lock".

Then, when kernel context is holding the lock, if interrupts comes, what will happen? Not necessarily multiple locks and multiple threads. (actually, multiple contexts are engaged.)

By looking at the backtrace, it's hard to understand what's going on. Typically, backtrace at deadlock situation is not showing who is holding the lock.

Suggesting is, enable kernel deadlock detection config option and see what will happen.

Strange deadlock in Linux kernel

1 Answers1