Our system(linux) have a spin_lock lockup/deadlock problem, but I don't have a good idea to solve it. The spinlock can be get in irq and a data transmit function. The lockup happened when:
The app is going to transmit data, then it acquired the spinlock through the
spin_lock_irqsave
, but haven't called thespin_lock_irqrestore
. And this would run in CPU1/CPU2/CPU3.The irq happened in CPU0, and try to acquire the same spinlock by calling the
spin_lock_irq_save
, this cause the lockup of system. Because the preemption is disable for thespin_lock_irqsave
.
Disable the CPU0 irq is a solution, but the data is huge and it will be a long time before transmit all the data. And once we use the irq affinity, we have to disable all irq in all CPUs, this is not acceptable.
Is there any other method to solve this problem? Any experience? I guess the kernel should already have the mechanism to solve this, but I don't know.
Thanks in advance!