0

Our system(linux) have a spin_lock lockup/deadlock problem, but I don't have a good idea to solve it. The spinlock can be get in irq and a data transmit function. The lockup happened when:

  1. The app is going to transmit data, then it acquired the spinlock through the spin_lock_irqsave , but haven't called the spin_lock_irqrestore. And this would run in CPU1/CPU2/CPU3.

  2. The irq happened in CPU0, and try to acquire the same spinlock by calling the spin_lock_irq_save, this cause the lockup of system. Because the preemption is disable for the spin_lock_irqsave.

Disable the CPU0 irq is a solution, but the data is huge and it will be a long time before transmit all the data. And once we use the irq affinity, we have to disable all irq in all CPUs, this is not acceptable.

Is there any other method to solve this problem? Any experience? I guess the kernel should already have the mechanism to solve this, but I don't know.

Thanks in advance!

TonyHo
  • 244
  • 5
  • 9

1 Answers1

0

You said that the transmission can take a long time, but spinlocks should be avoided in this context. Spinlocks should be used mainly for low level concurrency and for the least time possible.

You can use mutex or semaphore. For example, mutex has

/**
 * mutex_is_locked - is the mutex locked
 * @lock: the mutex to be queried
 *
 * Returns 1 if the mutex is locked, 0 if unlocked.
 */
static inline int mutex_is_locked(struct mutex *lock)
{
        return atomic_read(&lock->count) != 1;
}

You can use this to check if the mutex was already taken. With this you can avoid the deadlock.

Federico
  • 3,782
  • 32
  • 46