2

I have a question regarding the implementation of printk() in the 3.10 kernel. I see it calls local_irq_save at the beginning. I see it then calls raw_spin_lock(&logbuf_lock). What's the purpose of logbuf_lock if interrupts have been already disabled prior to this? Is it because even though interrupts are disabled on the current CPU other CPUs can still call printk so they need to be stopped from writing to the log buffer?

Basically I have three questions:

  1. I see printk acquires logbuf_lock and writes to the log buffer and then tries to grab the console semaphore and releases logbuf_lock. Then inside console_unlock inside a loop it acquires logbuf_lock and disables interrupts, then releases logbuf_lock and calls the console drivers, and then restores interrupts. What's the purpose of this locking/disabling interrupt sequence?

  2. I see comments in printk() about the log buffer possibly being filled up again so the buffer may have to get flushed to the console again. How would this situation occur given all the locking I asked about in #1 above?

  3. If code on only 1 CPU were to be calling printk() at any given moment, could interrupts still be handled on other cores in an SMP system? I'm trying to also understand the impact of printk on interrupt latency.

Thanks.

A few follow-ups:

Can you clarify this:

local_irq_save()protects against interrupts on the local CPU (and also avoids being rescheduled on another CPU while using a cpu variable to access per-CPU data)

Do you mean calling local_irq_save() will prevent the current thread from being rescheduled on another CPU only if it accesses per-CPU data or it prevents the current thread from being rescheduled on another CPU period? What is the purpose of local_irq_save() in the case of printk() here? I remember reading a thread on LMKL that said the disabling of interrupts was to ensure the order of entries in the log buffer reflected the actual order in which the printk() calls occurred.

Mike Sweeney
  • 1,896
  • 2
  • 18
  • 20
  • Please consider asking in a linux forum, as your question is not actually a programming issue that can be solved here. – fast Sep 16 '14 at 16:22
  • @fast: How is it not a programming question? It could be relevant to Linux kernel programming. – Keith Thompson Sep 18 '14 at 01:05

1 Answers1

8

What's the purpose of logbuf_lock if interrupts have been already disabled prior to this? Is it because even though interrupts are disabled on the current CPU other CPUs can still call printk so they need to be stopped from writing to the log buffer?

Yes. local_irq_save() protects against interrupts on the local CPU (and also avoids being rescheduled on another CPU while using a cpu variable to access per-CPU data), while a spinlock protects against other CPUs.

If code on only 1 CPU were to be calling printk() at any given moment, could interrupts still be handled on other cores in an SMP system?

Yes.

I see printk acquires logbuf_lock and writes to the log buffer and then tries to grab the console semaphore and releases logbuf_lock. Then inside console_unlock inside a loop it acquires logbuf_lock and disables interrupts, then releases logbuf_lock and calls the console drivers, and then restores interrupts. What's the purpose of this locking/disabling interrupt sequence?

There are two things to protect: the log buffer and the console drivers. logbuf_lock protects the log buffer while console_sem protects access to the console drivers list and to the actual consoles themselves.

Printing a kernel message is a two step process. First the message is placed in the log buffer, and then the log buffer is sent to the consoles in console_unlock(). These two steps don't need to happen in the same call to printk(). Moreover, flushing the log buffer to the consoles can happen while registering/starting/resuming/... consoles.

After having placed the message in the log buffer, printk() tries to acquire the console_sem. It can do this even on interrupt context, since down_trylock() doesn't sleep. If it acquires the semaphore, it can proceed to send the log buffer content to the consoles. If it doesn't acquire the semaphore, it is the responsability of the holder of the console semaphore to send the log buffer content to the consoles, on console_unlock(). While console_unlock() is sending to the consoles, there can be other CPUs calling printk(). So console_unlock() loops until there is nothing more to send to the consoles. In every loop it gets pointers to the part of the log buffer to send to the consoles, under the logbuf_lock, and then, no longer under the logbuf_lock, emits the output to the consoles. While sending stuff to the consoles, since the logbuf_lock is not taken, other CPUs can keep adding stuff to the log buffer.

I see comments in printk() about the log buffer possibly being filled up again so the buffer may have to get flushed to the console again. How would this situation occur given all the locking I asked about in #1 above?

The buffer could have been filled up after releasing logbuf_lock but before up()ing console_sem. And logbuf_lock is released before up()ing console_sem because up() can cause wakeups, which need to take the runqueue lock, which could generate priority inversion problems against printk() called with the runqueue lock taken (commit 0b5e1c5255).

There are proposed patches to change this locking scheme: Jan Kara [PATCH 0/8 v4] printk: Cleanups and softlockup avoidance, which, among other things, try to avoid a CPU looping indefinitely on console_unlock() (it has happened on big systems logging lots of startup events over slow serial consoles), handing over that work to other CPUs; and try to minimize the time interrupts are disabled on printk().

Do you mean calling local_irq_save() will prevent the current thread from being rescheduled on another CPU only if it accesses per-CPU data or it prevents the current thread from being rescheduled on another CPU period?

The latter. local_irq_save() prevents interrupts from being handled on the local CPU, which could end up calling schedule() when returning from the ISR. schedule() can also be called from other places, but since printk() should be able to be used from interrupt context, nothing it calls should end up calling schedule() (e.g: down_trylock() is used instead of down() for this very reason).

What is the purpose of local_irq_save() in the case of printk() here?

Jan Kara's [PATCH 3/8] printk: Enable interrupts before calling console_trylock_for_printk() tries to minimize the time interrupts are disabled. Before that patch, interrupts are disabled at least for the following reasons:

  • To protect logbuf_lock-protected data from interrupts. This case is typically solved with IRQ variants of the spinlocks (e.g: raw_spin_lock_irqsave()), but here there are more things to run with interrupts disabled.
  • To prevent the current CPU from suddenly changing. can_use_console(), called from console_trylock_for_printk(), asks: Can we actually use the console at this time on this cpu?, noting that: Console drivers may assume that per-cpu resources have been allocated. Moving to another CPU could prove confusing.
  • While console_sem is taken with a down_trylock() and so shouldn't be a problem should an interrupt also try to printk(), being preempted while holding the console_sem() would prevent anyone else from printing to the console.

So the above patch no longer has interrupts disabled for the latter two cases, and instead wraps them in preempt_disable()/preempt_enable(), which allow interrupts, but not preemption.

I remember reading a thread on LMKL that said the disabling of interrupts was to ensure the order of entries in the log buffer reflected the actual order in which the printk() calls occurred.

Can you share a reference to that thread? You may notice in 3.10 there is a cont buffer, which holds all the characters from the last newline. It gets flushed to the "real" log buffer either when a newline arrives, or when a different task printk()s.

On that thread, I don't find any valid concern about the ordering of log entries, apart from this excerpt:

Well, I believe someone got DDetetccctted ed 113223 HHzz CPUCPU

which, AFAIK, is totally bogus. Ordering of the log buffer is guaranteed by logbuf_lock, which is taken with interrupts already disabled when writing to the log buffer in vprintk_emit(), and is taken with a lock variant that disables interrupts (raw_spin_lock_irqsave()) when reading from it on console_unlock(). So, access to the log buffer is safe from interference by either other CPUs, or interrupts.

There remains the case where a log line is split into several printk() calls, in newer kernels, that case is covered by the cont buffer, which holds partial lines, and flushes them when another CPU/interrupt interferes, so a log line may be split in several lines and have unrelated log lines between them, but no log line should have mixed output.

The other possible cause of corruption that remains, is that since the log buffer is a ring buffer, it could theoretically overflow, which would mean overwriting previous messages.

A valid concern in that thread is timely output of the log buffer. This is achieved by trying to call console_unlock() (which calls the console drivers) in every vprintk_emit() call. If the console semaphore cannot be acquired, the message is already in the log buffer, and the current semaphore owner will output it to the consoles.

An interesting thing mentioned in that thread, is that before commit a0f1ccfd8d: "lockdep: do not recurse in printk", (printk.c before and after) interrupts were reenabled before calling release_console_sem() (which is the previous incarnation of console_unlock()). Apparently, when lockdep (a lock validator, able to detect possible deadlocks and other locking problems, and print diagnostics) was enabled, it could cause lockups when attempting to printk from printk(). So, calls to spin_{,un}lock_irq{save,restore}() were split into disabling/enabling interrupts and acquiring/releasing the lock, lockdep_on/off() calls were added in between, and the disabling of lockdep and interrupts was extended to cover the whole function.

Going back to:

I see printk acquires logbuf_lock and writes to the log buffer and then tries to grab the console semaphore and releases logbuf_lock. Then inside console_unlock inside a loop it acquires logbuf_lock and disables interrupts, then releases logbuf_lock and calls the console drivers, and then restores interrupts. What's the purpose of this locking/disabling interrupt sequence?

console_unlock() is not only called from vprintk_emit(), it's also called when registering a new console, when resuming a console, when hotplugging a CPU with pending output to console, ... Those places typically have interrupts enabled. So, console_unlock() has to take that into account.

You seem to have noticed that while interrupts are disabled when taking the logbuf_lock on console_unlock() (it calls raw_spin_lock_irqsave()), they are not (potentially) reenabled when releasing the lock (raw_spin_unlock()), they are only potentially reenabled (local_irq_console()) after call_console_drivers(). The only reason I see to call call_console_drivers() with interrupts disabled is to avoid the CPU from changing under us (console drivers may access per-CPU variables).

An interesting data point is that the -rt (realtime) patchset reenables interrupts before calling console_unlock(), and also before call_console_drivers() (which in -rt is protected by migrate_disable()/migrate_enable(), which disallows CPU migration) in console_unlock(). This is done to minimize interrupt latencies during printk(). PREEMPT_RT_FULL favors low interrupt latencies. You can see it at printk.c in the linux-stable-rt tree at git.kernel.org, the relevant patch is printk-rt-aware.

ninjalj
  • 42,493
  • 9
  • 106
  • 148
  • Thanks for the very helpful and thorough response. I have a few follow-ups I added to the original question. – Mike Sweeney Sep 18 '14 at 06:03
  • [Here's](https://lkml.org/lkml/2007/10/25/377) the thread that mentioned the reason for disabling interrupts in printk is to allow for deterministic behavior. – Mike Sweeney Sep 19 '14 at 08:49
  • I'm confused by your last 2 paras. In the 3.10 [kernel](https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/tree/kernel/printk.c?id=refs/tags/v3.10.55#n2098) at the end of the for loop I see a call to `raw_spin_unlock(&logbuf_lock);`, a call to the console drivers, and then a call to `local_irq_restore(flags);`. Since from my reading of the code interrupts are still disabled when `console_unlock()` is called my understanding is the `local_irq_restore()` will keep them disabled. I'm also not sure why releasing the logbuf_lock is done earlier than `local_irq_restore`. – Mike Sweeney Sep 23 '14 at 09:18
  • Doesn't most code that accesses per-CPU variables do so using `get_cpu_var()` which disables preemption? I though that would prevent any rescheduling. If so, I don't see any good reason for disabling interrupts and increasing interrupt latency versus disabling preemption before calling the console drivers. – Mike Sweeney Sep 24 '14 at 07:16
  • 1
    And the -rt developers seem to agree with that assessment. Perhaps some mainline developers are more concerned about an interrupt storm delaying console output indefinitely (and possibly even overflowing the ring buffer)? – ninjalj Sep 25 '14 at 10:14
  • Thanks for all the answers. For future reference, is SO the right place for these kinds of questions? – Mike Sweeney Sep 25 '14 at 16:23
  • Please, could you clarify your notation about possible lost of messages in log_buf. Does it mean, that printk could discard messages (in case of aggressive printing) because of slow UART? I'am talking about kernel 4.1 http://lxr.free-electrons.com/source/kernel/printk/printk.c#L1613 . As i understand Jan Kara's idea was implemented but in slightly different way.. please refer my question if it possible http://stackoverflow.com/questions/31088266/linux-kernel-printk-can-skip-messages – Alex Hoppus Jun 27 '15 at 12:11
  • 1
    @AlexHoppus: if you look at `log_store()`, who calls `log_make_free_space()` and possibly `truncate_msg()`, it seems that now are new messages the ones that are potentially lost or truncated, and `printk` returns the number of bytes actually logged. This is different than what happened historically with `emit_log_char()`. – ninjalj Jun 27 '15 at 12:45