4

I run an Embedded Linux system, which is created with Buildroot 2021.02.2 and has BusyBox v1.33.0, Linux 5.4.8, glibc 2.32.

When I run

# reboot

It does nothing, and there is no error or warning in syslog.

However, when I run with -f force option, it reboots the device.

# reboot --help
BusyBox v1.33.0 () multi-call binary.

Usage: reboot [-d DELAY] [-nf]

Reboot the system

    -d SEC  Delay interval
    -n  Do not sync
    -f  Force (don't go through init)

# reboot -f
< reboots OK >

I have debug messages in /etc/init.d/rcK and in inittab, but it looks like the execution after reboot is not reaching the inittab nor rcK. The "shutdown" part of /etc/inittab:

# Stuff to do before rebooting
::shutdown:/bin/echo "MY DEBUG" | /usr/bin/logger
::shutdown:/etc/init.d/rcK
::shutdown:/sbin/swapoff -a
::shutdown:/bin/umount -a -r
::respawn:/usr/bin/monit -Ic /etc/monitrc

I have confirmed the libc.so has the reboot functionality:

nm build/glibc-2.32-37-g760e1d287825fa91d4d5a0cc921340c740d803e2/build/libc.so.6 | grep reboot
000cb2e0 T reboot

This makes me think there is a mistake in the communication with init. The system has Busybox init: BR2_INIT_BUSYBOX=y.

Signals accepted by init:

# init --help
BusyBox v1.33.0 () multi-call binary.

Usage: init

Init is the first process started during boot. It never exits.
It (re)spawns children according to /etc/inittab.
Signals:
HUP: reload /etc/inittab
TSTP: stop respawning until CONT
QUIT: re-exec another init
USR1/TERM/USR2/INT: run halt/reboot/poweroff/Ctrl-Alt-Del script

Do you have any ideas how to debug this situation? I want the reboot to work with proper shutdown and without having to omit the init.

EDIT:

The reboot command from busybox sometimes has effect, and sometimes not, this is why this case is especially confusing.

When it works, it gives the following log in syslog:

Jan  1 00:20:26 : starting pid 494, tty '': '/etc/init.d/rcK'                                                                              
Jan  1 00:20:26 [my_service: signal_handler] MyService terminated                                                                  
Jan  1 00:20:26 bluetoothd[368]: Terminating                                                                                               
Jan  1 00:20:26 bluetoothd[368]: Stopping SDP server                                                                                       
Jan  1 00:20:26 bluetoothd[368]: Exit                                                                                                                                                                         
Jan  1 00:20:27 dropbear[346]: Early exit: Terminated by signal                                                                            
Jan  1 00:20:27 ntpd[318]: ntpd exiting on signal 15 (Terminated)                                                                          
Jan  1 00:20:28 ModemManager[312]: <info>  caught signal, shutting down...                                                                 
Jan  1 00:20:28 ModemManager[312]: <info>  ModemManager is shut down                                                                       
Jan  1 00:20:29 syslog-ng[236]: syslog-ng shutting down; version='3.30.1'                                                                  
Jan  1 00:20:29 : starting pid 575, tty '': '/sbin/swapoff -a'                                                                             
Jan  1 00:20:29 : starting pid 576, tty '': '/bin/umount -a -r'     

When it has no effect, there is no evidence in logs, not even calling the rcK. Do you have ideas what could be the reason for such behavior? As a last resort, I can move to SysVinit if this would solve the problem.

EDIT: SOLUTION

Posting the solution here since the question was closed.

When writing init scripts, always use start-stop-daemon with either -b -m -p, or with just -p.

Using -m without -b option can cause problems similar to the one described in question.

Examples of good usage:

start-stop-daemon -S -q -m -b -p /var/run/logrotate.pid --exec /usr/sbin/logrotate.sh

or

start-stop-daemon -S -q -p "$PIDFILE" -x "/usr/sbin/$DAEMON"

More info:

Filip Kubicz
  • 459
  • 1
  • 5
  • 17
  • If init at pid 1 you can try also halting with: ```init 0``` – koyaanisqatsi Aug 23 '21 at 20:41
  • `# init 0 init: must be run as PID 1` What would the `init 0` command achieve? Do you mean running it after the system is fully booted, and the init is already running? – Filip Kubicz Aug 24 '21 at 07:06
  • init is running at PID 1, as shown by `ps aux | grep init`. – Filip Kubicz Aug 24 '21 at 07:09
  • On my systems ( raspian/KNOPPIX ) init executes scripts in ```/etc/rc[0-6].d``` - And ```init 0``` do a halt/shutdown with ```/etc/rc0.d``` - The ```init 6``` do a reboot and executes all ```/etc/rc6.d``` scripts. I do for example restarting ```X``` as root on a terminal with ```init 2``` (multiuser without X) and a few seconds later with ```init 5``` (multiuser with X) – koyaanisqatsi Aug 24 '21 at 09:31
  • 1
    Your Raspbian does not use Busybox init. You mention using `init 6` which under the hood uses `telinit` to pass the argument to init. On my device there is no telinit, also no runlevels (rc[0-6].d). – Filip Kubicz Aug 24 '21 at 10:11
  • Then maybe you send ```TERM``` to init with busybox? Like: ```busybox kill -s TERM $(busybox pidof init)``` - Like in the usage of ```ìnit``` you posted. – koyaanisqatsi Aug 24 '21 at 10:34
  • Thank you. `kill -s TERM 1` or equivalent `busybox kill -s TERM $(busybox pidof init)` has no effect. – Filip Kubicz Aug 24 '21 at 11:36
  • Read this: https://busybox.busybox.narkive.com/TjHrJu1r/how-to-reboot-the-system-if-busybox-run-as-init. It explains what happens in your case. – 0andriy Aug 24 '21 at 11:39
  • @0andriy this is definitely not this case. I did more testing and in my case, `reboot` sometimes works, and sometimes not. Sometimes it reboots after a long time after the `reboot` command was used. The problem is finding out why it behaves this way. I updated the question with more information. – Filip Kubicz Aug 27 '21 at 09:18
  • 1
    Sounds like a race condition between the scripts. When it works another script may have been finished, and otherwise it's not and prevents rebooting (note, just a speculation, you have to read the source code). – 0andriy Aug 27 '21 at 13:50
  • I added more debugging to init.c, and found that one of the init scripts was being run by `start-stop-daemon` with `-m (--make-pidfile)` option, but without `-b (--background)`. This caused the `/etc/init.d/rcS` to be active all the time as a child of `/bin/sh`. It was the reason why `rcK` was not able to run, and could not terminate the system. Problem solved. See https://man7.org/linux/man-pages/man8/start-stop-daemon.8.html – Filip Kubicz Aug 30 '21 at 14:33
  • 1
    Actually your last suggestion @0andriy was true - after the faulty daemon finished, the reboot was working. When it was still running, the `rcS` was also running, and reboot had no effect. – Filip Kubicz Aug 30 '21 at 14:34
  • 1
    I get why this question was closed - it belongs on a different stackexchange forum - but it's a valuable question, and one which just solved my problem. Suggest admins migrate it to the correct place. – Sod Almighty Aug 11 '23 at 21:07

0 Answers0