0

If I manually stop my service and then execute echo V > /dev/watchdog1, the watchdog stops properly.

If I do the same echo command in my systemd service, I get:

watchdog did not stop!

ExecStopPost=echo V > /dev/watchdog1

Why is the behavior not the same?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
David
  • 1,241
  • 20
  • 36

3 Answers3

1

This does not work for the same reason mentioned in this post: Execute multiple commands with && in systemd service ExecStart on RedHat 7.9

The commands from inside a systemd service are not executed in a proper shell environment. Even so, I do not have some kind of source that explicitly states this. From experience, the capabilities of a single systemd exec are the following: Run one command with parameters (not multiple commands, no output redirection, etc.).

Just like in the referenced post, the solution could be writing it as follows:

ExecStopPost=/bin/bash -c 'echo V > /dev/watchdog1'
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
MiEbe
  • 96
  • 8
  • I tried your solution, but got on the serial port "watchdog: watchdog1: watchdoge did not stop!". After the delay I configured in my app, the OS reboot. – David May 09 '22 at 14:21
  • Just wanted to add that don't forget the watchdog has been started from code that was under ExecStart. I think that code has exclusive access to the watchdog1 file and can't do anything until the process is really "killed" because the same "echo" command works after returning from a systemctl stop myservice – David May 09 '22 at 14:39
  • my bad... it is working. It looks like because I was pushing the result and the errors to a file, it did not work (echo V > /dev/watchdog1 >> myfile 2>> myfile – David May 09 '22 at 15:09
0

You can interact with your watchdog through echoes, but I would strongly advise you against it.

An echo opens/closes your watchdog on every run, needing to configure it as a non-stoppable watchdog. Also, for each open/close, you are getting a warning in the kmsg log, receiving an unnecessary amount of insane spam.

Do it right; do it by writing your own application and handling its file descriptor. Do not use echoes anymore! See the below example:

#include <stdio.h>
#include <stdlib.h>
#include <sys/ioctl.h>
#include <fcntl.h>
#include <string.h>
#include <errno.h>
#include <unistd.h>

// Read more:
// https://www.kernel.org/doc/Documentation/watchdog/watchdog-api.txt
#include <linux/watchdog.h>

#define WATCHDOG_DEV "/dev/watchdog"

int main(int argc, char** argv) {

  /* Open your watchdog */
  int fd = open(WATCHDOG_DEV, O_RDWR);
  if (fd < 0) {
    fprintf(stderr, "Error: %s\n", strerror(errno));
    exit(EXIT_FAILURE);
  }

  /* Query timeout */
  int timeout = 0;
  if (ioctl(fd, WDIOC_GETTIMEOUT, &timeout) < 0) {
    fprintf(stderr, "Error: Cannot read watchdog timeout: %s\n", strerror(errno));
    exit(EXIT_FAILURE);
  }
  fprintf(stdout, "The timeout is %d seconds\n", timeout);

  /* Query timeleft */
  int timeleft = 0;
  if (ioctl(fd, WDIOC_GETTIMELEFT, &timeleft) < 0) {
    fprintf(stderr, "Error: Cannot read watchdog timeleft: %s\n", strerror(errno));
    exit(EXIT_FAILURE);
  }
  fprintf(stdout, "The timeleft is %d seconds\n", timeleft);

  /* Touch your watchdog */
  if (ioctl(fd, WDIOC_KEEPALIVE, NULL) < 0) {
    fprintf(stderr, "Error: Cannot write watchdog keepalive: %s\n", strerror(errno));
    exit(EXIT_FAILURE);
  }
  fprintf(stdout, "Keepalive written successfully\n");

  /* Stop your watchdog */
  write(fd, "V", 1);

  /* Close your watchdog */
  close(fd);

  return 0;
}

Another (and easier) option could be to setup a ready-made watchdog service. See the watchdog package for Debian/Ubuntu.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
BeardOverflow
  • 938
  • 12
  • 15
  • thanks for the advice. Looking at your code, I see that you set "V" to /dev/watchdog. In my case, should I change it for /dev/watchdog1 instead or it does not matter? – David May 05 '22 at 16:40
  • @david Change `WATCHDOG_DEV` to your needs, it can be `/dev/watchdog1` too. About writting a `V` character, also known as 'magic close', it allows you to disable your watchdog when `NOWAYOUT=Y` is configured in the watchdog's driver. In other words, if you don't write a magic close and NOWAYOUT=Y, you won't be able to stop the watchdog after closing its file descriptor and your machine will be rebooted. Read more about magic close/nowayout feature in `Documentation/watchdog/watchdog-api.txt` – BeardOverflow May 05 '22 at 18:05
0

I know this slightly deviates from the OP's question, but you could also delegate watchdog management to systemd, using systemd's socket API instead.

[Unit]
Description=My Unit

[Service]
ExecStart=/my/app args
WatchdogSec=30 # 30s, but you can specify whatever you want
# Optional: Restart=on-watchdog # Restart app on watchdog failure
# Optional: WatchdogSignal=SIGABRT # Change signal sent to kill app
#

Then, you must periodically reset the watchdog from your app:

sd_notify(0, "WATCHDOG=1");

There are also options to ask systemd to reboot the machine if a service fails, although I don't remember which one.

If you need more information, you can see a comprehensive guide here: https://0pointer.de/blog/projects/watchdog.html

Antoine Viallon
  • 314
  • 4
  • 12