How to restart PM2 itself when the server processes are killed?

Question

I use PM2 to run my node.js app. It works fine, however, sometimes my web hosting provider WebFaction (hello!) kills all the processes on my part of the shared server when my app goes beyond the memory limit (it's actually a java-based neo4j graph database doing that and I'm working on resolving this issue).

When this happens, PM2, for some reason thinks it is still running, even though my app is not online (while the database is restarted using a cron).

How do I make sure PM2 "knows" when something like this happens and restarts my node.js app even if it thinks it's still running?

nonbeing · Answer 1 · 2021-11-10T09:22:32.787

Using supervisor or monit is the recommended, better way.

However, a quick-and-dirty hack that is working for me, is to use the pm2-save and pm2-resurrect utilities in combination with cron.

I ran pm2 save to save a snapshot of all the node processes I wanted to keep running. Then, while logged in as the linux-user whom my pm2 and Node apps run as (e.g. ec2-user on Amazon Linux in my case), I put this in my crontab (using crontab -e):

* * * * * pm2 resurrect

This is telling the cron scheduler to run pm2 resurrect every minute of every day. This ensures that even if the Linux kernel kills pm2 itself (which happens often when my tiny t2.micro goes OOM), there will be at most 60 seconds of downtime: cron will ensure that pm2 and my NodeJS apps are resurrected (if necessary) automatically every minute.

pm2 resurrect is really useful here, because it is idempotent in the sense that you can safely run it over and over again, and it won't restart any running processes or cause disruptions on reruns. It just checks if the required node apps are running, if they are, it doesn't do anything. However, if they are not already running, it uses the config recorded previously by pm2 save to start these apps up.

score 2 · Answer 2 · answered Feb 19 '18 at 20:26

I generally use something like monit to keep the process running. The basic idea is to have a cronjob which runs every minute and starts PM2 if it isn't running.

You will want to configure PM2 (or whatever you use for this) to either manage the process running in the foreground (like SupervisorD or SystemD), or monitor a PIDfile so that it can confirm that the process is actually running and start it if not. I don't know the specifics of how to do this in PM2 (never used it), but you can do this in SupervisorD or Monit.

Qiulang · Answer 3 · 2021-07-16T05:45:40.000

I think one comment here probably explained why https://github.com/Supervisor/supervisor/issues/147

In order to implement stop command one would need to keep track of process pid. This is usually done by storing pid file somewhere in the file system. This creates at least two problems 1) the pid file can be deleted (intentionally/unintentionally/fs corruption) 2) pid file is not deleted upon system crash or ungraceful shutdown (i.e. pull the plug event) and another process started with the same pid id upon reboot.

These are rare circumstances but I experience all of this more times than I ever expected. The only reliable way to monitor the process state is by forking it from a parent (supervisor) and using kernel facilities to control/monitor the child.

The start/stop/pid approach works in 99% of cases and suitable for generic applications where failures are common occurrence and not a big deal. However, there are applications out there where 99% is not good enough. Think of industrial automation, robots, car controls, flight controls, etc.

There are very few tools that do process control in this way (daemontools, runit, systemd, supervisor, inittab, pm2), there are even fewer that do it right and supervisor is one of the best.

PM2 stores its pidfile at $HOME/.pm2/pm2.pid and its monitored process(s)'s pidfile at $HOME/.pm2/pid/app-pm_id.pid, so I guess one possible way is to delete the pid files when system restarts. I don't tried that thought

How to restart PM2 itself when the server processes are killed?

3 Answers3