2

Puma: 3.8.2
capistrano3-puma: 3.1.1
Rails: 5.0.1
Ruby: 2.3.3

My system configuration contains 3 servers, there's always one or two servers, in which Puma won't restart after deployment. Though, the deployment log says Command restart sent success.

Then, I logged in to all servers, and ran ps aux | grep puma. The output was weird.

Server 1: puma works fine, restarted successfully at 10:06 with release 20170726095950

deploy   21379  0.0  3.5 2119668 274772 ?      Sl   10:06   0:01 puma 3.8.2 (unix:///var/www/fbdash/shared/tmp/sockets/puma.sock) [20170726095950]                                                        
deploy   21384  0.7  8.0 2266296 618124 ?      Sl   10:06   0:35 puma: cluster worker 0: 21379 [20170726095950]                                                                                           
deploy   21387  0.9  7.9 2269380 607296 ?      Sl   10:06   0:46 puma: cluster worker 1: 21379 [20170726095950]                                                                                           
deploy   21390  1.0  7.4 2271436 569760 ?      Sl   10:06   0:48 puma: cluster worker 2: 21379 [20170726095950]                                                                                           
deploy   21394  0.8  7.8 2203948 603500 ?      Sl   10:06   0:41 puma: cluster worker 3: 21379 [20170726095950]                                                                                           
deploy   25204  0.0  0.0  10484  2260 pts/2    S+   11:26   0:00 grep --color=auto puma

Server 2: Same as server 1, puma works fine, restarted successfully at 10:06 with release 20170726095950

deploy   21379  0.0  3.5 2119668 274772 ?      Sl   10:06   0:01 puma 3.8.2 (unix:///var/www/fbdash/shared/tmp/sockets/puma.sock) [20170726095950]                                                        
deploy   21384  0.7  8.0 2266296 618124 ?      Sl   10:06   0:35 puma: cluster worker 0: 21379 [20170726095950]                                                                                           
deploy   21387  0.9  7.9 2269380 607296 ?      Sl   10:06   0:46 puma: cluster worker 1: 21379 [20170726095950]                                                                                           
deploy   21390  1.0  7.4 2271436 569760 ?      Sl   10:06   0:48 puma: cluster worker 2: 21379 [20170726095950]                                                                                           
deploy   21394  0.8  7.8 2203948 603500 ?      Sl   10:06   0:41 puma: cluster worker 3: 21379 [20170726095950]                                                                                           
deploy   25204  0.0  0.0  10484  2260 pts/2    S+   11:26   0:00 grep --color=auto puma

Server 3: Two of the clusters stopped (Didn't restart at all), still pointing at old release 20170726080157

deploy   22364  0.0  3.6 2120888 279128 ?      Sl   09:37   0:00 puma 3.8.2 (unix:///var/www/fbdash/shared/tmp/sockets/puma.sock) [20170726080157]                                                                                                                             
deploy   22369  0.1  9.3 2405448 719184 ?      Sl   09:37   0:11 puma: cluster worker 0: 22364 [20170726080157]                                                                                                                                                                
deploy   22378  0.2 10.9 2538956 838948 ?      Sl   09:37   0:15 puma: cluster worker 3: 22364 [20170726080157]                                                                                                                                                                
deploy   30422  0.0  0.0  10484  2160 pts/2    S+   11:26   0:00 grep --color=auto puma

This is causing all requests that are directed to this server to timeout with error code HTTP 408.


I've also tried cap production puma:stop followed by cap production puma:start, but it's still not working. I had to exlicitly kill the all the old processes with kill -9 PIDs, and then send cap production puma:start

So, technically,

~/.rvm/bin/rvm default do bundle exec pumactl -S /var/www/fbdash/shared/tmp/pids/puma.state -F /var/www/fbdash/shared/puma.rb stop/restart

is doing nothing.

Any idea why this is happening?


I also found out that Puma workers are not shutting down gracefully when restarted. This is results in old process still running and nginx listening socket in new release folder (Which is not started yet).

[24498] PumaWorkerKiller: Consuming 3081.06640625 mb with master and 4 workers.
[24498] PumaWorkerKiller: Consuming 3081.06640625 mb with master and 4 workers.
[24498] PumaWorkerKiller: Consuming 3081.06640625 mb with master and 4 workers.
[24498] PumaWorkerKiller: Consuming 3081.06640625 mb with master and 4 workers.
[24498] PumaWorkerKiller: Consuming 3081.06640625 mb with master and 4 workers.
[24498] - Gracefully shutting down workers...
[24498] PumaWorkerKiller: Consuming 1250.91796875 mb with master and 4 workers.
[24498] PumaWorkerKiller: Consuming 1250.91796875 mb with master and 4 workers.
[24498] PumaWorkerKiller: Consuming 1250.91796875 mb with master and 4 workers.
[24498] PumaWorkerKiller: Consuming 1250.91796875 mb with master and 4 workers.
[24498] PumaWorkerKiller: Consuming 1250.91796875 mb with master and 4 workers.
[24498] PumaWorkerKiller: Consuming 1250.91796875 mb with master and 4 workers.

Notice that it says Gracefully shutting down workers but, in the next line the workers still continue using old PID.

  • I've no idea what the reason, but I know the way when it works in production for years without troubles. Just use system way to start/stop/restart/monit daemons. Like SysV, Upstart и systemd in Debian. – Pavel Mikhailyuk Jul 26 '17 at 13:57

0 Answers0