Puma: 3.8.2
capistrano3-puma: 3.1.1
Rails: 5.0.1
Ruby: 2.3.3
My system configuration contains 3 servers, there's always one or two servers, in which Puma won't restart after deployment. Though, the deployment log says Command restart sent success
.
Then, I logged in to all servers, and ran ps aux | grep puma
. The output was weird.
Server 1: puma works fine, restarted successfully at 10:06
with release 20170726095950
deploy 21379 0.0 3.5 2119668 274772 ? Sl 10:06 0:01 puma 3.8.2 (unix:///var/www/fbdash/shared/tmp/sockets/puma.sock) [20170726095950]
deploy 21384 0.7 8.0 2266296 618124 ? Sl 10:06 0:35 puma: cluster worker 0: 21379 [20170726095950]
deploy 21387 0.9 7.9 2269380 607296 ? Sl 10:06 0:46 puma: cluster worker 1: 21379 [20170726095950]
deploy 21390 1.0 7.4 2271436 569760 ? Sl 10:06 0:48 puma: cluster worker 2: 21379 [20170726095950]
deploy 21394 0.8 7.8 2203948 603500 ? Sl 10:06 0:41 puma: cluster worker 3: 21379 [20170726095950]
deploy 25204 0.0 0.0 10484 2260 pts/2 S+ 11:26 0:00 grep --color=auto puma
Server 2: Same as server 1, puma works fine, restarted successfully at 10:06
with release 20170726095950
deploy 21379 0.0 3.5 2119668 274772 ? Sl 10:06 0:01 puma 3.8.2 (unix:///var/www/fbdash/shared/tmp/sockets/puma.sock) [20170726095950]
deploy 21384 0.7 8.0 2266296 618124 ? Sl 10:06 0:35 puma: cluster worker 0: 21379 [20170726095950]
deploy 21387 0.9 7.9 2269380 607296 ? Sl 10:06 0:46 puma: cluster worker 1: 21379 [20170726095950]
deploy 21390 1.0 7.4 2271436 569760 ? Sl 10:06 0:48 puma: cluster worker 2: 21379 [20170726095950]
deploy 21394 0.8 7.8 2203948 603500 ? Sl 10:06 0:41 puma: cluster worker 3: 21379 [20170726095950]
deploy 25204 0.0 0.0 10484 2260 pts/2 S+ 11:26 0:00 grep --color=auto puma
Server 3: Two of the clusters stopped (Didn't restart at all), still pointing at old release 20170726080157
deploy 22364 0.0 3.6 2120888 279128 ? Sl 09:37 0:00 puma 3.8.2 (unix:///var/www/fbdash/shared/tmp/sockets/puma.sock) [20170726080157]
deploy 22369 0.1 9.3 2405448 719184 ? Sl 09:37 0:11 puma: cluster worker 0: 22364 [20170726080157]
deploy 22378 0.2 10.9 2538956 838948 ? Sl 09:37 0:15 puma: cluster worker 3: 22364 [20170726080157]
deploy 30422 0.0 0.0 10484 2160 pts/2 S+ 11:26 0:00 grep --color=auto puma
This is causing all requests that are directed to this server to timeout with error code HTTP 408.
I've also tried cap production puma:stop
followed by cap production puma:start
, but it's still not working. I had to exlicitly kill the all the old processes with kill -9 PIDs
, and then send cap production puma:start
So, technically,
~/.rvm/bin/rvm default do bundle exec pumactl -S /var/www/fbdash/shared/tmp/pids/puma.state -F /var/www/fbdash/shared/puma.rb stop/restart
is doing nothing.
Any idea why this is happening?
I also found out that Puma workers are not shutting down gracefully when restarted. This is results in old process still running and nginx listening socket in new release folder (Which is not started yet).
[24498] PumaWorkerKiller: Consuming 3081.06640625 mb with master and 4 workers.
[24498] PumaWorkerKiller: Consuming 3081.06640625 mb with master and 4 workers.
[24498] PumaWorkerKiller: Consuming 3081.06640625 mb with master and 4 workers.
[24498] PumaWorkerKiller: Consuming 3081.06640625 mb with master and 4 workers.
[24498] PumaWorkerKiller: Consuming 3081.06640625 mb with master and 4 workers.
[24498] - Gracefully shutting down workers...
[24498] PumaWorkerKiller: Consuming 1250.91796875 mb with master and 4 workers.
[24498] PumaWorkerKiller: Consuming 1250.91796875 mb with master and 4 workers.
[24498] PumaWorkerKiller: Consuming 1250.91796875 mb with master and 4 workers.
[24498] PumaWorkerKiller: Consuming 1250.91796875 mb with master and 4 workers.
[24498] PumaWorkerKiller: Consuming 1250.91796875 mb with master and 4 workers.
[24498] PumaWorkerKiller: Consuming 1250.91796875 mb with master and 4 workers.
Notice that it says Gracefully shutting down workers
but, in the next line the workers still continue using old PID.