7

From time to time on my RoR API service I can see that requests are being interrupted in the middle of processing. Simply the Puma server stops processing the request and right after I can see message in logs saying:

[10] - Worker 0 (PID: 811) booted in 0.01s, phase: 0

I'm trying to figure out what could be the root cause of it. This API service runs in Kubernetes as a pod and in front of this service I have AWS ELB.

Just to be clear, the K8S pod doesn't restart, only one of the Puma's worker gets "killed" and is started again right after.

Kostas
  • 8,356
  • 11
  • 47
  • 63
OndrejK
  • 187
  • 1
  • 9
  • 1
    Are there k8s events related to the pod? Is the pod restarting or just the process inside of the pod? – jordanm Dec 13 '21 at 15:59
  • The pod isn't being restarted, just one of the Puma worker gets killed while other worker runs without a problem. – OndrejK Dec 13 '21 at 17:19
  • What Ruby version and Puma version are you using? – Robert Nubel Dec 13 '21 at 17:23
  • Ruby is 2.4 and Rails 5.0.5 and Puma 5.5.2 – OndrejK Dec 13 '21 at 17:25
  • Are you running Puma in threaded or clustered mode? I think your post implies there are multiple worker processes running within the pod, and only one is restarting. You might try adding some logging into Puma's `on_worker_shutdown` hook to rule out if the shutdown is under Puma's control. If it's a process crash outside of Puma's control, there might be logs elsewhere, like /var/log/syslog, with more info. Might also be worth updating Ruby to 2.7. – Robert Nubel Dec 13 '21 at 17:50
  • Yes Robert, I have 2 workers in each pod and they both run 3 threads. And randomly one of the worker process gets killed. Currently on one branch I'm upgrading the Ruby version to 2.6 and Rails 5.2.5. I'm not sure if it help, but the upgrade is needed. – OndrejK Dec 14 '21 at 12:39
  • I have observed the same behavior with Rails 7 and Ruby 2.7, and having trouble figuring out the best way to troubleshoot this on AWS. – Kostas Aug 24 '22 at 11:16
  • can you describe the pod and share the output – P Ekambaram Aug 26 '22 at 04:56
  • I'm also struggling with this issue right now, I see errors in my nginx logs where a request is interupted, and a "worker booted" message from puma, but no further information. I'm also running k8s, maybe this could be somehow related to running puma in a k8s pod? Has anyone figured out why this might be happening, or how to get more information? I've posted here on the puma repo as well: https://github.com/puma/puma/discussions/3193 – ndbroadbent Jul 13 '23 at 13:14

0 Answers0