I've been using Heroku to host my application for several years and just started running into issues with the worker queue getting backlogged. I was hoping I could fix this by increasing the number of workers running so queued jobs could be completed in parallel, but whenever I scale up my number of workers, all but one crash.
Here's my Procfile:
web: vendor/bin/heroku-php-apache2 public
worker: php /app/artisan queue:restart && php /app/artisan queue:work redis --tries=3 --timeout=30
Here's the output from my sever logs when I scale up my workers to anything greater than 1 (in this example, it was just scaling it to 2 workers):
Mar 16 06:04:51 heroku/worker.1 Starting process with command `php /app/artisan queue:restart && php /app/artisan queue:work redis --tries=3 --timeout=30`
Mar 16 06:04:52 heroku/worker.1 State changed from starting to up
Mar 16 06:04:54 app/worker.1 Broadcasting queue restart signal.
Mar 16 06:04:58 heroku/worker.2 Process exited with status 0
Mar 16 06:04:58 heroku/worker.2 State changed from up to crashed
Mar 16 06:04:58 heroku/worker.2 State changed from crashed to starting
Mar 16 06:05:09 heroku/worker.2 Starting process with command `php /app/artisan queue:restart && php /app/artisan queue:work redis --tries=3 --timeout=30`
Mar 16 06:05:10 heroku/worker.2 State changed from starting to up
Mar 16 06:05:14 app/worker.2 Broadcasting queue restart signal.
Mar 16 06:05:19 heroku/worker.1 Process exited with status 0
Mar 16 06:05:19 heroku/worker.1 State changed from up to crashed
As you can see, both workers try starting but only worker.2
stays in the up status.
The crashed workers try restarting every 10 minutes to the same result as above.
When I run heroku ps
, here's what I see:
=== worker (Standard-1X): php /app/artisan queue:restart && php /app/artisan queue:work redis --tries=3 --timeout=30 (2)
worker.1: crashed 2021/03/16 06:05:19 -0600 (~ 20m ago)
worker.2: up 2021/03/16 06:05:10 -0600 (~ 20m ago)
(my normal web dynos scale up and down just fine, so i'm not showing that in here).
Any thoughts as to what could be happening? My first thought was that there was an issue going on with Heroku, but I realized that wasn't the case. My second thought is that my Procfile entry for my worker could be causing problems, but I don't know enough about that entry to know what could be the cause.
Again, this has been working fine for 1 worker for a long time and the crashing only happens when I try to scale up to more than 1 worker. Regardless of how many workers I try scaling to, only one doesn't crash and remains active and able to receive and process jobs.
Misc info:
- Heroku stack: Heroku-18
- Laravel version: 8.*
- Queue driver: Redis
Update - I scaled up the dynos on my staging environment and was able to scale the workers up and down without any kind of crashes. Now I'm thinking there might be some kind of add-on conflict or something else going on. I'll update this if I find anything else out (already reached out to Heroku support).