0

I currently have a rails 6 app using Shoryuken (5.2.3) with SQS to asynchronously handle user initiated image processing and uploading. Everything appears to be working as expected as far as the side effects of the job (which includes the Shoryuken::Worker) are considered. The Shoryuken logs (from using the command bundle exec shoryuken -q my-queue-name -R) generates some confusing output. The job has some puts statements for debugging, and while watching the initial run generate the expected sequence of puts statements, I start seeing output that would imply that the job (the one still inflight) has been initiated again, before finishing or erroring out. This continues until the pool of threads have been exhausted, and this error is thrown.

ERROR: Processor failed: could not obtain a connection from the pool within 5.000 seconds (waited 5.000 seconds); all pooled connections were in use

My concern is that this could block connections for users and negatively impact user experience or have unexpected consequences with partial job runs (despite the job being idempotent). I assume I might need to configure the retry settings but thats just a guess, and input on this would be helpful.

humbledev7000
  • 275
  • 1
  • 4
  • 8
  • Not sure this is the answer but its starting to look like this might be at the root of my issue given how long running my job is https://github.com/ruby-shoryuken/shoryuken/issues/264 – humbledev7000 Dec 12 '22 at 01:48
  • Any chance you have a short visibility timeout, and you are not using a DL queue? That would cause the messages to keep coming back as available in case of failures. You need to make sure your concurrency (overall, across all Shoryuken instances) is compatible with your connection pool size. Shoryuken can't read the messages twice or read inflight. SQS does not support that. – Pablo Cantero Apr 13 '23 at 16:35
  • So I figured out it was a number of things. yes my retry time was too low, my concurrency count was too high for the computational burden of the job and available system resources, also I needed to throttle how quickly jobs were being queued. – humbledev7000 Apr 14 '23 at 03:57
  • Good that you figured it out. Theoretically, you should not worry about how quickly the messages are enqueued. If throughput is a problem, it's recommended to limit how fast the messages are consumed instead. You can reduce the concurrency limit or use a FIFO queue to throttle consumption. – Pablo Cantero Apr 15 '23 at 05:13

0 Answers0