0

I have a queue for photos that people upload to my site, since at peak times there can be 10,000 or more photos uploaded in an hour.

My php script is called via cron every minute, and picks photos from the database with the where clause FLOOR(photoId / 100) % $digit, where $digit = $counter % 10;

So on the first minute photos with Ids matching xxxx0xx get processed, then the next minute photos with Ids xxxx1xx, then xxxx2xx, etc. I LIMIT 0, 1000 the query and let it try to do as many as it can before I time it out.

I have a 6-core HT xeon, and typically leave these processes running for 6 minutes.

When I launch the process I run it through ionice and nice and if the 1 minute load average for the server is above 6 I abort. If not I increment $counter and carry on.

This seems to be a fairly good balance of getting lots done, without holding up website performance too much (everything is on the same dedicated server).

Why am I doing FLOOR(photoId / 100) rather than just photoId? Well people upload a lot of batch photos (i.e. a rapid burst of shots), and it works better on the site for those to all appear at the same time. This won't always accomplish that, but it'll be pretty close and much better than just modding photoId.

When the queue is busy and lots of people are uploading, this works very well.

But at a quiet time when just one photographer uploads just a few dozen photos, they might get "unlucky" and have the queue take 10 minutes to come around to their Ids to process.

What'd be the best way to mitigate this? Is my whole idea of the queue up to this point rubbish and there's something better I should be doing?

Instead of the whole % technique, I could just pick the first 1000 ids that aren't marked as "being processed".

I then mark them as "being processed" in the database so the next process won't pick them. Do as many as possible and then unmark the rest so they can be picked again. But then if there's less than 1000 in the queue the next 5 processes would be unable to select any to help, until that initial process has timed out...

Suggestions please!

Thank you

Codemonkey
  • 4,455
  • 5
  • 44
  • 76
  • Sometimes a trade off is expected (some scenarios work well, others don't) and you code for the 'best' case. Unfortunately without knowing what the processing involves, and any code, it's too broad a question (IMHO). – Nigel Ren Jun 15 '18 at 06:45
  • Just resizing and applying watermarks etc. Takes ~2 seconds per photo on average, but I can do about 8,000 an hour running these multiple processes. – Codemonkey Jun 15 '18 at 06:47

0 Answers0