1

I have an APNS notification server sent up, which would in theory every day send about 50,000 to 100,000 users a processed notification (based on the amount of users of our web app that ties in with the iOS app).

The notification would go out around 2, but it must send it to each user individually (using Urban Airship) and is triggered by curl on a cron job.

It iterates through each user and has to use an HTML scraper (simple_html_dom to be exact) which takes about 5-10s per user, and is obviously very memory intensive. A simple GET request cant be the right way to come about doing this, in fact im positive it will fail. What is the best way to handle this long, memory intensive task on a cron job?

roozbubu
  • 1,106
  • 4
  • 14
  • 30
  • Why is it memory intensive? – Martinsos Apr 13 '13 at 01:19
  • Im assuming it would be, no? going through 50000 users each processing for 10s? Or would it just take a long time? – roozbubu Apr 13 '13 at 01:20
  • Well if you are not loading all users at once but loading one by one (which makes much more sense), than it would not be memory intensive, it would only take a long time. I guess users are stored in a database? – Martinsos Apr 13 '13 at 01:23
  • Also why are you processing each user for 10s, could you explain why does it take so long? – Martinsos Apr 13 '13 at 01:23
  • Oh yes it is one by one, and thats just the speed of the server in creating the objects and loading in HTML (from the scraper) – roozbubu Apr 13 '13 at 01:24
  • Ok then you don't have memory problem but only time problem. Why do you need to use web scraping? – Martinsos Apr 13 '13 at 01:28
  • Creating objects and scraping some HTML *should* not take 5-10 seconds. It should be quite fast, idealy counted in hundreds of milliseconds. Now, if you make a call to an external service, for every single user, then that would be your bottleneck. – Sverri M. Olsen Apr 13 '13 at 01:35
  • Interesting. I am really just scraping a single page off yahoo's web services, but that page itself does take quite awhile to load. (maybe 3s) not sure where the other 2s are coming from. – roozbubu Apr 13 '13 at 02:24

1 Answers1

1

If You will reuse same variables or set ones You are not going to use any more to null You won't run out memory. Just don't load all data at once and free it(set to null) or replace with new data after You process it.

And make sure You can't improve speed of Your task 5-10s sounds really long.

Gustek
  • 3,680
  • 2
  • 22
  • 36