0

I am struggling with a problem occuring only on very long running requests. I have a web application with the possibility to export a large amount of data on demand in a XML file. Depending on the data the export runs from just a second to several minutes. I tested the export with little data and the xml generation and file download worked well. Then I tested it with a large amount of data and the xml generation ran about 30 minutes (due to lots of database queries). But the problem for this long running request is that after generating the xml file it is not downloading but generating the file again. While doing the xml generation again the gets an site loading error. The same happened with more data and a xml generation time of more than one hour... So why is this happening? This problem occurs only for such requests running about 30 minutes or longer. Is it possible that there is some timeout which starts the request again? But the second file generation starts exactly after finishing the first and not after a fixed duration.

sandro1111
  • 63
  • 1
  • 8
  • Are you using sessions? – Marek Oct 10 '13 at 11:00
  • Yes, I am using sessions. Could the problem be caused by sessions? – sandro1111 Oct 14 '13 at 11:55
  • No, session is important for the reason @OddEssay already pointed out. – Marek Oct 14 '13 at 12:08
  • Ah okay, but as I mentioned in the comment to @OddEssay's answer this is probably not the problem because it behaves differently for short and for long running requests... Do you have any other idea what could cause this problem? – sandro1111 Oct 14 '13 at 12:25
  • 1
    Have you checked if the browser makes the request twice? It might be retrying. – Marek Oct 14 '13 at 12:53
  • I checked the accesslog and yes it has 2 entries. Does that mean the browser retried it? Is there any possibility to change that? – sandro1111 Oct 15 '13 at 15:09
  • 1
    It shouldn't do that, maybe a js library you use? But definately you can't get response back to the user for large files, I suggest you go for offline processing. – Marek Oct 16 '13 at 06:22
  • Thank you @Marek, I'll do offline processing... although I wished to find the problem, but it is enough to know that long running requests are not working because they should also be avoided for usability reasons. – sandro1111 Oct 16 '13 at 07:49

1 Answers1

2

But the second file generation starts exactly after finishing the first and not after a fixed duration.

If you use file based sessions, the lock on the session file will only allow one running PHP script to access that file at one time, which could account for the blocking and why the next request happens as soon as the current script finishes.

For big exports, one option is to process "offline" in the backend somehow and check for the finished file. Eg: Request a download, get a "key" back right away, and have php spawn the export in the backend. The browser can then keep checking if the export for "key" has finished, and download it when it's ready. Allowing the user to start the export and not worry about the browser closing mid way or download the same export multiple times without regenerating.

Alternatively, if the export is needed regularly, just preprocess it from a cron job so the end user can just download the fresh data quickly when they want it without waiting.

OddEssay
  • 1,334
  • 11
  • 19
  • 2
    Or instead of having the browser pinging away at the server, get an email address, and have the report or a link to the report emailed to the requester once the report has been generated. – jmarkmurphy Oct 10 '13 at 18:25
  • okay, that sounds plausible but actually I'm not doing a second request. The only request I'm triggering is the export and this request is done correct when the export is small and lasts only a few seconds but if the export lasts several minutes (~30min) then the request is done twice. So the behavior is not always the same depending on the execution time. Some timeout seems to be involved... – sandro1111 Oct 14 '13 at 08:24