3

I need to recieve a big amount of data from external source. The problem is that external source sends data very slow. The workflow is like this:

  1. The user initiates some process from app interface (common it is fetching data from local xml file). This is quite fast process.
  2. After that we need to load information connected with fetched data from external source(basically it is external statistics for data from xml). And it is very slow. But user needs this additional inforamtion to continue work. For example he may perform filtering according to external data or something else.

So, we need to do it asynchronously. The main idea is to shows external data as it becomes available. The question is how could we organise this async process? Maybe some quess or something else? We`re using php+mysql as backend and jquery at front-end. Thanks a lot!

hakre
  • 193,403
  • 52
  • 435
  • 836
Maxim
  • 158
  • 2
  • 12
  • is there anyway to use websevices – COLD TOLD Apr 16 '12 at 17:44
  • 1
    Does caching the data help in part 2? – Larry Battle Apr 16 '12 at 17:47
  • If you're receiving data in chunks, [jQuery Deferred objects](http://api.jquery.com/category/deferred-object/) might help here. – wzub Apr 16 '12 at 17:52
  • 1
    please specify - do you store the fetched data in a database or do you use it for real-time display? how do you get the data from the external source? do you use cURL or similar? – Michal Apr 16 '12 at 17:54
  • @LarryBattle, caching might help a bit, but it couldn`t solve the problem, becouse each time the data is different (possible repetion of previously cached data is ~20% maximum). Of course first we will see if we have this item in cache, but almost always it will be a new item. – Maxim Apr 16 '12 at 17:55
  • @COLDTOLD, could you explain about webservices? What do you mean? – Maxim Apr 16 '12 at 17:55
  • @Michal, we`re going to store data in DB for caching. But the main purpose is to delever this data to user as fast as possible. We fetches data with cURL, the problem isn`t connected with data transfer speed, but the fact that external data provider answering very slowly. – Maxim Apr 16 '12 at 18:01
  • 1
    Maxim, in general you want to abstract heavy operations behind a set of endpoints that allow you to set off a job and then query it for completion and for new data. This is probably what @COLDTOLD means by web services. An example would be two scripts, begin_statistics.php and statistics_status.php. You would post the XML in to begin_statistics, which would launch an async call to the external service, store a session token, and return a job number, and statistics_status could receive that job number and then reply with recent data. With a polling loop in JavaScript, you've got a basic solve. – zetlen Apr 16 '12 at 18:03
  • i'd use slickgrid in case you'd need to work with tabular data – cristi _b Apr 16 '12 at 18:13
  • 2
    Also you might want to consider websockets, you connect to the server and it starts outputting the data, the output data is immediatly sent to the client via the socket and js outputs it on clientside. However it will only work on modern browsers and you have to get server side technology ready, so ideally additional module for apache. – cyber-guard Apr 16 '12 at 18:17

2 Answers2

1

Your two possible strategies are:

  1. Do the streaming on the backend, using a PHP script that curls the large external resource into a database or memcache, and responds to period requests for new data by flushing that db row or cache into the response.

  2. Do the streaming on the frontend, using a cross-browser JavaScript technique explained in this answer. In Gecko and WebKit, the XmlHttpRequest.onreadystatechange event fires every time new data is received, making it possible to stream data slowly into the JavaScript runtime. In IE, you need to use an iframe workaround, also explained at Ajax Patterns article linked in the above SO post.

Community
  • 1
  • 1
zetlen
  • 3,609
  • 25
  • 22
  • We`re inclined to "classical" variant with two php scripts and long polling requests from front end. (your first variant). Thanks a lot! – Maxim Apr 17 '12 at 08:03
0

One possible solution would be to make the cURL call using system() with the output being redirected in a file. Thus PHP would not hang until the call is finished. From the PHP manual for system():

If a program is started with this function, in order for it to continue running in the background, the output of the program must be redirected to a file or another output stream. Failing to do so will cause PHP to hang until the execution of the program ends.

This would split the data gathering from the user interface. You could then work with the gathered local data by several means, for example:

  • employ an iFrame in the GUI that would refresh itself in some intervals and fetch data from the local stored file (and possibly store it in the database or whatever),
  • use jQuery to make AJAX calls to get the data and manipulate it,
  • use some CGI script that would run in the background and handle the database writes too and display the data using one of the above from the DB directly,
  • dozens more I can't think of now...
Michal
  • 3,262
  • 4
  • 30
  • 50