0

I am trying to get the contents of a webpage but the page loads in 2 times.

there seems to be some kind of timer, at first it loads some content and then after 10 seconds it loads the other part of the content that have the content that i am trying to get.

Is there a way to achieve this?

Ty ;)

  • I am trying to get content from this url: http://www.onlinegames.net/games/4725/armyswat.html But at first it makes a pause and then after some time it loads the hole page, and because of that I cant use file_get_html('http://www.onlinegames.net/games/4725/armyswat.html'); cuz it does not load the part of the content that i want to get :( any ideas? – christine thompson Feb 27 '11 at 07:57

1 Answers1

0

You need to use a headless browser engine to do this. cURL and wget are HTTP libraries; they speak HTTP and download documents as text. They don't have a concept of a DOM or a JavaScript engine that would help them understand that a page is doing AJAX OR JS Timer . So to download the HTML, you need something that acts more like a browser, by parsing a DOM and executing JS. I recommend http://simile.mit.edu/wiki/Crowbar, which uses a Mozilla engine.

Shailendra Sharma
  • 6,976
  • 2
  • 28
  • 48