Why is PHP Simple HTML DOM Parser unable to capture the contents of a tag from certain URL's?

Question

I'm using PHP Simple HTML DOM Parser to get the contents of the first <h1> tag on different webpages. The script works great most of the time, but on some webpages my script just kind of 'hangs up'. The script stops, without completing the code that comes after what I have listed below. I looked at the source for the pages that don't work, but there is nothing particularly different about the <h1> or its contents. Is there a way I can get this to work for all possible URL's, and if not, how can I fix my script so it won't hang up for the URL's that don't work?

include_once( 'simple_html_dom.php');
$html = file_get_html($webpage);
$element = $html->find('h1', 0);
$element = strip_tags($element);

For example, this URL will not work with the code above: "http://www.foxnews.com/sports/2012/07/28/australia-beach-volleyball-uniforms-are-interesting/?intcmp=obnetwork". It has an `
` tag that appears to be 'normal'. Why won't it work with my code from above? — Ajay Mohite, Aug 02 '12 at 07:02

score 0 · Accepted Answer · answered Aug 03 '12 at 00:45

0

I never got a response, but it appears the answer is to use cURL to get the contents of the url, then get the tag info with PHP Simple HTML DOM Parser.

answered Aug 03 '12 at 00:45

Ajay Mohite

119
3
13

Why is PHP Simple HTML DOM Parser unable to capture the contents of a tag from certain URL's?

` tag that appears to be 'normal'. Why won't it work with my code from above?

1 Answers1