1

What's wrong with my code? I wish to get all dates from but my array is empty.

<?php
$url = "http://weather.yahoo.com/";
$page_all = file_get_contents($url); 

preg_match_all('#<div id="myLocContainer">(.*)</div>#', $page_all, $div_array);

echo "<pre>";
print_r($div_array);
echo "</pre>";
?>

Thanks

Samuel Liew
  • 76,741
  • 107
  • 159
  • 260
user319854
  • 3,980
  • 14
  • 42
  • 45
  • 2
    It might be your code, it might be not. Please provide more debug otherwise you'll waste others time in wild goose chases. – zaf Jul 04 '10 at 16:33

6 Answers6

0

You want to parse a multiline content but you did not use multiline switch of REGEX pattern. Try using this:

preg_match_all('#<div id="myLocContainer">(.*?)</div>#sim', $page_all, $div_array);

Please note that regular expressions is not suitable for parsing HTML content because of the hierachical nature of HTML documents.

Emre Yazici
  • 10,136
  • 6
  • 48
  • 55
0

try adding "m" and "s" modifiers, new lines might be in the div you need.. like this:

preg_match_all('#<div id="myLocContainer">(.*)</div>#ms', $page_all, $div_array);
Sergey Eremin
  • 10,994
  • 2
  • 38
  • 44
0

Before messing around with REGEX, try HTML Scraping. This HTML Scraping in Php might give some ideas on how to do it in a more elegant and (possibly) faster way.

Community
  • 1
  • 1
DrColossos
  • 12,656
  • 3
  • 46
  • 67
  • There is a recent implementation of such a library (allowing to access elements via CSS etc) built on PHP 5.3, using some components of the upcoming Symfony 2. Note: It's still kind of unstable. http://www.phparch.com/2010/04/22/four-new-php-5-3-components-and-goutte-a-simple-web-scraper/ – igorw Jul 05 '10 at 10:32
0
$doc = new DomDocument;
$doc->Load('http://weather.yahoo.com/');
$doc->getElementById('myLocContainer');
Ben Shelock
  • 20,154
  • 26
  • 92
  • 125
0

you need to Excape Special Characters in your Regular Expression like the following

~\<div id\=\"myLocContainer\"\>(.*)\<\/div\>~

also Checkout wheather there is a newline problem or not as mentioned by @eyazici and @kgb

Neel Basu
  • 12,638
  • 12
  • 82
  • 146
-2

Test your response before running the regex search. Then you'll know which part isn't working.

Peter Anselmo
  • 2,961
  • 1
  • 17
  • 12