-1

I have a PHP scraper that scrapes URLs and echos out the material inside a given div. I want to modify that scraper to check the html on the page for the occurence of a string, and then echo out the entire word the string occurs in.

My Current scraper is this:

<?php
$urls = array( 
"http://www.sample1.html",
"http://www.sample2.html",
"http://www.sample3.html",
);
foreach($urls as $url){
$content = file_get_contents($url);
$first_step = explode( '<div class="div1">' , $content );
$second_step = explode("</div>" , $first_step[1] );
echo $second_step[0]."<br>";
};
?>

I want it look more like this, only working:

$first_step = explode( 'eac' , $content );

With the results being:

  1. teacher
  2. preacher
  3. each etc...
Tim
  • 63
  • 1
  • 1
  • 10

1 Answers1

0

You can use the following regex with preg_match instead of explode:

(\w*eac\w*)

Code:

preg_match('(\w*eac\w*)', $content , $first_step , PREG_OFFSET_CAPTURE);
echo $first_step[1];
karthik manchala
  • 13,492
  • 1
  • 31
  • 55
  • `$first_step = preg_split( '(\w*eac\w*)' , $content );` this excluding the words @Trim saying... – Kirs Sudh Jun 13 '15 at 20:21
  • I tried it, and its echoing out huge chunks of data, and not just the word, and the data doesnt have the word in it. This is what I'm trying: $first_step = preg_split( '(\w*eac\w*)' , $content ); echo $first_step[1]."
    ";
    – Tim Jun 13 '15 at 20:29
  • Using your advice I came up with this, which works. $first_step = preg_match_all( '(\w*eac\w*)' , $content, $answer ); print_r($answer); Yours didn't work, not sure why. – Tim Jun 13 '15 at 20:55