0

I am trying to get image using preg_match_all but cant get it right here is my code. my problem i that i have soome images with .img ext and some with .jpg?w=655&h=357, i dont need .img but need all other valid images it can have ?w=655&h=357 in last or just .jpg or .png

$post ='
<img width="1" height="1" src="http://pi.feedsportal.com/r/180265248066/u/49/f/648326/c/35070/s/34410e29/a2t.img" border="0"/></br>
    <img width="1" height="1" src="http://9to5mac.files.wordpress.com/2013/11/screen-shot-2013-11-29-at-5-17-15-pm.png?w=655&#038;h=357" border="0"/></br>
    <img src="http://images.macrumors.com/article-new/2013/11/mlb.png" alt="MLB" title="mlb.png" width="175" height="175" class="alignright"/></br>
 ';

function catch_that_image($post) {
  global $post, $posts;
  $first_img = '';
  ob_start();
  ob_end_clean();
  $output = preg_match_all("<img.+?src=[\"']([^\"]*\.(gif|jpg|jpeg|png).*)[\"'].+?>", $post, $matches);
  $first_img = $matches [1] [0];

  return $first_img;

}
echo catch_that_image($post);

Output is

http://images.macrumors.com/article-new/2013/11/mlb.png" alt="MLB" title="mlb.png" width="175" height="175" class="alignright

I just need url till .png

Thanks

Harinder
  • 1,257
  • 8
  • 27
  • 54

1 Answers1

1

Don't use regex for parsing HTML. Use a DOM Parser instead:

$dom = new DOMDocument;
$dom->loadHTML($html);

foreach ($dom->getElementsByTagName('img') as $image) {
    $src =  $image->getAttribute('src');
    $extension = pathinfo($src, PATHINFO_EXTENSION);
    if ($extension !== 'img') {
        echo $src . PHP_EOL;
    }
}

Online demo.

Amal Murali
  • 75,622
  • 18
  • 128
  • 150
  • thx , but as i have mention it can be any ting after .gif|jpg|jpeg|png or .gif?w=** but i dont want .img ... i think in thin case it will take .img also – Harinder Nov 30 '13 at 13:48