2

Tricky preg_replace_callback function here - I am admittedly not great at PRCE expressions.

I am trying to extract all img src values from a string of HTML, save the img src values to an array, and additionally replace the img src path to a local path (not a remote path). Ie I might have, surrounded by a lot of other HTML:

img src='http://www.mysite.com/folder/subfolder/images/myimage.png'

And I would want to extract myimage.png to an array, and additionally change the src to:

src='images/myimage.png'

Can that be done?

Thanks

hotstuff
  • 21
  • 2
  • possible duplicate of [Regex to change format of all img src attributes](http://stackoverflow.com/questions/3131691/regex-to-change-format-of-all-img-src-attributes) – Gordon Mar 29 '11 at 15:34

2 Answers2

3

Does it need to use regular expressions? Handling HTML is normally easier with DOM functions:

<?php

$domd = new DOMDocument();
libxml_use_internal_errors(true);
$domd->loadHTML(file_get_contents("http://stackoverflow.com"));
libxml_use_internal_errors(false);

$items = $domd->getElementsByTagName("img");
$data = array();

foreach($items as $item) {
  $data[] = array(
    "src" => $item->getAttribute("src"),
    "alt" => $item->getAttribute("alt"),
    "title" => $item->getAttribute("title"),
  );
}

print_r($data);
Álvaro González
  • 142,137
  • 41
  • 261
  • 360
1

Do you need regex for this? Not necessary. Are regex the most readable solution? Probably not - at least unless you are fluent in regex. Are regex more efficient when scanning large amounts of data? Absolutely, the regex are compiled and cached upon first appearance. Do regex win the "least lines of code" trophy?

$string = <<<EOS
<html>
<body>
blahblah<br>
<img src='http://www.mysite.com/folder/subfolder/images/myimage.png'>blah<br>
blah<img src='http://www.mysite.com/folder/subfolder/images/another.png' />blah<br>
</body>
</html>
EOS;

preg_match_all("%<img .*?src=['\"](.*?)['\"]%s", $string, $matches);
$images = array_map(function ($element) { return preg_replace("%^.*/(.*)$%", 'images/$1', $element); }, $matches[1]);

print_r($images);

Two lines of code, that's hard to undercut in PHP. It results in the following $images array:

Array
(
  [0] => images/myimage.png
  [1] => images/another.png
)

Please note that this won't work with PHP versions prior to 5.3 unless you replace the anonymous function with a proper one.

svoop
  • 3,318
  • 1
  • 23
  • 41