0

this might be a funky question, but I wonder if anyone could think of a way how to take a chunk of html, scan it for <img> tags and if the tag has no width+height value apply it with list($width, $height, $type, $attr); ?

More in detail, I have a php page that includes another page with html only. I would want the html to be changed before output to browser.

This is a simplified version of what I am looking at:

<!DOCTYPE HTML>
<html>
<head>
</head>
<body>
<div id="content">
<?php 
include_once("client-contributed-text-and-images.php");
?>
</div>
</body>
</html>

after some input below I came up with following:

<!DOCTYPE HTML>
<html>
<head>
</head>
<body>
<div id="content">
<?php
$dom = new DOMDocument();
$dom->loadHTMLFile("client-contributed-text-and-images.php");

foreach ($dom->getElementsByTagName('img') as $item) {

    $item->setAttribute('width', '100');
    echo $dom->saveHTML();
    exit;
}
?>
</div>
</body>
</html>

The problem is that it generates a complete html4 file in the middle, while only changing the first img tag and seemingly not outputting the code afterwards:

<!DOCTYPE HTML>
<html>
<head>
</head>
<body>
<div id="content">
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body><img src="img1.jpg" width="100"><h1>header</h1>
<p>some text</p>
<a href="http://google.com">some link</a>
<img src="img2.jpg"></body></html>

so I switched gears and tried fopen() instead and got it to work partly:

<!DOCTYPE HTML>
<html>
<head>
</head>
<body>
<div id="content">
<?php
$root = realpath($_SERVER['DOCUMENT_ROOT']);
$file = $root."/client-contributed-text-and-images.php";
$f = fopen($file, 'r');
$contents = fread($f, filesize($file));
fclose($f);

$new_contents = str_replace("<img ", "<img width='100' height='100' ", $contents); 
echo $new_contents;
?>
</div>
</body>
</html>

which gave:

<!DOCTYPE HTML>
<html>
<head>
</head>
<body>
<div id="content">
<img width='100' height='100' src="img1.jpg">
<h1>header</h1>
<p>some text</p>
<a href="http://google.com">some link</a>
<img width='100' height='100' src="img2.jpg"></div>
</body>
</html>

Now I just need some help with figuring out how to implement list($width, $height, $type, $attr); to include right with and height (and obviously only when it is not already set).

Paul
  • 1,624
  • 5
  • 18
  • 24

2 Answers2

1

Yes, this is entirely possible.

  1. Use a DOM parser to load your HTML and find your image tags.
  2. Use cURL to download images (if you don't already have them locally)
  3. Use GD to get the image sizes
  4. Use that DOMDocument to modify the HTML
  5. Output modified HTML.

Note that all of this will take a long amount of processing time. It probably isn't worth it. At least, cache the results.

Community
  • 1
  • 1
Brad
  • 159,648
  • 54
  • 349
  • 530
  • so something like this? `loadHTMLFile("filename.html"); /*change img tags here somehow...*/ echo $doc->saveHTML(); ?>` – Paul Mar 19 '13 at 19:34
  • it's the /*change img tags here somehow...*/ part I'm uncertain on how to get to stick with the new code :) – Paul Mar 19 '13 at 19:58
  • @Paul, See this question for an example: http://stackoverflow.com/a/11387770/362536 – Brad Mar 19 '13 at 20:02
0

YOu can try

$url = 'http://yahoo.com';
$dom = new DOMDocument();
@$dom->loadHTMLFile($url);

$imgs = $dom->getElementsByTagName("img");

foreach ( $imgs as $img ) {
    $attrs = array();
    // only load large images
    if ((int) $img->getAttribute("height") < 80)
        continue;

    for($i = 0; $i < $img->attributes->length; ++ $i) {
        $node = $img->attributes->item($i);
        $attrs[$node->nodeName] = $node->nodeValue;
    }
    print_r($attrs);
}

Output

Array
(
    [src] => http://l3.yimg.com/nn/fp/rsz/031913/images/smush/ucf-thwarted_635x250_1363714489.jpg
    [class] => fptoday-img
    [alt] => Quick thinking helped thwart UCF massacre plan (AP)
    [title] => Quick thinking helped thwart UCF massacre plan (AP)
    [width] => 635
    [height] => 250
)
Array
(
    [src] => http://l.yimg.com/os/mit/media/m/base/images/transparent-95031.png
    [style] => background-image:url('http://l2.yimg.com/ts/api/res/1.2/8hS1Q3v9rmaW8yI0eXEPHw--/YXBwaWQ9eWhvbWVydW47cT04NTtzbT0xO3c9MjUwO2g9MTU5/http://media.zenfs.com/en_us/News/Reuters/2013-03-19T120925Z_1_CBRE92I0XS100_RTROPTP_2_USA-SHOOTING-OHIO.JPG');
    [width] => 129
    [height] => 82
    [alt] => 
    [title] => 
    [class] => lzbg
)
Array
(
    [src] => http://l.yimg.com/os/mit/media/m/base/images/transparent-95031.png
    [style] => background-image:url('http://l3.yimg.com/ts/api/res/1.2/wcwLlp6sGVdOT7WXfkGEkQ--/YXBwaWQ9eWhvbWVydW47cT04NTtzbT0xO3c9MTg2O2g9MjUw/http://l.yimg.com/os/publish-images/lifestyles/2013-03-19/d9f10733-ee09-4e1f-a363-e3b9cd66078f_garygoldsmith.jpg');
    [width] => 82
    [height] => 110
    [alt] => 
    [title] => 
    [class] => lzbg
)


 .......... 
Baba
  • 94,024
  • 28
  • 166
  • 217
  • do you have an example on how I can get it to output with edited img tags fitting the model in my initial (modified) post? – Paul Mar 19 '13 at 19:50