1

I am new to regular expression i tried a lot for getting image tag value inside a anchor tag html this is my html expresstion

<div class="smallSku" id="ctl00_ContentPlaceHolder1_smallImages">
                                <a title="" name="http://www.playg.in/productImages/med/PNC000051_PNC000051.jpg" href="http://www.playg.in/productImages/lrg/PNC000051_PNC000051.jpg" onclick="return showPic(this)" onmouseover="return showPic(this)">
    <img border="0" alt="" src="http://www.playg.in/productImages/thmb/PNC000051_PNC000051.jpg"></a>    <a title="PNC000051_PNC000051_1.jpg" name="http://www.playg.in/productImages/med/PNC000051_PNC000051_1.jpg" href="http://www.playg.in/productImages/lrg/PNC000051_PNC000051_1.jpg" onclick="return showPic(this)" onmouseover="return showPic(this)">
    <img border="0" alt="PNC000051_PNC000051_1.jpg" src="http://www.playg.in/productImages/thmb/PNC000051_PNC000051_1.jpg"></a>
                        </div>

i want to return only the src value of image tag and i tried a matching pattern in "preg_match_all()" and the pattern was

"@<div[\s\S]class="smallSku"[\s\S]id="ctl00_ContentPlaceHolder1_smallImages"\><a title=\"\" name="[\w\W]" href="[\w\W]" onclick=\"[\w\W]" onmouseover="[\w\W]"\><img[\s\S]src="(.*)"[\s\S]></a><\/div>@"

please help i tried a lots of time for this also tried with this link too Match image tag not nested in an anchor tag using regular expression

Community
  • 1
  • 1
Sunith Saga
  • 609
  • 2
  • 14
  • 30
  • 1
    A proper HTML parser might serve you better than a regex. – mu is too short Apr 23 '13 at 05:10
  • yup but i need a regex for this instead of that.. – Sunith Saga Apr 23 '13 at 05:27
  • 1
    @SunithSaga: Why do you *need* regex instead of a DOM parser? A DOM parser will do a better job than regex 100% of the time. – Madara's Ghost Apr 23 '13 at 05:30
  • 1
    **Don't use regular expressions to parse HTML**. You cannot reliably parse HTML with regular expressions, and you will face sorrow and frustration down the road. As soon as the HTML changes from your expectations, your code will be broken. See http://htmlparsing.com/php for examples of how to properly parse HTML with PHP modules that have already been written, tested and debugged. – Andy Lester Apr 23 '13 at 05:39

2 Answers2

5

Regular expression is not the right tool for parsing HTML. See this FAQ: How to parse and process HTML/XML?

Here is an example on how to get the src property using your example:

$doc = new DOMDocument();
$doc->loadHTML($your_html_string);
$xpath = new DOMXPath($doc);

foreach ($xpath->query('//div[@class="smallSku"]/a/img/@src') as $attr) {
    $src = $attr->value;
    print $src;
}
Community
  • 1
  • 1
Randle392
  • 139
  • 2
1

try this sunith

    $content = file_get_contents('your url'); 
    preg_match_all("|<div class='items'>.*</div>|", $content, $arr, PREG_PATTERN_ORDER);  
preg_match_all("/src='([^']+)'/", $arr[0][0], $arrr, PREG_PATTERN_ORDER); 
    echo '<pre>'; 
    print_r($arrr);
Kapil gopinath
  • 1,053
  • 1
  • 8
  • 18