0

I have below mix content in a php string.

<div class="biz-website">
    <span class="offscreen">Business website</span>
    <a target="_blank" href="/biz_redir?url=http%3A%2F%2Fwww.example.com&amp;src_bizid=LihgJyPNjlUB3euiFvfEgw&amp;cachebuster=1419609400&amp;s=112daf4cc534d37cbf02a548cb8cb1d15bbeba6fab83b74b1195640dc44c040e">example.com</a>
</div>

I need to fetch example.com from above php string. Any ideas on what I am doing wrong?

Lalit Sharma
  • 555
  • 3
  • 12

1 Answers1

2

Regex is not the right tool here. The correct tool is a DOM parser. I like PHP's DOMDocument.

$html = <<<END
<div class="biz-website">
    <span class="offscreen">Business website</span>
    <a target="_blank" href="/biz_redir?url=http%3A%2F%2Fwww.example.com&amp;src_bizid=LihgJyPNjlUB3euiFvfEgw&amp;cachebuster=1419609400&amp;s=112daf4cc534d37cbf02a548cb8cb1d15bbeba6fab83b74b1195640dc44c040e">example.com</a>
</div>
END;

$DOM = new DOMDocument;
$DOM->loadHTML($html);

$aTags = $DOM->getElementsByTagName('a');

$value = $aTags->item(0)->nodeValue;
echo $value;

UPDATE: If you want to see if the href contains "biz_redir", then you can just simply check that:

$html = <<<END
<div class="biz-website">
    <span class="offscreen">Business website</span>
    <a target="_blank" href="/biz_redir?url=http%3A%2F%2Fwww.example.com&amp;src_bizid=LihgJyPNjlUB3euiFvfEgw&amp;cachebuster=1419609400&amp;s=112daf4cc534d37cbf02a548cb8cb1d15bbeba6fab83b74b1195640dc44c040e">example.com</a>
</div>
END;

$DOM = new DOMDocument;
$DOM->loadHTML($html);

$aTags = $DOM->getElementsByTagName('a');
$aTag = $aTags->item(0);

if(strpos($aTag->getAttribute('href'), 'biz_redir') !== FALSE){
    $value = $aTag->nodeValue;
    echo $value;
}

UPDATE 2: If you don't just have that clip, but the entire webpage, then you can find the <div> you want like this:

$DOM = new DOMDocument;
$DOM->loadHTML($html);
$xPath = new DOMXPath($DOM);

$biz = $xPath->query('//div[@class="biz-website"]/a[contains(@href, "biz_redir")]');

$value = $biz->item(0)->nodeValue;
echo $value;
gen_Eric
  • 223,194
  • 41
  • 299
  • 337