0

How do I get all the TITLE and LINKS reading from a file ?

A sample content of the file below

<tr class="odd">
 <td align="left" valign="top" class="text_cont_normal"> TITLE </td>
 <td align="left" valign="top" class="normal_text_link">
  <img border="0" onclick="javascript:window.location.href='LINK'" style="cursor: pointer;" alt="Download"  src="btn.jpg"/></td>
</tr>
<tr class="even">
 <td align="left" valign="top" class="text_cont_normal"> TITLE2 </td>
 <td align="left" valign="top" class="normal_text_link">
  <img border="0" onclick="javascript:window.location.href='LINK2'" style="cursor: pointer;" alt="Download"  src="btn.jpg"/></td>
</tr>

I tried

$tags = $doc->getElementsByTagName('img');
foreach ($tags as $tag) {
 if ($tag->hasAttribute('onclick'))
    echo $tag->getAttribute('onclick').'<br>';
}

But not getting the data which I actually want !

hakre
  • 193,403
  • 52
  • 435
  • 836
Sourav
  • 17,065
  • 35
  • 101
  • 159

2 Answers2

1

Like this, for example

$doc = new DOMDocument();
$doc->loadHTMLFile($filename);
$xpath = new DOMXPath($doc);
$nodes = $xpath->query('//td[@class="text_cont_normal"]');
 foreach($nodes as $node)
 {
    echo $node->nodeValue.'<br>';   // title
 }
$nodes = $xpath->query('//td[@class="normal_text_link"]/img[@alt="Download"]');
 foreach($nodes as $node)
 {
  if ($node->hasAttribute('onclick'))
     echo $node->getAttribute('onclick').'<br>';  //click
 }

If you need exactly the LINK then rewrite

  if ($node->hasAttribute('onclick'))
  {
      echo $node->getAttribute('onclick').'<br>';  //click
      preg_match('/location\.href=(\'|")(.*?)\\1/i', 
                 $node->getAttribute('onclick'), $matches);
      if (isset($matches[2])) echo $matches[2].'<br>'; // the value
  }

Or do you need them in groups?

Cheery
  • 16,063
  • 42
  • 57
1

One possible way:

$nodes = $doc->getElementsByTagName('tr');
$max = $nodes->length;
for ($i = 0; $i < $max; $i++)
{
    echo $nodes->item($i)->firstChild->nodeValue . '<br>';  // TITLE
    $onclick = $nodes->item($i)->childNodes->item(2)->childNodes->item(1)->getAttribute('onclick');
    $parts = explode("'", $onclick);
    echo $parts[1] . '<br>';  // LINK
}
J. Bruni
  • 20,322
  • 12
  • 75
  • 92