0

I have created a php script in PHP Dom where multiple html files are scraped to look for all P tags that contain a specific class.

I then want to get the values inside those p tags and build an unordered list in PHP Dom.

My problem is, while I can get the values and echo all of them onto a page, when I try to createElements and append each value in its own LI tag my results only returns the LAST item in the list. I hope that makes sense. Here is the code:

    $dom = new DOMDocument();   
    $dom->formatOutput = true;
    $dom->preservewhiteSpace = false;

//looping through an array

    foreach ($pages as $page) {

    foreach ($page['pageContent'] as $listlinks) {
    $dom->loadHTMLFile($theurl . 'content_id_' . $listlinks['content'] . '.html');

//create the xPath object after loading the html source, otherwise the query won't work:/
    $xPath = new DOMXPath($dom);

//get the p nodes in a DOMNodeList that has class"content_header_type_2":
    $nodeList = $xPath->query("//p[@class='content_header_type_2']");

    //create a new DOMDocument and add a ul element:
    $newDom = new DOMDocument();
    $ul = $newDom->createElement('ul');
    $newDom->appendChild($ul);

        // append all nodes from $nodeList to the new dom, as children of $ul:
        foreach ($nodeList as $domElement) {
        $domNode = $newDom->importNode($domElement, true);
        echo $domNode->nodeValue . '<br>'; //This gives the entire list

        $li = $newDom->createElement('li', $domNode->nodeValue); //This gives the last value in the list
        $ul->appendChild($li);

        }

    } 
       };
            $output = $newDom ->saveHTML();
        echo $output;
  • I think you don't have to importNode here at all. All you need is a nodeValue of the extracted P tags, It should work inside your for loop without importing node into $newDom – Dmitri Snytkine Dec 13 '11 at 19:58
  • Hi Dimitri, Yes, I have tried it both ways with and without importNode and it still only returns the last item in the nodeList. – Gerry Bunker Dec 13 '11 at 20:15
  • Maybe it's because your foreach($nodeList as $domElement) is itself inside another foreach $pages as $page, so the last iteration of $pages will override the previously set value of $output. Is that possible? – Dmitri Snytkine Dec 13 '11 at 20:30
  • I can't think of any other way to iterate through the nodes without using a foreach loop. I am open to suggestions. – Gerry Bunker Dec 13 '11 at 20:41

0 Answers0