-1

I'd like to import an HTML document onto a MySQL database using PHP.

The structure of the document looks like this :

<p class="word">
<span class="word-text">word1</span>
<span class="grammatical-type">noun</span>
</p>
...
<p class="word">
<span class="word-text">word128</span>
<span class="grammatical-type">adjective</span>
</p>

For each word, I only have one word-text and one grammatical-type.

I'm able to find each word node, but for each of its children word-text and grammatical-type I'd like to perform a MySQL query :

$dom = new DOMDocument();
            $dom->loadHTMLFile($location);
            $xpath = new DomXPath($dom);
            $res = $xpath->query("//p[@class='word']");
            foreach ($res as $textNode) {
                //do something here for each *word-text*->nodeValue
                //do something here for each *grammatical-type*->nodeValue
                }
            }

I tried in the foreach loop to pass $textNode, which is a DOMNode, as a $contextNode as follows :

$wordText = $xpath->query("span[@class='word-text']", $textNode);
$myWord = $wordText->nodeValue;

But in $wordText I only have a DOMNodeList with a NULL nodeValue.

How can I, starting from the word node, manage the children nodes ?

Thanks

Fafanellu
  • 424
  • 4
  • 20

2 Answers2

0

Solved.

You just need to, as you know that the node only contains a single element, select this single element using item(0) :

$dom = new DOMDocument();
            $dom->loadHTMLFile($location);
            $xpath = new DomXPath($dom);
            $res = $xpath->query("//p[@class='word']");
            foreach ($res as $textNode) {
                $wordTextNode = $xpath->query("span[@class='word-text']", $textNode);
                $word = $wordTextNode->item(0)->nodeValue;

                //do same thing here for each *grammatical-type*
                }
            }
Fafanellu
  • 424
  • 4
  • 20
0

You can provide different node as context in your $xpath->query calls:

<?php

$location = 'so-dom.html';
$dom = new DOMDocument();
            $dom->loadHTMLFile($location);
            $xpath = new DomXPath($dom);
            $res = $xpath->query("//p[@class='word']");
            foreach ($res as $textNode) {
                echo $xpath->query('./a/text()', $textNode)[0]->nodeValue;
                                                //^^^^^^^^^
                };

?>

Where doc is

<head></head>
<body>
  <p class="word"><a>one</a></p>
  <p class="word"><a>two</a></p>
</body>

will print "onetwo"

Granitosaurus
  • 20,530
  • 5
  • 57
  • 82