3

I have some HTML that contains this:

<div class="test">
  Outer
  <div class="test">Inner 1</div>
  <div class="test">Inner 2</div>
</div>

I'm doing str_replace() on the contents of these elements:

$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXpath($dom);

foreach($xpath->query("//div[@class='test']") as $node) {
    $node->nodeValue = str_replace(" ", "X", $node->nodeValue);
}

That should replace any spaces with an "X".

But it results in this error:

Warning: Couldn't fetch DOMElement. Node no longer exists in /path/to/my/file.php on line 63

It works if there's only one nested div:

<div class="test">
  Outer
  <div class="test">Inner 1</div>
</div>

Why does this happen, and how can I get it working?

Phil Gyford
  • 13,432
  • 14
  • 81
  • 143
  • Which spaces are you trying to replace? As a bit of debugging, add `echo $node->nodeValue.PHP_EOL;` into the loop to check what text it's working with. – Nigel Ren Mar 20 '20 at 18:33
  • 2
    I believe the issue is that you need to query the elements only once, then loop through the object. Currently you are performing the same query over and over. `$elements = $xpath->query("//div[@class='test']"` then `foreach($elements as $node) {...}` – EternalHour Mar 20 '20 at 18:36
  • @NigelRen It first outputs "Outer Inner 1 Inner 2" then "Inner 1", and then the error happens. It then outputs "" and the error happens again. – Phil Gyford Mar 20 '20 at 18:49
  • @EternalHour Thanks, but that doesn't make any difference. – Phil Gyford Mar 20 '20 at 18:50

1 Answers1

2

Try changing

foreach($xpath->query("//div[@class='test']") as $node) 

to

foreach($xpath->query('//div[@class="test"]//div[@class="test"]') as $node)

Edit per comments:

Assuming there's a space in the outer element (i.e., its "Outer 1:):

<?php
$string = <<<XML
<div class="test">
  Outer 1
  <div class="test">Inner 1</div>
  <div class="test">Inner 2</div>
</div>
XML;
$dom = new DOMDocument();
$dom->loadHTML($string);
$xpath = new DOMXpath($dom);

foreach($xpath->query('//div[@class="test"]//text()') as $node) {
   $nnode = trim($node->nodeValue);

   echo $nnode  = str_replace(" ", "X", $nnode);
}
Jack Fleeting
  • 24,385
  • 6
  • 23
  • 45
  • Thanks - that does process both of the *Inner n* elements successfully, but it skips the *Outer* element (which in the real world might also have a space in; I should have clarified that in my example). – Phil Gyford Mar 21 '20 at 19:03
  • 1
    @PhilGyford - Indeed it does skip the outer element - and on purpose, as per the original question. So are you saying the the outer element may also have a space in it and you want to replace that with an X as well? – Jack Fleeting Mar 21 '20 at 19:43
  • Yes, sorry, when I said "I'm doing `str_replace()` on the contents of these elements" I should have been more explicit and provided a better example – I meant *all* of the elements, not just the inner two. – Phil Gyford Mar 22 '20 at 11:00
  • That's great, thanks @JackFleeting! One odd thing - the `foreach` loop has an extra iteration after each of the Inner elements. They're easy enough to ignore because they end up as empty strings, but can you explain why? – Phil Gyford Mar 23 '20 at 15:20
  • 1
    @PhilGyford - It's one of the artifacts of xpath; if you try `count(//div[@class="test"]//text())`, you'll see if returns `5`, not `3` as one would expect intuitively. The spaces between each pair of closing `` and its immediate following opening `
    ` count as 1. There are two of these pairs, hence the extra iterations...
    – Jack Fleeting Mar 23 '20 at 16:57