2

I have a xml document with multidimensional structure:

<?xml version="1.0" encoding="UTF-8"?>
<main>
  <products>
    <product id="87" state="active">
      <options>
        <option id="99" state="active">
          <item id="33" value="somevalue" />
          <item id="35" value="somevalue2" />
        </option>
        <option id="12" state="deleted">
          <item id="56" value="somevalue" />
          <item id="34" value="somevalue2" />
        </option>
      </options>
      <reports>
        <report type="json">
          <field id="123" state="active" />
          <field id="234" state="deleted" />
          <field id="238" state="active" />
          <field id="568" state="deleted" />
        </report>
      </reports>
    </product>
  </products>
</main>

In the PHP backend I've written methods to detect items with "deleted" status and remove them. Here is PHP part:

public function loadAndModify() {
    $xml = simplexml_load_file($this->request->file('import_xml'));

    $this->processXml($xml);
}

/**
 * @param $element
 *
 * @return bool
 */
private function shouldRemove($element): bool
{
    return ($element['state'] == SomeClass::STATE_DELETED);
}

/**
 * @param $xml
 *
 * @return void
 */
private function processXml(&$xml): void
{
    if ($xml->children()->count() > 0) {
        foreach ($xml->children() as $child) {
            if ($this->shouldRemove($child)) {
                // this code works as expected with or without xdebug
                //$node = dom_import_simplexml($child);
                //$node->parentNode->removeChild($node);

                // this code will work only with xdebug when breakpoint is set
                unset($child[0]);
                continue;
                // end
            } else {
                $this->processXml($child);
            }
        }
    }
}

I solve my problem by converting simpleXMLElement to DOMElement.

However it seems that PHP has some bug when I use unset with xdebug. When I add breakpoint to line with unset and go to next step in the debugger and then resume application - there is no problem. But when breakpoint is active and I just clicked resume application it cause error:

Uncaught ErrorException: Trying to get property of non-object in \project\vendor\symfony\var-dumper\Cloner\AbstractCloner.php

If someone else had this error please explain why this is happened in this case. Thanks.

Alex Slipknot
  • 2,439
  • 1
  • 18
  • 26

1 Answers1

1

As discussed in the comments under this previous answer, the problem you are encountering is that you are manipulating an object (in this case, the return value of $xml->children()) which you are iterating over (with a foreach loop).

Internally, the SimpleXMLElement object has a list of child items it is going to present, in turn, to the iterator code in foreach. When you delete the current child item, you necessarily change the shape of that internal list, so "next item" is not well defined. Deleting other items in the list can also have odd behaviour - for instance, deleting item 1 while inspecting item 2 may cause the iterator to "skip ahead" since item 4 has now moved into the place where item 3 was.

As hakre suggests in the comments linked above, the most robust solution is to copy the original list of items into an array, which can be achieved using iterator_to_array. Passing false as the second argument throws away the keys, which is important with SimpleXML because because it uses the tag name as the key, and there can be only one value for each key in the array.

foreach ( iterator_to_array($xml->children(), false) as $child) {
    // Carry on as you were
}

The only thing to be aware of with this is that iterator_to_array will go through the whole list before returning, so if you have a large list and want to break out of the loop early, or stream output, this may be problematic.

Community
  • 1
  • 1
IMSoP
  • 89,526
  • 13
  • 117
  • 169
  • Thanks for the answer anyway but actually I know how to avoid this problem. As you can see in the question I'm just using DOM conversion to achieve the goal. The question is **why error doesn't happened when I'm using step-by-step debugger?** So actually I wondered that code that might throw an error and **must** throw an error didn't do it. So I think that XDebug doing some manipulation with the iterator and it **can be potential cause** But I'm not sure. Any idea? – Alex Slipknot May 17 '17 at 11:06
  • 1
    @AlexSlipknot Removing an item from a collection you are iterating over isn't something which causes a predictable error message because the problem has been detected; rather, it leads to what specs generally call "undefined behaviour", because the problem *hasn't* been detected, and the internal implementation may behave in ways its designers never intended. Adding a debugger necessarily changes the internal implementation - for instance, memory cleanup might be delayed so that you can inspect data - so the undefined behaviour will be *different*. – IMSoP May 17 '17 at 11:32
  • @AlexSlipknot As for "knowing how to solve the problem", reaching for DOM conversion suggested to me that you *didn't* understand the problem, because this is not a problem with SimpleXML as such, it's a general rule that you should never manipulate a collection that you are looping over. It may be that your new DOM code is subject to the same undefined behaviour, and you've just got lucky (i.e. it will break when you upgrade PHP / change some minor detail); or it may be that you've stumbled on the actual fix (looping over a separate copy of the collection) without realising. – IMSoP May 17 '17 at 11:37
  • Aha, that's very similar to delayed memory cleanup. Thanks! Can you tell me where I can see detailed info about this? Cause this is very important to me. Let's say: in the real world there is a problem but when you try to reproduce it - there is no problem :) And about the lucky code: I got it. We got a problem while we removing element inside iteration. So removing element with DOM conversion - avoid direct-unsetting so iteration will continue without any trouble. – Alex Slipknot May 17 '17 at 11:43
  • 1
    @AlexSlipknot I'm afraid I don't know the exact details of what's going on "under the hood", but I think more useful than learning every detail of the debugger, and every possible side effect it could have, is to take the lesson of "if I see this kind of thing, consider if there might be some undefined behaviour or interaction with internal implementation involved". Particularly when working with extensions like SimpleXML/DOM, which are PHP wrappers around non-PHP data structures in memory (in the XML case, the libxml2 library). – IMSoP May 17 '17 at 12:11
  • Alright. Thank you so much. However even when I didn't get answer for xdebug & iteration you help me to move forward to (I hope) right way with delayed operations in xdebug and external resources. – Alex Slipknot May 17 '17 at 12:20