1

I'm being given XML in the following format, and am parsing it with PHP's SimpleXML.

<?xml version="1.0" encoding="UTF-8"?>
<ManageMyBooking>
    <BookingInfo>
        <PartyDetails>
            <Passenger>
                <PaxNo>1</PaxNo>
                <Title>Mrs</Title>
                <Surname>Murphy</Surname>
            </Passenger>
            <Passenger>
                <PaxNo>2</PaxNo>
                <Title>Mr</Title>
                <Surname>Murphy</Surname>
            </Passenger>
            <Passenger>
                <PaxNo>3</PaxNo>
                <Title>Miss</Title>
                <Surname>Murphy</Surname>
            </Passenger>
        </PartyDetails>
        <Accommodation>
            <Units>
                <Unit>
                    <UnitNo>1</UnitNo>
                    <UnitDesc>...</UnitDesc>
                    <PaxAssociated>1|2</PaxAssociated>
                </Unit>
                <Unit>
                    <UnitNo>2</UnitNo>
                    <UnitDesc>...</UnitDesc>
                    <PaxAssociated>3</PaxAssociated>
                </Unit>
            </Units>
        </Accommodation>
    </BookingInfo>
</ManageMyBooking>

I'm looping through the Units (Rooms) thus:

// $Accommodation is a SimpleXML Object defined earlier, and able to provide relevant info
<?  foreach ($Accommodation->Units as $Units) {
        foreach ($Units->Unit as $Unit) {
        // (room/unit details echoed out here)
            foreach ($Unit->xpath('//Passenger[contains(PaxAssociated,./PaxNo)]') as $RoomPax) { ?>
<?= $RoomPax->Title $RoomPax->Surname" ?><br />
<?= "$RoomPax->Title $RoomPax->Surname" ?><br />
<?          }
        }
    } ?>

in an attempt to show the names off the Passengers (Pax) in each room.

But this xpath finds no-one, and the following gets everyone.

//Passenger[contains(PaxNo,./PaxAssociated)]

What's especially frustrating is that I've successfully used XPath elsewhere in the same PHP for a very similar purpose, with no problems.

Any help/advice/suggestions will be much appreciated.


Edit: for completeness, and to answer a question from multiple people: The following works elsewhere in the code, (though not 100% correctly given the possible matching on '22' vs '2'.

//Flight[contains(PaxAssociated,./PaxNo)]
jezmck
  • 1,138
  • 3
  • 18
  • 38
  • I'm confused about how an xpath on a elements could return any result for '//passenger' at all, given that in your example xml structure, units do not contain passengers. Could you post how the $accomodation object gets defined? – Henrik Opel Nov 13 '09 at 12:52
  • It appears to be a built-in feature of SimpleXML objects, the Unit element is directly created a SimpleXML object that contains the whole tree. – jezmck Nov 13 '09 at 13:40
  • Interesting feature - good to know about that, thanks! – Henrik Opel Nov 13 '09 at 13:51
  • 1
    @jezmck: No, that's not it. See my comment below. ;-) – Tomalak Nov 13 '09 at 14:33

1 Answers1

3

This:

//Passenger[contains(PaxNo,./PaxAssociated)]

is: Find any <Passenger> with a child <PaxNo> who's value contains the value of the child <PaxAssociated>. It would only work with such a data structure (which you clearly don't have):

<ManageMyBooking>
  <BookingInfo>
    <PartyDetails>
      <Passenger>
        <PaxNo>1|2</PaxNo>  <!-- note the exchanged value! -->
        <PaxAssociated>1</PaxAssociated>
      </Passenger>
    </PartyDetails>
  </BookingInfo>
</ManageMyBooking>

So this is wrong on multiple accounts. What you mean is probably a dynamic XPath expression, like this:

foreach ($Units->Unit as $Unit) {
  $XPath = "//Passenger[contains('". $Unit->PaxAssociated . "', PaxNo)]";
  foreach ($Unit->xpath($XPath) as $RoomPax) {
    // ...
  }
}

This works on first glance, but it is not fail-safe, because "22" contains "2" as well. So doing a contains() alone won't get you anywhere. Correct would be:

$XPath = "//Passenger[contains('|". $Unit->PaxAssociated ."|', concat('|', PaxNo, '|'))]";

This way you check "|22|" against "|2|", which would return false.

Tomalak
  • 332,285
  • 67
  • 532
  • 628
  • 1
    There's a small typo in that last code snippet, should be `contains('|` (the the inversion in the last two characters) – Josh Davis Nov 13 '09 at 13:08
  • Same question here: How could an xpath for '//passenger...' return anything when run on a $unit object (except if $unit would *not* correspond to a from the markup)? – Henrik Opel Nov 13 '09 at 13:28
  • $Unit has been directly defined as a subsubsub-element of the variable that contains the whole tree. – jezmck Nov 13 '09 at 13:42
  • @Henrik Opel: I suspect you confuse `'//'` with `'.//'`. The former *always* starts at the root node, regardless of context. The latter takes context into account. – Tomalak Nov 13 '09 at 14:32
  • @Tomalak: Nope, I'm well aware of that difference - I just did not expect that a new SimpleXMLElement, extracted as a child element from another SimpleXMLElement, would still contain the whole tree. I expected it to contain only the subtree the element represents. – Henrik Opel Nov 14 '09 at 13:01
  • 1
    @Henrik Opel: It does not *contain* the whole tree, but it still can refer to it - it is still part of the DOM, after all. And for an XPath query with `//` it makes no difference where in the DOM you currently are. – Tomalak Nov 16 '09 at 17:12