39

I've been trying to use SimpleXML, but it doesn't seem to like XML that looks like this:

<xhtml:div>sample <xhtml:em>italic</xhtml:em> text</xhtml:div>

So what library will handle tags that look like that (have a colon in them)?

Josh Davis
  • 28,400
  • 5
  • 52
  • 67
mpen
  • 272,448
  • 266
  • 850
  • 1,236

4 Answers4

88

Say you have some xml like this.

<xhtml:div>
  <xhtml:em>italic</xhtml:em>
  <date>2010-02-01 06:00</date>
</xhtml:div>

You can access 'em' like this: $xml->children('xhtml', true)->div->em;

however, if you want the date field, this: $xml->children('xhtml', true)->div->date; wont work, because you are stuck in the xhtml namespace.

you must execute 'children' again to get back to the default namespace:

$xml->children('xhtml', true)->div->children()->date;
Nathan
  • 11,938
  • 12
  • 55
  • 62
  • 1
    not sure why this isn't the selected answer. But for anyone in the future this is the one that solved my questions/problem! :) – daveomcd May 18 '11 at 16:22
  • What about multiple tags with ':'? This is how I were trying to access it: `$array['body']['Order']['Extensions']['data:AdditionalReferences']['data:YourRef']` – user2924019 Mar 10 '20 at 17:45
28

If you want to fix it quickly do this (I do when I feel lazy):

// Will replace : in tags and attributes names with _ allowing easy access
$xml = preg_replace('~(</?|\s)([a-z0-9_]+):~is', '$1$2_', $xml);

This will convert <xhtml: to <xhtml_ and </xhtml: to </xhtml_. Kind of hacky and can fail if CDATA NameSpaced XML container blocks are involved or UNICODE tag names but I'd say you are usually safe using it (hasn't failed me yet).

EarnestoDev
  • 507
  • 4
  • 7
6

Colon denotes an XML namespace. The DOM has good support for namespaces.

Ollie Saunders
  • 7,787
  • 3
  • 29
  • 37
2

I don't think it's a good idea to get rid of the colon or to replace it with something else as some people suggested. You can easily access elements that have a namespace prefix. You can either pass the URL that identifies the namespace as an argument to the children() method or pass the namespace prefix and "true" to the children() method. The second approach requires PHP 5.2 and up.

SimpleXMLElement::children

Patryk K
  • 57
  • 3
  • Isn't this exactly what Nathan Reed suggested in the answer I accepted? I agree that regex-fu is a dirty hack, but having to go through the `children()` selector isn't very fun either. – mpen Feb 22 '13 at 17:06
  • Yes, the same. I just wanted to indicate the you can also pass the URL that identifies the namespace to the children() method which works with PHP 5 and up. IMHO, there is no need to do a dirty hack, when there is a core method available. – Patryk K Feb 22 '13 at 17:22