2

I am writing a parser for an xml file which will contains special characters, for example

<name>You &amp; me &#174;</name>

The dom parser will, by default, parse this value to "You & me ®". However what I want the string is

You &amp; me &#174;

Is there any way I can do this? Thanks

Jim Garrison
  • 85,615
  • 20
  • 155
  • 190
jasonfungsing
  • 1,625
  • 8
  • 22
  • 34
  • Which parser are you using? And do you need the exact string or can you just re-encode the strings after you get them from the parser? – takteek Jan 18 '12 at 04:08
  • I am using DOM, the reason is that I will to return those ® to my client, they will display it to customer. – jasonfungsing Jan 18 '12 at 04:12

1 Answers1

1

If you are using DOM for parsing, see the DocumentBuilderFactory.setExpandEntityReferences() method.

By default, this setting is true meaning that entities are expanded out automatically. If you turn this off, you will be able to read the entities from the DOM - in this case you won't just get one big text node from a parent element, but you will get text nodes interleaved with entity nodes.

prunge
  • 22,460
  • 3
  • 73
  • 80