5

I have next xml file

<xml name="Places">
    <data>
        <row Code="1" Name="#X1.A&B(City)" />
    </data>
</xml>

And after I'm executing unmarshal I'm getting exception The reference to entity "B" must end with the ';' delimiter because of ampersand(&) inside Name attribute.

JAXBContext jaxbContext = JAXBContext.newInstance(Root.class);
Unmarshaller unmarshaller = jaxbContext.createUnmarshaller();
File xml = new File("test2.xml");
Object obj = unmarshaller.unmarshal(xml);

How can I escape those characters?

I've already tried to add CharacterEscapeHandler but it's not working for Unmarshaller

private static class EscapeHandler implements CharacterEscapeHandler {
    @Override
    public void escape(char[] buf, int start, int len, boolean isAttValue,
            Writer out) throws IOException {
           ...
    }

}

unmarshaller.setProperty("com.sun.xml.bind.marshaller.CharacterEscapeHandler", 
new EscapeHandler());

And I'm getting javax.xml.bind.PropertyException: name: com.sun.xml.bind.marshaller.CharacterEscapeHandler

Orest
  • 6,548
  • 10
  • 54
  • 84
  • Might be a duplicate: http://stackoverflow.com/a/18017130/506855 Though this is about marshalling not unmarshalling. – Puce Jan 15 '15 at 17:33
  • A JAXB implementation is going to depend on a lower level (SAX or StAX) parser. If you can find one that is tolerant of your invalid XML, then you will be able to get JAXB to work with it. – bdoughan Jan 15 '15 at 18:23

1 Answers1

2

The issue is, I think, that the JAXB unmarshaller unmarshalls XML documents but your sample is not well-formed XML. To make the XML valid you would have to replace & with &amp;.

Make sure you always have well-formed (syntactically) and valid (according to a XSD) XML documents.

Puce
  • 37,247
  • 13
  • 80
  • 152
  • 2
    Yes I know that it's not valid xml but are there any solution to escape those characters? I found CharacterEscapeHandler but how can I add it to my unmarshaller? – Orest Jan 15 '15 at 17:12
  • @Orest You should make sure the XML is well-formed and valid before you unmarshall it. Where do you get your XML documents from? Make sure the provider only provides well-formed and valid XML documents. – Puce Jan 15 '15 at 17:15
  • I'm getting it from 3rd party service, so I can't do anything with this. I should validate it on my side. – Orest Jan 15 '15 at 17:18
  • @Orest Try to talk to the provider. Basically, if the providers offers a XML based interface then he should provide well-formed and valid XML. Non well-formed XML is not really XML and thus does not meet the specification. If you can't talk to provider then you have indeed to fix it yourself e.g. using some regular expressions and pre-process the XML document before it can be used by XML tools (eg. JAXB). – Puce Jan 15 '15 at 17:23
  • I thought that there is some mechanism in jaxb to pre-process XML. – Orest Jan 15 '15 at 17:25
  • There might be some vendor specific properties, but I haven't tried those: https://jaxb.java.net/2.1.2/docs/vendorProperties.html#charescape http://stackoverflow.com/a/18017130/506855 – Puce Jan 15 '15 at 17:31
  • @Orest Please show in your question what you have tried. – Puce Jan 15 '15 at 17:39