6

I'm using JAXB 2.2.5 to output Xml from a JAXB Model, the data is populated from the database and occasionally the database contains non-displayable characters that it should not such as

0x1a 

If it does then JAXB outputs invalid Xml by just outputting this char as is, shouldn't it escape it or something ?

Update

I wonder if there are any implementations that do fix this problem, maybe Eclipselink MOXy does ?

EDIT

I tried the workaround that fixes the illegal char issue however it changes the output in an undesirable way.

<?xml version="1.0" encoding="UTF-8" standalone="yes"?><metadata created="2013-02-27T11:40:04.009Z" xmlns="http://musicbrainz.org/ns/mmd-2.0#" xmlns:ext="http://musicbrainz.org/ns/ext#-2.0"><cdstub-list count="1" offset="0"><cdstub id="w237dKURKperVfmckD5b_xo8BO8-" ext:score="100"><title>fred</title><artist></artist><track-list count="5"/></cdstub></cdstub-list></metadata>

to

<?xml version="1.0" ?><metadata xmlns:ext="http://musicbrainz.org/ns/ext#-2.0" xmlns="http://musicbrainz.org/ns/mmd-2.0#" created="2013-02-27T11:39:15.394Z"><cdstub-list count="1" offset="0"><cdstub id="w237dKURKperVfmckD5b_xo8BO8-" ext:score="100"><title>fred</title><artist></artist><track-list count="5"></track-list></cdstub></cdstub-list></metadata>

i.e <track-list count="5"/> has become <track-list count="5"></track-list>which is undesirable, I'm not sure why it is doing this.

Paul Taylor
  • 13,411
  • 42
  • 184
  • 351

3 Answers3

5

It is apparently a common problem - and marked as a bug JAXB generates illegal XML characters.

You can find a workaround at Escape illegal characters

thedayofcondor
  • 3,860
  • 1
  • 19
  • 28
  • this works but unfortunately it has changed the output in another way that I do not want, please see update to question – Paul Taylor Feb 27 '13 at 11:48
  • Thanks for the answer. Unfortunately the workaround that you link to has some deficiencies, like not indenting the generated XML :( – Kaitsu Nov 26 '15 at 14:49
  • The updated link (I guess) is https://github.com/javaee/jaxb-v2/issues/614 - also related seems https://github.com/javaee/jaxb-v2/issues/960 – Philip Helger Jul 20 '17 at 15:44
2

Another solution is to use Apache Commons Lang to remove the invalid XML characters:

import org.apache.commons.lang3.StringEscapeUtils;

String xml = "<root>content with some invalid characters...</root>";
xml = StringEscapeUtils.unescapeXml(StringEscapeUtils.escapeXml10(xml));

The escapeXml10 method will escape the String and remove the invalid characters. The unescapeXml method will undo the escaping. The end result being the same XML but with the invalid XML characters removed.

TampaHaze
  • 1,075
  • 9
  • 6
0

Simply replace character with any or space in message content. If you don't want to use extra jar or third party things, you can try below method for it:

String msgContent = "......";// string with some illegal character
msgContent = msgContent .replaceALL("\\P{Print}","_");

At this example, replaceALL method replace unprintable characters with underscore. So your msgContent will be just printable characters and that prevent JAXB from illegal characters.

Mustafa Kemal
  • 1,292
  • 19
  • 24