Using the following code, I'm successfully reading XML files. However when a comment appears in the middle of a node, the reader is discarding the remainder of the node. For example:
<text>thisismy<!--comment-->document</text>
would result in a return string of "thisismy" and nothing else.
I had a similar problem earlier when I'd encounter special chars like & and setting the XMLInputFactory
to isCoalescing=true
fixed that. I'm guessing I've encountered a related feature.
I need to be able to process such documents elegantly. Can anyone suggest how I might work around such interruptions?
try {
XMLInputFactory factory = XMLInputFactory.newInstance();
factory.setProperty("javax.xml.stream.isCoalescing", true);
XMLEventReader eventReader =
factory.createXMLEventReader(new FileReader(fileName));
while(eventReader.hasNext()) {
XMLEvent event = eventReader.nextEvent();
switch(event.getEventType()) {
case XMLStreamConstants.START_ELEMENT:
StartElement startElement = event.asStartElement();
String qName = startElement.getName().getLocalPart();
if (qName.equalsIgnoreCase("page")) {
page = new DocumentPage();
Iterator<Attribute> attributes = startElement.getAttributes();
while(attributes.hasNext())
{
Attribute attribute = attributes.next();
switch (attribute.getName().toString().toLowerCase()) {
case "index" :
pageIndex = attribute.getValue();
page.setPageIndex(pageIndex);
break;