1

After upgrading from Java1.2 and Apache Xerces DomParser to Java1.7 and Xerces JAXP DocumentBuilder, the upgraded parsing completes without errors but does not “unwrap” CDATA elements, despite initializing the DocumentBuilderFactory with “setCoalescing(true);”

That is, input XML elements such as <ITEMDESC><![CDATA[ Sales Bom Material,Dist]]></ITEMDESC> are returned unmodified.

The code is shown below.

I’m new to XML parsing, so it’s likely that I’m missing something quite basic.

Our input XML has literally hundreds of different tags, so we’d like a solution that works without changing each element “get”.

Are there other requirements/hints/tips/tricks for getting “setCoalescing(true);” to work ?

Thanks in advance for any suggestions.

Code:

        DocumentBuilderFactory aDocBuilderFactory = DocumentBuilderFactory.newInstance();
        aDocBuilderFactory.setValidating(m_dtdValidate);

        // Set to make sure that CDATA elements are automatically converted and collected into a single text element
        aDocBuilderFactory.setCoalescing(true);

        // Make sure that entity references are expanded, this includes the replacements for the reserved markup
        // characters
        aDocBuilderFactory.setExpandEntityReferences(true);

        // Ignore comments as they won't contain information to be processed
        aDocBuilderFactory.setIgnoringComments(true);

        // Get a document builder
        m_documentBuilder = aDocBuilderFactory.newDocumentBuilder();

        // Install entity resolver if required
        m_documentBuilder.setEntityResolver(new DocumentEntityResolver());

        m_document = m_documentBuilder.parse(pSource);
7579
  • 61
  • 6

0 Answers0