10

With all the searching that I've done, I understand that serializing/deserializing xml with jackson that has mixed content is problematic. Does anybody know of a way to handle the following xml using Java?

<xmlsample>
    <title>Yada yada yada <a href=\"component:tcm:757-228001\" id=\"Link_1492103133595\" title=\"yada\" name=\"Link_1492103133595\" xmlns=\"xhtml\">yada</a> yada</title>
    <link>test</link>
</xmlsample>

I am using the following POJO:

@JacksonXmlRootElement(localName="xmlsample")
public class XmlSample{

    private String title;
    private String link;

    public String getTitle() {
        return title;
    }
    public void setTitle(String title) {
        this.title = title;
    }
    public String getLink() {
        return link;
    }
    public void setLink(String link) {
        this.link = link;
    }
}

If the node has mixed content, as in the above example, I will get the following error:

java.io.IOException: Expected END_ELEMENT, got event of type 1

If the node has plain text, then deserialization works.

I have tried using JsonNode, TextNode, ObjectNode, Object instead of String for the data type. I have tried a custom serializer and deserializer, but the error persists. In fact, processing doesn't reach the custom deserializer if there is html in the node.

This xml is coming from a 3rd party system (SDL Tridion) that I cannot change.

Any help would be greatly appreciated!

EDIT: I need to clarify that the node could contain markup or could contain plain text, so I can't create a POJO that represents the markup as you see it in the above xml. And the markup in could be significantly more complex than the example above as well. This is why I am just trying to force it into a String. I don't need to manipulate it, I just need to preserve it in the POJO so it can be returned to the database unchanged.

ShinyNewUser
  • 101
  • 5
  • Have you found a solution to your problem? We're dealing with the same issue and tried all the options you mentioned, but it still doesn't work. I'm thinking of moving to jaxb for de/serialization, because jackson doesn't seem to cover all the cases that do not appear in json (attributes / multiple consecutive nodes with the same name). – Timi Oct 26 '17 at 07:52
  • Haven't found a solution yet. The bug reports in jackson github have open issues for this, but no resolution. – ShinyNewUser Nov 13 '17 at 18:19

1 Answers1

0

you could try using CDATA:

<![CDATA[<]]>
<![CDATA[>]]>

or another workaround, e.g. escaping the brackets, so "<" becomes "&lt;" and so on...

CKey
  • 88
  • 5