2

Given a set of three XML files:

first.xml

<root>
    <item1>A</item1>
    <complexItem>
        <item2>B</item2>
        <item3>C</item2>
    </complexItem>
</root>

patch1.xml

<root>
     <item1>X</item1>
</root>

patch2.xml

<root>
     <complexItem>
         <item3>Y</item>
     </complexItem>
     <differentItem>Z</differentItem>
<root>

I would like to end up with the XML:

patched.xml

<root>
    <item1>X</item1>
    <complexItem>
        <item2>B</item2>
        <item3>Y</item2>
    </complexItem>
    <differentItem>Z</differentItem>
</root>

So new elements in the patches are additive and existing elements in patches are destructive. These additions and updates can take place at any level of the document tree. Ideally, this would be a maven plugin which can take a list of files as arguments, though a solution in Java (i.e. an available library - I'm trying to avoid reinventing something which seems like it should have already been done!) is fine as I can write the plugin myself. Each file (base and patches) will always have the same root element.


I should have added, there is no use case to remove elements from the tree (the hierarchical nature of our file replacements cause this to be an error case for the application using the patched files). I did a little more searching for a pre-built tool or library and couldn't find anything suitable, so took Andrew's advice of building something from scratch with dom4j. Quite unfamiliar with dom4j, but here's roughly what I've come up with (without review/error handling/proper commenting etc.):

public void execute(){

    // Environment specific file loading removed

    SAXReader reader = new SAXReader();
Document patchedDocument = null;
for (InputStream is : loadedFiles) {
    Document d = null;
    try {
        d = reader.read(is);
    } catch (DocumentException e) {
        e.printStackTrace();
    }
    if (patchedDocument == null) {
        patchedDocument = d;
    } else {
        Element root = d.getRootElement();
        patch(patchedDocument, root);
    }
}

    // Environment specific file writing

}

private void patch(Document patchedDocument, Element element) {

    for (Iterator i = element.elementIterator(); i.hasNext();) {
        Element nextElement = (Element) i.next();
        if (nextElement.isTextOnly()) {
            String path = nextElement.getUniquePath();
            Node n = patchedDocument.selectSingleNode(path);
            if (n != null)
            {
                                // This already exists and needs content replacing    
                n.setText(nextElement.getText());
            }else{
                                // This doesn't exist and needs to be added to the tree
                addElement(patchedDocument, nextElement);
            }
        } else {
            patch(patchedDocument, nextElement);
        }
    }
}

private Node addElement(Document patchedDocument, Element element)
{
    Element parent = element.getParent();
    String parentPath = parent.getPath();
    Node n = patchedDocument.selectSingleNode(parentPath);
    if (n == null){
        return addElement(patchedDocument, parent);
    } else {
        ((Branch)n).add(element.detach());
        return n;
    }
}
Charles A
  • 404
  • 3
  • 10
  • Did you mean to use both "complexType" and "complexItem", or is that a typo? – Andrew Swan Jan 17 '13 at 11:05
  • There is a similar page here: http://stackoverflow.com/questions/80609/merge-xml-documents , it's quite detailed, you can develop a light-weight based on that. – Gavin Xiong Jan 17 '13 at 11:34

1 Answers1

0

If the files are small enough to fit in memory, I'd write some Java code to slurp them all into separate DOMs, then merge them into one DOM based on the additive/destructive logic you've described, then spit the merged DOM back out into the target file. There are several Java libraries that could be used for the reading and writing of the XML files, e.g. dom4j. You don't want to be messing with trying to parse or construct the XML yourself.

Andrew Swan
  • 13,427
  • 22
  • 69
  • 98