2

I need to write a script that takes several XML files and performs some operations basing on their contents. To make it easier to go through all the elements I wanted to merge all files into one XML tree (only in memory). I've tried using appendNode() method but I've encountered very strange behaviour. Here's the snippet I use to show the problem:

def slurper = new XmlSlurper()
def a = slurper.parseText("<a></a>")
def b = slurper.parseText("<b>foo</b>")
a.appendNode(b)

println XmlUtil.serialize(a)

a."**".each { println (it.name()) }

It outputs:

<?xml version="1.0" encoding="UTF-8"?><a>
  <b>foo</b>
</a>

a

Serialized XML is correct but I don't get <b> from the iterator.

However, if I add this line after appending:

a = slurper.parseText(XmlUtil.serialize(a))

output looks like this:

<?xml version="1.0" encoding="UTF-8"?><a>
  <b>foo</b>
</a>

a
b

<b> is there as I expect it to be.

What am I missing here? Why parsing and serializing again changed the output? I'm new to Groovy so I imagine it can be something obvious, please help me understand why it happens. Or maybe there is a better way to merge XML files?

Szymon Stepniak
  • 40,216
  • 10
  • 104
  • 131
xersiee
  • 4,432
  • 1
  • 14
  • 23

1 Answers1

3

It happens because XmlSlurper.parse(String text) returns GPathResult which is:

Base class for representing lazy evaluated GPath expressions.

And according to Groovy XML processing documentation:

XmlSlurper evaluates the structure lazily. So if you update the xml you’ll have to evaluate the whole tree again.

That's why you have to re-evalutate XML tree with

a = slurper.parseText(XmlUtil.serialize(a))

to get your expression working.

If you use XmlParser on the other hand you will get it working without re-evaluation of XML tree, e.g.

import groovy.xml.XmlUtil

XmlParser root = new XmlParser()
def a = root.parseText("<a></a>")
def b = root.parseText("<b>foo</b>")

a.append(b)

println XmlUtil.serialize(a)

a."**".each { println (it.name()) }

Output

<?xml version="1.0" encoding="UTF-8"?><a>
  <b>foo</b>
</a>

a
b
Szymon Stepniak
  • 40,216
  • 10
  • 104
  • 131