How to print the path for current XML node in Groovy?

Question

I am iterating through an XML file and want to print the gpath for each node with a value. I spent the day reading Groovy API docs and trying things, but it seems that what I think is simple, is not implemented in any obvious way.

Here is some code, showing the different things you can get from a NodeChild.

    import groovy.util.XmlSlurper

    def myXmlString = '''
    <transaction>
        <payment>
            <txID>68246894</txID>
            <customerName>Huey</customerName>
            <accountNo type="Current">15778047</accountNo>
            <txAmount>899</txAmount>
        </payment>
        <receipt>
            <txID>68246895</txID>
            <customerName>Dewey</customerName>
            <accountNo type="Current">16288</accountNo>
            <txAmount>120</txAmount>
        </receipt>
        <payment>
            <txID>68246896</txID>
            <customerName>Louie</customerName>
            <accountNo type="Savings">89257067</accountNo>
            <txAmount>210</txAmount>
        </payment>
        <payment>
            <txID>68246897</txID>
            <customerName>Dewey</customerName>
            <accountNo type="Cheque">123321</accountNo>
            <txAmount>500</txAmount>
        </payment>
    </transaction>
    '''

    def transaction = new XmlSlurper().parseText(myXmlString)

    def nodes = transaction.'*'.depthFirst().findAll { it.name() != '' }

    nodes.each { node -> 
        println node
        println node.getClass()
        println node.text()
        println node.name()
        println node.parent()
        println node.children()
        println node.innerText
        println node.GPath
        println node.getProperties()
        println node.attributes()
        node.iterator().each { println "${it.name()} : ${it}" }
        println node.namespaceURI()
        println node.getProperties().get('body').toString()
        println node.getBody()[0].toString()
        println node.attributes()
    }

I found a post groovy Print path and value of elements in xml that came close to what I need, but it doesn't scale for deep nodes (see output below).

Example code from link:

    transaction.'**'.inject([]) { acc, val -> 
        def localText = val.localText() 
        acc << val.name()

        if( localText ) {
            println "${acc.join('.')} : ${localText.join(',')}"
            acc = acc.dropRight(1) // or acc = acc[0..-2]
        }
        acc
    }

Output of example code :

    transaction/payment/txID : 68246894
    transaction/payment/customerName : Huey
    transaction/payment/accountNo : 15778047
    transaction/payment/txAmount : 899
    transaction/payment/receipt/txID : 68246895
    transaction/payment/receipt/customerName : Dewey
    transaction/payment/receipt/accountNo : 16288
    transaction/payment/receipt/txAmount : 120
    transaction/payment/receipt/payment/txID : 68246896
    transaction/payment/receipt/payment/customerName : Louie
    transaction/payment/receipt/payment/accountNo : 89257067
    transaction/payment/receipt/payment/txAmount : 210
    transaction/payment/receipt/payment/payment/txID : 68246897
    transaction/payment/receipt/payment/payment/customerName : Dewey
    transaction/payment/receipt/payment/payment/accountNo : 123321
    transaction/payment/receipt/payment/payment/txAmount : 500

Besides help getting it right, I also want to understand why there isn't a simple function like node.path or node.gpath that prints the absolute path to a node.

score 1 · Accepted Answer · answered Apr 12 '19 at 09:40

You could do this sort of thing:

import groovy.util.XmlSlurper
import groovy.util.slurpersupport.GPathResult

def transaction = new XmlSlurper().parseText(myXmlString)

def leaves = transaction.depthFirst().findAll { it.children().size() == 0 }

def path(GPathResult node) {
    def result = [node.name()]
    def pathWalker = [hasNext: { -> node.parent() != node }, next: { -> node = node.parent() }] as Iterator
    (result + pathWalker.collect { it.name() }).reverse().join('/')
}

leaves.each { node -> 
    println "${path(node)} = ${node.text()}"
}

Which gives the output:

transaction/payment/txID = 68246894
transaction/payment/customerName = Huey
transaction/payment/accountNo = 15778047
transaction/payment/txAmount = 899
transaction/receipt/txID = 68246895
transaction/receipt/customerName = Dewey
transaction/receipt/accountNo = 16288
transaction/receipt/txAmount = 120
transaction/payment/txID = 68246896
transaction/payment/customerName = Louie
transaction/payment/accountNo = 89257067
transaction/payment/txAmount = 210
transaction/payment/txID = 68246897
transaction/payment/customerName = Dewey
transaction/payment/accountNo = 123321
transaction/payment/txAmount = 500

Not sure that's what you want though, as you don't say why it "doesn't scale for deep nodes"

Thank you Tim, this seems to work, I will study it to learn how you did it :-) What I meant with the other solution not scaling is that some elements repeat in the output and they criss-cross. Look at this: transaction/payment/receipt/payment/payment/txID — ou_ryperd, Apr 12 '19 at 09:44
It works by walking back from the leaf node to the root (where parent == node) and collecting the node names, then it reverses the list and sticks them together into a string separated by `/` :-) — tim_yates, Apr 12 '19 at 11:33

How to print the path for current XML node in Groovy?

1 Answers1