0

Hey I would like to find a given text in an xml that looks like this:

<s:Envelope xmlns:s="http://...">
<s:Body>
<About_ServiceResponse xmlns="http://...>
<About_ServiceResult xmlns:a="http://>
<a:businessServiceVersionStructureField> <a:BusinessServiceVersionStructureType>                                                        <a:businessServiceDBVersionNameField>V001</a:businessServiceDBVersionNameField>
<a:businessServiceVersionNameField>Some Service^V100</a:businessServiceVersionNameField>
           </a:BusinessServiceVersionStructureType>
        </a:businessServiceVersionStructureField>
     </About_ServiceResult>
  </About_ServiceResponse>
</s:Body>
</s:Envelope>

So in this example i would like to find the text: "Some Service".

I have tried with Xpath but could not get that to work. I have also tried with Gpath and all i could get there was all of the texts in one long String.

How would you do this in GPath or/and XPath?

LarsH
  • 27,481
  • 8
  • 94
  • 152
Nyegaard
  • 1,309
  • 3
  • 16
  • 24
  • "I have tried with XPath but could not get that to work." What did XPath expressions did you try, and what was the result? – LarsH Oct 01 '11 at 20:51

4 Answers4

2

Try this XPath:

//*[contains(text(), 'Some Service')]

It will return all elements which contain text node with Some Service

Kirill Polishchuk
  • 54,804
  • 11
  • 122
  • 125
  • Okay tried your XPath in this online tool and it looks like it returns the correct nodes. But it also looks like the expression: "//a:businessServiceVersionNameField[name()]" will do that. Is there any differens between the result of the 2 expressions? – Nyegaard Oct 02 '11 at 08:34
1

After registering the bindings of the prefixes to the corresponding namespaces, use:

  /*/s:Body
         /s:About_ServiceResponse
            /s:About_ServiceResult
               /a:businessServiceVersionStructureField
                  /a:BusinessServiceVersionStructureType
                      /a:businessServiceVersionNameField
                          /text()

When this XPath expression is evaluated against the following XML document (the provided one is severely malformed and I had to spend considerable time to make it well-formed):

<s:Envelope xmlns:s="http://...">
    <s:Body>
        <About_ServiceResponse xmlns="http://...">
            <About_ServiceResult xmlns:a="http://">
                <a:businessServiceVersionStructureField>
                    <a:BusinessServiceVersionStructureType>
                        <a:businessServiceDBVersionNameField>V001</a:businessServiceDBVersionNameField>
                        <a:businessServiceVersionNameField>Some Service^V100</a:businessServiceVersionNameField>
                    </a:BusinessServiceVersionStructureType>
                </a:businessServiceVersionStructureField>
            </About_ServiceResult>
        </About_ServiceResponse>
    </s:Body>
</s:Envelope>

Exactly the wanted text node is selected:

Some Service^V100

In case you want to select the element that is the parent of this text node, use:

  /*/s:Body
         /s:About_ServiceResponse
            /s:About_ServiceResult
               /a:businessServiceVersionStructureField
                  /a:BusinessServiceVersionStructureType
                      /a:businessServiceVersionNameField

XSLT - based verification:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:s="http://..." xmlns:a="http://">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="/">
  <xsl:copy-of select=
  "/*/s:Body
         /s:About_ServiceResponse
            /s:About_ServiceResult
               /a:businessServiceVersionStructureField
                  /a:BusinessServiceVersionStructureType
                      /a:businessServiceVersionNameField
                          /text()
  "/>
  =======
  <xsl:copy-of select=
  "/*/s:Body
         /s:About_ServiceResponse
            /s:About_ServiceResult
               /a:businessServiceVersionStructureField
                  /a:BusinessServiceVersionStructureType
                      /a:businessServiceVersionNameField
  "/>
 </xsl:template>
</xsl:stylesheet>

When this transformation is applied against the same XML document (above), the selected nodes are output (using "=======" as delimiter):

Some Service^V100
  =======
  <a:businessServiceVersionNameField xmlns:a="http://" xmlns="http://..." xmlns:s="http://...">Some Service^V100</a:businessServiceVersionNameField>
Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431
  • Okay but then the expression: //a:businessServiceVersionNameField/text() would give me the text of all the elements of the node businessServiceVersionNameField right? – Nyegaard Oct 02 '11 at 08:36
  • @Tatewaki: This expression *selects* all text-nodes any of which is a child of an `a:businessServiceVersionNameField` element. However, using the XPath `//` pseudo-operator most often results in gross inefficiency (slowness) as it causes the whole documen tree (or a subtree, if the context node isn't the top node) to be searched. It is recommended to avoid using `//` whenever the structure of the document is statically known. – Dimitre Novatchev Oct 02 '11 at 14:47
1

Using Groovy with XmlSlurper/GPathResult

def xml = '''
<s:Envelope xmlns:s="http://foo">
  <s:Body>
    <About_ServiceResponse xmlns="http://bar">
      <About_ServiceResult xmlns:a="http://baz">
        <a:businessServiceVersionStructureField>
          <a:BusinessServiceVersionStructureType>
            <a:businessServiceDBVersionNameField>V001</a:businessServiceDBVersionNameField>
            <a:businessServiceVersionNameField>Some Service^V100</a:businessServiceVersionNameField>
          </a:BusinessServiceVersionStructureType>
        </a:businessServiceVersionStructureField>
      </About_ServiceResult>
    </About_ServiceResponse>
  </s:Body>
</s:Envelope>'''

def envelope = new XmlSlurper().parseText(xml)
envelope.declareNamespace(s:'http://foo', t:'http://bar', a:'http://baz')

assert 'Some Service^V100' == envelope.'s:Body'.
                                       't:About_ServiceResponse'.
                                       't:About_ServiceResult'.
                                       'a:businessServiceVersionStructureField'.
                                       'a:BusinessServiceVersionStructureType'.
                                       'a:businessServiceVersionNameField'.text()

assert 'Some Service^V100' == envelope.'Body'.
                                       'About_ServiceResponse'.
                                       'About_ServiceResult'.
                                       'businessServiceVersionStructureField'.
                                       'BusinessServiceVersionStructureType'.
                                       'businessServiceVersionNameField'.text()

Since the element names in your sample are unique, it can be done with or without registering the namespaces.

John Wagenleitner
  • 10,967
  • 1
  • 40
  • 39
0

Using Groovy XmlSlurper.

def xml = new XmlSlurper().parseText(yourXml).declareNamespace(ns1: 'http://..',ns2:'http://..')
def theText = xml?.'ns1:Body'?.'ns2:About_ServiceResponse'?.'ns3.About_ServiceResult'?.businessServiceVersionStructureField?.businessServiceVersionNameField.text();
Abe
  • 8,623
  • 10
  • 50
  • 74