2

I want to query nodes from a XOM document which contains certain value but case insensitive. Something like this:

doc.query('/root/book[contains(.,"case-insentive-string")]')

But it contains is case sensitive.

  1. I tried to use regexes, but it is only XPATH2.0 and XOM does not seem to support it.
  2. I tried contains(translate(."ABCEDF...","abcdef..."),"case-insentive-string")]' failed too.
  3. I tried to match subnodes and read parent attributes using getParent, but there is no method to read parents attributes.

Any suggestions ?

millebii
  • 1,277
  • 2
  • 17
  • 27

2 Answers2

2

2.I tried contains(translate(."ABCEDF...","abcdef..."),"case-insentive-string")]' failed too.

The proper way to write this is:

/root/book[contains(translate(., $vUpper, $vLower),
                    translate($vCaseInsentiveString, $vUpper, $vLower)
                    )
          ]

where $vUpper and $vLower are defined as (should be substituted by) the strings:

'ABCDEFGHIJKLMNOPQRSTUVWXYZ'

and

'abcdefghijklmnopqrstuvwxyz'

and $vCaseInsentiveString is defined as (should be substituted by) the specific case-insensitive string.

For example, given the following XML document:

<authors>
  <author>
    <name>Victor Hugo &amp; Co.</name>
    <nationality>French</nationality>
  </author>
  <author period="classical" category="children">
    <name>J.K.Rollings</name>
    <nationality>British</nationality>
  </author>
  <author period="classical">
    <name>Sophocles</name>
    <nationality>Greek</nationality>
  </author>
  <author>
    <name>Leo Tolstoy</name>
    <nationality>Russian</nationality>
  </author>
  <author>
    <name>Alexander Pushkin</name>
    <nationality>Russian</nationality>
  </author>
  <author period="classical">
    <name>Plato</name>
    <nationality>Greek</nationality>
  </author>
</authors>

the following XPath expression (substitute the variables by the corresponding strings):

   /*/author/name
              [contains(translate(., $vUpper, $vLower),
                        translate('lEo', $vUpper, $vLower)
                        )
              ]

selects this element:

<name>Leo Tolstoy</name>

Explanation: Both arguments of the contains() function are converted to lower-case, and then the comparison is performed.

Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431
2

If you are using XOM, then you can use Saxon to run XPath or XQuery against it. That gives you the ability to use the greatly increased function library in XPath 2.0, which includes functions lower-case() and upper-case(), and also the ability (though in a somewhat product-specific way) to choose your own collations for use with functions such as contains() - which means you can do matching that ignores accents as well as case, for example.

Michael Kay
  • 156,231
  • 11
  • 92
  • 164
  • I do use Saxon 8 which also has regexes, maybe something is wrong with my config. I will check again. – millebii Jan 29 '11 at 19:19
  • Indeed I get the following error : `Caused by: org.jaxen.UnresolvableException: No Such Function lower-case`. How do u get saxon called ??? – millebii Jan 29 '11 at 19:29