0

I have a long local XML file like this:

<root>
  <div>this</div>
  <div>is</div>
  <div>a</div>
  ...
  <div>very</div>
  <div>long</div>
  <div>list</div>
</root>

I need to know:

  1. the total number of thos div elements
  2. the position of the element "very"

I know that I can do that by launching these two XPATH (2.0) queries:

1.  count(/root/div)
2.  index-of(/root/div,"very")

but the XML file is long so I hate that the XML parsing engine passes two times through all the file.

Is there a faster combination that returns for example an array with the two results, passing a single time through the file?

Imbuter
  • 17
  • 5

3 Answers3

1

What makes you think you have to parse the XML document twice in order to run two XPath queries? I don't know what API you are using to run the XPath, but I don't know of an API that has that restriction.

Michael Kay
  • 156,231
  • 11
  • 92
  • 164
  • My logic tells me that if I launch two times a command-line XPATH query over a local XML file on my harddisk, the file will be read and parsed two times, with the only advantage of HD caching the second time... I still don't even know if there's a way to launch two queries in a single command-line launch because I've to solve a bigger problem first: to make a (simpler) command-line query work. For example with Saxon I even asked for help here http://stackoverflow.com/questions/17013688/how-to-extract-an-xpath-from-an-html-page-with-saxon-pe-commandline but without luck so far... – Imbuter Jun 11 '13 at 09:01
  • Correct, you don't want to do this from the command line. Take a look at something like xmlsh. – Michael Kay Jun 11 '13 at 17:22
  • I need to call it from a proprietary scripting language... so a single-line command-line would be better... According to this page http://www.xmlsh.org/XPathExtension XMLSH only supports "Version 0.1.0.0 an XPath" so the function "index-of" used in my example is not supported. Correct? – Imbuter Jun 12 '13 at 09:01
  • I can't help you with xmlsh, but I think you are misreading the spec. You don't need XPath extensions, and that version number is an xmlsh version number not an XPath version number. – Michael Kay Jun 12 '13 at 11:50
  • Ah, you're right about xmlsh version, however xmlsh seems a very unpopular tool to deserve learning it. What about the XQUERY that I just posted? http://stackoverflow.com/a/17116961/1292671 – Imbuter Jun 14 '13 at 20:45
  • You can certainly achieve a lot more with XQuery than with XPath, but eventually, doing this from the command line is going to run out of steam. It's hard to believe that's the right long-term approach. – Michael Kay Jun 14 '13 at 22:58
0

Short answer: No, or at least I doubt it.

Long answer: It can be done in a single pass, but only by whatever is calling the XPath. In pseudo code:

FIND ALL DIVS USING XPath(/root/div), ASSIGN TO x
VAR i = 0
VAR v = -1
LOOP THROUGH ALL x
    INCREMENT COUNTER i
    IF x' EQUALS 'very' AND v EQUALS -1
        v = i
0

In XQUERY language:

let $div := /root/div
return
<result>
<cnt>{count($div)}</cnt>
<v-index>{index-of($div, contains(., 'very'))}</v-index>
</result>
Imbuter
  • 17
  • 5