1

Despite the title, I'm asking a general question here:

Is there any way to debug an XPath statement, or perhaps translate it to English (i.e. something similar to this fantastic REGEX tool)

Case in point: I've got this XPath query:

//transaction[@sumPrice != sum(id(@products) /@price)

Suppose I'm a newb at XPath (I am). I want to break it down and get some output that will help me understand each component. However, if I pull out a certain part, say id(@products), I don't know where to put that in order to get some sort of feedback. It seems you can't break it up. How can I break it down and analyze?

As a bonus if you can, what does my query say?

Community
  • 1
  • 1
CodyBugstein
  • 21,984
  • 61
  • 207
  • 363

1 Answers1

2

I don't know of a tool but a few basic concepts will help you here.

  • //transaction will select all elements named transaction in no namespace, located anywhere in your source document.
  • the [...] expression is a predicate, which is evaluated once for each element selected by the path step it is attached to, and which filters the list to extract only those nodes for which the predicate is true.

To break such an expression down for debugging you'd have to start from the left hand end and add a step at a time, inspecting the results as you go. For a predicate you'd have to manually iterate over the results of the expression that the predicate is attached to, and then evaluate (parts of) the predicate expression with the context node set to each node in that list in turn. Exactly how you do that depends on the XPath tool or library you are using.


So in this specific case we're selecting all transactions whose sumPrice value is different from the value of

sum(id(@products) /@price)

id(@products) takes the value of the products attribute on the transaction being tested, splits it up at whitespace into a series of tokens, then looks up the set of elements in the document that have an ID attribute (which means an attribute of type ID according to the document's DTD - it may not necessarily be called id) whose value is the same as any of those tokens. Finally /@price then gives you the price attribute for each of those elements, and sum totals them all up. For example, given this XML:

<!DOCTYPE root [
  <!-- rest of DTD omitted -->
  <!ATTLIST product ident ID #REQUIRED>
]>
<root>
  <transactions>
    <transaction sumPrice="5" products="prod1 prod2" />
    <transaction sumPrice="10" products="prod2 prod3" />
  </transactions>
  <product ident="prod1" price="3" />
  <product ident="prod2" price="2" />
  <product ident="prod3" price="4" />
</root>

The expression

//transaction[@sumPrice != sum(id(@products)/@price)]

would select the second transaction (as the sum of the prod2 and prod3 prices is not 10) but would not select the first (because prod1 + prod2 gives 5).

Ian Roberts
  • 120,891
  • 16
  • 170
  • 183
  • Thanks for explanation, it makes sense, but in practice, it's not working like that... when I run this query on the XML you provided, it returns everything – CodyBugstein Jan 15 '14 at 16:33
  • @Imray that suggests the DTD isn't being processed properly, so the `ident` attributes aren't recognised as being IDs. This would mean the `id` function always returns an empty node set, whose sum is zero. You probably need a complete DTD rather than just the fragment I've given, and to make sure you turn on DTD validation in your parser. – Ian Roberts Jan 15 '14 at 16:35
  • That doesn't seem to be the issue, I have a working DTD and XML file, that validates – CodyBugstein Jan 15 '14 at 16:55
  • @Imray if you are definitely using a validating parser to load the document for the XPath engine then I agree, it should work (it's not enough that the document happens to be valid, it must be parsed with DTD validation enabled when you build the DOM tree that you'll be using for the XPaths). – Ian Roberts Jan 15 '14 at 17:55
  • Hmmm... I don't know about that. I'm using Notepad++ with XPatherizerNPP – CodyBugstein Jan 15 '14 at 18:03
  • @Imray looking at the code it appears to use the .NET XmlDocument.LoadXml method, and [this answer](http://stackoverflow.com/a/19127375/592139) suggests that doesn't do DTD validation. – Ian Roberts Jan 15 '14 at 23:14
  • thanks- you were correct it works now! Interestingly, it doesn't work if there are two of the same products in an order... – CodyBugstein Jan 15 '14 at 23:50
  • @Imray the `id` function is a union - it gives you the set of nodes that match any of the ids but each node appears in the set only once even if it was "referenced" multiple times. You'd need to change your representation to use a sub-element of `transaction` for each product and add a "quantity" attribute, but you can't validate that sum-of-products type calculation in pure XPath 1.0 (you could in 2.0). – Ian Roberts Jan 16 '14 at 10:31