1

I want to parse XPath expressions I an looking for a lexer/parser implementation in C++ or Python

Here is all the information about XPath parsers I manage to gather :

Does anyone knows other implementations ? in C++ ?

PS : I don't want to evaluate XPath expressions but to tokenise them

Community
  • 1
  • 1
Patrick Marty
  • 487
  • 8
  • 14
  • Do you want a solution that lexes/parses XPath expressions, or one that evaluates them with respect to a specific XML document? – robert Jun 29 '11 at 01:07
  • @robert : I don't want to evaluate XPath expressions, just tokenise them. So yes I want a solution that lexes/parses XPath expressions – Patrick Marty Jun 29 '11 at 10:03
  • 1
    first you say, "just tokenise" then you immediately say "lexes/parses". Those are very different requests. It isn't clear to me that you have a clear understanding of your own requirements. What exactly to do you want to do with the XPath information and why? – Ira Baxter Jun 29 '11 at 13:28
  • @IraBaxter : you are right, I am not clear enought. I don't want to evaluate on XPath expression on XML documents. I want to represent an xpath expression as an abstract syntax tree ( like tree representation me use for arithmetic expressions ) in order to manipulate them. I could use the XPath grammar and write my own parser. But before reinventing the wheel, I just want to know if somebody has already done the job :) I am working on XPath queries rewriting and equivalence. – Patrick Marty Jul 01 '11 at 08:52
  • @Patrick: if you want to do "rewriting" you need more than tokenization. You need a full parser that builds ASTs. Ideally you would also get other machinery to help you do the rewriting, otherwise you'll spend a lot of work building *that*, too. – Ira Baxter Jul 01 '11 at 18:23

3 Answers3

2

Based on a comment by OP,

I am working on XPath queries rewriting and equivalence

what he needs is a parser that builds abstract syntax trees, and ways to analyze those trees and transform them. Analyzers and "rewriting" can then be done procedurally by walking/modifying the AST; this is the traditional way to do it.

But it seems that the focus should be OP's goals. For that he needs analysis and rewriting. But that doesn't have to be entirely done in the traditional, procedural way. Rather, it would be nice if the analysis/rewrites can be done directly using XPath notation.

I suggest he look at our DMS Software Reengineering Toolkit which parses, builds ASTs, but in particular enables "rewriting" on the ASTs using the surface syntax. Then XPath "rewrites" could be written directly as equivalences over XPath expressions. A motivating example of how this works can be seen as Rewriting algebra equations using DMS. It should be obvious from that example that a grammar for XPath is easily defined.

Ira Baxter
  • 93,541
  • 22
  • 172
  • 341
0

Checkout XQilla http://xqilla.sourceforge.net/HomePage

Steve
  • 153
  • 3
  • I agree that there is a xpath parser/lexer in an xquery/xslt engine like XQilla. but I only need the parser, not the whole engine. – Patrick Marty Jun 29 '11 at 12:13
0

Xerces has an offshoot, xalan-c, for doing this:

http://xml.apache.org/xalan-c/overview.html

Martin York
  • 257,169
  • 86
  • 333
  • 562
  • I agree that there is a xpath parser/lexer in an xquery/xslt engine like Xerces. but I only need the parser, not the whole engine. I want an easy API to tokenise XPath query in a sequence of location steps – Patrick Marty Jun 29 '11 at 12:17