Time and memory efficient java XPath parser

Question

What I need is a java implementation of XPath parser that will be more intuitive to use and comparable in memory and time efficiency to VTD-XML. What is more, I need it to perform nested Xpath expressions for some additional performance gains.

In my current project I do a lot of XPath parsing with VTD-XML which is really fast and memory efficient, but really difficult to learn and with convoluted syntax.

I looked at XOM and Xalan parsers already. Xalan has a poor performance comparing to VTD. XOM on the other hand is a good one but as far as I know it lacks the feature of nesting XPath expressions. By nested expressions I mean the possibility to execute XPath search from some position in document and not always from the beginning.

Thanks for any answers.

Welcome on Stackoverflow. Unfortunately, questions asking people to recommend a tool or library are off-topic here. — Mathias Müller, Jan 13 '16 at 10:57
Also you might want to explain what exactly you consider "to perform nested XPath expressions". — Martin Honnen, Jan 13 '16 at 11:21
Hello, sorry about offtopic I didn't know about this rule, in the past I've found lots of topics with recommended tools and libraries here. Didn't know this was a problem. — Edmund K, Jan 13 '16 at 11:34
XOM exposes a `query` method on `Node` so you should be able to select one node with e.g. `Node foo = doc.query("//foo").get(0);` and then write a selection relative to that `foo` node with e.g. `Node bar = foo.query("bar").get(0);`. — Martin Honnen, Jan 13 '16 at 11:54
Ah thanks a lot, that was exactly what I needed, I was trying that approach but I wrote foo.query("/bar").get(0); and it didnt work. The additional slash before expression was unnecessary. — Edmund K, Jan 13 '16 at 12:11
which part of syntax of vtd-xml is hard to learn? I am all ears — vtd-xml-author, Jan 13 '16 at 20:04
To be honest I got this task from my supervisor, it's just the info he gave me, I've never before used vtd or any other xml parsers in general so I'm really in no position to criticize this library myself. My job was to search for a possible replacement of vtd, a parser that can process a lot of xpaths fast and be memory efficient. I really just used some basic syntax thus far, and apart from non discriptive (for me) class names, comparing to other parsers (Xalan, XOM) it looks ok. Performance-wise, it is still the best I've tested thus far. — Edmund K, Jan 14 '16 at 07:09

vtd-xml-author · Accepted Answer · 2016-01-16T02:45:50.250

I do not think you will be able to easily find a replacement of VTD-XML for fast XPath and memory saving. The fundamental reason is that every little object allocation (think element nodes, strings, attributes, etc ) incurs a little bit memory overhead, and those overheads tend to accumulate during the construction of a DOM tree, leading to significant memory overhead as observed in object based XML modeling APIs such as DOM.

As VTD-XML's underlying modeling approach is different from DOM, its style of API differs drastically from DOM API as well. So if you are accustomed to DOM, there will be some learning curves (which is to be expected)...

If you use VTD-XML in ways that it is not intended to be used, your code certainly will be convoluted and ugly. Ignore the underlying principle of reducing/eliminating object creation, and your app will end up being sluggish. No tools in this world can help you.

score 1 · Answer 2 · answered Jan 27 '16 at 20:20

When searching with XPath, you begin from a context node. The XPath is relative to that context node. This doesn't have to be the root of the document.

In XOM specifically you can use the query() method on any node to search starting from that node as the context. E.g.

Nodes result = p.query("b");

will find the elements named b which are children of the p node.

Nodes result = p.query(".//b");

will find the elements named b which are descendants of the p node.

Time and memory efficient java XPath parser

2 Answers2