2

i have a problem filtering specific nodes of a jackrabbit jcr in magnolia.

when i submit following query : //element(*, standort)//*

i get:

33 nodes returned in 18ms
/standort/Standorte/MetaData
/standort/Standorte/standort-de
/standort/Standorte/standort-de/MetaData
/standort/Standorte/standort-de/Teststandort
/standort/Standorte/standort-de/Teststandort/MetaData
/standort/Standorte/standort-de/Hauptwerk-Köln
/standort/Standorte/standort-de/Hauptwerk-Köln/MetaData
/standort/Standorte/standort-de/Geschäftsstelle-Berlin
/standort/Standorte/standort-de/Geschäftsstelle-Berlin/MetaData
/standort/Standorte/standort-de/Geschäftsstelle-Frankfurt
/standort/Standorte/standort-de/Geschäftsstelle-Frankfurt/MetaData
/standort/Standorte/standort-de/Geschäftsstelle-Hamburg
/standort/Standorte/standort-de/Geschäftsstelle-Hamburg/MetaData
/standort/Standorte/standort-de/Geschäftsstelle-Hannover
/standort/Standorte/standort-de/Geschäftsstelle-Hannover/MetaData
/standort/Standorte/standort-de/Geschäftsstelle-Köln
/standort/Standorte/standort-de/Geschäftsstelle-Köln/MetaData
/standort/Standorte/standort-de/Werk-Leipzig
/standort/Standorte/standort-de/Werk-Leipzig/MetaData
/standort/Standorte/standort-de/Geschäftsstelle-München
/standort/Standorte/standort-de/Geschäftsstelle-München/MetaData
/standort/Standorte/standort-de/Geschäftsstelle-Stuttgart
/standort/Standorte/standort-de/Geschäftsstelle-Stuttgart/MetaData
/standort/Standorte/standort-de/Gelsdorf-(Mischwerk)
/standort/Standorte/standort-de/Gelsdorf-(Mischwerk)/MetaData
/standort/Standorte/standort-de/Gelsdorf-(Handläufe)
/standort/Standorte/standort-de/Gelsdorf-(Handläufe)/MetaData
/standort/Standorte/standort-de/KB-Roller-Tech-Kopierwalzen-GmbH
/standort/Standorte/standort-de/KB-Roller-Tech-Kopierwalzen-GmbH/MetaData
/standort/Standorte/standort-en
/standort/Standorte/standort-en/MetaData
/standort/Standorte/standort-en/Böttcher-UK-Ltd-
/standort/Standorte/standort-en/Böttcher-UK-Ltd-/MetaData

But i want only the nodes:

/standort/Standorte/standort-de/Teststandort
/standort/Standorte/standort-de/Hauptwerk-Köln
/standort/Standorte/standort-de/Geschäftsstelle-Berlin
/standort/Standorte/standort-de/Geschäftsstelle-Frankfurt
/standort/Standorte/standort-de/Geschäftsstelle-Hamburg
/standort/Standorte/standort-de/Geschäftsstelle-Hannover
/standort/Standorte/standort-de/Geschäftsstelle-Köln
/standort/Standorte/standort-de/Werk-Leipzig
/standort/Standorte/standort-de/Geschäftsstelle-München
/standort/Standorte/standort-de/Geschäftsstelle-Stuttgart
/standort/Standorte/standort-de/Gelsdorf-(Mischwerk)
/standort/Standorte/standort-de/Gelsdorf-(Handläufe)
/standort/Standorte/standort-de/KB-Roller-Tech-Kopierwalzen-GmbH
/standort/Standorte/standort-en/Böttcher-UK-Ltd-

thus without the MetaData nodes and the parent-nodes. I need everything beneath Standorte. The children of Standorte can be type of standort-de or standort-en. I hope i could make my problem clearer. I've shortened my output in the last version of my question. So far i didn't find any xpath-expression which could help me out. But this is due to my lack of xpath-knowledge.

Thanks in advance!

meltac
  • 537
  • 4
  • 11
  • Strange result! It should also select `/standort/Standorte`. If you are working with a PSVI, why don't you match `Teststandort` type annotation? –  Dec 15 '10 at 13:43
  • @Alejandro: hm, i dont know. maybe xpath it works different when selecting nodes in jackrabbit. – meltac Dec 15 '10 at 15:05
  • Then `Standorte` element has the `standort` type annotation and the result informs the full absolute path of selected nodes. –  Dec 15 '10 at 15:20
  • Good question, +1. See my answer for the solution -- a simple adjustment is all that is needed. :) – Dimitre Novatchev Dec 15 '10 at 16:29

2 Answers2

2

The expression

//element(*, standort)//*

selects any element (final *) that is a descendant (second //) of an element anywhere in the document (//element()) that has been successfully validated against a schema-defined type definition for standort. (Thanks to @Alej for helping correct this statement and the following.)

So basically you are selecting every element that is a descendant of a validated standort element, assuming you have a schema successfully attached..

Try the XPath expression (updated):

/standort/Standorte/(standort-de | standort-en)/*
LarsH
  • 27,481
  • 8
  • 94
  • 152
  • @LarsH: This is XPath 2.0 and the second argument for `element()` node test is matching the type annotation. –  Dec 15 '10 at 13:38
  • i hope, that my revised question is better to understand. Naturally i dont meant the '/' node but the parents of the leaves – meltac Dec 15 '10 at 15:00
  • @meltac, did you try the XPath expression `/standort/Standorte/(standort-de | standort-en)/*`, and what was the result? – LarsH Dec 15 '10 at 15:17
  • @Alej: thanks, I was lazy and didn't check the definitions of the args to element()... I assumed `*, standort` was a sequence, forgetting that a sequence could only be expressed as an argument by using extra parentheses. I will revise my answer. – LarsH Dec 15 '10 at 15:20
  • @LarsH: thanks for ur hint. But unfortunately its not working for me. If i enter ur query, then i get an empty resultset back. – meltac Dec 15 '10 at 17:40
  • @meltac: I think you'd better post some sample input XML. I'm wondering if it's in a namespace or something. – LarsH Dec 15 '10 at 21:09
  • @meltac, sorry, I just learned about jackrabbit/JCR that the top-level element is always `/jcr:root`... despite the misleading output you received from your query. So how about – LarsH Dec 15 '10 at 22:56
  • @meltac, do you have docs on exactly what XPath features Jackrabbit supports? "Even though Jackrabbit uses the XQuery grammar, it only implements the set of XPath features required by JCR-170 (and some extras, like predicates on location steps)." http://wiki.gxdeveloperweb.com/confluence/display/GXDEV/XPath+JCR+Sample+Queries – LarsH Dec 15 '10 at 23:21
  • @meltac: next attempt: `//element(*, standort)/standort-de | //element(*, standort)/standort-en` – LarsH Dec 15 '10 at 23:23
  • @meltac: if Jackrabbit's XPath doesn't support the union operator, the expression in the previous comment will not work. But in that case you could try `//element(*, standort)/standort-de` and `//element(*, standort)/standort-en` as separate queries, and put the results together. – LarsH Dec 15 '10 at 23:34
  • FYI, the thread http://www.mail-archive.com/users@jackrabbit.apache.org/msg06032.html indicates there is likely not a list of XPath features supported by Jackrabbit. :-( However the JSR-170 spec says that the union operator *is* required to be supported; but the following are not required: for/return, variables, if, **any axes** including "..". – LarsH Dec 15 '10 at 23:46
  • thanks for ur comments, but it wont work. both expressions are not working. the "and" operator returns the same error message as dimitres expression returned. Anyway big thanks for ur big help. i will try to dig me through the jackrabbit docs to find a solution. there must be a way.... – meltac Dec 17 '10 at 13:59
  • @meltac, I don't know what "and" operator you're referring to... none of my expressions contain "and". However union should work as it's required by JSR-170. What does `//element(*, standort)/standort-de | //element(*, standort)/standort-en` do? – LarsH Dec 17 '10 at 14:41
  • @LarsH: oh sorry, i copied the expressions from 4 comments above. But i didnt see that there were 2 separated by an "and". And it was exactly the same expression as with the `|` operator. If i run ur expression, i get zero results. This is confusing as i get a result for each separate expression. But the `|` should be allowed because jackrabbit usually complains about wrong placed or unknown characters. Thanks for the hint with the "backquotes" annotation – meltac Dec 17 '10 at 14:54
2

Use:

(//element(*, standort)//*)[not(ancestor-or-self::MetaData)]
Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431
  • @Dimitre: Thanks, this query sounds also very self-explaining, but xpath returns following error: null for statement: for $v in (//element(*, standort)//*)[not(ancestor-or-self::MetaData)] return $v: null – meltac Dec 15 '10 at 17:42
  • @meltac: There is no `$v` in my answer. I only guarantee that my answer is correct -- anything beyond it may or may not be correct. Also, it seems that you have a typo in your expression -- an `*` is missing in more than one places. If you don't get any node selected, but you really get the selected nodes reported in your question, then your XPath engine is broken. I have done two things: 1. Taken your expression and surrounded it by brackets. This should select the same nodes. 2. Appended a predicate that filters any node that is `MetaData` or has an ancestor `Metadata`. This *should* work. – Dimitre Novatchev Dec 15 '10 at 17:54
  • @Dimitre - I think the `*` characters missing in his comment got parsed as italics markup. – LarsH Dec 15 '10 at 21:02
  • @LarsH: All the more then -- his Xpath engine must be quite buggy. – Dimitre Novatchev Dec 15 '10 at 21:06
  • @meltac, you should put backquotes around your code in comments to stop that from happening. +1 to @Dimitre for good answer. – LarsH Dec 15 '10 at 21:12
  • @Dimitre, the page http://wiki.gxdeveloperweb.com/confluence/display/GXDEV/XPath+JCR+Sample+Queries says that "Even though Jackrabbit uses the XQuery grammar, it only implements the set of XPath features required by JCR-170 (and some extras, like predicates on location steps)." I haven't found any documentation on what subset of XPath is supported, other than examples. So ancestor-or-self may not be supported. – LarsH Dec 15 '10 at 23:11
  • @LarsH, @meltac: This means that this question doesn't belong to the xpath-2.0 tag. – Dimitre Novatchev Dec 15 '10 at 23:20
  • @Dimitre, I agree in that it sounds like the XPath supported by Jackrabbit is not the standard XPath (just a subset). However the answer to the question should be standard XPath (2.0). – LarsH Dec 15 '10 at 23:29
  • @LarsH: My answer is standard XPath 2.0. However, it isn't useful to the OP. – Dimitre Novatchev Dec 16 '10 at 00:06