-1

I need to give some limited possibilities for an Attribute in xml, for this I am using in Relax NG validation document:

<param "pattern">myRegEx</param>

allowing to specify some regular expression the attribute need to be conform to.

For some reason pattern does not support use of $ (end-of-string) sign.

Example:

<param "pattern">Why it does not work??$</param>

This is supposed to match to string ending with "Why it does not work??" sentence. But the $ seems to be totally ignored from the validation.

Does anyone understand why?

thank you

Nathaniel

  • What you've got there is not valid XML, and consequently not valid Relax NG. In other words, any Relax NG validator would refuse to work with what you show in your question. You should provide an [mcve](http://stackoverflow.com/help/mcve) in the question so as to avoid issues like this one. – Louis Aug 19 '14 at 11:46

1 Answers1

1

The RNG 'pattern' parameter has the same* semantics as the pattern facet on an XSD simple type definition. XSD patterns apply to the entire lexical item being validated -- XSD patterns have no use for ^ and $ as start-of-string and end-of-string anchors, because XSD patterns always match either the entire input string or do not match. (It might have been nicer for people who assume all regular expression languages have ^ and $ if XSD had specified that they are ignored at the beginning and ending of patterns, but that didn't happen.)

If you want to define a pattern which matches any text node ending in the string "Why does it not work??", instead of one matching either of the two strings "Why does it not work$" or "Why does it not wor$", you'll want to write

<param name="pattern">.*Why does it not work\?\?</param>

Note:

  • lose the $
  • escape the ?

(* There is one gratuitous incompatibility in the treatment of multiple patterns, but it's not relevant here.)

C. M. Sperberg-McQueen
  • 24,596
  • 5
  • 38
  • 65
  • So typically if I want to identify a sequence of numbers necessarily followed by space or comma, but that for the last number I don't want to authorize the comma I should do the following: ([0-9][, ]).*([0-9][ ]?){1,} – Nathaniel Perez Aug 21 '14 at 09:04
  • I don't think that's quite right. The regex you give matches a sequence consisting of (a) a single digit, (b) a single comma or single space, (c) any string of zero or more characters, and (d) one or more sequences consisting of (d1) a single digit and (d2) an optional space. So "9,8 7,this is some random stuff 654 3 2 1" matches, and "987,654,321" does not. That doesn't seem to match your prose description. If you have further questions about regular expressions in RNG, it would probably be best to ask a new question. – C. M. Sperberg-McQueen Aug 21 '14 at 14:12
  • Oh you are completely right, it should rather be ([0-9]+,)*([0-9][ ]?). Anyway thanks for help – Nathaniel Perez Aug 24 '14 at 07:12