1

I need to embed user-input in my regular expression, so it needs to be escaped for any regex special characters, and I don't know in advance what the string will be.

It would be something like

string pattern = "\\d+ " + myEscapeFunction(userData);

Which special characters do I need to escape? Or is there an equivalent function to Qt's QRegExp::escape?

sashoalm
  • 75,001
  • 122
  • 434
  • 781
  • I'm not sure, if there is a generic solution, because it depends on the expression standard. For example (AFAIR) some engines required ] to be escaped and some didn't. You should look at your standard's rules of escaping strings and do some string replacements... – Spook Mar 20 '13 at 12:37
  • That would seem to be the case. Now that I think about it, tr1::regex allows the expression standard to be set at runtime. I think it defaults to ECMAscript. – sashoalm Mar 20 '13 at 12:39

1 Answers1

3

The list of characters that you have to escape depends on which of the various regular expression grammars you're using. If you're using the default ECMAScript, it looks like the list in the QRegExp::escape documentation is a good place to start. It says:

The special characters are $, (,), *, +, ., ?, [, ,], ^, {, | and }.

That list leaves out \ for some reason.

But it's slightly more complicated than that, because inside square brackets, none of the characters except \ and ] are special, and \] has to stay unescaped.

Further, a ? that comes right after a ( is not special. For example, in (?=x) the ? should not be escaped.

I think that's pretty much it, but I haven't put enough time into this to be sure.

Pete Becker
  • 74,985
  • 8
  • 76
  • 165