This answer explains that to validate an arbitrary regular expression, one simply uses eval
:
while (<>) {
eval "qr/$_/;"
print $@ ? "Not a valid regex: $@\n" : "That regex looks valid\n";
}
However, this strikes me as very unsafe, for what I hope are obvious reasons. Someone could input, say:
foo/; system('rm -rf /'); qr/
or whatever devious scheme they can devise.
The natural way to prevent such things is to escape special characters, but if I escape too many characters, I severely limit the usefulness of the regex in the first place. A strong argument can be made, I believe, that at least []{}()/-,.*?^$!
and white space characters ought to be permitted (and probably others), un-escaped, in a user regex interface, for the regexes to have minimal usefulness.
Is it possible to secure myself from regex injection, without limiting the usefulness of the regex language?