3

I have a (assumed well formed) regex expresion R. I want to test if the regex expression is just a single match (all letters, numbers, and escaped expressions) or could be swapped with anything else. This function, "HasWildCards", would work like this:

bool a = HasWildCards("asdf");//returns false
bool b = HasWildCards("asdf*");//returns true
bool c = HasWildCards("asdf[123]");//returns true
bool d = HasWildCards("asdf\\[123\\]");//returns false

I am using boost::regex, if that helps at all. I was thinking of checking if the regex expression matches something like this:

(^(([\[\^\$\.\|\?\*\+\(\{\}])))?(\\[QEdwsDWSbAZzB])?([^\\][\[\^\$\.\|\?\*\+\(\)\{\}])?

I've tested this on a few expressions (using the RegexTest tool of grepWin)

So non-escaped regex symbol to start, non-escaped flag,non-escaped regex sumbol in body. Is there an alternative? Did I screw something up? Is there a better way?

IdeaHat
  • 7,641
  • 1
  • 22
  • 53
  • `"[^\\\\][\\.\\^\\$\\[\\]\\?\\+\\*\\{\\}]"` If a special character exists without escaping before it, you may need to extend the second character class to include other special characters I missed off the top of my head. All backslashes are doubled up for being escaped into the string. –  Jul 31 '13 at 19:23
  • @DrewMcGowen On a few expressions yeah, and whenever I break it I have to edit the regex... – IdeaHat Jul 31 '13 at 19:23
  • @MadScienceDreams you might want to mention that in your question, in case someone assumes you haven't actually tested anything – Drew McGowen Jul 31 '13 at 19:25
  • @Robadob yeah, its gotta seach if its not escaped at the beginning of the line (yours requires that there be a non-escape character before the symbol) and doesn't check for the other escape symbold (\Q\E, for example) – IdeaHat Jul 31 '13 at 19:29
  • Might this be an [XY problem](http://meta.stackexchange.com/questions/66377/what-is-the-xy-problem)? What do you need this for? – Martin Ender Jul 31 '13 at 19:45
  • @m.buettner Taking in regex expressions of a tree path, as a command line argument, need to find all paths that match the regex, I want to allow the node deliminator ('/') to be regexed (so .*foo[0-1] will have any length in the tree). While I could do a brute force search of everything, it would be much faster to be able to not do any search on nodes that are "complete" (aka "/asdf/") could also split out each line as a regex. I guess I should really be checking if any of the wild cards can be replaced by the node deliminator... – IdeaHat Jul 31 '13 at 19:57
  • @Robadob: your check fails for something like `"\\\\*"` - a backslash (escaped with another backslash) repeated zero or more times. It's not enough to check for special characters with no backslashes before them - you need to check for special characters preceded by an even number (possibly zero) of backslashes. Personally, I'd not try to express that with a regexp, but write a simple single-pass algorithm that checks for special characters while keeping track of the length of the most recent run of backslashes. – Igor Tandetnik Jul 31 '13 at 22:49
  • @IgorTandetnik perhaps this then `"([^\\\\](\\\\\\\\)*[\\.\\^\\$\\[\\]\\?\\+\\*\\=\{\\}])"` It's not great for trying to express with Regex, but its fun to try :p –  Aug 01 '13 at 17:57

1 Answers1

0

Well, there's a quick two-step way to test for this. Instead of testing for escaped characters and wildcards in one regex, the first line of your function could remove escaped characters, then the second line would test the remaining string for wildcard-type expressions.

.*((\[.+?\])|(\{[0-9]*,[0-9]*\})|(\*)|(\+)).*

will match a string that contains any *, +, {#,#}, or [] expression. In your function, return whether or not the passed string matches this expression.

Radian_Mode
  • 13
  • 1
  • 4