I need to get the following regular expression to work but having issues. Yes, it's parsing HTML. No, there's no better option to use.
This is the regex:
test(.*)\/[^s].*(=|\/|Z)
I'm using the "U" modifier (so it's ungreedy), and "\" is my escape symbol.
Plugging in this pattern:
test.com/sch/anythingwhateverZhello
Results in a match, when I don't think it should. The captures are ".com/sch" and "Z", although I (think) I specifically told it that it should A) capture only up to the first "/", so it should be ".com", and B) don't match if the first letter after the "/" is an "s". Interestingly -- and the probable source of my problem -- is when I remove the [^s], the capture now works correctly. With it in, the asterisk is gobbling up to the second "/", which makes no sense. I tried putting a question mark after the asterisk, just as a double hint to the regex that it should not be greedy, but this made no difference.
OK, so instead of a negated character class (I really don't want to exclude just "s"; I really would like to exclude "sch" specifically), I next tried a negative lookahead:
test(.*)\/(?!sch).*(=|\/|Z)
Same problem! Matching, and first capture is ".com/sch".
Any ideas what my blunder is here? (I've been using RexV2 regex validator at http://www.rexv.org/, so it occurred to me that there might be a bug in that engine, but I can replicate this issue in my live environment).