Since negative lookaheads are unsupported, I broke mine out into several expressions that cover all cases. WAF lets you specify multiple expressions. It uses logical OR matching, so only one of them has to match. Using the example in the question, the solution could be...
joe[^aj]
joea[^n]
joean[^n]
joej[^e]
joeje[^n]
joe
matches, unless he's followed by an a
or a j
. Then he's suspicious, so we go on to the next rule. If that a
is followed by an n
, the we're still suspicious, so we go on to the next rule. We repeat that process until we've decided whether or not the entire word is joeann
or joejen
My particular use case was URI matching. I wanted to throttle requests to an entire directory, except for one subdirectory (and all its subdirectories).
Say we want to throttle /my/dir
but not anything in /my/dir/safe
. We would do it like so...
^/my/dir/?$
^/my/dir/[^s]
^/my/dir/s[^a]
^/my/dir/sa[^f]
^/my/dir/saf[^e]
^/my/dir/safe[^/]
We follow the same process of identifying each letter in sequence.
"You can't start with S. Ok, you can start with S, but you can't also have an A. Ok ok, I'll let it slide, but you cannot have an F too. Ok fine, your persistent, but..."
Notice we have to include a rule for the trailing slash /
. This covers the optional slash in /my/dir/safe/
and all subdirectories such as /my/dir/safe/whatever
.