Blocking based on full URL and not just the URI in AWS WAF

Question

I am using AWS WAF across multiple CloudFront distributions which go to different URLs. Generally speaking, it is working well. However, we have noticed particular activity on a few of the underlying sites that I want to block, but I don't want to block it across all the sites.

It seemed simple enough to me to create a WAF rule that would match a regex on the URI and block based on that. However, it appears that AWS WAF does not use the host in its URI matching. For example this rule:

Inspect URI, Block based on RegEx with RegEx being:

^(http|https):\/\/(www)?\.?example\.net\/(.*)?\/*.html$

And these test URLs work in my regex tester:

http://example.net/blah.html
https://example.net/blah.html
http://www.example.net/blah.html
https://example.net/stuff/blah.html

When I apply it to the WAF, though, it does not block.

Is there something else I can do here to achieve what I am looking to do? I do not want to edit anything directly on my hosting servers because it would be more of a maintenance headache and it would not solve the problem I am attempting to solve (which is stop bots from spamming bad URLs and spiking my server with 404s).

I also realize someone may suggest I could do a rate limit - which I do have in place - but the bots are coming from many different IPs so that doesn't solve this particular case. Instead, I just want to block some of the URL types that they keep trying to get to. In this case, it's thousands and thousands of HTML pages. It also does not take into account that I only want to block these requests for a very specific site.

"URI" matches in WAF are actually path matches. It's not clear what you are actually intending with your regex but it's pretty clear that even though it might match, it isn't likely correct at `\/(.*)?\/*.html$`. The `?` in `(.*)?` has no effect inside `^...$` and `\/?*.html` matches zero or more `/` followed by any single character followed by `html`. What exactly are you trying to block? Is it just `/*.html` (read this as glob, not regex)? — Michael - sqlbot, Mar 10 '20 at 22:16
@Michael-sqlbot If I strip off this portion - ^(http|https):\/\/(www)?\.?example\.net\ - it matches the exact thing I am attempting to block. But it does it for every site under that WAF. I really only want it for a single site (but don't want to have to create a rule for everything) because that particular site is being targeted with old URLs that are no longer supported. But, if it's only a path match, then I guess there's not a way to accomplish what I'm attempting to do? — JasCav, Mar 11 '20 at 01:07
I think in WAF, you can have rule with two conditions 1. Host matches www.example.com or regex AND 2. To match the URI (path) — James Dean, Mar 14 '20 at 14:41
@JamesDean That appears the be the only solution I can see. Unfortunately, that won't meet my goals, but I appreciate you providing the insight. — JasCav, Mar 17 '20 at 00:42
@JasCav I'm running into a similar situation and was wondering why a multi-part rule doesn't fit your need? It seems a `HTTP method` + `Header[host]` + `URI` should be able to cover the full URL? — nitsujri, May 28 '20 at 02:21

Blocking based on full URL and not just the URI in AWS WAF

0 Answers0