0

I am trying to use StackPath's EdgeRules and their documentation is not very clear or good. I need to match urls in multiple directories but exclude any URL's that have the extension m3u8 in it or the word segment in it. This is their docs EdgeRules

This works to limit it to 2 directories.

/(https://example.com(/(pics|vids)/).*)/

But then this doesn't work.

/(https://example.com(/(pix|vids)/).+(?!m3u8|segment).*)/

I've been trying to use https://regex101.com/ but nothing I try seems to work. I don't even know what kind of regex they use. Hopefully can get some help with this.

Panama Jack
  • 24,158
  • 10
  • 63
  • 95
  • Two issues I can see with your regex: First is that you have unescaped forward slashes, and the second is that you have `.+` before your negative lookahead and `.*` after it. For example, if your URL is `https://example.com/pix/segment`, it can still match because the `.+` consumes `segment`, then the negative lookahead has nothing left to match. To fix this, move the `.+` inside the lookahead (remembering to add brackets around your alternation). – ApexPolenta Feb 09 '23 at 04:44
  • @ApexPolenta the documentation linked also has unescaped backslashes. Might be a C# thing – BrendanOtherwhyz Feb 09 '23 at 05:02
  • @Panama Jack: Do any of the answers answer your question? If not, please update your question with more details. If yes, please consider accepting one of the answers – Peter Thoeny Mar 10 '23 at 21:20
  • @PeterThoeny Unfortunately not. They don't support negative lookaheads. – Panama Jack May 03 '23 at 07:17

2 Answers2

1

I can't test this so apologies if its something else wrong...

The negative look aheads need to be side by side, not wrapped in parentheses separated by or (|). I also added a end of line character ($) at the end of .m3u8.

(https://example.com(/(pix|vids)/)(?!.*\.m3u8$)(?!.*segment.*).*)

See this example: https://regex101.com/r/reVHWt/1

  • Yes in theory this should work. I checked the example and it handles the scenarios but still not working with StackPath's EdgeRules. I have opened a ticket to see what's up. – Panama Jack Feb 09 '23 at 06:04
0

The EdgeRules docs do not mention the regex flavor they support, and from the examples it is not clear. Also the example /(^http://example.com(/.*/)+.$)/ shows non-escaped backslashes, indicating this is non-standard regex.

I see no other way than using a negative lookahead to exclude arbitrary patterns. Assuming their regex does support it you can try:

/^https://example.com/(pix|vids)/(?!.*\bm3u8\b)(?!.*\bsegment\b).*$/

Or with properly escaped special chars:

/^https:\/\/example\.com/(pix|vids)/(?!.*\bm3u8\b)(?!.*\bsegment\b).*$/

Explanation of regex:

  • ^ -- anchor at start of string
  • https:\/\/example\.com/ -- literal https://example.com/
  • (pix|vids) -- literal pix or vids
  • / -- slash
  • (?!.*\bm3u8\b) -- negative lookahead for m3u8, anchored on both sides with \b
  • (?!.*\bsegment\b) -- ditto for segment
  • .*$ -- any other chars up to end of string
Peter Thoeny
  • 7,379
  • 1
  • 10
  • 20