1

Is there a Regex guru in here ?

I tried to find the answer, I got close on that one : Matching strings not ending with .html but unfortunately, it doesn't fulfill my need :

On IIS 7.5 want to redirect any address without trailing slash :

  • catalog
  • catalog/chapter

to

  • catalog/chapter/

but not

  • catalog/
  • catalog/chapter/
  • catalog/chapter/anything.any
  • catalog/chapter/anything.any4
  • catalog/chapter/anything.an

(extension between 2 and 4 char)

Tricky part is, I need to get that first group "catalog" as it's used in the conditions to check if it exists in rewrite tables. For now, I'm getting that first group with : ((.+?)(?=/|$))

But then, to check the following (if there's any), I'm stuck.

I can't figure how to do it...

Thanks a bunch for your help !!!

EDIT :

To be clearer, this is what I'm looking to achieve :

  • catalog ==> MATCH + captures "catalog" in group 1
  • catalog/chapter ==> MATCH + captures "catalog" in group 1
  • catalog/ ==> NO MATCH
  • catalog/chapter/ ==> NO MATCH
  • catalog/chapter/anything.any ==> NO MATCH
  • catalog/chapter/anything.any4 ==> NO MATCH
  • catalog/chapter/anything.an ==> NO MATCH
Community
  • 1
  • 1
strem
  • 11
  • 3

1 Answers1

1

You can transform your conditions into negative lookaheads:

^([^/]+)(?!.*\.\w+$)(?!.*/$).*$
        ^---- 1----^^-- 2--^

See a regex demo

The first one will fail the match when the URL ends with a file extension (\w+$ will require 1 or more word characters to appear at the end of the string). The second lookahead will fail the match if the URL ends with a /.

Note: escaping / might not be necessary in your environment if you are not using regex delimiters.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • Hi Wiktor, Thanks for your help, Unfortunately this is not exactly what I want. I want to 1 - capture the catalog name either anything in the URL, without trailing slash as http://example.com/catalog or anything before the first slash as in http://example.com/catalog/anything/behind 2 - If there's anything after /catalog, then I want to check whether it's a file or not. If it's not, and has no trailing slash, then redirect – strem Feb 12 '16 at 08:12
  • Please check `^([^/]+)(?!.*\.[^.]+$).*[^/]$` – Wiktor Stribiżew Feb 12 '16 at 09:19
  • close, but it still matches : `catalog/test/index.html` whereas I don't want to as the URL is correct that way, and doesn't need to be redirected. – strem Feb 12 '16 at 09:46
  • The [`^([^/]+)(?!.*\.[^.]+$).*[^/]$`](https://regex101.com/r/sU0aV5/2) can match `catalog/test/index.html`. It ends with an extension and cannot be matched with the regex. Also, I think the lookahead should be a bit precised like `^([^/]+)(?!.*\.[^/.]+$).*[^/]$`. – Wiktor Stribiżew Feb 12 '16 at 10:01
  • I finally found it ! `^([^/]+)(?!.*\.\w{1,4}$)(?!.*\/$).*$` Thanks for your help, Wiktor ! – strem Feb 12 '16 at 11:13
  • With that one, you limit to extensions from 1 to 4 characters. I updated the answer. The only difference between `^([^/]+)(?!.*\.\w+$)(?!.*/$).*$` and `^([^/]+)(?!.*\.[^/.]+$).*[^/]$` is that the latter requires at least one character to be in the URL after the first capture group. You have not provided a test case for that scenario, so I was not paying attention to that aspect. – Wiktor Stribiżew Feb 12 '16 at 11:23
  • I shortened the answer to include the relevant details only. Please consider accepting/upvoting if it worked/turned out helpful to you. – Wiktor Stribiżew Feb 12 '16 at 11:30