1

I've been trying to remove the trailing slashes from all URLs. For example, I want the following URLs to redirect:

  • http://example.com///test -> http://example.com/test
  • http://example.com//test -> http://example.com/test
  • http://example.com/test -> http://example.com/test

I've tried using the following RedirectMatch rule:
RedirectMatch 301 ^//+(.*)$ http://example.com/$1
However this does not do anything at all. The page does not redirect, and the URL - with the multiple slashes - appears in the server log.

I've also tried using the following RewriteRule rule:
RewriteRule ^//+(.*)$ http://example.com/$1 [R=301,L]
But this does not do anything either.


The odd thing is that the rule:
RedirectMatch 301 ^//*(.*)$ http://example.com/$1
does work as expected, meaning it redirects all URLs, even the correct ones. But as soon as I change the expression to ^//+(.*)$ it stops matching against anything.


The output of httpd -v is:

Server version: Apache/2.4.37 (centos)
Server built:   Oct  7 2019 21:42:02

And I'm running on CentOS 8.


Any help with this would be greatly appreciated!

MrWhite
  • 12,647
  • 4
  • 29
  • 41
user2370460
  • 213
  • 1
  • 3
  • 7

1 Answers1

1

The easiest way to reduce multiple slashes anywhere in the URL-path is to use the "fact" that the URL-path the mod_rewrite RewriteRule pattern matches against has been "processed" and has already had multiple slashes reduced to single slashes. We can then use another condition (and check against THE_REQUEST) to determine whether multiple slashes existed in the initial request.

Try the following, before any existing rewrites/redirects:

RewriteEngine On

RewriteCond %{THE_REQUEST} \s[^?]*//
RewriteRule ^/?(.*) /$1 [R=301,L]

The preceding condition (RewriteCond directive) checks against THE_REQUEST server variable (which contains the raw first line of the HTTP request and is not modified in anyway) in order to determine if there are 2 or more slashes present in the URL-path.

For a request like http://example.com///test, THE_REQUEST would contain a string of the form:

GET ///test HTTP/1.1

The captured backreference ($1) already has multiple slashes reduced (as mentioned above)


UPDATE:

I initially used the regex //+ (although // would have been sufficient) in the preceding CondPattern. However, this would also match multiple slashes in the query string (if any), not just the URL-path, but the rule that follows only reduces slashes in the URL-path, so this could have potentially resulted in a redirect loop if multiple slashes did occur in the query string.

The regex \s[^?]*// on the other hand only matches multiple slashes that occur before the first ?. So only matches aginst the URL-path, not the query string.

I also modified the RewriteRule directive from the "simpler" RewriteRule (.*) $1 so that it works in any context. The previous rule was fine when used directly in the <VirtualHost> container, as per the question.


A quick look at your attempts:

I've tried using the following RedirectMatch rule:
RedirectMatch 301 ^//+(.*)$ http://example.com/$1
However this does not do anything at all. The page does not redirect, and the URL - with the multiple slashes - appears in the server log.

This will never match multiple slashes because the URL-path that the RedirectMatch directive matches against has already been processed to reduce multiple slashes.

I've also tried using the following RewriteRule rule:
RewriteRule ^//+(.*)$ http://example.com/$1 [R=301,L]
But this does not do anything either.

For the same reason as above. And it is this "feature" we are using in the above answer to reduce multiple slashes in the request.

The odd thing is that the rule:
RedirectMatch 301 ^//*(.*)$ http://example.com/$1
does work as expected, meaning it redirects all URLs, even the correct ones. But as soon as I change the expression to ^//+(.*)$ it stops matching against anything.

Because the regex //* matches a single slash and there is always a single slash at the start of the URL-path that the RedirectMatch directive matches against. Specifically, the regex //* matches a single slash followed by 0 or more slashes (as denoted by the * quantifier). But this results in an endless redirect-loop.

Whereas the regex //+ matches at least 2 slashes. Specifically, a single slash followed by 1 or more slashes (as denoted by the + quantifier). So, for the reason mentioned above, it will not match, so no redirect occurs.

MrWhite
  • 12,647
  • 4
  • 29
  • 41
  • I'll update this answer later as there are still some potential issues with this. I'll also explain your earlier attempts. – MrWhite Dec 04 '19 at 16:14
  • So this does indeed work, however I'm a little uneasy about inspecting `THE_REQUEST`. I tried it with `REQUEST_URI` but this did not work - so I assume this has also had the multiple slashes reduced? I'd be curious to know more about when the reduction of multiple slashes actually happens, because they are present in the log. – user2370460 Dec 04 '19 at 17:23