4

I am actually using a RewriteMap directive inside my vhost to redirect a list of 800 URLs. It works quiet well:

RewriteEngine On
RewriteMap redirects dbm=db:/data/apps/project/current/configuration/etc/httpd/conf/redirects.db
RewriteCond ${redirects:$1} !=""
RewriteRule ^(.*)$ ${redirects:$1} [redirect=permanent,last]

I use a redirect.txt file containing the mapping. Then it is converted to a db file:

httxt2dbm -f db -i /data/apps/project/current/configuration/etc/httpd/conf/redirects.txt -o /data/apps/project/current/configuration/etc/httpd/conf/redirects.db

For example for this kind of URL, it is OK:

/associations/old_index.php /

But when the URL contains spaces it doesn't work: (I suppose it will be the same with other special characters)

/Universités%20direct   /

How to handle this case?

COil
  • 207
  • 3
  • 12
  • 1
    try `/Universités\ direct` – user9517 Sep 29 '16 at 13:31
  • @Hanginoninquietdesperation Doesn't work, remember that the redirect is itself in the db file not in the vhost. – COil Sep 29 '16 at 13:46
  • The URL-path matched by the `RewriteRule` _pattern_ is %-decoded, so @Hanginoninquietdesperation is certainly on the right track I would say. Try surrounding the value in double quotes. ie. `"/Universités direct" /`? – MrWhite Sep 29 '16 at 13:49
  • Same result with the quotes. – COil Sep 29 '16 at 16:10

3 Answers3

5

You can use a second rewrite map, the internal function 'escape' this turns spaces into %20:

RewriteMap ec int:escape

RewriteMap redirects dbm=db:/data/apps/project/current/configuration/etc/httpd/conf/redirects.db

RewriteCond ${redirects:${ec:$1}} !=""

RewriteRule ^(.*)$ ${redirects:${ec:$1}} [redirect=permanent,last]

Then in your own rewrite map db you can have:

/Universités-direct%20/

This should then match.

Dave M
  • 4,514
  • 22
  • 31
  • 30
Chris Lawton
  • 51
  • 1
  • 2
  • 1
    Interesting, didn't know about this `escape` function. I'll give a try. thanks. – COil Jan 04 '18 at 11:27
  • imho this is the best answer. Way more robust than the other answers and minimal changes (just insert the escape map at the right position) if you already have your rules setup to work with urls without special characters. – Tom Cannaerts Jun 26 '18 at 11:08
3

You can solve this by extracting the encoded URI from the %{THE_REQUEST} variable and using that to do the lookup. You need to put the encoded URIs in the map though of course. Something like the following:

RewriteEngine On
RewriteMap redirects dbm=db:/data/apps/project/current/configuration/etc/httpd/conf/redirects.db
RewriteCond %{THE_REQUEST} "\w+ ([^ ]+)"
RewriteRule ^ - [E=MYVAR:%1]

RewriteCond ${redirects:%{ENV:MYVAR}} !=""
RewriteRule ^ ${redirects:%{ENV:MYVAR}} [redirect=permanent,last] [B]

I've only tested it with a text based map instead of the DB one though. This will probably need modification if you have to deal with URLs with query strings.

Unbeliever
  • 2,336
  • 1
  • 10
  • 19
1

A workaround might be to internally rewrite URLs that contain a space to a hyphen (replace the space with a hyphen) and include the hyphenated URL in your rewrite map instead.

If you only have some URLs that contain a single space within the URL then you could use something like the following directive before your existing directives:

RewriteRule ^(.+)\s(.+)$ $1-$2

And then use the following in your rewrite map:

/Universités-direct /

UPDATE: If you have URLs that contain two spaces (eg. /the force awakens) and some with one space then you could add an additional rule:

RewriteRule ^(.+)\s(.+)\s(.+)$ $1-$2
RewriteRule ^(.+)\s(.+)$ $1-$2

These rules do assume that you don't have URLs that end with a space. And no URL has more than one contiguous space.

If three spaces then add another rule before the above...

RewriteRule ^(.+)\s(.+)\s(.+)\s(.+)$ $1-$2

I would tend to do it this with multiple (simple) rules, rather than a generic "convert everything in a single rule", unless you specifically need that. A generic rule will run recursively, reducing multiple spaces to a single character. You will also likely need additional flags (ie. DPI) to prevent a known rewrite bug in Apache.

MrWhite
  • 12,647
  • 4
  • 29
  • 41
  • 1
    We are near. It works but only when there is only one space in the URL. The regexp (.+)\s(.+) must be tuned. Thanks. – COil Sep 30 '16 at 08:39
  • In this post there is the clue: https://stackoverflow.com/questions/5821120/301-redirect-to-replace-all-spaces-to-hyphens but I can't make it work: `RewriteRule ^([^\s%20]*)[\s%20]+(.*)$ $1-$2 [E=NOSPACE:1]` – COil Sep 30 '16 at 09:39
  • Do you _need_ a general solution that converts all spaces? Otherwise, I would keep it specific to your situation. The `%20` above doesn't seem to make sense in that context... the `RewriteRule` pattern matches the %-decoded URL anyway, but when used in a character class `%20` will match the literal characters `%`, `2` and `0` - which is not desirable. Also note that these general solutions will reduce multiple contiguous spaces to a single char. (I've updated my answer to handle two or three spaces.) – MrWhite Sep 30 '16 at 11:20