3

We use pretty urls on our site. I had an external technician add back links some years ago. He did a great job, but in one case, he consistently added a link with a trailing space character.

https://www.example.com/item/item/%20

This has been indexed as %20 and I can see on my back link reports that there are 87 sites that point to the URL with %20 at the end.

If I can redirect this, then my page /item/item/ would gain 87 back links.

We use rewrite rules, and I have tried every solution here on stack overflow, but none has worked. Some non working solutions are:

RewriteEngine on
RewriteRule ^(.*[^\ ])\ +$ /$1

RedirectRule (.*)\s$ $1 [R=301]

RewriteRule ^(.*/|)[\s%20]+(.+)$ $1$2 

I have tried a redirect 301 but these don't work either.

redirect 301 /item/item/%20 /item/item/

redirect 301 /item/item/+ /item/item/

Some things that helps - this is not a site wide pattern. It is just one particular URL that got propagated out into the world incorrectly. And it is not a space anywhere in the string - it is always at the end.

Thanks.

It would also work fine for me to convert the trailing %20 to a known character like a - because I could redirect it /item/item/- to item/item/

anubhava
  • 761,203
  • 64
  • 569
  • 643
user35546
  • 65
  • 1
  • 7
  • How is your server currently responding to these requests with a trailing _space_? A 403 Forbidden? How are these URLs routed? Is `/item/item/` entirely virtual or does it relate in some way to the filesystem? – MrWhite Jan 17 '20 at 00:21
  • 1
    It goes to our 404 page. To be clear, the incoming link looks like /item/item/%20 and this is what is recorded on our site. I think what confuses me is that htaccess rules work with the space translated, and in the htaccess syntax, a space is a delimiter. Also normal URL encoding like + is not used either. – user35546 Jan 18 '20 at 18:12
  • "htaccess rules work with the space translated" - Not necessarily. The URL-path matched by the `RewriteRule` _pattern_ is %-decoded, however, not all server variables are. "a space is a delimiter" - if the argument contains a space then you can surround the entire argument in double quotes (or backslash escape the space - as you have done, or use `\s` shorthand character class in regex). "normal URL encoding like + is not used either" - The `+` (encoded space) only applies to the query string part of the URL. In the URL-path, a `+` is a literal `+` (plus). – MrWhite Jan 18 '20 at 21:27

1 Answers1

2

You can use this rule as your topmost rule just below RewriteEngine On line:

RewriteEngine On

RewriteRule ^(.*)(?:\s|\x20)+$ /$1 [L,NE,R=301]
anubhava
  • 761,203
  • 64
  • 569
  • 643