4

For example, “å” can be encoded as /%E5 and /%C3%A5 (utf-8). All my filenames are UTF-8, so the ASCII variants return a 404. I want both variants to work.

I have tried rewriting the incorrect URLs to the correct encodings with variations of the below configuration. I have not been able to actually match the locations so have not gotten anywhere.

rewrite ^/%E5$ /%C3%A permanent;
rewrite ^/%25E5$ /%25C3%25A permanent;
location = /%E5 { return 301 /%C3%A; }

How am I supposed match these percent encoded locations?

Daniel
  • 211
  • 3
  • 16

1 Answers1

5

See here for the same issue in Apache - there, I recommended using an external program to handle the rewrite, since it's clunky to do it in native configuration.

For nginx, the best approach might be to embed some perl in your configuration via ngx_http_perl_module - use perl_set to set a variable to a UTF-8-ified version of $r->uri, using the Encoding module (see here), and rewrite (or probably better, try_files) to that.

Edit:

If you want to just do this by hand for specific URLs or specific characters, then you're just missing a couple things in your attempts:

  • You're working against escaped URLs, while the rewrite should be happening against the string after decoding escaped characters
  • You're hardcoding the matches to be for files that are just the special character, not files that contain the special character

Try something like this..

rewrite (*UTF8)^(.*)\xe5(.*)$ %1å%2;
Shane Madden
  • 114,520
  • 13
  • 181
  • 251
  • Can I achieve this on a smaller scale for one or two URLs? Seems like a over engineered solution for just one–two addresses. – Daniel Jan 05 '15 at 18:23
  • @Aeyoun Your question is certainly worded like you want a general solution, not just for specific addresses. Edited the answer. – Shane Madden Jan 05 '15 at 18:48
  • Thanks for broadening the answer. `rewrite (*UTF8)^(.*)\xe5(.*)$ /elsewhere;` triggers a 500 response. – Daniel Jan 08 '15 at 18:31
  • @Aeyoun What's logged for the 500 response? What about if you make it a redirecting rewrite with `permanent`? – Shane Madden Jan 08 '15 at 18:46
  • 1
    I needed to `apt-get install libpcre3 libpcre3-dev` first. All my other regexes worked, so I did not suspect that was the problem. After doing that, almost all my previous attempts also started working. – Daniel Jan 09 '15 at 01:25
  • extra callout for the *why* of `\x`: `\x` is the regex escape sequence, and thus the regex-equivalent of `%` in URLs. (I think this deserves attention because [it stumped me in apache too](https://serverfault.com/q/1036007/303501) ) – Jules Kerssemakers Oct 16 '20 at 07:19