0

I have following location block in my server configuration:

location ~^/media/(.+)/(.+)$ {        
    error_log /home/user/Server/nginx/logs/error.log;
    access_log /home/user/Server/nginx/logs/access.log;
    proxy_pass  https://bucketname.sgp1.digitaloceanspaces.com/$1/$2;             
}

The problem is if I use english character filenames in second variable of regular expression then it works fine. But if we pass any unicode Hindi character as second variable of regular exp then it returns bad request.

In my access log it looks like this: enter image description here

enter image description here

How to fix this issue? so that it can easily pass unicode characters in filename to Digital Ocean Spaces server.

Ivan Shatsky
  • 13,267
  • 2
  • 21
  • 37
Future King
  • 3,819
  • 6
  • 28
  • 37

1 Answers1

0

The reason you've got an HTTP 400 Bad Request error is that unicode characters in URI should been percent-encoded according to RFC 3986. However for some reasons (especially to match internationalized filenames on the local filesystem) nginx works with the normalized URI, and one of the normalization steps is decoding such URIs (check location directive documentation to find out more details about the URI normalization). There are no built-in instruments to re-encode those matched URI parts again (although some third-party modules could do the job). However an original non-normalized request URI is available to you via the $request_uri nginx internal variable, so you can take these URI parts from it using the following map block (should be placed outside the server block at the http configuration level):

map $request_uri $mediafile {
    ~^/media/([^/?]+/[^/?]+)(?:$|\?)  $1;
}

Then inside your server block you can use the following location instead (no need to use capture groups here anymore):

location ~ ^/media/[^/]+/[^/]+$ {
    ...
    proxy_pass https://bucketname.sgp1.digitaloceanspaces.com/$mediafile;
}
Ivan Shatsky
  • 13,267
  • 2
  • 21
  • 37