16

In order to mirror a whole website as static HTML,

I would like to convert URLs like http://example.com/script.php?t=12 to http://example.com/script.php_t=12.

Notice ? in URL is being is being converted to _.

This will allow nginx or apache to serve these files from disk as the raw HTML we obtained and saved from wget -- one file for each URL -- rather than as a PHP file.

Is it possible to do so via Nginx URL rewriting?

Jeff Atwood
  • 13,104
  • 20
  • 75
  • 92
Arpit Jalan
  • 163
  • 1
  • 1
  • 7
  • It's certainly possible, but really weird. What do you want it for? – Alexey Ten Mar 10 '15 at 09:20
  • 2
    One problem with this approach is that multiple GET parameters in an URL can be in any order, and when you do this conversion, you will change the semantics of the URL. – Tero Kilkanen Mar 10 '15 at 10:05
  • its for archiving an old forum in static html, but yeah having this work with `http://example.com/script.php?a=1&t=3` is going to need some super fancy rewrite action – Sam Saffron Mar 10 '15 at 13:06
  • 1
    @tero the querystring URLs are, in practice, always in the same order. So this is a non issue. – Jeff Atwood Mar 10 '15 at 20:08
  • Do you know the possible arguments? It always `t`? Also, doesn't http://serverfault.com/questions/321225/rewriting-a-query-string-part-as-a-path-part-using-nginx answer this? – chx Mar 12 '15 at 08:24
  • 1
    @chx it's for archiving an old forum, so the argument can be other than `t` like `f`, `u`, etc. – Arpit Jalan Mar 12 '15 at 08:28
  • https://twitter.com/edogawa_c/status/575935975827755009 – Egalitarian Mar 12 '15 at 08:31

6 Answers6

15

I got this working using try_files:

location / {
    try_files "${uri}_${args}" 404.html;
}

This will try to find a file on disk named after the pattern you provided with a "_" instead of the "?".

Further configuration depends on how you saved static files like images or stylesheets. You can add a fallback trying to read them without query string form disk like so:

location / {
    try_files "${uri}_${args}" $uri 404.html;
}
Matthias Bayer
  • 776
  • 3
  • 7
  • 1
    very interesting approach, would this cause a potential disk read and "file not found" on all potential URLs though? – Jeff Atwood Mar 12 '15 at 09:00
  • From [try_files](http://nginx.org/en/docs/http/ngx_http_core_module.html#try_files): "Checks the existence of files in the specified order and uses the first found file for request processing". So this will not cause any additional disk reads for the URLs in your question. – Matthias Bayer Mar 12 '15 at 09:07
  • 1
    ok, if we put it in `location ~ \.php$` we can get `try_files` to work, but it does not work in `location /` – Jeff Atwood Mar 12 '15 at 09:57
  • +1 for example of using curly braces :) – Danila Vershinin Feb 21 '19 at 10:58
4

Something along the lines:

location ~ \.php$ {
  # only rewrite URL's with args
  if ($args != '') {
    rewrite .* "${uri}_${args}?" last;
  }
}
  • You left out the `?` my answer used. – chx Mar 12 '15 at 08:43
  • You're correct with regard to ?, as for your answer—it took me some time to test it on real server, so I haven't seen yours until I posted mine. And yours will rewrite all URL's even without arguments to variant with "_" what may be undesirable. – Max Gashkov Mar 12 '15 at 08:49
  • Solution from Matthias with try_files is actually more preferable to this. – Max Gashkov Mar 12 '15 at 08:54
  • we can do it without the if, when we specify a more strict `.*` clause in the first rewrite param. This is hugely helpful! – Jeff Atwood Mar 12 '15 at 09:02
  • @JeffAtwood I don't think that's possible—nginx rewrite (as well as location) patterns should apply to part of an URL before query string only. – Max Gashkov Mar 12 '15 at 11:03
  • It does work, we are testing that the matched files are ending in `.php` which is part of the path. Though you're right, a file ending in php could still have no params. – Jeff Atwood Mar 12 '15 at 21:35
1

This works on nginx/1.6.2.

rewrite ^/.*\.php$ "${uri}_${args}";

But personally I'd use try_files solution with a fallback to an original URI if there is any.

try_files $uri "${uri}_${args}";

E.g. if you have script.php on disk it'll try with it first, and then, if there's none, it'll go for script.php_t=12. try_files needs a recent-enough version of nginx.

And if this is not enough, you can do like this inside an if:

return 301 "${uri}_${args}";
sanmai
  • 531
  • 5
  • 19
  • I like this, but we can't get the try_files bit to actually work, whereas the rewrite *does* work. (well, if you add a `?` to the end of your rewrite there so the query params aren't added to it..) – Jeff Atwood Mar 12 '15 at 09:40
  • @JeffAtwood `try_files` is going to stop at `$uri` if there's a location block matching that, or a file for it (the request without get args) - is that the case? – AD7six Mar 12 '15 at 10:27
  • @JeffAtwood `?` has no visible effect on static files, you would only see an extra query in your access logs; if you care about it, sure add a question mark – sanmai Mar 13 '15 at 02:49
1

I don't think you'll be able to do this with vanilla nginx but if you are willing to install the Lua module for nginx (http://wiki.nginx.org/HttpLuaModule) you can do it.

server {
    server_name so.dev;
    listen 80;

    location / {

        root /tmp;

        rewrite_by_lua '
            local uri = ngx.var.uri
            local params = ngx.req.get_uri_args(0)

            for key, value in pairs(params) do
                uri = string.format("%s_%s=%s", uri, key, value)
            end

            ngx.req.set_uri(uri)
            ngx.req.set_uri_args({})
        ';

    }
}

Tested it locally and seems to do what you are looking for. If you want to keep other params separated by ampersands, change the rewrite_by_lua block to be

local uri = ngx.var.uri
local param_string = ""
local params = ngx.req.get_uri_args(0)
local separator = ""

for key, value in pairs(params) do
    param_string = param_string .. separator .. key .. "=" .. value
    separator = "&"
end

ngx.req.set_uri(uri .. "_" .. param_string)
ngx.req.set_uri_args({})
c17r
  • 11
  • 2
0

The wiki says

If you specify a ? at the end of a rewrite then Nginx will drop the original $args (arguments).

So then rewrite ^ ${uri}_$args? last; should work.

chx
  • 1,705
  • 2
  • 16
  • 25
0

The answers suggested above should work. However you see how sensitive your URL becomes. Because nginx tries to check if the file name exists on the server first, any extra parameter will throw it off.

http://example.com/script.php?t=12 // works 
http://example.com/script.php?t=12&_utm=twitter // not work

My suggestion is to leave the URL as is and route it to the file correct with php. You have access to the t parameter.

Ibu
  • 153
  • 1
  • 6