0

I am trying to archive an old wordpress blog of mine. So far the best way I have found to do this is via wget. The problem is the name of the files it generates, like "index.html?cat=3&paged=3.html".

When I open this file in my browser off a local drive, it works fine. And when I put it on my local apache server, the page works fine. But when I put it on another webserver, going to "index.html?cat=3&paged=3.html", just sends it to "index.html".

I assume it is doing this because it things ?cat... are some sort of arguments to index.html, but I dont get why this happens on one server and not another.

What I could do to get around this problem is to replace the "?" with "_" in all the filenames and links in the files. However, I am still curious what configuration in the server configuration would be causing this to be handled differently. (he asked secretly hoping this was something simple that could be dropped in an htaccess file)

Jono
  • 101
  • 2

2 Answers2

0

The part after the ? is called the query part of the URL and does usually represent variable names and values to be passed by the webserver to an external script or program (CGI, servlet, etc).

I'm guessing but maybe Apache only separates out the query part when the path part is mapped to a directory for which scripting (CGI etc) is enabled (using ScriptAlias, +ExecCGI etc).

RedGrittyBrick
  • 3,832
  • 1
  • 17
  • 23
  • Thanks for the idea. I tried a bunch of different CGI and PHP config options in the .htaccess file, but I could not find one that affected the URL arguments. – Jono Oct 25 '10 at 17:43
0

In the end I just renamed all the files by replacing the "?" with "_". So "index.html?cat=3&paged=3.html" became "index.html_cat=3&paged=3.html".

Then I had to updated all the links in the HTML to reflect the name. Which was easy with regexxer. Not the best solution, but it did the trick. Thanks.

Jono
  • 101
  • 2