12

I'm in the process of shutting down a site, and have replaced the old site with a single "nobody home" page at the root level of the site. Now I need to set up some redirection, so that any request to any part of the site, no matter how complicated, ends up at the root page.

I've tried what (I thought) ought to work: Creating an .htaccess file containing:

RewriteEngine On
RewriteCond %{HTTPS} !=on
RewriteRule ^(.*)$ https://www.example.com/ [L,R=301,NE]

but it mostly fails: Requests to http://www.example.com still get through, but https://www.example.com/doesnotexist.html throws a 404. (If there was no redirection going on, this would be correct, since that page doesn't exist on the site, but that's the point of the redirection: I want this request to be sent to https://www.example.com.)

Arggh. The answer to this is probably obvious to everyone but me; can anyone help out?

PS: I'm in a shared hosting situation, so I have to do this with a .htaccess file rather than hacking a full Apache configuration file.

Peter Mortensen
  • 2,318
  • 5
  • 23
  • 24
Jim Miller
  • 713
  • 2
  • 11
  • 23

3 Answers3

45

If you are "shutting down a site" then you probably should not be "redirecting" the old site pages to a single page. An HTTP redirect sends a 301 response code, informing users and search engines the pages have moved. (Although mass redirects to a single page are likely to be seen as soft-404s by Google.)

Instead, you should be serving a custom "410 Gone" response instead. A 410 informs search engines the pages are gone and not coming back. Your "single page" is the custom error document.

For example:

ErrorDocument 410 /single-page.html

RewriteEngine On

# Trigger a 410 Gone for all user requests
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteRule ^ - [G]

The additional condition that checks against the REDIRECT_STATUS environment variable (which is empty on the initial request, but set to "410" after the RewriteRule is triggered and the HTTP response status is set) is to ensure two things:

  1. That the internal subrequest for the error document itself (ie. /single-page.html in this example) does not trigger another 410 (essentially resulting in an endless loop) and thus preventing the custom error document from being served (a default server response would sent instead in this scenario).

  2. And also to enable direct requests for /single-page.html itself to also trigger a 410 without creating a rewrite loop.

(Aside: The technique of using REDIRECT_STATUS in this way to detect an already triggered "error state" does not appear to work on LiteSpeed servers unfortunately since the env var is not updated during the request in the case of 4xx responses. However, it is updated for internal rewrites, ie. 200 OK responses, so it's still a good solution to prevent general internal rewrite-loops. It is bizarre why there would be this difference though. Seems like a bug.)

If you have images (and/or other external resources) that need to be displayed in the error document then see my answer to the following related question:

MrWhite
  • 12,647
  • 4
  • 29
  • 41
  • 1
    This looks like the right match for what I'm trying to do (spiders find out the page no longer exists, but humans get a viewable page explaining what's going on). Thanks! – Jim Miller Sep 28 '20 at 15:36
2

The reason http://www.example.com/ is working and https://www.example.com/doesnotexist.html isn't is because your rewrite condition explicitly disables the rewriting if the client is accessing the site via HTTPS (which I believe several major browsers do by default now, but I don't have a source for this right off the top of my head).

I'm assuming you did that to prevent an infinite loop, which would otherwise infinitely rewrite and redirect every request to (what I assume is) the canonical url, with the client never being able to actually view the page.

I believe the following configuration is what you're looking for. Please note that I tested this on my local server, but I'm running version 2.4.46. As best as I can tell from an admittedly semi-half-assed glance over the version 2.2 mod_rewrite documentation, there weren't any significant changes in the upgrade to version 2.4, at least as it pertains to mod_rewrite specifically.

RewriteEngine On
RewriteCond expr "! %{REQUEST_URI} -strmatch '/'"
RewriteRule (.*) https://www.example.com/ [L,R=301,NE]

This configuration enables the rewrite engine, obviously, after which it checks the URI of the request. If it's not equal to "/", it rewrites the entire thing to only the document root, issues a 301 redirect like you did in your original, and stops processing any further rules. You may or may not need a RewriteBase "/" directive declaration right after the RewriteEngine On. It didn't seem to matter when I tested this, as I was able to rewrite nested-directory URI's with no problem (i.e... /a/b/c successfully redirected to /), but the documentation does make a note to point this out, so here I am passing the warning down.

Again, I did test this, but the server version is different so hopefully you don't run into any problems from that. Please note that in order for this to work (barring technical limitations or differences stemming from the aforementioned version difference), your hosting provider needs to have enabled the AllowOverride directive, probably with the All option, although I wasn't able to pin down exactly what the minimum requirement for the rewrite directives to work was.

  • 2
    You don't actually need a separate `RewriteCond` directive as the necessary check (that it's not root) can be performed in the `RewriteRule` itself. eg. `RewriteRule . / [R=301,L]` - redirects _something_ to root, where _something_ is anything other than root. The `NE` flag is not required. However, a 3xx redirect may not be the best option here. – MrWhite Sep 27 '20 at 02:32
  • 1
    "there weren't any significant changes in the upgrade to version 2.4, at least as it pertains to mod_rewrite specifically." - actually, you can't use Apache _expressions_ (`expr` argument on the `RewriteCond` directive) under Apache 2.2. – MrWhite Sep 27 '20 at 10:35
  • @MrWhite Hey, thanks for the feedback, I wasn't aware of that `Rewrite . /` shortcut. I actually never used version 2.2 in any real capacity, either, so I really do appreciate these pointers. – Jose Fernando Lopez Fernandez Sep 28 '20 at 14:23
2

Shouldn't this work?

RewriteEngine On
RewriteRule ^ index.html
Hagen von Eitzen
  • 824
  • 3
  • 17
  • 43