18

There are several questions already posted here about returning a 404 instead of a 403 in special cases (e.g., .ht* files or a certain directory), but I can't figure out how to simply replace all 403 responses ("Okay, it exists, but you still have to find a way in") with 404s ("Sorry, never heard of it"). I'm hoping that there is a simple solution that won't require updating regexes or other bits of the .htaccess to match site changes, just a simple directive: "whenever you decide to return a 403, return a 404 instead" that applies to the whole site, regardless of configuration changes.

Then, if the top level .htaccess contains "Options -Indexes", and any given directory contains no index.html (or equiv), the bare directory URL will return 404, but if I ever add an index.html to that directory, the same bare directory URL will return the index.html, with no updates needed to any .htaccess file.

I don't even care if, in the event I ever password a directory, a bad password returns a 404 (because of the 404 -> 403 mapping). In that case, I'm not hiding anything by returning a 404, but it causes no harm either. If there's a way to UNDO the general 403->404 mapping for special cases (rather than DO it for special cases), though, that could be even more useful.

Of course, if I'm overlooking something, please set me straight.

EDIT: Drat. I was trying to write a good quality question here, but my description of the behavior of "Options -Indexes" in the second paragraph turns out to be wrong. Without that line a bare directory URL shows "index.html" if exists in the directory; otherwise, it reveals the contents of the directory. (That forwarding of /dir to /dir/index.html if index.html exists is the default setup of the Web host, unless I'm mistaken.) Adding the "Options -Indexes" line stops it airing my laundry in public (returning a 403, not 404, but still better than exposing the dir contents), but now the bare directory URL returns a 403 even if index.html exists.

I wish a bare dir URL "mysite.com/mydir" displayed /mydir/index.html if it existed and "404" if it didn't, but clearly there's more to it than just replacing the 403s with 404s.

Glen
  • 652
  • 2
  • 6
  • 12
  • can you post your current configuration ( `.htaccess` and in Apache configuration file )? – Raptor May 09 '12 at 04:30
  • Actually, I don't have any .htaccess yet, nor do I have my own Apache config file (httpd.conf?). It's a new shared Web host, and I'm trying to figure out how to set things up. I assume my .htaccess will contain the line, "Options -Indexes", but I'm not even sure of that. – Glen May 09 '12 at 06:25
  • I had the same problem and used rewrite conds and rewrite rules. [See my answer to a similar question](http://stackoverflow.com/a/21074783/2747427). – jff Jan 12 '14 at 12:57

5 Answers5

13

To complete one of the better answers here (as mentioned by @WebChemist and @JennyD), a good way to solve this is to return 404 Not Found from the document you use to handle 403 errors. Personally I do something like the following:

.htaccess in web root (relevant excerpt):

ErrorDocument 400 /http-errors.php
ErrorDocument 403 /http-errors.php
ErrorDocument 404 /http-errors.php

http-errors.php in web root (condensed working example):

<?php

$status = $_SERVER['REDIRECT_STATUS'];
// If it's a 403, just bump it up to a 404
if ( $status == 403 ) $status++;

$codes = array(
  400 => array( '400 Bad Request', 'The request cannot be fulfilled due to...' ),
  404 => array( '404 Not Found', 'The resource you requested was not found...' ),
  500 => array( '500 Internal Server Error', 'The request was unsuccessful...' )
);

$title = $codes[$status][0];
$message = $codes[$status][1];

header( $_SERVER['SERVER_PROTOCOL'] . ' ' . $title );
echo "<h1>$title</h1>\n<p>$message</p>";
Marcel
  • 27,922
  • 9
  • 70
  • 85
  • 1
    By using the `ErrorDocument 403` directive, Apache has not already sent a 403 header - is that correct? I ask because while reading about the `headers_sent()` function I found out that headers are sent before the entire response is constructed. While normal users would not notice, I might not want hackers to see a 403 response followed by a 404. – Sean Letendre Aug 02 '18 at 06:39
3

You can easily do a fake 403 -> 404 by doing

ErrorDocument 403 /404.php

Where 404.php is your formatted 404 response page, but that would still return 403 in the response header. If you want to return a 404 header, you could make a 403 page with a header redirect that returns 404 code, but the redirect might still be visible....

Not really sure if you can do a catchall 403->404 substitution with pure .htaccess

Edit:

played around with a bit, 2 pure .htaccess methods you can try:

1) make any url ending in a slash redirect to an index page like:

RewriteRule ^(.*)/$ /$1/index.php [NC,L]

so if you do have an index page, it will show but if not it will 404 on the missing index page request. If you leave out the [R], the url will still appear as the / non index.php canonical SEO friendly url :) can't say this wont cause problems elsewhere though..

2)

in a rewrite rule, you can use r=404 to return a 404 code.

RewriteRule ^(.*)/$ - [R=404, NC,L]

however RewriteCond %{REQUEST_FILENAME} !-f or RewriteCond %{REQUEST_FILENAME} -d

do not help, meaning all trailing slashes will 404 even when index page is present. You could fix this by specifying folders to protect (or ignore) in additional RewriteCond rules but that wouldn't exactly be "automatic" as you'd have to add RewriteCond rules as more folders get added to protect|ignore

WebChemist
  • 4,393
  • 6
  • 28
  • 37
  • Yes, the headers are probably the biggest issue, since exploits are usually automated. – Glen May 09 '12 at 07:18
  • 2
    I *think* that if you edit the 404.php script to contain a status code, this might work. But it's magic beyond what I've previously attempted... – Jenny D May 09 '12 at 07:39
  • tested the edited example code and these methods will return a header response of 404 – WebChemist May 09 '12 at 19:23
3

Here's a snippet from my .htaccess file that resides in a directory I have blocked, but allow access from the LAN machines. Returns a 404 error everytime!!

# Turn rewrite engine on
RewriteEngine on
RewriteOptions Inherit

# Make it for a specific directory
# RewriteBase /phpMyAdmin 

# Block everyone except me
RewriteCond %{REMOTE_HOST} !^192.168.2
RewriteRule .? - [R=404,L]
Jason
  • 41
  • 3
2

To the best of my knowledge, there is no such simple directive. It's still possible, but it's a bit more complicated than you may like.

You can use the ErrorDocument directive to set the 403 responses to a script, and then have that script respond with a 404 response instead of a 403 one. There are some tips at apache.org that may be helpful

Jenny D
  • 1,225
  • 9
  • 20
-1

There is hope even though there probably isn't a blanket way to get 403 errors to report 404 errors.

I suspect what you really want is to ensure /~root (and any other existing, non HTTP user) doesn't give back 403 but 404.

The UserDir directive in Apache gives back 403 by design. I don't think that can currently be changed. BUT-- the 'UserDir' directive can be used more than once in the same Apache file like so:

UserDir public_html UserDir disable root someuser1 someuser1

That configuration will allow the use of UserDir, but will also report the cherished 404 response header to all those who try to sniff around your system.

I found this tip from the following link:

http://www.if-not-true-then-false.com/2010/enable-apache-userdir-with-selinux-on-fedora-centos-red-hat-rhel/

I tested it myself in Apache 2.2.15 under RHEL 6.3 and it works-- even when testing with something like curl. I was also looking for a solution to this. Finally found something that I can work with!