0

In my Apache error log I can see the following errors has caught on enormous amount everyday.

[Tue Jan 15 13:37:39 2013] [error] [client 66.249.78.53] Request exceeded the limit of 10 internal redirects due to probable configuration error. Use 'LimitInternalRecursion' to increase the limit if necessary. Use 'LogLevel debug' to get a backtrace.

When I check the corroesponding IP, Date and Time with the access log I can see the following

66.249.78.53 - - [15/Jan/2013:13:37:39 +0000] "GET /robots.txt HTTP/1.1" 500 821 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

I've tested my robot.txt file in the Google Webmster tool -> Health -> Blocked URLs and it's fine.

Also when some images accessed by bot's it throw the following error,

Error_LOG

[Tue Jan 15 12:14:16 2013] [error] [client 66.249.78.15] Request exceeded the limit of 10 internal redirects due to probable configuration error. Use 'LimitInternalRecursion' to increase the limit if necessary. Use 'LogLevel debug' to get a backtrace.

Accessed_URL

66.249.78.15 - - [15/Jan/2013:12:14:16 +0000] "GET /userfiles_generic_imagebank/1335441506.jpg?1 HTTP/1.1" 500 821 "-" "Googlebot-Image/1.0"

Actually the above image URL (and several other images in our access log) are not available on our site (they were available before a website revamp that we did in August 2012), and we thrown 404 errors when we go to those invalid resources.

However once in a while, it seems that bots (and even human visitors) generate this type of error in our access/error log, only for static resources like images that don't exist, and our robots.txt file. The server throws a 500 error for them, but actually when I try it from my browser - the images are 404 and the robots.txt is 200 (success).

We are not sure why this is happening and howcome a valid robot.txt and inavalid image can throw a 500 error. We do have a .htaccess file and we are sure that our (Zend framework) application is not being reached, because we have a separate log for that. Therefore, the server itself (or.htaccess) is throwing the 500 error "once in a while" and I can't imagine why. Could it be due to too many requests to the server, or how can I debug this further?

Note that we only noticed these errors after our design revamp, but the web server itself stayed the same

FR STAR
  • 662
  • 4
  • 24
  • 50
  • This is must be problem with your rewrite rules. If possible disable the redirects and see if this still happens. – AlecTMH Jan 16 '13 at 14:39

1 Answers1

1

It might be useful to log the domain that the client is accessing. Your server might be accessible via multiple domains, including the raw IP address. When you're testing, you're doing so via the primary domain and everything works as expected. What if you try to access the same files via your IP (http://1.2.3.4/robots.txt) vs. the domain (http://example.com/robots.txt)? Also example.com vs. www.example.com or any other variation that points to the server.

Bots can sometimes hold on to IP/domain info long after an address has changed and may be attempting to access something that the rules were changed for months ago.

David Ravetti
  • 2,030
  • 1
  • 17
  • 22
  • Thank you for reminding this to me. Yes I've a alias domain refer to my images folder `http://mydomain.com/images/ = alias domain (http://subdomain.com)`. Now when I when upload a basic .htaccess file to the `http://mydomain.com/images/' path it actually throws 404. After 3days I checked the erro_log file and there's no redirection happened. – FR STAR Jan 22 '13 at 22:38