1

I'm running Apache 2.2, and my server runs well. Noticed this weird anomaly in my access.log file, how should I prevent it? robots.txt doesn't seem to be working.

127.0.0.1 - - [17/Apr/2011:12:17:00 +0100] "GET / HTTP/1.1" 200 3022 "-" "msnbot/1.1 (+http://search.msn.com/msnbot.htm)"
127.0.0.1 - - [17/Apr/2011:12:17:00 +0100] "GET /icons/blank.gif HTTP/1.1" 200 487 "http://localhost/" "msnbot/1.1 (+http://search.msn.com/msnbot.htm)"
127.0.0.1 - - [17/Apr/2011:12:17:00 +0100] "GET /icons/layout.gif HTTP/1.1" 200 616 "http://localhost/" "msnbot/1.1 (+http://search.msn.com/msnbot.htm)"
127.0.0.1 - - [17/Apr/2011:12:17:00 +0100] "GET /icons/folder.gif HTTP/1.1" 200 564 "http://localhost/" "msnbot/1.1 (+http://search.msn.com/msnbot.htm)"
127.0.0.1 - - [17/Apr/2011:12:17:00 +0100] "GET /icons/compressed.gif HTTP/1.1" 200 1379 "http://localhost/" "msnbot/1.1 (+http://search.msn.com/msnbot.htm)"
127.0.0.1 - - [17/Apr/2011:12:17:01 +0100] "GET /icons/image2.gif HTTP/1.1" 200 650 "http://localhost/" "msnbot/1.1 (+http://search.msn.com/msnbot.htm)"

Is this a bot attack, or has my server screwed up? How should I fix it, to prevent this happening again?

Chris S
  • 77,945
  • 11
  • 124
  • 216
  • Have you replaced the remote IP by `127.0.0.1` and your host by `localhost`? Or is Apache behind a proxy? It's really easy to spoof the User-Agent. – Lekensteyn Apr 18 '11 at 13:16

2 Answers2

1

Do you have robots.txt?

If not then you should create one. You can read about it here.

Ben Pilbrow
  • 12,041
  • 5
  • 36
  • 57
user74596
  • 66
  • 2
1

MSNbot would be the webcrawler Microsoft uses for msn/live/bing. It's not "dangerous" normally. It will pickup robots.txt files and obey them if you have one correctly configured (see user74596's answer). It may take a day or two before it picks up the robots.txt file.

If you want people to be able to find your website in the search engines listed above, you shouldn't block access to your site.

Chris S
  • 77,945
  • 11
  • 124
  • 216