The lanl.arxiv.org math and scientific preprint service (formerly known as xxx.lanl.gov) has a strict policy against bots that ignore its robots.txt
, Robots Beware. On that page, the have a link labelled with "Click here to initiate automated 'seek-and-destroy' against your site", which is forbidden by their robots.txt
but presumably badly behaved robots will follow it, and reap the consequences. The question, what are the actual consequences? I have never had the guts to actually click on that link to see what it does. What can they be doing that is both effective and legal?
Asked
Active
Viewed 754 times
2

HopelessN00b
- 53,795
- 33
- 135
- 209

Brian Campbell
- 377
- 3
- 8
-
Heh, cool... Clicking the link... – Shog9 May 01 '09 at 16:18
-
6Shucks, just a page that holds the connection open for ages and ages. Nothing too interesting. Hey, why are there men with guns here – Shog9 May 01 '09 at 16:21
2 Answers
4
[reverse DNS result]: you've been identified as a robot operating in violation of the guidelines posted at arxiv.org.
If this determination is in error, please report to www-admin@arxiv.org so your problem can be investigated.
Scanning, Initialized:
10 minutes to Trinity...
9 minutes to Trinity...
8 minutes to Trinity...
7 minutes to Trinity...
6 minutes to Trinity...
5 minutes to Trinity...
4 minutes to Trinity...
3 minutes to Trinity...
2 minutes to Trinity...
1 minute to Trinity...
Ground zero. Have a nice day.
Contact
So... it's a page that would waste 10 minutes of a very naive bot's time. Probably useless for combating malicious bots, but might save some bandwidth when faced with a badly-written site-scraper.
0
No consequences other than spinning for a bit. Most browsers (and probably their server) will just timeout after a bit. They probably just cause more harm to themselves with this than the bots.

Daniel A. White
- 675
- 1
- 11
- 20