2

I am looking for a database that is regularly updated of different bots, spiders, and crawlers. I want to be able to identify them in the log files from IIS.

  • 1
    You are never going to completely accurately distinguish between bots and regular users due to the number of bots actively trying to mess up your detection. But if you are happy with approximations getting it mostly right, there are things to do. I find that the simplest way to tell a typical user and a legitimate crawler apart is from the requests for `/robots.txt`. But combining with other signals (including databases of known bots) can help produce a more reliable signal. – kasperd Apr 19 '15 at 06:44

1 Answers1

4

User-Agents.org has a pretty large database of user agents/spiders etc. It seems to be updated reguarly ( last update was 2/28/2009 ). Data is availible through RSS/XML.

Sam Cogan
  • 38,736
  • 6
  • 78
  • 114