Yesterday I setup some software which tracks all http requests across our network of websites. After analyzing the first day of traffic we found nearly a dozen IP's that were flat out harvesting our data. It's pretty obvious when one ip browses 300 pages in a matter of 1 hour lol. I did do a reverse lookup on these and the majority were from Singapore, China, etc so they weren't search engine bots.
Does anyone know a service or website that maintains a list of bad IP's that should be blocked?