0

I am running some web crawling jobs on an AWS hosted server. The crawler scrapes data from an eCommerce website but recently the crawler gets "timeout errors" from the website. The website might have limited my visiting frequency based on my IP address. Allocating a new Elastic-IP address solves the problem, but not for long.

My Question: Is there any service that I can use to automatically and dynamically allocate & associate new IPs to my instance? Thanks!

Alex
  • 111
  • 1
  • 5
  • 4
    It seems as if you are asking us to help you evade a site's acceptable use policy. – Michael - sqlbot Jun 06 '16 at 10:33
  • @Michael-sqlbot you are absolutely correct :) On contrary there is a way to do this but you have to first confirm the ecommerce website acceptability is not violated. – Piyush Patil Jun 06 '16 at 17:32
  • im sorry guys, but you are wrong, AWS does not forbid crawling, it does ask to limit the number of requests per second. and if the crawled party complains you need to have a structure in place to remove them from the pool. - from AWS policy -> "Monitoring or Crawling. Monitoring or crawling of a System that impairs or disrupts the System being monitored or crawled." i have actually asked AWS support about this and as long as you are not DDOS'ing or jeopardize the third party .. crawling is accepted. – Paul Ma Jun 12 '16 at 09:26

0 Answers0