2

I am doing web scraping with python in some pages and I have been blocked from some of them. When I have tried to check it also through the TOR Browser I have seen that I cannot access to the pages neither, so I think that these pages have been able to track all my IP or I dont have well configurated TOR (and I think not cause I have checked my IP address with Chrome and TOR and are different), so, any one knows why?

Also, I am trying to do a function or method in my python code to change mi IP automatically. What I have seen is that the best is to do it through the TOR browser (using it as the search engine to get data from pages) but I am not able to make it work. Do you have any recommendation to create this function?

Thank you!

1 Answers1

3

I would expect anti scrape protection to also block visits from known Tor exit nodes. I dont think they know it is you. Some websites hire/implement state of the art scrape protection services.

You could setup your own proxies at friends and family and use a very conservative crawl rate or maybe search for commercial residential proxy offerings.

Niels van Reijmersdal
  • 2,038
  • 1
  • 20
  • 36
  • Thank you for your response Niels. What I have finally done is to change the user-agent when requesting access to the web page. It seems it is working. I dont know how much time... – Jesús Téllez May 22 '20 at 11:15