0

I'm trying to crawl a website with scrapy using the request object. I connect to the internet through a proxy that requires authentication, and this authentication prevents me from crawling the website: DEBUG: Crawled (400) <GET http://auth4/robots.txt> (referer: None)

How can I authenticate in the proxy or skip it to reach the website?

Thanks!!

user1454456
  • 51
  • 1
  • 7

1 Answers1

0

I think you need to set the User Agent. Try to set the User Agent to 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:39.0) Gecko/20100101 Firefox/39.0' in the settings.py

Edit: check this out How to use scrapy with an internet connection through a proxy with authentication

Pinguluk
  • 54
  • 7