I am using a proxy rotation in my project to prevent being banned from a website, I have to scrape a list of urls http://website/0001 to http://website/9999 and when it's detect that I am scraping they send me to the website/contact.html.
I already have my proxy list in the settings
ROTATING_PROXY_LIST = [
'proxy1.com:8000',
'proxy2.com:8031',
# ...
]
And I created this Spider:
next_page_url = response.url[17:]//getting the relative url from website/page
if next_page_url == "contact.html":
absolute_next_page = response.urljoin(last_page)
yield Request(absolute_next_page)
//should try the same page with different proxy
else:
next_page_url = int(next_page_url)+1
last_page = str(next_page_url).zfill(4)
absolute_next_page = response.urljoin(last_page)
yield Request(absolute_next_page)`
But it gives an error saying UnboundLocalError: local variable 'last_page' referenced before assignment
How can I specify that the proxy is dead in this spider? Or is there another way to do the same thing?