HTTP Error 403: Forbidden using Python3 & How to change IP when Crawling

Question

Below is my code. I want to get historical data from Google Finance, and I tried some methods to fix it. But it still doesn't work. I think it may be because Google Finance has blocked my IP. If so, is there any solution?

url='http://finance.google.com/finance/historical?q=AMZN'
headers = {'Accept': 
'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8','User-Agent':'Mozilla/5.0'}
context = ssl._create_unverified_context()
htmlll=urllib.request.Request(url,headers=headers)
html = urllib.request.urlopen(htmlll,context=context).read().decode()
datalist=html.splitlines()
name = datalist[2].split('>')[2].split(':')[0].replace('amp;','')
print(name)

And the error is as below

Traceback (most recent call last):
File "C:/Users/admin/Desktop/Price & Volume Database/Global Intraday Pricing Data/Asia Intraday Pricing/test.py", line 30, in <module>
html = urllib.request.urlopen(htmlll,context=context).read().decode()
File "C:\Users\admin\AppData\Local\Programs\Python\Python35-32\lib\urllib\request.py", line 162, in urlopen
return opener.open(url, data, timeout)
File "C:\Users\admin\AppData\Local\Programs\Python\Python35-32\lib\urllib\request.py", line 471, in open
response = meth(req, response)
File "C:\Users\admin\AppData\Local\Programs\Python\Python35-32\lib\urllib\request.py", line 581, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Users\admin\AppData\Local\Programs\Python\Python35-
32\lib\urllib\request.py", line 509, in error
return self._call_chain(*args)
File "C:\Users\admin\AppData\Local\Programs\Python\Python35-32\lib\urllib\request.py", line 443, in _call_chain
result = func(*args)
File "C:\Users\admin\AppData\Local\Programs\Python\Python35-32\lib\urllib\request.py", line 589, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden

I have added header and the web address works. I cannot find what's wrong with it.Thank you so much for your help!

have you tried entering that url into a web browser? I did and i got the following message `... but your computer or network may be sending automated queries. To protect our users, we can't process your request right now.` — gyx-hh, Mar 17 '18 at 22:18
Do you mean this url? http://finance.google.com/finance/historical?q=AMZN It works on my computer. — econofutmist, Mar 17 '18 at 22:33
Actually this is just a sample query of my whole code. When the code runs, it works a while. And then it will stop by 403 forbidden. But after several minutes, it will work again, and continue until the next stop by 403 forbidden. So I think maybe my computer is blocked by Google Finance. What do you think about it? — econofutmist, Mar 17 '18 at 22:39
why dont use something different to google finance -- this SO [link](https://stackoverflow.com/questions/10040954/alternative-to-google-finance-api) has really good suggestions — gyx-hh, Mar 17 '18 at 22:53
It's a homework from my professor. I need to use Google Finance to solve the problem. But still thanks for your suggestion. — econofutmist, Mar 17 '18 at 22:58
Haven't I seen this very same question today and somebody told you that you should have a look at the Google search API, because Google blocks scraping attempts? — Mr. T, Mar 18 '18 at 00:35
Possible duplicate of [Is it ok to scrape data from Google results?](https://stackoverflow.com/questions/22657548/is-it-ok-to-scrape-data-from-google-results) — Mr. T, Mar 18 '18 at 00:44
Google has closed their API so I don't know how to find one. I'm still trying to find how to figure it out. — econofutmist, Mar 18 '18 at 02:17

HTTP Error 403: Forbidden using Python3 & How to change IP when Crawling

0 Answers0