1

Below is my code. I want to get historical data from Google Finance, and I tried some methods to fix it. But it still doesn't work. I think it may be because Google Finance has blocked my IP. If so, is there any solution?

url='http://finance.google.com/finance/historical?q=AMZN'
headers = {'Accept': 
'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8','User-Agent':'Mozilla/5.0'}
context = ssl._create_unverified_context()
htmlll=urllib.request.Request(url,headers=headers)
html = urllib.request.urlopen(htmlll,context=context).read().decode()
datalist=html.splitlines()
name = datalist[2].split('>')[2].split(':')[0].replace('amp;','')
print(name)

And the error is as below

Traceback (most recent call last):
File "C:/Users/admin/Desktop/Price & Volume Database/Global Intraday Pricing Data/Asia Intraday Pricing/test.py", line 30, in <module>
html = urllib.request.urlopen(htmlll,context=context).read().decode()
File "C:\Users\admin\AppData\Local\Programs\Python\Python35-32\lib\urllib\request.py", line 162, in urlopen
return opener.open(url, data, timeout)
File "C:\Users\admin\AppData\Local\Programs\Python\Python35-32\lib\urllib\request.py", line 471, in open
response = meth(req, response)
File "C:\Users\admin\AppData\Local\Programs\Python\Python35-32\lib\urllib\request.py", line 581, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Users\admin\AppData\Local\Programs\Python\Python35-
32\lib\urllib\request.py", line 509, in error
return self._call_chain(*args)
File "C:\Users\admin\AppData\Local\Programs\Python\Python35-32\lib\urllib\request.py", line 443, in _call_chain
result = func(*args)
File "C:\Users\admin\AppData\Local\Programs\Python\Python35-32\lib\urllib\request.py", line 589, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden

I have added header and the web address works. I cannot find what's wrong with it.Thank you so much for your help!

  • have you tried entering that url into a web browser? I did and i got the following message `... but your computer or network may be sending automated queries. To protect our users, we can't process your request right now.` – gyx-hh Mar 17 '18 at 22:18
  • 1
    https://support.google.com/recaptcha/answer/6081888?hl=en – DYZ Mar 17 '18 at 22:22
  • Do you mean this url? http://finance.google.com/finance/historical?q=AMZN It works on my computer. – econofutmist Mar 17 '18 at 22:33
  • Actually this is just a sample query of my whole code. When the code runs, it works a while. And then it will stop by 403 forbidden. But after several minutes, it will work again, and continue until the next stop by 403 forbidden. So I think maybe my computer is blocked by Google Finance. What do you think about it? – econofutmist Mar 17 '18 at 22:39
  • why dont use something different to google finance -- this SO [link](https://stackoverflow.com/questions/10040954/alternative-to-google-finance-api) has really good suggestions – gyx-hh Mar 17 '18 at 22:53
  • It's a homework from my professor. I need to use Google Finance to solve the problem. But still thanks for your suggestion. – econofutmist Mar 17 '18 at 22:58
  • Haven't I seen this very same question today and somebody told you that you should have a look at the Google search API, because Google blocks scraping attempts? – Mr. T Mar 18 '18 at 00:35
  • Possible duplicate of [Is it ok to scrape data from Google results?](https://stackoverflow.com/questions/22657548/is-it-ok-to-scrape-data-from-google-results) – Mr. T Mar 18 '18 at 00:44
  • Google has closed their API so I don't know how to find one. I'm still trying to find how to figure it out. – econofutmist Mar 18 '18 at 02:17

0 Answers0