0

I am trying to scrape quotes from google finance's new site as the old one going to be deprecated soon. I have written some code to extract stock quotes, but it is painfully slow and takes about 2 minutes to return a single quote and returns only a few quotes every time i run the program.

import urllib
import re
import time

    def get_quote(symbol):
        base_url = 'http://google.com/finance?q='
        content = urllib.urlopen(base_url + symbol).read()
        m = re.search('id="ref_(.*?)">(.*?)<', content)
        if m:
            quote = m.group(2)
            print quote,m
        else:
            quote = 'no quote available for: ' + symbol
        return quote
    while True:
        get_quote('AMZN')

Output:

1,500.00 <_sre.SRE_Match object at 0x109f66360>

1,500.00 <_sre.SRE_Match object at 0x109f66360>

1,500.00 <_sre.SRE_Match object at 0x109f66360>

If you print variable m each time the loops, you will see that most of the time it will return value 'none'

How do I fix this?

  • https://www.alphavantage.co/ instead? Also, parsing HTML with regex is usually a bad idea. – G_M Feb 25 '18 at 04:32
  • alpha vantage offers 1 minute intraday quotes. I am looking to scrape it from google as it offers realtime. – Urvish Ramaiy Feb 25 '18 at 04:34
  • 1
    Scraping from google is a bad idea, they will detect it. Plus, if your code takes two minutes to get a single result, simply using alphavantage speeds up your current solution by 100% – G_M Feb 25 '18 at 04:36
  • But will they detect it if i want to pull a quote like every 10-15 seconds? – Urvish Ramaiy Feb 25 '18 at 04:44
  • A better question might be why would you think they couldn't detect that a request every 10-15 seconds is automated? I'm not saying you shouldn't try, go ahead. – G_M Feb 25 '18 at 04:47
  • https://stackoverflow.com/q/22657548/8079103 – G_M Feb 25 '18 at 04:57

1 Answers1

1

How about this as an option?

from pandas_datareader import data
import matplotlib.pyplot as plt
import seaborn; seaborn.set()

goog = data.DataReader('GOOG', start='2004', end='2016',
                       data_source='google')
goog.head()

enter image description here

goog = goog['Close']
goog.plot();

enter image description here

ASH
  • 20,759
  • 19
  • 87
  • 200