How would I go about scraping sectors for tickers on Yahoo/Google Finance using Python?

Asked Nov 10 '16 at 20:55

Active Nov 10 '16 at 21:02

Viewed 535 times

I am trying to get sector classifications for tickers using python. How would I go about scraping?

This is a function I designed but I was wondering if there is a better way to go about it.

def get_sector(ticker):

    req=urllib2.Request('http://google.com/finance?q='+ str(ticker))
    response = urllib2.urlopen(req)
    the_page = response.read()
    output = re.search('Sector\: \<a id=sector href=\"(.*)\" \>(.*)\<\/a\>\&gt;', the_page, flags=re.IGNORECASE)

    if output != None:
        output = output.group(2)
        output= HTMLParser.HTMLParser().unescape(output)
        return output
    else:
        return 'Not Found'

I am receiving the following error when I try iterating it over the list of tickers in Russell 3000:

URLError:

edited Nov 10 '16 at 21:02

asked Nov 10 '16 at 20:55

Karan Teckchandani

1

@MooingRawr this doesn't look like a bug free code: _I am receiving the following error [...]_ so no, off-topic for CR. – t3chb0t Nov 10 '16 at 21:04
I think, you need [beautifulsoup](https://www.crummy.com/software/BeautifulSoup/bs4/doc/) – Yevhen Kuzmovych Nov 10 '16 at 21:42

How would I go about scraping sectors for tickers on Yahoo/Google Finance using Python?

0 Answers0