0

I am trying to get sector classifications for tickers using python. How would I go about scraping?

This is a function I designed but I was wondering if there is a better way to go about it.

def get_sector(ticker):

    req=urllib2.Request('http://google.com/finance?q='+ str(ticker))
    response = urllib2.urlopen(req)
    the_page = response.read()
    output = re.search('Sector\: \<a id=sector href=\"(.*)\" \>(.*)\<\/a\>\&gt;', the_page, flags=re.IGNORECASE)

    if output != None:
        output = output.group(2)
        output= HTMLParser.HTMLParser().unescape(output)
        return output
    else:
        return 'Not Found'

I am receiving the following error when I try iterating it over the list of tickers in Russell 3000:

URLError:

0 Answers0