3

I tried to grab data from google finance by the following code:

import pandas_datareader.data as wb
import datetime as dt
start = dt.datetime(2015, 1, 1)
end = dt.datetime(2017, 1, 1)

dt = wb.DataReader('FB', 'google', start, end)
dt.head()

and I got this.

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 24697: invalid start byte

But if I changed 'google' to 'yahoo' (by using yahoo finance instead), it works fine. So what's wrong with it?

saga
  • 736
  • 2
  • 8
  • 20
  • I just tonight got a similar error but while providing a list of stocks in a script that worked up until today. Link below may have something to do with this change? https://github.com/rsvp/fecon235/issues/7#issuecomment-332572738 – MARK CONNOLLY Nov 30 '17 at 01:21

2 Answers2

3

There is an open issue here.

A quick fix is below, porting from the source, paring it down and making a few slight tweaks.

I believe the issue is with the body returned by requests.get() and reading of the resulting bytes. (The traceback agrees with this.) For instance, try data = requests.get(url).content (gets bytes); this will fail. Below, data = requests.get(url).text works.

I really haven't tested this rigorously but the Google API does appear to be working okay. For instance, the export link generated by url does work just fine at the moment.

import datetime
import requests
from io import StringIO
from pandas.io.common import urlencode
import pandas as pd

BASE = 'http://finance.google.com/finance/historical'


def get_params(symbol, start, end):
    params = {
        'q': symbol,
        'startdate': start.strftime('%Y/%m/%d'),
        'enddate': end.strftime('%Y/%m/%d'),
        'output': "csv"
    }
    return params


def build_url(symbol, start, end):
    params = get_params(symbol, start, end)
    return BASE + '?' + urlencode(params)


start = datetime.datetime(2010, 1, 1)
end = datetime.datetime.today()
sym = 'SPY'
url = build_url(sym, start, end)

data = requests.get(url).text
data = pd.read_csv(StringIO(data), index_col='Date', parse_dates=True)

print(data.head())
#               Open    High     Low   Close     Volume
# Date
# 2017-11-30  263.76  266.05  263.67  265.01  127894389
# 2017-11-29  263.02  263.63  262.20  262.71   77512102
# 2017-11-28  260.76  262.90  260.66  262.87   98971719
# 2017-11-27  260.41  260.75  260.00  260.23   52274922
# 2017-11-24  260.32  260.48  260.16  260.36   27856514

Edit: The issue should be fixed on version 0.6.0 of pandas_datareader. If not, please reopen it as bashtage requested.

Or B
  • 1,675
  • 5
  • 20
  • 41
Brad Solomon
  • 38,521
  • 31
  • 149
  • 235
-2

from pandas_datareader import data goog = data.DataReader('GOOG', start='2004',nd='2016', data_source='google') # Use yahoo and get answer goog.head()enter code here