How to fix new unable to read URL error in python for yahoo finance

Question

I have been using this code to extract (scrape) stock prices from Yahoo Finance for the last year, but now it produces an error. Does anyone know why this is happening and how to fix it?


# Importing necessary packages
from pandas_datareader import data as web
import datetime as dt
import matplotlib.pyplot as plt
import pandas as pd
import os
import numpy as np

# Stock selection from Yahoo Finance
stock = input("Enter stock symbol or ticket symbol (Exp. General Electric is 'GE'): ")

# Visualizing the stock over time and setting up the dataframe
start_date = (dt.datetime.now() - dt.timedelta(days=40000)).strftime("%m-%d-%Y")
df = web.DataReader(stock, data_source='yahoo', start=start_date)
#THE ERROR IS ON THIS LINE^

plt.plot(df['Close'])
plt.title('Stock Prices Over Time',fontsize=14)
plt.xlabel('Date',fontsize=14)
plt.ylabel('Mid Price',fontsize=14)
plt.show()

RemoteDataError: Unable to read URL: https://finance.yahoo.com/quote/MCD/history?period1=-1830801600&period2=1625284799&interval=1d&frequency=1d&filter=history Response Text: b'\n \n \n \n Yahoo\n \n \n \n html {\n height: 100%;\n }\n body {\n background: #fafafc url(https://s.yimg.com/nn/img/sad-panda-201402200631.png) 50% 50%;\n background-size: cover;\n height: 100%;\n text-align: center;\n font: 300 18px "helvetica neue", helvetica, verdana, tahoma, arial, sans-serif;\n }\n table {\n height: 100%;\n width: 100%;\n table-layout: fixed;\n border-collapse: collapse;\n border-spacing: 0;\n border: none;\n }\n h1 {\n font-size: 42px;\n font-weight: 400;\n color: #400090;\n }\n p {\n color: #1A1A1A;\n }\n #message-1 {\n font-weight: bold;\n margin: 0;\n }\n #message-2 {\n display: inline-block;\n *display: inline;\n zoom: 1;\n max-width: 17em;\n _width: 17em;\n }\n \n \n document.write('&test=\'+encodeURIComponent(\'%\')+\'" width="0px" height="0px"/>');var beacon = new Image();beacon.src="//bcn.fp.yahoo.com/p?s=1197757129&t="+ne...

Scraping data off Yahoo Finance is not so simple. They implemented a session token with every request. Without this token, you will see a generic HTML page, hence your error. If you are using Chrome, turn on the Developer Console then go to the Network tab to see it. It's better to use a dedicated stock API like [AlphaVantage](https://www.alphavantage.co) — Code Different, Jul 02 '21 at 14:52
Is this new? I was surprised because this was working fine up until this week. — iceAtNight7, Jul 02 '21 at 15:06
As far as I know, they have had that anti-scraping mechanism for a while. Depending on what scraping library you use, you might have been able to bypass it until they update the code. I've stopped using Yahoo Finance several years ago. Like I said, AlphaVantage is a better option — Code Different, Jul 02 '21 at 15:15
Ok, I will look into figuring out how to implement that then! Thank you for your help!! — iceAtNight7, Jul 03 '21 at 02:16

score 7 · Accepted Answer · answered Jan 06 '22 at 10:58

I use this code to extract data from yahoo:

start = pd.to_datetime(['2007-01-01']).astype(int)[0]//10**9 # convert to unix timestamp.
end = pd.to_datetime(['2020-12-31']).astype(int)[0]//10**9 # convert to unix timestamp.
url = 'https://query1.finance.yahoo.com/v7/finance/download/' + stock_ticker + '?period1=' + str(start) + '&period2=' + str(end) + '&interval=1d&events=history'
df = pd.read_csv(url)

score 4 · Answer 2 · answered Jul 08 '21 at 13:43

I had the same problem. At some recent point pdr stopped working with Yahoo (again). AlphaVantage doesn't carry all the stocks that Yahoo does; googlefinance package only gets current quotes as far as I can tell, not time series; the yahoo-finance package doesn't work (or I failed to get it to work); Econdb sends back some kind of weirdly-formed dataframe (maybe this is fixable); and Quandl has a paywall on non-US stocks.

So because I'm cheap, I looked into the Yahoo CSV download functionality and came up with this, which returns a df pretty much like pdr does:

import pandas as pd
from datetime import datetime as dt
import calendar
import io
import requests

# Yahoo history csv base url
yBase = 'https://query1.finance.yahoo.com/v7/finance/download/'
yHeaders = {
    'Accept': 'text/csv;charset=utf-8'
    }

def getYahooDf(ticker, startDate, endDate=None): # dates in ISO format
    start = dt.fromisoformat(startDate) # To datetime.datetime object
    fromDate = calendar.timegm(start.utctimetuple()) # To Unix timestamp format used by Yahoo
    if endDate is None:
        end=dt.now()
    else:
        end = dt.fromisoformat(endDate)
    toDate = calendar.timegm(end.utctimetuple())
    params = { 
        'period1': str(fromDate),
        'period2': str(toDate),
        'interval': '1d',
        'events': 'history',
        'includeAdjustedClose': 'true'
    }
    response = requests.request("GET", yBase + ticker, headers=yHeaders, params=params)
    if response.status_code < 200 or response.status_code > 299:
        return None
    else:
        csv = io.StringIO(response.text)
        df = pd.read_csv(csv, index_col='Date')
        return df

And today this doesn't work either. I got a 403 Forbidden. I managed to fix it by spoofing the user agent in the headers but I have a feeling Yahoo may jump on that soon too. It would be great if they would give us some guidelines, like ""use this apikey", or "ok, but throttle your requests to 1000 an hour", or even "we're not an API, go away" — PeteCahill, Jul 10 '21 at 13:36
If I get this script to work well for me, you will have solved a big problem for me, since yahoo, every so often, gives me the error that is the subject of this question. Dates in Python confuse the most inexperienced, like me,. If the start date I want is "2017-1-4", is this what I should enter? "2019-04-05T00: 00: 0Z". Thnaks for your solution. — efueyo, Jul 10 '21 at 16:40
Hi. You're welcome! The function takes dates in ISO format, i.e. yyyy-mm-dd. You don't need the time part. So your example date would be 2017-01-04 (assuming you mean the fourth of January and not the first of April ;-) ). As I mentioned in my earlier comment, you'll need to spoof the user agent in the header as Yahoo now seem to be rejecting requests from the python requests package. My headers dictionary now looks like `yHeaders = { 'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0', 'Accept': 'text/csv;charset=utf-8' }` — PeteCahill, Jul 11 '21 at 18:16

Maks · Answer 3 · 2021-11-30T18:43:23.847

3

Also works if you provide headers to your session data object which you then provide to the data reader (e.g. for the caching purpose)

import requests_cache

session = requests_cache.CachedSession(cache_name='cache', backend='sqlite', expire_after=expire_after)

# just add headers to your session and provide it to the reader
session.headers = {     'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0',     'Accept': 'application/json;charset=utf-8'     }

data = web.DataReader(stock_names, 'yahoo', start, end, session=session)

edited Nov 30 '21 at 18:43

answered Aug 08 '21 at 16:59

Maks

766
8
11

What is `requests_cache` here? – jason m Nov 10 '21 at 01:43
It's an additional library to cache requests locally – Maks Nov 11 '21 at 08:44
You can also use `requests.sessions.Session` from `requests` package according to the docs – Maks Nov 11 '21 at 09:06
should should add it to your answer. This is not intuitive. – jason m Nov 11 '21 at 16:01

Jose · Answer 4 · 2022-04-11T12:12:58.670

0

If you are using Google Colab first upgrade the libraries:

!pip install --upgrade pandas-datareader

!pip install --upgrade pandas

Hope it works! :)

Don't forget to restart the workspace and re-run

edited Apr 11 '22 at 12:12

answered Mar 28 '22 at 12:08

Jose

11
2

unfortunately doesn't work – olucube.com Mar 30 '22 at 10:36
1

Don't forget to restart the workspace and re-run? – Jose Mar 31 '22 at 11:45
Yes, after restarting the google colab, it works! Thanks Jose! – olucube.com Apr 01 '22 at 07:05

score 0 · Answer 5 · edited Apr 04 '22 at 14:06

0

pip install yfinance

import pandas_datareader as pdr
from datetime import datetime
TWTR = yf.Ticker('TWTR')
ticker =  TWTR.history(period='1y')[['Open', 'High', 'Low', 'Close', 'Volume']]  # return is

edited Apr 04 '22 at 14:06

baileythegreen

1,126
3
16

answered Apr 02 '22 at 10:52

Muhammad Movahedi

71
1

Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Apr 02 '22 at 12:57

score 0 · Answer 6 · edited Mar 10 '23 at 08:06

0

!pip install yfinance

import yfinance as yf

start_date = '2010-01-01'
end_date = '2022-03-04'

df = yf.download('AAPL', start=start_date, end=end_date)

print(df)

edited Mar 10 '23 at 08:06

Wasit Shafi

854
1
9
15

answered Mar 04 '23 at 23:14

eliasso574gmailcom

1

How to fix new unable to read URL error in python for yahoo finance

6 Answers6

Linked