I'm trying to import the following excel file in pandas: https://rbnz.govt.nz/-/media/ReserveBank/Files/Statistics/tables/b2/hb2-daily-close.xlsx
I tried the following:
url="https://www.rbnz.govt.nz/-/media/ReserveBank/Files/Statistics/tables/b2/hb2-daily.xlsx"
df = pd.read_excel(url,sheet_name="Data", header=4, usecols="A,H")
but I get the following HTTPError: HTTP Error 503: Service Temporarily Unavailable
I thought the problem lay in missing request headers so I tried the following, but I keep getting the same error..
url="https://www.rbnz.govt.nz/-/media/ReserveBank/Files/Statistics/tables/b2/hb2-daily-close.xlsx"
req = Request(url)
req.add_header('User-Agent', 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:77.0) Gecko/20100101 Firefox/77.0')
content = urlopen(req)
df = pd.read_excel(content,sheet_name="Data", header=4, usecols="A,H")
Any thoughts? Thanks
PS It looks like the website is protected by Cloudfare. How to get around Newspaper throwing 503 exceptions for certain webpages Probably selenium is the only solution here