0

I have downloaded an Excel file with 4 sheets from the web but when I try to convert it into a pandas dataframe I can only access the first sheet.

import requests
import pandas as pd

dls = "http://www.nasdaqomxnordic.com/digitalAssets/110/110149_the-nordic-list-july-12--2019.xlsx"

resp = requests.get(dls)

output = open('test.xlsx', 'wb')
output.write(resp.content)
output.close()

df = pd.read_excel("test.xlsx", sheet_name = 1)

I get the following error message : "TypeError: 'values' is not ordered, please explicitly specify the categories order by passing in a categories argument."

  • This person had the exact same issue, have you tried something in the line of the answers he received? https://stackoverflow.com/questions/52504709/how-to-explicitly-specify-the-categories-order-by-passing-in-a-categories-argum – Renato Byrro Jul 30 '19 at 22:34
  • @RenatoByrro Yes I did, but the problem persists, so I came up with a less direct way, yet it is working. – Guillem Fortó Jul 31 '19 at 23:48

1 Answers1

0

There may be a more direct way to do this, but I ended up by using xlrd package, which lets me access any sheet, then xlsxwriter to create an excel file from the sheet, and finally pandas to read it as a dataframe.

import requests
import pandas as pd

dls = "http://www.nasdaqomxnordic.com/digitalAssets/110/110149_the-nordic-list-july-12--2019.xlsx"

resp = requests.get(dls)

output = open('test.xlsx', 'wb')
output.write(resp.content)
output.close()

import xlrd 
loc = ('test.xlsx') 
wb = xlrd.open_workbook(loc)
sheet = wb.sheet_by_index(1)

import xlsxwriter
workbook = xlsxwriter.Workbook('hello.xlsx') 
worksheet = workbook.add_worksheet() 
for i in range(sheet.nrows):
    for j in range(sheet.ncols):
        worksheet.write(i, j, sheet.cell_value(i, j)) 

workbook.close()

df = pd.read_excel("hello.xlsx")