I am working with very outdated files downloaded from a public website. They are in .xls format.
When I attempt the code below I receive an error from pandas.
Code:
import pandas as pd
wb = pd.read_excel("file_name.xls")
first_sheet = pd.read_excel(wb,"First Tab Name")
Error:
XLRDError: Unsupported format, or corrupt file: Expected BOF record; found b'\xef\xbb\xbf<?xml'
When I open the file in Excel, I receive a message which reads: "The file format and extension of 'FileName.xls' don't match. The file could be corrupted or unsafe. Unless you trust its source, don't open it. Do you want to open it anyway?"
However, after using Excel to save the file as a .xls or .xlsx Pandas reads it just fine.
I have several files I need to do this with at once, as I download them, so unfortunately manually saving it is not an option.
I've attempted using openpyxl, XLRD, and xls2xlsx, but am still receiving the same error.
The file initially downloads as a zip file. I am using Zipfile to unzip it to the .xls file.
I am at a loss as to what I could be missing.