0

I am using DataIku to return a bytes instance which represents (apparently) a file type object. I am trying to load in an Excel file.

# Read recipe inputs
WeeklyDataFolder = dataiku.Folder("WeeklyDataFolder")

with WeeklyDataFolder.get_download_stream("WeeklyData.xlsx") as f:
    excel_file = f.read()
    print(f"Excel File Read....type={type(excel_file)}")

DataIku is not important, what I am getting back is a "bytes" class object.

When I try to pass this into xlrd using the file_contents parameter, I get an error.

sheet_name = "Report Info"
row = 26
col = 3

wb = xlrd.open_workbook(file_contents=excel_file, on_demand = True)
sheet = wb.sheet_by_name(sheet_name)
cell_val = sheet.cell_value(row, col)

The error I get is:

[22:40:15] [INFO] [dku.utils]  - *************** Recipe code failed **************
[22:40:15] [INFO] [dku.utils]  - Begin Python stack
[22:40:15] [INFO] [dku.utils]  - Traceback (most recent call last):
[22:40:15] [INFO] [dku.utils]  -   File "/dataiku/design-node/jobs/GLOBAL_PV_COMPLIANCE/Build_excel_bo_weekly_data_all_submissions_prepared__NP__2022-05-11T20-39-22.629/compute_excel_bo_weekly_data_all_submissions_prepared_NP/python-recipe/pyoutCueZQ0hx53Q7/python-exec-wrapper.py", line 208, in <module>
[22:40:15] [INFO] [dku.utils]  -     exec(f.read())
[22:40:15] [INFO] [dku.utils]  -   File "<string>", line 46, in <module>
[22:40:15] [INFO] [dku.utils]  -   File "/dataiku/design-node/code-envs/python/GlobalPvCompliance-Py-env/lib/python3.6/site-packages/xlrd/__init__.py", line 170, in open_workbook
[22:40:15] [INFO] [dku.utils]  -     raise XLRDError(FILE_FORMAT_DESCRIPTIONS[file_format]+'; not supported')
[22:40:15] [INFO] [dku.utils]  - xlrd.biffh.XLRDError: Excel xlsx file; not supported

Is there something I need to do to the bytes instance returned before passing it to xlrd?

I tried this also:

buf = io.StringIO(excel_file)
wb = xlrd.open_workbook(file_contents=buf, on_demand = True)
smackenzie
  • 2,880
  • 7
  • 46
  • 99
  • How do you interpret the last line of the error message? – Paul H May 11 '22 at 20:46
  • well I now tried to do str(excel_file, "utf-8) inside StringIO, but am not getting this: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xab in position 12: invalid start byte – smackenzie May 11 '22 at 20:52
  • I copied the code straight out of a functioning python notebook, that has no issues reading in the same xlsx file, but that has the benefit of using a straight path @PaulH. The other project is using xlrd 1.1, and I am using 2+. I guess they removed xlsx support. – smackenzie May 11 '22 at 20:56

0 Answers0