0

I am having troubles opening excel files (xlsx) in pandas. I had a small application that used to work, but then another application with the same code was failing miserably when opening the same files.

The files have not changed since and I cannot change them (they are provided externally)

After some digging I was able to find out that this seems to be linked to the version of openpyxl:

  • If the version is set to openpyxl==3.0.9: files are opened without failing.
  • If the version is openpyxl==3.1.1: 2 out of three files fail to open raising exception 'ReadOnlyWorksheet' object has no attribute 'defined_names'

File "path\venvtest\lib\site-packages\openpyxl\reader\workbook.py", line 109, in assign_names sheet.defined_names[name] = defn AttributeError: 'ReadOnlyWorksheet' object has no attribute 'defined_names'

import pandas as pd

folder = 'C:\\Path\\of\\folder\\'
FILES = [
    folder + 'file1.xlsx',
    folder + 'file2.xlsx',
    folder + 'file3.xlsx'
]

for file in FILES :
    #make sure the file is readable by pandas
    try :
        sheet_name = 'Inventory Summary'
        df = pd.read_excel(file, sheet_name=sheet_name, engine='openpyxl')
        print(f'file : OK')
     
    except Exception as e:
        print(f'file : ERROR : {file}')
        print(e)
        continue

I can obviously keep the old version for now, but I am curious to know if anyone has a clearer understanding of what is going on.

Wing
  • 642
  • 2
  • 5
  • 16
  • Known issue: https://foss.heptapod.net/openpyxl/openpyxl/-/issues/1947 – Corralien Mar 08 '23 at 08:24
  • Here's the library's issue tracker for this item: https://foss.heptapod.net/openpyxl/openpyxl/-/issues/1963 a commit appears to have been made to fix it in the next version – TZHX Mar 08 '23 at 08:28
  • Oh great, I missed it. Super weird behavior that took me forever to understand – Wing Mar 08 '23 at 09:38

0 Answers0