1

I'm trying to parse .ods files with pandas, using pd.read_excel() function, which uses odf under the hood. The problem I face is simple: some cells have comments, and pandas treat them as if they were some regular content.

Here is a basic example ; given a very simple .ods file with a single comment:

enter image description here

Importing that file into a dataframe using

import pandas as pd 
df = pd.read_excel("example_with_comment.ods")

gives:

enter image description here

while I would have liked to retrieve the content of the cell only. Does anyone know how to drop the comments during parsing ?

I'm using pandas 1.3.4.

Thanks a lot to anyone who could give me a hint !

Clej
  • 416
  • 3
  • 13

1 Answers1

1

It seems like a bug. You may try, instead of read_excel, to use this module:

https://pypi.org/project/pandas-ods-reader/

Liutprand
  • 527
  • 2
  • 8
  • That solves the bug indeed. However, I cannot use some useful options from pandas.read_csv(), such as `skip_rows=` for instance; plus it adds a dependency. Well, if it can't be helped ? – Clej Mar 25 '22 at 16:47
  • It also loses a lot of features, like the automatic conversion from empty values to np.nan :/ – Clej Mar 25 '22 at 17:10