0

Python 3.8.10
Pandas 1.4.1

Hi everyone,

I have a spread sheet in ods(odf) format. I am importing this data using pandas and it seems to be removing all newline characters but I want to keep them.

This test script replicates the issue:

#!/usr/bin/env python3

import pandas as pd

sheet = pd.read_excel("./test.ods", engine='odf')

print(sheet)
print('--------')
print(sheet.loc[0]['A'])

test.ods looks like this: link to image

output is like this:

                A                   B   C
0  TestAbcEfgHijk  lallala12345121212  12
--------
TestAbcEfgHijk

Am I doing something dumb or is this a bug?

edit:
I am on Linux if it makes any difference

Obsnold
  • 3
  • 3

1 Answers1

0

Yeah this seems to be a bug in odfpy used by pandas https://github.com/eea/odfpy/issues/114.
If I save the file as xlsx then I have no issues.
For the time being I will use xlsx.

Here's some code showing the issue in odfpy.

#!/usr/bin/env python3

import sys
from odf.opendocument import load
from odf.table import Table, TableRow, TableCell

infile = sys.argv[1]
doc = load(infile)

cell= doc.getElementsByType(Table)[0].getElementsByType(TableRow)[1].getElementsByType(TableCell)[0]

print(cell)
Obsnold
  • 3
  • 3