Does anyone know how to workaround a problem of OpenOffice Calc not handling new lines in cells correct?
I have a python script that is dynamically generating an excel workbook using openpyxl via pandas.
The script works fine but when I view cells in OpenOffice that contain newlines all the values are duplicated multiple times. If I open the same file using the Microsoft Excel Viewer everything is displayed correctly and if I use a character other than a new line (e.g. comma, #, etc) it displays fine in both also.
I have a workaround to go into the excel and replace the random character using a macro but would like to avoid that if possible as the process really needs to be completely automated. also because the file will be processed by an another internal tool, I do need these cells to be processed with a new line and I can't change the character.
I have also tried using chr(10) and/or chr(13) but in the former case it just get's replaced in the output by '\n' anyway as expected.
The code I'm currently using is similar to:
test_list = []
for x in range(1,18):
test_list.append([
"value1",
"\n".join(['element1', 'element2', 'element3']),
"value3"
])
data_df = pd.DataFrame(test_list)
fn = r'/path/to/excel/file.xlsx'
writer = pd.ExcelWriter(fn, engine='xlsxwriter')
data_df.to_excel(writer, sheet_name='Data', index=False, header=0)
workbook = writer.book
worksheet = writer.sheets['Data']
worksheet.set_column('A:ZZ',50,
workbook.add_format({'text_wrap': True}))
writer.save()
What happens with the Element data is that it shows in the OpenOffice Calc cell as something like:
Oddly the last item appears to be correct
The same data viewed as a list or via DataFrame.head() appears fine:
pprint(test_list)
[['value1', 'element1\nelement2\nelement3', 'value3'],
['value1', 'element1\nelement2\nelement3', 'value3'],
['value1', 'element1\nelement2\nelement3', 'value3'],
['value1', 'element1\nelement2\nelement3', 'value3'],
['value1', 'element1\nelement2\nelement3', 'value3'],
['value1', 'element1\nelement2\nelement3', 'value3'],
...
['value1', 'element1\nelement2\nelement3', 'value3']]
data_df.head(18):
0 1 2
0 value1 element1\nelement2\nelement3 value3
1 value1 element1\nelement2\nelement3 value3
2 value1 element1\nelement2\nelement3 value3
...
15 value1 element1\nelement2\nelement3 value3
16 value1 element1\nelement2\nelement3 value3
It's just when it get passed to the openpyxl library and viewed in OpenOffice.
Thanks