I have recently started using Pandas library to validate and manipulate .CSV-files and Excel-files. There is an issue I have just whitnessed. When assigning values to a numpy-series (column of the Pandas DataFrame) and the values I'd like to assign start with '=', then it seems that Pandas interprets those values as formulas.
Source (.CSV-file):
ID | Name | Size
--------------------------
0 | Foo | =500.000 cells
1 | Bar | Baz
Result (Excel-file):
ID | Name | Size
----------------
0 | Foo | 0 <- wrong value
1 | Bar | Baz
When I open the resulted Excel-file an error message is being displayed. Unfortunately the error message is in german, so I'll try to translate it as well as possible: "We have detected a problem with some of the content of 'FILENAME.xlsx'. Would you like to restore the content? If you trust this workbook, then press 'Yes'." When pressing 'Yes' the workbook is being opened and all the values that should be '=something' are '0'. Every other value was assigned correctly!
This is the line of code is use to assign the values.
ws.ix[some_row_index, some_col_index] = edit_set.ix[some_other_row_index, some_other_col_index].values
Is there a way to bypass this? Maybe by using another method to assign parameters or by setting a parameter?
EDIT: It seems that the problem occurs even if I don't change/set the value. So if the initial value, that the Excel-file that I open with Pandas, starts with '=' and I don't overwrite/change the value at all, the error still occurs when writing the DataFrame to_excel. I still haven't found a way to tell Pandas to "ignore formulas" when writing to Excel-file.
So I think the change in my code should happen in the following two lines:
writer = pd.ExcelWriter(path + name, engine='xlsxwriter', options={'strings_to_urls': False})
edit_final_file.to_excel(writer, sheet_name = 'Main', index=False, encoding='utf-8')