I am trying to use tables in pandas.
The original data look like that (.txt file):
µm nm 1.34E+00 1.39E+00 1.34E+00 1.61E+00 ...
When I manually convert the file from .txt to .csv, by opening it in excel and saving as a .csv file, I obtain something like that:
µm;nm 1.339216;1.388997 1.340324;1.612847 1.341462;1.587352 1.342533;1.686544 ...
Which is working fine in pandas, using the following code:
file =('filename.csv')
df = pd.read_csv(file, sep = ";")
df
dataframe from manually obtained .csv file
Which is what I want. But since I am planning to deal with a lot of those files, I need to process them as batch. So I need to obtain the same dataframe from the original files, which come as .txt.
But if I try to do that from the original data, it looks like this:
The code is as follows:
df2 = pd.read_csv('filename.txt', sep = ";", encoding = 'unicode_escape')
df2.to_csv('filename-2.csv', sep='\t', index=None)
df2
Please note that I use the 'unicode_escape' value to avoid the error message "utf-8' codec can't decode byte 0xb5 in position 0: invalid start byte"
I tried to specify various separators, but without success so far.
I hope someone will be able to help.
Thanks,
Sébastien.