I have a big dataframe that I read via read_csv. So far so good, unfortunately, the data is interpreted as 'str' because there are commas ',' instead of points '.'. The df looks like this:
Nr Time Wavelength Temperature Wavelength1 Temperature1 ....
0 0 12:32:28 1.509,72 12,57 1.508,68 11,78 ....
1 1 12:32:29 1.509,73 12,49 1.508,69 11,82 ....
2 2 12:32:30 1.508,99 11,68 1.508,72 11,97 ....
. . . . . . . ....
. . . . . . . ....
I have basically 2 different values:
1: The temperature e.g 20,0 which is read in as a string
2: The wavelength e.g. 1.509,72 which is also read in as a string
Here is the problem:
1: Changing the whole column 'Temperature' in the dataframe with
# 20,0 into 20.0
df.loc[:,'Temperature'] = df.loc[:,'Temperature'].replace(',','.', regex=True).astype(float)
works perfectly and the end result is, that every Temperature is changed to 20.0 BUT not to type 'float'. It is still a 'str' Also trying
df1 = df1.astype({"Temperature": float})
doesn't work.
2: Changing the whole column 'Wavelength' in the dataframe with
# 1.509,72 into 1509.72
df.loc[:,'Wavelength'] = df.loc[:,'Wavelength'].replace('.','',regex=True).replace(',','.',regex=True).astype(float)
just get's every value into '' = nothing and doesn't change the 'str' into a 'float' So if I just go for 1 value:
df.iloc[0,2] = float(df.iloc[0,2].replace('.','').replace(',','.'))
Everything works fine.
What am I missing? Is there an easy solution to it, even though while reading it? Iterating through the whole dataframe isn't really an option because it takes too much time with 250k rows and 80 columns..
Cheers, Jonas