I download a bunch of csv-files from an aws s3-bucket and put them in a dataframe. Before uploading the dataframe to sql server I would like to change the columns of the dataframe to have the right datatypes.
When I run astype('float64')
on a column I want to change it not only changes the datatype but also the data.
Code:
df['testcol'] = df['lineId'].astype('float64')
I attached a picture to visualize the error. As you can see the data in the third column (testcol
) is different to the data in the second column (lineId
) even though only the datatype should be changed.
A pl_id
can have multiple lineId
's, that's why I added and sorted by pl_id
in the picture.
Am I using astype()
wrong or is this a pandas bug?