0

I got a dataframe from a .csv that contains a column with numeric values however they are written using a , instead of a .. I am trying to alter this before casting the value but I am guessing my lack of Python skills is making me make wring assumptions as this does not seem to work.

df['score'] = df['score'].replace(',', '.')
df['score'].astype('float64')

What is it I am doing wrong here?

SomeDutchGuy
  • 2,249
  • 4
  • 16
  • 42

2 Answers2

2

As written in https://stackoverflow.com/a/56114791/5666087 and described in the documentation of pandas.read_csv, you should use

import pandas as pd

pd.read_csv(path_to_csv, decimal=",")

The documentation states

decimal : str, default '.'

Character to recognize as decimal point (e.g. use ',' for European data).

Here is an example

import io
import pandas as pd

data = """
item,cost
book,"19,99"
coffee,"2,50"
"""

df = pd.read_csv(io.StringIO(data), decimal=",")
df.head()
#      item   cost
# 0    book  19.99
# 1  coffee   2.50

df.dtypes
# item     object
# cost    float64
# dtype: object
jkr
  • 17,119
  • 2
  • 42
  • 68
0

You can try

df['score'] = df['score'].apply(lambda x : float(x.replace(',', '.')))

or

df['score'] = df['score'].str.replace(',','-')

See also link