0

I have hundreds of CSV files separated by comma, and the decimal separator is also a comma. These files look like this:

ID,columnA,columnB
A,0,"15,6"
B,"1,2",0
C,0,

I am trying to read all these files in python using pandas, but I am not able to separate these values properly in three columns, maybe because of the decimal separator or because some values have quotation marks.

I first tried with the code below, but then even with different encodings I could not achieve my goal

df = pd.read_csv("test.csv", sep=",")

Anyone could help me? The result should be a dataframe like this:

  ID  columnA  columnB
0  A      0.0     15.6
1  B      1.2      0.0
2  C      0.0      NaN
user026
  • 638
  • 4
  • 14
  • You probably want `pd.read_csv('test.csv', sep=',', quotechar='"')` to define the quoting behavior – C.Nivs Jul 29 '22 at 00:47

1 Answers1

2

You just need to specify decimal=","

from io import StringIO

file = '''ID,columnA,columnB
A,0,"15,6"
B,"1,2",0
C,0,'''

df = pd.read_csv(StringIO(file), decimal=",")
print(df)

Output:

  ID  columnA  columnB
0  A      0.0     15.6
1  B      1.2      0.0
2  C      0.0      NaN
BeRT2me
  • 12,699
  • 2
  • 13
  • 31