I have this data in a CSV file and i want to read it in python. The problem is that the decimal points and the delimiter are both commas. How can I read the CSV file in a way that the 1,2,3,4,6,8,10,11 and 15 commas are used as delimiters and the other commas are used as decimals?. Thanks
Asked
Active
Viewed 48 times
0
-
2If that's the case, then your csv file is invalid. Values that contain special characters, such as delimiters, quotechars or lineterminators must be quoted. How else should a parser know how to parse the values? If values are not quoted, delimiters wihin values must be escaped. Have a look at the quoting options of the [csv module](https://docs.python.org/3/library/csv.html#csv.QUOTE_MINIMAL) to get an idea how valid data should look like. – Mike Scotty Mar 09 '18 at 22:08
-
You could conditionally split on commas with some pattern matching, but it would require making assumptions about a regular structure to the data, and it would get messy very fast. Like @MikeScotty said: better to get valid data first, since then separation becomes trivial. – thmsdnnr Mar 09 '18 at 22:10
-
Thank you for taking the time to respond, now i will stop wasting time triying to parse this files, and I will ask for the RAW data again. – Freyman Mendoza Plata Mar 09 '18 at 22:16
1 Answers
0
As a workaround you could read the data as follows:
with open('input.txt') as f_input:
data = [next(f_input).strip().split(',')] # Read header normally
for line in f_input:
row = line.split(',')
data.append(row[:2] + ['.'.join([x, y]) for x, y in zip(*[iter(row[2:])] * 2)])
print data
This would give you output starting like:
[
['$Date', '$Time', 'PIT_2612_EU', 'PIT_3611_EU', 'PT2614_EU', 'PIT_3614_UE', 'E1_QBRT_D3', 'E1_QBRT_D4', 'DT_2611_EU'],
['04/01/015', '00:00:00', '799.8047', '686.0352', '780.7617', '380.8594', '0.1058', '298.0', '8324219.0']
]
This first reads the header line in and splits on commas. Then for each data row it splits again on commas, keeps the first two values and then recombines each following pair with a .
. The resulting data
could be loaded into Pandas.

Martin Evans
- 45,791
- 17
- 81
- 97