Issue with coma delimiter CSV Python

Question

I work with a sample of CSV file which is like this :

3256221406917,DESCRIPTION1,"U Bio,  U",food
3256223662106,DESCRIPTION2,"U Bio,  U",food

I want to parse it with comas :

def import_csv(csvfilepath):
data = []
product_file = open(csvfilepath, "r")
reader = csv.reader(product_file, delimiter=',')
for row in reader:
    if row:  # avoid blank lines
        columns = [row[0], row[1], row[2], row[3], row[4]]
        data.append(columns)

return data

However it returns a "list index out of range" issue when running.

I believe that the trouble might come from third and fourth column as there is opening and closing double quotes. But I don't understand why the delimiter = ',' seems not used.

Do you know why ? Thank you for your help !

EDIT :

Thank you all I was simply not sure why "," was read after '"' and if there was a way to change it, but it seems simpler to remove the ' "' before !

What do you expect ? The row looks like this: `['3256221406917', 'DESCRIPTION1', 'U Bio, U', 'food']` — Maurice Meyer, Dec 10 '18 at 14:07
you must check that all rows have 5 columns. you're trying to get row[4] maybe it doesn't exist. use assert len(rows) == 5, 'len 5 expected %s' % ";".join(row) — user2652620, Dec 10 '18 at 14:09
csv module supports reading fields with comma: https://stackoverflow.com/questions/8311900/read-csv-file-with-comma-within-fields-in-python — MGP, Dec 10 '18 at 14:10

score 0 · Answer 1 · answered Dec 10 '18 at 14:05

0

I believe you can use pandas for this:

df = pd.read_csv('your-data.csv')
df_to_list = df.values.tolist()

answered Dec 10 '18 at 14:05

I don't know pandas, I should read about it. I was wondering how to parse "," before ' " ' while reading with CSV reader, but it seems that it's simply not how it is working and should remove the ' " ' first. Thank you ! – SidGabriel Dec 10 '18 at 14:12

Charles Landau · Answer 2 · 2018-12-10T14:22:23.933

0

I don't see that you need csvreader for this, and I think that if you want to enforce splitting on ALL commas then I guess you can try this approach:

def import_csv(csvfilepath):
  data = []
  with open(csvfilepath, "r") as product_file:
  for r in productfile:
      row = r.split(",")
      if len(r) == 5: # Vary this to change the sensitivity
          columns = [row[0], row[1], row[2], row[3], row[4]]
          data.append(columns)

  return data

edited Dec 10 '18 at 14:22

answered Dec 10 '18 at 14:11

Charles Landau

4,187
1
8
24

I just tried but with this seems to return only the first number ("3") – SidGabriel Dec 10 '18 at 14:17
Yes I had an error in there you're right. I think I edited it out now @SidGabriel – Charles Landau Dec 10 '18 at 14:22

score 0 · Answer 3 · answered Dec 10 '18 at 14:14

0

Try replacing

columns = [row[0], row[1], row[2], row[3], row[4]] with columns = [row[0], row[1], row[2], row[3]]

As there are only 4 columns in the CSV in your example.

answered Dec 10 '18 at 14:14

Preetham

618
7
10

Issue with coma delimiter CSV Python

3 Answers3