0

I work with a sample of CSV file which is like this :

3256221406917,DESCRIPTION1,"U Bio,  U",food
3256223662106,DESCRIPTION2,"U Bio,  U",food

I want to parse it with comas :

def import_csv(csvfilepath):
data = []
product_file = open(csvfilepath, "r")
reader = csv.reader(product_file, delimiter=',')
for row in reader:
    if row:  # avoid blank lines
        columns = [row[0], row[1], row[2], row[3], row[4]]
        data.append(columns)

return data

However it returns a "list index out of range" issue when running.

I believe that the trouble might come from third and fourth column as there is opening and closing double quotes. But I don't understand why the delimiter = ',' seems not used.

Do you know why ? Thank you for your help !

EDIT :

Thank you all I was simply not sure why "," was read after '"' and if there was a way to change it, but it seems simpler to remove the ' "' before !

SidGabriel
  • 205
  • 1
  • 3
  • 12
  • 1
    What do you expect ? The row looks like this: `['3256221406917', 'DESCRIPTION1', 'U Bio, U', 'food']` – Maurice Meyer Dec 10 '18 at 14:07
  • you must check that all rows have 5 columns. you're trying to get row[4] maybe it doesn't exist. use assert len(rows) == 5, 'len 5 expected %s' % ";".join(row) – user2652620 Dec 10 '18 at 14:09
  • csv module supports reading fields with comma: https://stackoverflow.com/questions/8311900/read-csv-file-with-comma-within-fields-in-python – MGP Dec 10 '18 at 14:10
  • Thank you for the link MGP ! – SidGabriel Dec 10 '18 at 14:22

3 Answers3

0

I believe you can use pandas for this:

df = pd.read_csv('your-data.csv')
df_to_list = df.values.tolist()
  • I don't know pandas, I should read about it. I was wondering how to parse "," before ' " ' while reading with CSV reader, but it seems that it's simply not how it is working and should remove the ' " ' first. Thank you ! – SidGabriel Dec 10 '18 at 14:12
0

I don't see that you need csvreader for this, and I think that if you want to enforce splitting on ALL commas then I guess you can try this approach:

def import_csv(csvfilepath):
  data = []
  with open(csvfilepath, "r") as product_file:
  for r in productfile:
      row = r.split(",")
      if len(r) == 5: # Vary this to change the sensitivity
          columns = [row[0], row[1], row[2], row[3], row[4]]
          data.append(columns)

  return data
Charles Landau
  • 4,187
  • 1
  • 8
  • 24
0

Try replacing

columns = [row[0], row[1], row[2], row[3], row[4]] with columns = [row[0], row[1], row[2], row[3]]

As there are only 4 columns in the CSV in your example.

Preetham
  • 618
  • 7
  • 10