13

I have the following code trying to iterate over some items:

Here is the input (Single line)

operation,sku,item_name,upc,ean,brand_name

   filename=open("WebstoreItemTemplate.csv").read()
   template=csv.reader(filename,delimiter=',')
   for row in template:
        print row

I'm expecting the output to look the same, something like:

['operation','sku','item_name','upc,ean','brand_name']

instead I'm receiving the following output with each letter being treated as a list. I've verified that the file is in csv format, so I'm unsure what I'm doing wrong.

['o']
['p']
['e']
['r']
['a']
['t']
['i']
['o']
['n']
['', '']
['s']
['k']
['u']
['', '']
['i']
['t']
['e']
['m']
['_']
['n']
['a']
['m']
['e']
['', '']
['u']
['p']
['c']
['', '']
['e']
['a']
['n']
['', '']
['b']
['r']
['a']
['n']
['d']
['_']
['n']
['a']
['m']
['e']
Ben C Wang
  • 617
  • 10
  • 19

2 Answers2

13

Remove the .read and just pass the file object:

with open("WebstoreItemTemplate.csv") as filename:
    template=csv.reader(filename)
    for row in template:
        print row

Which will give you:

['operation', 'sku', 'item_name', 'upc', 'ean', 'brand_name']

From the docs:

csv.reader(csvfile, dialect='excel', **fmtparams)

Return a reader object which will iterate over lines in the given csvfile. csvfile can be any object which supports the iterator protocol and returns a string each time its next() method is called — file objects and list objects are both suitable.

Basically this is happening:

In [9]: next(iter("foo"))
Out[9]: 'f'
Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321
  • Thanks, it worked, but I'm curious why does .read() cause the issue? – Ben C Wang Jul 04 '15 at 21:15
  • @BenCWang, because you are iterating over the string characters by character, with the file object csv calls next each time giving you a full line – Padraic Cunningham Jul 04 '15 at 21:16
  • 3
    The `csv.reader` expects an iterator of lines, which a file object provides. `.read()` produces a single string. You could do `open(filename).read().split("\n")` but passing the file object directly is much more efficient. – Constantinius Jul 04 '15 at 21:23
5

You just need to call splitlines() after calling read. Passing the file object is not always ideal or required.

For example reading from string:

import csv
rawdata = 'name,age\nDan,33\nBob,19\nSheri,42'
myreader = csv.reader(rawdata.splitlines())
for row in myreader:
    print(row[0], row[1])

in my case I just wanted to detect encoding using chardet:

with open("WebstoreItemTemplate.csv") as f:
     raw_data = f.read()
     encoding = chardet.detect(raw_data)['encoding']
     cr = csv.reader(raw_data.decode(encoding).splitlines())
...

Here are some practical examples that I have personally found useful: http://2017.compciv.org/guide/topics/python-standard-library/csv.html

Andreas
  • 970
  • 18
  • 28