Combine csv in Python with skipping header row Error

Question

I've successfully combined all csv files in a directory, however struggling with the ability to skip the first row (header) of each file. The error I currently get is " 'list' object is not an iterator". I have tried multiple approaches including not using the [open(thefile).read()], but still not able to get it working. Here is my code:

 import glob
 files = glob.glob( '*.csv' )
 output="combined.csv"

 with open(output, 'w' ) as result:
     for thefile in files:
         f = [open(thefile).read()]
         next(f)   ## this line is causing the error 'list' object is not an iterator

         for line in f:
             result.write( line )
 message = 'file created'
 print (message)

You should close each file after reading it, either explicitly, or using 'with' as you did opening the file to which you are writing. — Fred Mitchell, Mar 13 '15 at 02:11
You might find [this answer](http://stackoverflow.com/questions/11349333/when-processing-csv-data-how-do-i-ignore-the-first-line-of-data/11350095#11350095) helpful. — martineau, Mar 13 '15 at 02:16

Avinash Raj · Accepted Answer · 2015-03-13T02:16:26.253

1

Use readlines() function instead of read(), so that you could easily skip the first line.

f = open(thefile)
m = f.readlines()
for line in m[1:]:
    result.write(line.rstrip())
f.close()

OR

with open(thefile) as f:
    m = f.readlines()
    for line in m[1:]:
        result.write(line.rstrip())

You don't need to explicitly close the file object if the file was opened through with statement.

edited Mar 13 '15 at 02:16

answered Mar 13 '15 at 02:09

Avinash Raj

172,303
28
230
274

@ Avinash Raj it's telling me "invalid syntax" at m[1:] – jKraut Mar 13 '15 at 02:38
did you put the colon after `m[1:]` ? – Avinash Raj Mar 13 '15 at 02:39
when I use the exact example you gave it works, but does not keep same format as original files with each row of data on it's one line – jKraut Mar 13 '15 at 02:51
try `result.write(line)` – Avinash Raj Mar 13 '15 at 03:07

mhawke · Answer 2 · 2015-03-13T03:02:24.097

Here's an alternative using the oft forgotten fileinput.input() method:

import fileinput
from glob import glob

FILE_PATTERN = '*.csv'
output = 'combined.csv'

with open(output, 'w') as output:
    for line in fileinput.input(glob(FILE_PATTERN)):
        if not fileinput.isfirstline():
            output.write(line)

It's quite a bit cleaner than many other solutions.

Note that the code in your question was not far off working. You just need to change

f = [open(thefile).read()]

to

f = open(thefile)

but I suggest that using with would be better still because it will automatically close the input files:

with open(output, 'w' ) as result:
    for thefile in files:
        with open(thefile) as f:
            next(f)
            for line in f:
                result.write( line )

score 0 · Answer 3 · answered Mar 13 '15 at 02:16

>>> a = [1, 2, 3]
>>> next(a)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: list object is not an iterator

I am not sure why you chose to bracket the read, but you should recognize what is happening from the example above.

There is already a good answer. This is just an example of how you might look at the problem. Also, I would recommend getting what you want to work with just a single file. After that is working, import glob and work on using your mini-solution in the bigger problem.

Combine csv in Python with skipping header row Error

3 Answers3