0

just a normal .csv file with the first row has titles for each column.

I wonder how to create a new .csv file that has the same header (first row), but contains every 5th rows of the original file?

thank you!

j.doe123
  • 11
  • 1
  • 2

3 Answers3

1

This will take any text file and output the first and every 5th line after that. It doesn't have to be manipulated as a .csv, if the columns aren't being accessed:

with open('a.txt') as f:
    with open('b.txt','w') as out:
        for i,line in enumerate(f):
            if i % 5 == 0:
                out.write(line)
Mark Tolonen
  • 166,664
  • 26
  • 169
  • 251
  • "if the columns aren't being accessed" true only if there are no multi line fields in the CSV. You can have a legit field "line 1\nline 2" that should be a single field in the output. – dawg Feb 09 '16 at 17:26
0

This will read the file one line at a time and only write rows 5, 10, 15, 20...

import csv

count = 0

# open files and handle headers
with open('input.csv') as infile:
    with open('ouput.csv', 'w') as outfile:
        reader = csv.DictReader(infile)
        writer = csv.DictWriter(outfile, fieldnames=reader.fieldnames)
        writer.writeheader()

        # iterate through file and write only every 5th row
        for row in reader:
            count += 1
            if not count % 5:
                writer.writerow(row)

(work with Python 2 and 3)

If you'd prefer to start with data row #1 to write lines 1, 6, 11, 16... at the top change to:

count = -1
noahcoad
  • 101
  • 4
0

If you want to use the csv library, a tighter version would be...

import csv

# open files and handle headers
with open('input.csv') as infile:
    with open('ouput.csv', 'w') as outfile:
        reader = csv.DictReader(infile)
        writer = csv.DictWriter(outfile, fieldnames=reader.fieldnames)
        writer.writeheader()

        # iterate through file and write only every 5th row
        writer.writerows([x for i,x in enumerate(reader) if i % 5 == 4])
noahcoad
  • 101
  • 4