0

I have a program that converts CSV files into pipe delimited files and also counts the total no of lines. But in this case, if the total no of lines is above 7000, I want to create a new output file. The situation would be adjusting just 7000 lines in one file and creating every other output files for every other 7000 lines.

Any suggestions, ideas, or modifications will be highly appreciated.

Previous Code which converts into a single file:

import csv
input_file = input("Enter input file")
output_file = input("Enter Output file")

# count number of lines
def total_lines(input_file):
    with open(input_file) as f:
        return sum(1 for line in f)

# convert input files to output
def file_conversion(input_file, output_file):
    with open(input_file) as fin:
        with open(output_file, 'w', newline='') as fout:
            reader = csv.DictReader(fin, delimiter=',')
            writer = csv.DictWriter(fout, reader.fieldnames, delimiter='|')
            writer.writeheader()
            writer.writerows(reader)
            print("Successfully converted into", output_file)
Atom Store
  • 961
  • 1
  • 11
  • 35

1 Answers1

1

more-itertools makes this easy.

from more_itertools import chunked

def file_conversion(input_file, output_file_pattern, chunksize):
    with open(input_file) as fin:
        reader = csv.DictReader(fin, delimiter=',')
        for i, chunk in enumerate(chunked(reader, chunksize)):
            with open(output_file_pattern.format(i), 'w', newline='') as fout:
                writer = csv.DictWriter(fout, reader.fieldnames, delimiter='|')
                writer.writeheader()
                writer.writerows(chunk)
                print("Successfully converted into", output_file)

Example usage:

file_conversion('in.csv', 'out{:03}.csv', 7000)

which would generate files out000.csv, out001.csv, etc.

orlp
  • 112,504
  • 36
  • 218
  • 315
  • I want to improve this, I want to change my header. I tried using writer = csv.DictWriter(fout, fieldnames=new_headers,extrasaction='ignore', delimiter='|'), It changes headers only but doesn't shows the data in output files – Atom Store Dec 07 '20 at 08:33
  • 1
    @Atom Store That is a different question unrelated to this one. – orlp Dec 07 '20 at 10:37
  • https://stackoverflow.com/questions/65178599/header-modification-issue Please refer to this issue – Atom Store Dec 07 '20 at 11:03