1

Using the following code to merge CSV files, it will at times put the data in the wrong columns. Rather than being in Columns A-D it will put the data in columns F-J. From what I can tell is it's the first line of a new CSV that gets put in the wrong column, however, not every CSV file.

import glob
import codecs
import csv 

my_files = glob.glob("*.csv") 

header_saved = False 
with codecs.open('Final-US-Allies-Expects.csv','w', "UTF-8", 'ignore') as file_out: #save data to
    for filename in my_files:
        with codecs.open(filename, 'r', 'UTF-8', 'ignore') as file_in: 
            header = next(file_in) 
            if not header_saved: 
                file_out.write(header) #write header
                header_saved = True
            for line in file_in:
                file_out.write(line) #write next line

original code available at Merging multiple CSV files without headers being repeated (using Python) (reputation not high enough to add to original question)

Visual of issue

I've attached a visual of the issue. I need to be able to have every line be written in in the column it is meant to be written into.

Thanks for your help in advance.

1 Answers1

0

Looks like you are not checking if the lines end in new line character before writing it to the file. This could mess up the alignment. Could you try this?

import glob
import codecs
import csv

my_files = glob.glob("*.csv")

header_saved = False
with codecs.open('output.csv','w', "UTF-8", 'ignore') as file_out:
    for filename in my_files:
        with codecs.open(filename, 'r', 'UTF-8', 'ignore') as file_in:
            header = next(file_in)
            if not header_saved:
                file_out.write(header if "\n" == header[-1] else header + "\n")
                header_saved = True
            for line in file_in:
                file_out.write(line if "\n" == line[-1] else line + "\n")