2

I have been trying to solve a basic problem (and have been Python 2.7, and have got no where. I have a csv file with column headings such as:

a,b,c
1,2,3
1,2,3
1,2,3

This is saved as test1.csv

I have managed to take each column and pass them into an array, so each column heading is at the start, followed by the data. How would I then write this back into a new csv file with the same order and structure?

import csv

f = open('test1.csv','rb')

reader = csv.reader(f)
headers = reader.next()
print headers

column = {}
for h in headers:
    column[h] = []  

for row in reader:
    for h,v in zip(headers,row):
    column[h].append(v)
DGraham
  • 705
  • 2
  • 10
  • 23
  • I guess `pandas` is out of the question? – Leb Oct 21 '15 at 20:13
  • 1
    How is this a csv without commas? – MohitC Oct 21 '15 at 20:13
  • 1
    1/ write the header 2/ use a SortedDict to keep the columns in order 3/ rotate your dict: `zip(*d.values())` (gives you a list for each line) – njzk2 Oct 21 '15 at 20:16
  • 1
    or if you don't like SortedDict, `zip(*map(lambda x: x[1], sorted(d.items(), key=lambda x: header.index(x[0]))))` (but I prefer the SortedDict) – njzk2 Oct 21 '15 at 20:18
  • Possible duplicate of [Writing Python lists to columns in csv](http://stackoverflow.com/questions/17704244/writing-python-lists-to-columns-in-csv) – wflynny Oct 21 '15 at 20:25

1 Answers1

2

You could write (and read) the data like this, which uses a defaultdict instead of a plain dictionary to simplify its construction:

from collections import defaultdict
import csv

with open('test1.csv', 'rb') as f:
    reader = csv.reader(f)
    header = next(reader)
    columns = defaultdict(list)
    for row in reader:
        for col,value in zip(header, row):
            columns[col].append(value)
    num_rows = len(columns[header[0]])

# now write data back out in same order

with open('test2.csv', 'wb') as f:
    writer = csv.writer(f)
    writer.writerow(header)
    writer.writerows(
        tuple(columns[col][row] for col in header) for row in xrange(num_rows))
martineau
  • 119,623
  • 25
  • 170
  • 301