3

I'm writing a fixed-width file to CSV. Because the file is too large to read at once, I'm reading the file in chunks of 100000 and appending to CSV. This is working fine, however it's adding an index to the rows despite having set index = False.

How can I complete the CSV file without index?

infile = filename
outfile = outfilename
cols = [(0,10), (12,19), (22,29), (34,41), (44,52), (54,64), (72,80), (82,106), (116,144), (145,152), (161,169), (171,181)]

for chunk in pd.read_fwf(path, colspecs = col_spec, index=False, chunksize=100000):
chunk.to_csv(outfile,mode='a')
smci
  • 32,567
  • 20
  • 113
  • 146
user3867061
  • 73
  • 2
  • 16

1 Answers1

1

The to_csv method has a header parameter, indicating if to output the header. In this case, you probably do not want this for writes that are not the first write.

So, you could do something like this:

for i, chunk in enumerate(pd.read_fwf(...)):
    first = i == 0
    chunk.to_csv(outfile, header=first, mode='a')
Ami Tavory
  • 74,578
  • 11
  • 141
  • 185
  • Thanks, but I found the answer to my question hidden in the action of getting up, leaving work and going home. I should have put the index=False in chunk.to_csv(outfile, index=False, mode='a') rather than in read.fwf()... – user3867061 Jun 25 '15 at 12:07
  • 1
    @user3867061 why not add it as an answer then – eis Jun 23 '18 at 06:40