5

Initial Problem

When I run the following in ipython

import numpy as np
import pandas as pd

df = pd.DataFrame(np.round(9*np.random.rand(4,4), decimals=1))
df.index.name = 'x'
df.columns.name = 'y'

df.to_csv('output.csv')

df

it outputs the following result:

y    0    1    2    3
x                    
0  7.6  7.4  0.3  7.5
1  5.6  0.0  1.5  5.9
2  7.1  2.1  0.0  0.9
3  3.7  6.6  3.3  8.4

However when I open output.csv the "y" is removed:

x   0   1   2   3
0   7.6 7.4 0.3 7.5
1   5.6 0   1.5 5.9
2   7.1 2.1 0   0.9
3   3.7 6.6 3.3 8.4

How do I make it so that the df.columns.name is retained when I output the dataframe to csv?

Crude workaround

Current crude work-around is me doing the following:

df.to_csv('output.csv', index_label = 'x|y')

Which results in output.csv reading:

x|y 0   1   2   3
0   7.6 7.4 0.3 7.5
1   5.6 0   1.5 5.9
2   7.1 2.1 0   0.9
3   3.7 6.6 3.3 8.4

Something better would be great! Thanks for your help (in advance).

Context

This is what I am working on: https://github.com/SimonBiggs/Electron-Cutout-Factors

This is an example table: https://github.com/SimonBiggs/Electron-Cutout-Factors/blob/master/output/20140807_173714/06app06eng/interpolation-table.csv

Community
  • 1
  • 1
SimonBiggs
  • 816
  • 1
  • 8
  • 18
  • I think part of the reason there's no option for this (AFAIK) is that it's hard to know where the column index name would go- which cell do you think it belongs in, exactly? – Marius Aug 06 '14 at 03:06
  • It would be the same display set up that is used as ipython output. "y" would go in cell (1,1), "x" would go in cell (2,1), the top left hand data value would start in cell (3,2). – SimonBiggs Aug 06 '14 at 03:57

2 Answers2

9

You can pass a list to name the columns, then you can specify the index name when you are writing to csv:

df.columns = ['column_name1', 'column_name2', 'column_name3']
df.to_csv('/path/to/file.csv', index_label='Index_name')
WaveRider
  • 475
  • 4
  • 10
1

How about this? It's slightly different but hopefully usable, since it fits the CSV paradigm:

>>> df.columns = ['y{}'.format(name) for name in df.columns]
>>> df.to_csv('output.csv')
>>> print open('output.csv').read()
x,y0,y1,y2,y3
0,3.5,1.5,1.6,0.3
1,7.0,4.7,6.5,5.2
2,6.6,7.6,3.2,5.5
3,4.0,2.8,7.1,7.8
John Zwinck
  • 239,568
  • 38
  • 324
  • 436
  • Thanks for the offer, unfortunately that doesn't meet my needs. I care about the numbers that are written in the columns and index. – SimonBiggs Aug 06 '14 at 03:51
  • Can you be more specific about what problem my solution causes you? It seems like what you're asking for will otherwise require you to write your own header rows before asking Pandas to write the body of the CSV. You can do that if you really need to. – John Zwinck Aug 06 '14 at 05:44
  • **First:** I want the code to be as readable and transparent as possible. So minimal complication is important. **Second:** I am creating a look-up table to be used for interpolation. The axes represent "width" and "aspect ratio", hence the headings. The full code can be seen at my gihtub [here](https://github.com/SimonBiggs/Electron-Cutout-Factors). An example of one of these interpolation tables can be seen [here](https://github.com/SimonBiggs/Electron-Cutout-Factors/blob/master/output/20140806_150610/06app06eng/interpolation-table.csv). – SimonBiggs Aug 07 '14 at 03:34
  • I think your "crude workaround" is as good as you're likely to get. – John Zwinck Aug 07 '14 at 05:46
  • Okay, thanks John. Also, for future readers, the table link has changed. An example table should now be able to be found [here](https://github.com/SimonBiggs/Electron-Cutout-Factors/blob/master/output/20140807_173714/06app06eng/interpolation-table.csv) – SimonBiggs Aug 07 '14 at 07:42