How can I export and import a pandas dataframe with multiindex row and column index into/from a csv file?

Question

Pandas allows to export dataframes into csv with to_csv('path/to/file.csv'). However to me it is not clear how I can export (and import) a dataframe which uses MultiIndex for rows and columns like e.g. this one from the corresponding advanced docs:

first              bar                 baz                 foo          
second             one       two       one       two       one       two
first second                                                            
bar   one    -0.410001 -0.078638  0.545952 -1.219217 -1.226825  0.769804
      two    -1.281247 -0.727707 -0.121306 -0.097883  0.695775  0.341734
baz   one     0.959726 -1.110336 -0.619976  0.149748 -0.732339  0.687738
      two     0.176444  0.403310 -0.154951  0.301624 -2.179861 -1.369849
foo   one    -0.954208  1.462696 -1.743161 -0.826591 -0.345352  1.314232
      two     0.690579  0.995761  2.396780  0.014871  3.357427 -0.317441

To generate a dataframe like this with random data values in Jupyter notebook:

import pandas as pd
import numpy as np

column_index_matrix = [np.array(['bar', 'bar', 'baz', 'baz', 'foo', 'foo']),
                       np.array(['one', 'two', 'one', 'two', 'one', 'two'])]
column_names = ['first', 'second']
column_multiindex = pd.MultiIndex.from_arrays(column_index_matrix, names=column_names)
column_multiindex

row_multiindex = column_multiindex

df = pd.DataFrame(np.random.randn(6, 6), index=row_multiindex, columns=column_multiindex)
df

WHen running df.to_csv(r'df.csv', index=True) the dataframe is exported into a csv file properly. However I don't know how to use pd.read_csv().

`df = pd.read_csv('df.csv', index_col=[0,1], header=[0,1])` from [Pandas - write Multiindex rows with to\_csv](https://stackoverflow.com/questions/17349574/pandas-write-multiindex-rows-with-to-csv). This is a duplicate and should be closed. — Trenton McKinney, Apr 22 '20 at 06:42

gosuto · Accepted Answer · 2020-04-22T15:49:00.217

You can pass multiple columns/rows to be used as index/header. Here I pass the 0th and 1st column to be used for the index (so two levels) and the 0th and 1st row to be used for the two header levels:

pd.read_csv('data.csv', index_col=[0, 1], header=[0, 1])

first              bar                 baz                 foo          
second             one       two       one       two       one       two
first second                                                            
bar   one     0.788793 -0.591498 -0.309037 -0.433105 -1.413536 -0.209560
      two    -0.354429  1.671837  1.527225  0.282775 -0.973088 -0.728555
baz   one    -0.180517  1.226219 -0.810984 -0.580251 -0.453205 -1.368015
      two    -0.040708 -0.836359 -2.043332  1.396955 -0.562718 -1.099926
foo   one    -0.612561  0.815998 -0.942997 -0.423395  0.157410 -0.537063
      two    -0.312878  0.194915 -1.420048 -0.944414 -0.560043 -0.036713

score 0 · Answer 2 · answered Apr 22 '20 at 06:10

0

I tried before but eventually considered that it was better to :

unstack the entire column multiindex
reset_index
to_csv

and then :

read_csv
pivot_table

It seems easier but probably occupies more memory.

answered Apr 22 '20 at 06:10

Kokli

155
7

Seems like `df.to_csv(r'df.csv', index=True)` exports properly into csv. However there is no equivalent [`pd.read_csv(..., index=True)`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html). – thinwybk Apr 22 '20 at 06:28
I am not sure you can read_csv with both index and column multiindex, cf https://stackoverflow.com/questions/52435825/want-multiindex-for-rows-and-columns-with-read-csv – Kokli Apr 22 '20 at 06:32

How can I export and import a pandas dataframe with multiindex row and column index into/from a csv file?

2 Answers2