Read several files, and stack them into a single multilevel data frame. Each file has the same column names

Question

I wanted to ask if anyone knows how to use the multilevel index to stack several data frames into a single one instead of a list of data frames like the one I am doing. Thanks

import glob
import pandas as pd

glist=glob.glob("./path/*.csv")

D=[]
for file in glist:
    X=pd.read_csv(file,names=['name1','name2','name3'],index_col = 0, header=0)
    D.append(X)

It is my first time posting a question, so I don't know how to submit text correctly. But I wanted to ask if anyone knows how to use the multilevel index to stack several dataframes into a single one instead of a list of dataframes like the one I am doing. Thanks — Dr.PP, Aug 10 '17 at 22:07
you need `pd.concat(D,axis=0,keys=['name1','name2','name3'])` — BENY, Aug 10 '17 at 22:08

BlooB · Answer 1 · 2017-08-10T22:28:27.623

look here for a good start, you need to put the paths in a list, than use pandas concat() to put them together

import pandas as pd
import os
from os import path
dfs = [pd.read_csv(path.join('data',x)) for x in os.listdir("data") if path.isfile(path.join("data",x))]
df = pd.concat(dfs)

If you like to assign new columns to a data frame use assign two join multiple dataframes based on multiple levels of index look here

and in order to combine two dataframes into one with hierachial column index, do something like:

 pd.concat(dict(df1 = df1, df2 = df2),axis=1)

also Pandas has a built in function to merge two data frames, look here

Dr.PP · Answer 2 · 2017-08-14T18:22:19.003

0

This seems to do what I wanted. Thank you Wen.

       D=pd.DataFrame()
       for file in glist:
               X=pd.read_csv(file,names=['name1','name2','name3'],header=0,index_col=0)
               D=pd.concat([X,D],axis=0)

edited Aug 14 '17 at 18:22

answered Aug 14 '17 at 18:10

Dr.PP

711
5
8

Read several files, and stack them into a single multilevel data frame. Each file has the same column names

2 Answers2