Reindex DataFrame Columns by Label Series

Question

I have a Series of Labels

pd.Series(['L1', 'L2', 'L3'], ['A', 'B', 'A'])

and a dataframe

pd.DataFrame([[1,2], [3,4]], ['I1', 'I2'], ['A', 'B'])

I'd like to have a dataframe with columns ['L1', 'L2', 'L3'] with the column data from 'A', 'B', 'A' respectively. Like so...

pd.DataFrame([[1,2,1], [3,4,3]], ['I1', 'I2'], ['L1', 'L2', 'L3'])

in a nice pandas way.

Would you like create the sample data and show your expected result ? — BENY, Jun 29 '18 at 17:10
Hopefully, that is helpful in clarifying. The real problem has many labels and is a largish dataframe. — rhaskett, Jun 29 '18 at 17:41
I think reindex is the right solution, but I can't seem to write it the correct way. — rhaskett, Jun 29 '18 at 17:42

score 2 · Accepted Answer · answered Jun 29 '18 at 17:45

2

Since you mention reindex

#s=pd.Series(['L1', 'L2', 'L3'], ['A', 'B', 'A'])
#df=pd.DataFrame([[1,2], [3,4]], ['I1', 'I2'], ['A', 'B'])
df.reindex(s.index,axis=1).rename(columns=s.to_dict())
Out[598]: 
    L3  L2  L3
I1   1   2   1
I2   3   4   3

answered Jun 29 '18 at 17:45

BENY

317,841
20
164
234

If there is a cleaner way without `reindex` I'm happy to use that, but this looks great. – rhaskett Jun 29 '18 at 17:49
@rhaskett `df.loc[:,s.index].rename(columns=s.to_dict()) ` – BENY Jun 29 '18 at 17:50

score 1 · Answer 2 · answered Jun 29 '18 at 17:34

This will produce the dataframe you described:

import pandas as pd
import numpy as np

data = [['A','B','A','A','B','B'],
        ['B','B','B','A','B','B'],
        ['A','B','A','B','B','B']]

columns = ['L1', 'L2', 'L3', 'L4', 'L5', 'L6']

pd.DataFrame(data, columns = columns)

score 0 · Answer 3 · answered Jun 29 '18 at 18:01

You can use loc accessor:

s = pd.Series(['L1', 'L2', 'L3'], ['A', 'B', 'A'])
df = pd.DataFrame([[1,2], [3,4]], ['I1', 'I2'], ['A', 'B'])

res = df.loc[:, s.index]

print(res)

    A  B  A
I1  1  2  1
I2  3  4  3

Or iloc accesor with columns.get_loc:

res = df.iloc[:, s.index.map(df.columns.get_loc)]

Both methods allows accessing duplicate labels / locations, in the same vein as NumPy arrays.

Reindex DataFrame Columns by Label Series

3 Answers3