0

Is there a way in pandas to give the same column of a pandas dataframe two names, so that I can index the column by only one of the two names? Here is a quick example illustrating my problem:

import pandas as pd

index=['a','b','c','d']
# The list of tuples here is really just to 
# somehow visualize my problem below: 
columns = [('A','B'), ('C','D'),('E','F')]
df = pd.DataFrame(index=index, columns=columns)

# I can index like that:
df[('A','B')]
# But I would like to be able to index like this:
df[('A',*)] #error
df[(*,'B')] #error
Tim
  • 125
  • 3
  • 12

1 Answers1

3

You can create a multi-index column:

df.columns = pd.MultiIndex.from_tuples(df.columns)

Then you can do:

df.loc[:, ("A", slice(None))]

enter image description here

Or: df.loc[:, (slice(None), "B")]

Here slice(None) is equivalent to selecting all indices at the level, so (slice(None), "B") selects columns whose second level is B regardless of the first level names. This is semantically the same as :. Or write in pandas index slice way. df.loc[:, pd.IndexSlice[:, "B"]] for the second case.

Psidom
  • 209,562
  • 33
  • 339
  • 356
  • Thx, that does the job. Could you please leave s short sentence about the slice(None)? – Tim Mar 29 '17 at 20:41