One efficient option is to transform to long form with pivot_longer from pyjanitor, using the .value
placeholder ---> the .value
determines which parts of the columns remain as headers:
# pip install pyjanitor
import pandas as pd
import janitor
df.pivot_longer(
index = ['Account', 'lookup'],
names_to = ('Year', '.value'),
names_pattern = r"(FY\d+)(.+)")
Account lookup Year USD local
0 Sales CA FY11 1000 800
1 Sales JP FY11 5000 10
2 Sales CA FY12 5000 4800
3 Sales JP FY12 6500 15
Another option is to use stack:
temp = df.set_index(['Account', 'lookup'])
temp.columns = temp.columns.str.split('(FY\d+)', expand = True).droplevel(0)
temp.columns.names = ['Year', None]
temp.stack('Year').reset_index()
Account lookup Year USD local
0 Sales CA FY11 1000 800
1 Sales CA FY12 5000 4800
2 Sales JP FY11 5000 10
3 Sales JP FY12 6500 15
You can also pull it off with pd.wide_to_long
after reshaping the columns:
index = ['Account', 'lookup']
temp = df.set_index(index)
temp.columns = (temp
.columns
.str.split('(FY\d+)')
.str[::-1]
.str.join('')
)
(pd.wide_to_long(
temp.reset_index(),
stubnames = ['USD', 'local'],
i = index,
j = 'Year',
suffix = '.+')
.reset_index()
)
Account lookup Year USD local
0 Sales CA FY11 1000 800
1 Sales CA FY12 5000 4800
2 Sales JP FY11 5000 10
3 Sales JP FY12 6500 15