4

I have a pandas.Dataframe with the following columns:

a_1  ab_1  ac_1    a_2   ab_2   ac_2
2      3     4      5     6      7 

How do I convert it into the following?

a    ab    ac 
2     3     4
5     6     7 

I was trying to use pandas melt to convert from wide to long format, but not sure of the syntax.

Stephen Rauch
  • 47,830
  • 31
  • 106
  • 135
user308827
  • 21,227
  • 87
  • 254
  • 417

6 Answers6

5

You can replace the columns by a multi-index and stack:

df.columns = pd.MultiIndex.from_tuples(df.columns.str.split('_').map(tuple))
df = df.stack()
hilberts_drinking_problem
  • 11,322
  • 3
  • 22
  • 51
3

Here is one way to do that:

Code:

df.columns = pd.MultiIndex.from_tuples(
    [c.split('_') for c in df.columns], names=['col', 'row'])

df.melt().pivot(index='row', columns='col', values='value')

How?

  1. Create a pandas.MultiIndex for the columns by splitting on _.

  2. melt the data frame and then pivot on the elements from the original column names.

Test Code:

df = pd.DataFrame(
    data=[range(2, 8)],
    columns='a_1 ab_1 ac_1 a_2 ab_2 ac_2'.split()
)
print(df)

df.columns = pd.MultiIndex.from_tuples(
    [c.split('_') for c in df.columns], names=['col', 'row'])

print(df.melt().pivot(index='row', columns='col', values='value'))

Results:

   a_1  ab_1  ac_1  a_2  ab_2  ac_2
0    2     3     4    5     6     7

col  a  ab  ac
row           
1    2   3   4
2    5   6   7

pandas < 0.20.0

If using pandas prior to 0.20.0, melt() like:

print(pd.melt(df).pivot(index='row', columns='col', values='value'))
Stephen Rauch
  • 47,830
  • 31
  • 106
  • 135
2

You can use split for MultiIndex and then reshape by stack and last use reset_index for remove MultiIndex:

df.columns = df.columns.str.split('_', expand=True)
df = df.stack().reset_index(drop=True)
print (df)
   a  ab  ac
0  2   3   4
1  5   6   7

df = df.stack().reset_index(level=0, drop=True)
print (df)
   a  ab  ac
1  2   3   4
2  5   6   7
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • I have a similar issue but for some reason, it doesn't work. Some issue with column names. Can you please help? https://stackoverflow.com/questions/56807104/wide-to-long-returns-empty-output-python-dataframe/56808192?noredirect=1#comment100170775_56808192 – The Great Jun 28 '19 at 14:04
1

You can split the dataframe into two, then rename the columns, and finally concatenate them:

cols = ['a', 'ab', 'ac']
df1 = df[["a_1", "ab_1", "ac_1"]]
df2 = df[["a_2", "ab_2", "ac_2"]]
df1.columns = cols
df2.columns = cols
df3 = pd.concat([df1, df2], ignore_index=True)
Diego Mora Cespedes
  • 3,605
  • 5
  • 26
  • 33
1

If you want to use pnd.melt you should probably use the value_vars and value_name parameters:

df_a = pnd.melt(df, value_vars=['a_1', 'a_2'], value_name='a')[['a']]
df_ab = pnd.melt(df, value_vars=['ab_1', 'ab_2'], value_name='ab')[['ab']]
df_ac = pnd.melt(df, value_vars=['ac_1', 'ac_2'], value_name='ac')[['ac']]
df_final = df_a.join(df_b).join(df_c)

Alternatively, taking a more functional approach:

col_prefixes = ['a', 'ab', 'ac']
df_cuts = map(lambda x: pnd.melt(df, value_vars=['%s_1' % x, '%s_2' % x], value_name=x)[[x]], col_prefixes)
df_final = reduce(lambda x, y: x.join(y), df_cuts)
peterfields
  • 316
  • 1
  • 5
1

There is builtin function wide_to_long for more see in documentation:

In [115]: df
Out[115]: 
   a_1  ab_1  ac_1  a_2  ab_2  ac_2
0    2     3     4    5     6     7

In [116]: df['id'] = df.index

In [117]: df
Out[117]: 
   a_1  ab_1  ac_1  a_2  ab_2  ac_2  id
0    2     3     4    5     6     7   0
In [118]: pd.wide_to_long(df, ['a','ab','ac'],i='id',j='num',sep='_')
Out[118]: 
        a  ab  ac
id num           
0  1    2   3   4
   2    5   6   7
shivsn
  • 7,680
  • 1
  • 26
  • 33
  • Hi, Can you please helo me with this post? I get an empty output when I changed the column names. Not sure whther the issue is with column names or anything else. Help please https://stackoverflow.com/questions/56807104/wide-to-long-returns-empty-output-python-dataframe/56808192?noredirect=1#comment100170775_56808192. – The Great Jun 28 '19 at 14:10