1

I want to use dataframe.melt function in pandas lib to convert data format from rows into column but keeping first column value. I ve just tried also .pivot, but it is not working good. Please look at the example below and please help:

       ID      Alphabet Unspecified: 1  Unspecified: 2
0      1          A          G              L
1      2          B         NaN            NaN
2      3          C          H             NaN
3      4          D          I              M
4      5          E          J             NaN
5      6          F          K              O

Into this:

        ID     Alphabet
0      1          A   
1      1          G        
2      1          L 
3      2          B
4      3          C    
5      3          H  
6      4          D   
7      4          I        
8      4          M 
9      5          E
10     5          J
11     6          F
12     6          K     
11     6          O   

  
matheo_eo
  • 13
  • 2

3 Answers3

0

Try (assuming ID is unique and sorted):

df = (
    pd.melt(df, "ID")
    .sort_values("ID", kind="stable")
    .drop(columns="variable")
    .dropna()
    .reset_index(drop=True)
    .rename(columns={"value": "Alphabet"})
)
print(df)

Prints:

    ID Alphabet
0    1        A
1    1        G
2    1        L
3    2        B
4    3        C
5    3        H
6    4        D
7    4        I
8    4        M
9    5        E
10   5        J
11   6        F
12   6        K
13   6        O
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
0

Don't melt but rather stack, this will directly drop the NaNs and keep the order per row:

out = (df
 .set_index('ID')
 .stack().droplevel(1)
 .reset_index(name='Alphabet')
 )

Output:

    ID Alphabet
0    1        A
1    1        G
2    1        L
3    2        B
4    3        C
5    3        H
6    4        D
7    4        I
8    4        M
9    5        E
10   5        J
11   6        F
12   6        K
13   6        O
mozway
  • 194,879
  • 13
  • 39
  • 75
0

One option is with pivot_longer from pyjanitor:

# pip install pyjanitor
import pandas as pd
import janitor

(df
.pivot_longer(
    index = 'ID', 
    names_to = 'Alphabet', 
    names_pattern = ['.+'], 
    sort_by_appearance = True)
.dropna()
)
    ID Alphabet
0    1        A
1    1        G
2    1        L
3    2        B
6    3        C
7    3        H
9    4        D
10   4        I
11   4        M
12   5        E
13   5        J
15   6        F
16   6        K
17   6        O

In the code above, the names_pattern accepts a list of regular expression to match the desired columns, all the matches are collated into one column names Alphabet in names_to.

sammywemmy
  • 27,093
  • 4
  • 17
  • 31