0

I have the following dataframe which has a unique URL in the first column, followed by a random number of unique keywords. I would like to transpose the keywords into a single row and forward fill the url as per my desired output below.

0                              1          2          3          4          5      
www.example.com/page1         car        dog        cat        bat        apple
www.example.com/page2         dog        car 
www.example.com/page3         cat        bat        apple

Desired Output:

0                              1          2          3          4          5   
www.example.com/page1         car       
www.example.com/page1         dog
www.example.com/page1         cat
www.example.com/page1         bat
www.example.com/page1         apple
www.example.com/page2         dog
www.example.com/page2         car
www.example.com/page3         cat
www.example.com/page3         bat
www.example.com/page3         apple

what I've tried: I've looked at these solutiosn, but neither seem to work for me. I've also consulted the docs on pd.melt but could not get it to work. (Still learning. Any help appreciated). Pandas create new date rows and forward fill column values Converting rows to columns using UNPIVOT

Lee Roy
  • 297
  • 1
  • 11

3 Answers3

2

Doing it with melt is pretty straight forward.

df = df.melt(id_vars='0', value_vars=df.columns[1:], value_name='1').drop('variable', axis=1).sort_values('0')

This only works properly if your empty cells are shown as NaN in your DataFrame. From the example you posted, it looks like they are actually not NaN values but rather empty strings ''. In that case you first need to replace them with: df.replace('',np.NaN, inplace=True). Otherwise you will get rows with empty entries in column 1.

Im also not sure why you want to keep the empty columns 1 to 5 because if you need to fill them later, you can just create them, it's the same procedure as if you tried to fill / override an existing empty column. But for the sake of it, you can add the empty columns back in:

df = pd.concat([df, pd.DataFrame(columns=list('2345'))]).fillna('')
sunnytown
  • 1,844
  • 1
  • 6
  • 13
  • Thanks, I didn't need to keep the columns - just thought it might emphasis what I needed the final output to be, but looks like it just added confusion! Your answer is perfect thanks! – Lee Roy Dec 21 '20 at 08:49
0

Not sure why you need the empty columns after but you can always re-add them

I got the output by stacking and merging with the original df

df.drop([0], axis=1).stack().droplevel(1).to_frame(name=1).merge(df[[0]], left_index=True, right_index=True, how='right')

Final add the other columns df[[2,3,4,5]] = ''

       1                      0 2 3 4 5
0    car  www.example.com/page1
0    dog  www.example.com/page1
0    cat  www.example.com/page1
0    bat  www.example.com/page1
0  apple  www.example.com/page1
1    dog  www.example.com/page2
1    car  www.example.com/page2
2    cat  www.example.com/page3
2    bat  www.example.com/page3
2  apple  www.example.com/page3
Kenan
  • 13,156
  • 8
  • 43
  • 50
  • Good point about the other columns, I actually don't need them. I thought it might make the example clearer! – Lee Roy Dec 21 '20 at 08:19
0

There are a few ways to do this. Setup first:

df_orig = pd.DataFrame.from_dict(
{'0': {0: 'www.example.com/page1',
  1: 'www.example.com/page2',
  2: 'www.example.com/page3'},
 '1': {0: 'car', 1: 'dog', 2: 'cat'},
 '2': {0: 'dog', 1: 'car', 2: 'bat'},
 '3': {0: 'cat', 1: None, 2: 'apple'},
 '4': {0: 'bat', 1: None, 2: None},
 '5': {0: 'apple', 1: None, 2: None}}
)

# first way, df.melt:
df = df_orig.melt(id_vars="0", value_vars=list("12345"), value_name="1")
df_melt = df.drop(columns="variable").dropna().sort_values("0", ignore_index=True)

# second way, df.unstack:
df_unstack = df_orig.set_index("0", drop=True).unstack().dropna().reset_index(0, drop=True).to_frame()

output:

                       0      1
0  www.example.com/page1    car
1  www.example.com/page1    dog
2  www.example.com/page1    cat
3  www.example.com/page1    bat
4  www.example.com/page1  apple
5  www.example.com/page2    dog
6  www.example.com/page2    car
7  www.example.com/page3    cat
8  www.example.com/page3    bat
9  www.example.com/page3  apple
anon01
  • 10,618
  • 8
  • 35
  • 58