I have a data frame that looks like this:
>>> df = pd.DataFrame({'P1':['ARF5','NaN','NaN'],'P2':['NaN','M6PR','NaN'],'P3':['NaN','NaN','NDUFAF7']})
>>> df
P1 P2 P3
0 ARF5 NaN NaN
1 NaN M6PR NaN
2 NaN NaN NDUFAF7
I have been trying to collapse it down to something like this:
C1
0 ARF5
1 M6PR
2 NDUFAF7
All columns have an overlap but the degree I do not know. Also I do not know how many columns will be in this df at any iteration since it is part of pipeline of which I need to aggregate my output from.
I think in principle I need the functionality of combine_first
but for columns.
I tried something like this:
df['condensed'] = reduce(lambda x,y:x.combine_first(y),[df[:]])
or
df['condensed'] = reduce(lambda x,y:x.combine_first(y),[df['P1'],df['P2'],df['P3']])
But I have some issues figuring this out. Thanks for the help!