1

I have a list of biological samples that can show an event at Day 1, Day 2 or Day 3. My dataframe looks like that :

    Cell#   Event
0   0   Day 3
1   1   Day 2
2   2   Day 2
3   3   Day 1
4   4   Day 3
5   5   Day 3
6   6   Day 2
7   7   0
8   8   Day 1
9   9   Day 2

It means that Cell#0 showed event at Day3 and Cell#7 did not show the event.

I would like to reshape it in order to have this kind of dataframe :

    Cell#   Day 1   Day 2   Day 3
0   0   0   0   1
1   1   0   1   1
2   2   0   1   1
3   3   1   1   1
4   4   0   0   1
5   5   0   0   1
6   6   0   1   1
7   7   0   0   0
8   8   1   1   1
9   9   0   1   1

It means that while event has not happened, value is 0, but as it has happened, value turns to 1 until the end.

I have been struggling with unstack (how to unstack (or pivot?) in pandas) and pivot but I cannot find the solution...

Please, could you let me know if you have any clue in order to solve this issue ?

Kindly !

chalbiophysics
  • 313
  • 4
  • 15

2 Answers2

1

Try with get_dummies

out = df.join(df.pop('Event').replace('0',np.nan).str.get_dummies())
BENY
  • 317,841
  • 20
  • 164
  • 234
1

try:

pd.concat([df.drop(['Event'], axis=1),pd.get_dummies(df['Event'])], axis=1).drop('0',axis=1)

Or

df.assign(**pd.get_dummies(df['Event'])).drop(['Event', '0'],axis=1)

using unstack:

df.assign(k=1).set_index(['Cell#','Event'])['k'].unstack().reset_index(drop=True).drop('0',axis=1).fillna(0)
Pygirl
  • 12,969
  • 5
  • 30
  • 43