How to reshape a pandas dataframe with boolean values

Question

I have a list of biological samples that can show an event at Day 1, Day 2 or Day 3. My dataframe looks like that :

    Cell#   Event
0   0   Day 3
1   1   Day 2
2   2   Day 2
3   3   Day 1
4   4   Day 3
5   5   Day 3
6   6   Day 2
7   7   0
8   8   Day 1
9   9   Day 2

It means that Cell#0 showed event at Day3 and Cell#7 did not show the event.

I would like to reshape it in order to have this kind of dataframe :

    Cell#   Day 1   Day 2   Day 3
0   0   0   0   1
1   1   0   1   1
2   2   0   1   1
3   3   1   1   1
4   4   0   0   1
5   5   0   0   1
6   6   0   1   1
7   7   0   0   0
8   8   1   1   1
9   9   0   1   1

It means that while event has not happened, value is 0, but as it has happened, value turns to 1 until the end.

I have been struggling with unstack (how to unstack (or pivot?) in pandas) and pivot but I cannot find the solution...

Please, could you let me know if you have any clue in order to solve this issue ?

Kindly !

If any answer helps, close the question by accepting it as an answer. — Pygirl, Mar 11 '21 at 08:09

score 1 · Answer 1 · answered Mar 08 '21 at 15:50

1

Try with get_dummies

out = df.join(df.pop('Event').replace('0',np.nan).str.get_dummies())

answered Mar 08 '21 at 15:50

BENY

317,841
20
164
234

Thank you ! Indeed it's working. It was harder than I thought ! – chalbiophysics Mar 08 '21 at 15:52
I don't understand the role of `str` accessor here. Wasn't it possible to get the dummies directly on the return series of `replace` method without using `str`? – ashkangh Mar 08 '21 at 16:52
@ashkangh most of the time the 0 here is str '0' , so dummies will include one more column which is 0 – BENY Mar 08 '21 at 17:28
Sorry for again asking questions. But you already replaced those str`0` with `nan` values. – ashkangh Mar 08 '21 at 17:36
1

@ashkangh str here is a function tool ~ not 'str' ~ – BENY Mar 08 '21 at 18:02

Pygirl · Accepted Answer · 2021-03-08T16:35:42.210

1

try:

pd.concat([df.drop(['Event'], axis=1),pd.get_dummies(df['Event'])], axis=1).drop('0',axis=1)

Or

df.assign(**pd.get_dummies(df['Event'])).drop(['Event', '0'],axis=1)

using unstack:

df.assign(k=1).set_index(['Cell#','Event'])['k'].unstack().reset_index(drop=True).drop('0',axis=1).fillna(0)

edited Mar 08 '21 at 16:35

answered Mar 08 '21 at 15:56

Pygirl

12,969
5
30
43

Hello ! Function get_dummies was the key to solve this problem. Thank you. – chalbiophysics Mar 08 '21 at 16:01
@chalbiophysics: You can use unstack also. I have given you the code for that. – Pygirl Mar 08 '21 at 16:39

How to reshape a pandas dataframe with boolean values

2 Answers2