I have a DataFrame which have a column with statuses like this:
datetime | session | try | status
2020-09-17 10:00:01 | '1a' | '1a_1' | 'success'
2020-09-17 10:00:02 | '2a' | '2a_1' | 'fail'
2020-09-17 10:00:03 | '2a' | '2a_2' | 'success'
2020-09-17 10:00:03 | '3a' | '3a_1' | 'interrupted'
2020-09-18 13:00:04 | '4a' | '4a_1' | 'fail'
I want to resample data by day with counting status types by condition in sessions (not tries).
I can resample it by tries easyly like this:
df['date'] = df['datetime'].dt.date
df['ones'] = np.ones(df.shape[0])
piv = df.pivot_table(index='date', columns='status', values='ones', aggfunc=len).fillna(0)
And have:
day | success | fail | interrupted
2020-09-17 | 2 | 2 | 1
2020-09-18 | 0 | 1 | 0
But I want to aggregate it by session with condition no matter how much tries in session.:
- if 'success' status in session tries, then success +1, fail +0, interrupted +0;
- if 'interrupted' status in session AND NO 'success' status in session, then success +0, fail +0, interrupted +1;
- if NO 'interrupted' AND 'success' statuses in session, then then success +0, fail +1, interrupted +0.
So I shoul get something like this:
day | success | fail | interrupted
2020-09-17 | 2 | 0 | 1
2020-09-18 | 0 | 1 | 0
I stuck with function and all I come up with ends with "ValueError: The truth value of a Series is ambiguous". I will be very greatfull for any ideas.