I am looking for help with Groupby + Transform. I realize combined they are a very powerful tool, but am having trouble finding documentation/specific situations on what I want to accomplish. I would like to calculate the time delta based upon groups defined by multiple arguments. Specifically, I would like a column that displays the Assigned to Approved time difference by Casenumber AND Site AND only if there are both Assigned and Approval present. Please see below for my data set and code so far:
Dataset (start)
Casenumber Site CreatedDate NewValue
1 A 2018-03-30 16:47:03 Approved
2 A 2018-04-02 19:24:43 Assigned
2 A 2018-04-02 25:24:43 Delayed
2 A 2018-04-02 20:49:45 Approved
2 B 2018-04-02 19:24:43 Assigned
2 B 2018-04-02 22:49:45 Approved
2 B 2018-04-02 25:24:43 Delayed
Dataset (finish)
Casenumber Site CreatedDate NewValue Diff
1 A 2018-03-30 16:47:03 Approved N/A
2 A 2018-04-02 19:24:43 Assigned 0 days 0 hours
2 A 2018-04-02 20:49:45 Approved 0 days 1:25:02
2 A 2018-04-02 25:24:43 Delayed N/A
2 B 2018-04-02 19:24:43 Assigned 0 days 0 hours
2 B 2018-04-02 22:49:45 Approved 0 days 3:25:02
2 B 2018-04-02 25:24:43 Delayed N/A
What I've got so far
(df['Diff'] = df['CreatedDate'] -
(df.groupby(['Casenumber', 'Site'])['CreatedDate'].transform('first')))
Thank you for any help!