I have the following pd.DataFrame
In [155]: df1
Out[155]:
ORDER_ID ACQ DATE UID
2 3 False 2014-01-03 1
3 4 True 2014-01-04 2
4 5 False 2014-01-05 3
6 7 True 2014-01-08 5
7 8 False 2014-01-08 5
9 10 False 2014-01-10 6
0 11 False 2014-01-11 6
where each entry is an order, with values for ORDER_ID
, DATE
, UID
and ACQ
(indicates whether this is the first order for the associated UID
in the dataset).
I am trying to filter and keep all orders that were placed by users that have made their first order inside the time period covered in the dataset (i.e. at least one of the orders of such users satisfy ACQ == True
).
So, the desired output would be:
ORDER_ID ACQ DATE UID
3 4 True 2014-01-04 2
6 7 True 2014-01-08 5
7 8 False 2014-01-08 5
and I have managed to reach this by:
In [156]: df1.groupby('UID').filter(lambda x: x.ACQ.any() == True)
Out[156]:
ORDER_ID ACQ DATE UID
3 4 True 2014-01-04 2
6 7 True 2014-01-08 5
7 8 False 2014-01-08 5
However, when I try to find all the orders that were placed by users that have made their first order outside the time period covered in the dataset (i.e. All their orders should satisfy ACQ == False
) I seem to be lost. I have tried this:
In [159]: df1.groupby('UID').filter(lambda x: x.ACQ.all() == False)
Out[159]:
ORDER_ID ACQ DATE UID
2 3 False 2014-01-03 1
4 5 False 2014-01-05 3
6 7 True 2014-01-08 5 ## <- This order is an acquisition, therefore all orders with UID == 5 should be filtered out.
7 8 False 2014-01-08 5
9 10 False 2014-01-10 6
0 11 False 2014-01-11 6
How should I go about filtering out all the orders placed by users that have ALL their orders satisfy ACQ == False
?
Any ideas are very much appreciated, thanks!