I have a dataframe that looks like this:
Datetime | Category | ID
--------------------------
2020-01-30 | A | 1
2020-02-01 | B | 1
2020-02-02 | A | 1
2020-02-20 | A | 1
2020-01-28 | B | 2
2020-01-29 | C | 2
2020-01-30 | C | 2
2020-01-31 | D | 2
2020-02-01 | D | 2
2020-02-02 | D | 2
2020-02-03 | C | 2
I would like to obtain the top 2 most frequent categories for each ID within a 1 week window of the row (excluding the current row). Is that possible in pandas? I tried to do .rolling and .value_counts but it doesnt seem to work. Thanks!
Below is the dataframe which I would like to get:
Datetime | Category | ID
--------------------------
2020-01-30 | NaN | 1
2020-02-01 | [A, ""] | 1
2020-02-02 | [A, B] | 1
2020-02-20 | NaN | 1
2020-01-28 | Nan | 2
2020-01-29 | [B,""] | 2
2020-01-30 | [B,C] | 2
2020-01-31 | [B,C] | 2
2020-02-01 | [C,D] | 2
2020-02-02 | [C,D] | 2
2020-02-03 | [C,D] | 2
Thank you!
Edit The pd.get_dummies answer is great but as my dataset is huge, it is not efficient. Would really appreciate if anyone has an efficient solution for this! Thank you!