Given a DataFrame
, I would like the group number of the values in one column id1
, within each group of a second column id2
.
I tried ngroup()
to identify unique number groups by id1
and id2
.
Here is an example df
:
id1 id2
0 1123 123
1 1123 123
2 1124 123
3 1124 123
4 1125 123
5 1125 123
6 1125 123
7 1126 122
8 1126 122
9 1127 122
Using ngroup()
:
df['row_id'] = df.groupby(['id1','id2']).ngroup() + 1
But it gave me this output:
row_id = [1, 1, 2, 2, 3, 3, 3, 4, 4, 5]
I would like the last 3 values to start again at 1
, since they are for a new group of id2
(122); thus my desired output is:
row_id = [1, 1, 2, 2, 3, 3, 3, 1, 1, 2]
# ^ restart (id2 switches from 123 to 122)