Given df1 and df2:
df1:
id | item | sub_item |
---|---|---|
1 | 52 | AAA |
2 | 52 | AAA |
3 | 55 | BAA |
4 | 77 | CAA |
5 | 77 | CAA |
6 | 77 | CAA |
7 | 77 | CAB |
8 | 77 | CAC |
df2:
id | color |
---|---|
1 | #11 |
2 | #12 |
3 | #13 |
4 | #14 |
5 | #17 |
6 | #18 |
7 | #19 |
8 | #20 |
9 | #21 |
10 | #25 |
produce df_result:
id | item | sub_item | color |
---|---|---|---|
1 | 52 | AAA | #11 |
2 | 52 | AAA | #11 |
3 | 55 | BAA | #11 |
4 | 77 | CAA | #11 |
5 | 77 | CAA | #11 |
6 | 77 | CAA | #11 |
7 | 77 | CAB | #12 |
8 | 77 | CAC | #13 |
id
is unimportant. Each new sub-item, within an item group, gets new serial color number. Colors start re-assignment back to #11 for each new item. df2 will always be long enough to cover unique sub-items within an item group.
My intuition says to use groupby
on df1
, then merge
each group with df2
, but I need help with the code to make this happen.