How to merge dataframe to each groupby group?

Question

Given df1 and df2:

df1:

id	item	sub_item
1	52	AAA
2	52	AAA
3	55	BAA
4	77	CAA
5	77	CAA
6	77	CAA
7	77	CAB
8	77	CAC

df2:

id	color
1	#11
2	#12
3	#13
4	#14
5	#17
6	#18
7	#19
8	#20
9	#21
10	#25

produce df_result:

id	item	sub_item	color
1	52	AAA	#11
2	52	AAA	#11
3	55	BAA	#11
4	77	CAA	#11
5	77	CAA	#11
6	77	CAA	#11
7	77	CAB	#12
8	77	CAC	#13

id is unimportant. Each new sub-item, within an item group, gets new serial color number. Colors start re-assignment back to #11 for each new item. df2 will always be long enough to cover unique sub-items within an item group.

My intuition says to use groupby on df1, then merge each group with df2, but I need help with the code to make this happen.

Why do you want to use groupby? – xkudsraw Dec 23 '21 at 16:14 — xkudsraw, Dec 23 '21 at 16:14

score 1 · Answer 1 · answered Dec 23 '21 at 16:26

1

I'm not sure, but I think that's what you want

merged_df = pd.merge(
        df1,df2, how="left", on="Id"
    )

answered Dec 23 '21 at 16:26

timmy

65
5

Doesn't match the expected output – not_speshal Dec 23 '21 at 16:46

not_speshal · Answer 2 · 2021-12-23T17:07:24.033

1

Try:

df1["color"] = (df2.merge(df1["sub-item"].ne(df1.groupby("item")["sub-item"].transform("first"))
                                         .cumsum()
                                         .add(1)
                                         .rename("id"), 
                          how="right")["color"]
                )

>>> df1
   id  item sub-item color
0   1    52      AAA   #11
1   2    52      AAA   #11
2   3    55      BAA   #11
3   4    77      CAA   #11
4   5    77      CAA   #11
5   6    77      CAA   #11
6   7    77      CAB   #12
7   8    77      CAC   #13

edited Dec 23 '21 at 17:07

answered Dec 23 '21 at 16:46

not_speshal

22,093
2
15
30

Nice. That is definitely in the neighborhood of what I was looking for. I'll give it a go and let you know. Much appreciated. – user17749445 Dec 23 '21 at 16:57
@user17749445 - I edited my answer for optimization. Please use the updated version instead. – not_speshal Dec 23 '21 at 17:07

id	item	sub_item	color
1	52	AAA	#11
2	52	AAA	#11
3	55	BAA	#11
4	77	CAA	#11
5	77	CAA	#11
6	77	CAA	#11
7	77	CAB	#12
8	77	CAC	#13

id	item	sub_item	color
1	52	AAA	#11
2	52	AAA	#11
3	55	BAA	#11
4	77	CAA	#11
5	77	CAA	#11
6	77	CAA	#11
7	77	CAB	#12
8	77	CAC	#13

How to merge dataframe to each groupby group?

2 Answers2

id	item	sub_item	color
1	52	AAA	#11
2	52	AAA	#11
3	55	BAA	#11
4	77	CAA	#11
5	77	CAA	#11
6	77	CAA	#11
7	77	CAB	#12
8	77	CAC	#13