For the following dataframe:
df = pd.DataFrame({'group':['a','a','b','b'], 'data':[5,10,100,30]},columns=['group', 'data'])
print(df)
group data
0 a 5
1 a 10
2 b 100
3 b 30
When grouping by column, adding and creating a new column, the result is:
df['new'] = df.groupby('group')['data'].sum()
print(df)
group data new
0 a 5 NaN
1 a 10 NaN
2 b 100 NaN
3 b 30 NaN
However if we reset the df to the original data and move the group column to the index,
df.set_index('group', inplace=True)
print(df)
data
group
a 5
a 10
b 100
b 30
And then group and sum, then we get:
df['new'] = df.groupby('group')['data'].sum()
print(df)
data new
group
a 5 15
a 10 15
b 100 130
b 30 130
Why does the column group not set the values in the new column but the index grouping does set the values in the new column?