I have a large dataframe representing scores of products belonging to various product groups.
I need to:
Group all rows by beer_style
For each beer_style calculate the mean of that style
For each beer_style: subtract the mean for this specific style from each of this product group's elements (beers). The result (Value - Mean) will replace the original Value (no need for additional column)
here is what I tried
def normalize(group): # Normalize each group - elements same beer_style
group.review_overall -= group.review_overall.mean()
group.review_aroma -= group.review_aroma.mean()
group.review_appearance -= group.review_appearance.mean()
group.review_palate -= group.review_palate.mean()
group.review_taste -= group.review_taste.mean()
return group
df = df.groupby('beer_style').apply(normalize)
df.describe()
I got the table but the numbers look suspicious (all MEAN values for 5 parameters (see above) are very close to zero)
I am not sure that I properly coded my goal.
Please, help