Applying MEAN() to a group from DF

Question

I have a large dataframe representing scores of products belonging to various product groups.

I need to:

Group all rows by beer_style
For each beer_style calculate the mean of that style
For each beer_style: subtract the mean for this specific style from each of this product group's elements (beers). The result (Value - Mean) will replace the original Value (no need for additional column)

here is what I tried

    def normalize(group):  # Normalize each group - elements same beer_style
       group.review_overall -= group.review_overall.mean()
       group.review_aroma -= group.review_aroma.mean()
       group.review_appearance -= group.review_appearance.mean()
       group.review_palate -= group.review_palate.mean()
       group.review_taste -= group.review_taste.mean()
       return group
   df = df.groupby('beer_style').apply(normalize) 
   df.describe()

I got the table but the numbers look suspicious (all MEAN values for 5 parameters (see above) are very close to zero)

I am not sure that I properly coded my goal.

Please, help

Are you sure you didn't clobber the original value of `df`? You reassign the results of your normalization back to `df` in your penultimate line. Generally, you would want to use a different variable in case you want to do some other calculations on the original data. — Alexander, Sep 08 '15 at 04:55
@Alexander - great point! this is only an example. I always make tempDF from the original file and work on it. As a side note, I would hope there is a pre-built normalization function in numpy or in a similar library, ehich could be used. I will be very grateful for the reference. — Toly, Sep 08 '15 at 05:18

Applying MEAN() to a group from DF

0 Answers0