Weighted average in pandas with weights based on the value of a column?

Question

I have the following dataframe

 id        type    side       score       
 601166    p       right      2  
 601166    p       left       6        
 601166    p       right      2  
 601166    p       left       4      
 601166    r       left       2  
 601166    r       left       2  
 601166    r       right      6  
 601166                       2  
 601009    r       left       6  
 601009    r       right      8  
 601939    p       left       2  
 601939    p       left       2

I have calculated the average score for each id, type and side with:

df_result=df.groupby(["id", "type","side"])["score"].mean()

 id        type    side       mean       
 601166    p       right      2  
 601166    p       left       5        
 601166    r       right      6  
 601166    r       left       2   
 601166                       2

But now I would like to calculate the average score for each id and type and add weights to the average scores on each side: the lowest average score for the left or right side counts for 75%, the highest score for 25%.

An example result for id 601166, first calculate the average for each side. The side with the lowest score (right) counts for 75%, the other side (left) for 25%. Empty values can be skipped.

 id        type         mean       
 601166    p            2,75  
 601166    r            3

Any idea how I can add this to my code?

Does your weight need to be grouped by type? To be clear, you just want to say the higher number (between left and right) gets a weight of 25 and the other 75? Should this be another column, or do you actually want it to be concatenated to the mean? — David Maddox, Oct 25 '21 at 20:32
The weight can be added as an extra column to make it easier to check the logic but in the end I just need one value for the mean which is based on these weights — olive, Oct 26 '21 at 08:49

score 3 · Answer 1 · answered Oct 26 '21 at 20:04

3

Would something like this suffice?

df_result = df.groupby(["id", "type", "side"])["score"].mean()
g = df_result.groupby(["id", "type"])
g.min() * 0.75 + g.max() * 0.25

id      type
601009  r       6.50
601166  p       2.75
        r       3.00
601939  p       2.00
Name: score, dtype: float64

answered Oct 26 '21 at 20:04

hyit

496
4
10

1

Perfect, thank you for this elegant solution! – olive Nov 03 '21 at 13:40

Weighted average in pandas with weights based on the value of a column?

1 Answers1