2

I have a data frame and a list of weight as follows

import pandas as pd
import numpy as np

data = [
  ['A',1,2,3,4],
  ['A',5,6,7,8],
  ['A',9,10,11,12],
  ['B',13,14,15,16],
  ['B',17,18,19,20],
  ['B',21,22,23,24],
  ['B',25,26,27,28],
  ['C',29,30,31,32],
  ['C',33,34,35,36],
  ['C',37,38,39,40],

]

df = pd.DataFrame(data, columns=['Name', 'num1', 'num2', 'num3', 'num4'])
df

Now I want to calculate weighted average with the help of list

weights = [10,20,30,40] = 100


for cols in df.columns:
    df[cols]=df.groupby(['dpid'])[cols].transform(lambda x:)

But I am not sure how to calculate the weighted average from each Name.

For A it has only three rows so, it should be calculated for num1 as (1*30+5*30+9*40)/100 if I am not wrong. and same for columns num2,num3, and num4.

For B it has 4 rows, so, it should be calculated as (13*10+17*20+21*30+25*40)/100.

Can somebody help me with this?

Bad Coder
  • 177
  • 11
  • Cna you have more than 4 rows of "B"? – Colim Jul 27 '22 at 18:52
  • assuming you can have a max of 4 rows with the same name, you can substract the last element from the list to a 100 and iterate. for example for A : first one will be 40, with 60 left to distribute. second row would be 30 with 30 left, last row takes all the rest so 30. I do wanna add if it is an assignement it is more likely that your teacher meant col1 has a weight of 10, col2 of 20... and you average by row. – Farid Fakhry Jul 27 '22 at 19:00

1 Answers1

2

IIUC, you can use a groupby.apply:

df.groupby('Name').agg(lambda g: sum(g*weights[:len(g)])/sum(weights[:len(g)]))

output:

           num1       num2       num3       num4
Name                                            
A      6.333333   7.333333   8.333333   9.333333
B     21.000000  22.000000  23.000000  24.000000
C     34.333333  35.333333  36.333333  37.333333
mozway
  • 194,879
  • 13
  • 39
  • 75