2

I have a df that looks like this

data =  [{'Stock': 'Apple', 'Weight': 0.2, 'Price': 101.99, 'Beta': 1.1},
         {'Stock': 'MCSFT', 'Weight': 0.1, 'Price': 143.12, 'Beta': 0.9},
         {'Stock': 'WARNER','Weight': 0.15,'Price': 76.12,  'Beta': -1.1},
         {'Stock': 'ASOS',  'Weight': 0.35,'Price': 76.12,  'Beta': -1.1 },
         {'Stock': 'TESCO', 'Weight': 0.2, 'Price': 76.12,  'Beta': -1.1 }]
data_df = pd.DataFrame(data)

and a custom function that will calculate weighted averages

def calc_weighted_averages(data_in, weighted_by):
    return sum(x * y for x, y in zip(data_in, weighted_by)) / sum(weighted_by)

I want to apply this custom formula to a all the columns in my df, my first idea was to do s.th. like this

data_df = data_df[['Weight','Price','Beta']]
data_df = data_df.apply(lambda x: calc_weighted_averages(x['Price'], x['Weight']), axis=1)

How can I keep my weighted_by column fixed and apply the custom function to the other columns? I should end up with a weighted average number for Price and Beta.

ThatQuantDude
  • 759
  • 1
  • 9
  • 26

1 Answers1

1

I think you need subset of all columns first and then use second argument Weight column:

s1 = data_df[['Price','Beta']].apply(lambda x: calc_weighted_averages(x, data_df['Weight']))
print (s1)
Price    87.994
Beta     -0.460
dtype: float64

Another solution without apply is faster:

s1 = data_df[['Price','Beta']].mul(data_df['Weight'], 0).sum().div(data_df['Weight'].sum())
print (s1)
Price    87.994
Beta     -0.460
dtype: float64
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252