0

I have problems when trying to calculate the volume weighted average price using .resample() method and .apply(). Here is the setting. I have a dataframe called All with the following info:

All.head()

                               price     volume buy_sell market_limit misc
dtime                                                                      
2020-08-06 11:26:45.705199957  395.23   0.064363      buy        limit     
2020-08-06 11:26:45.702500105  395.23   0.114847      buy        limit     
2020-08-06 11:26:45.700900078  395.23  30.000000      buy        limit     
2020-08-06 11:26:45.698899984  395.23  11.175000      buy        limit     
2020-08-06 11:26:45.696000099  395.23   2.415115      buy        limit   

All.info()
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 38471 entries, 2020-08-06 11:26:45.705199957 to 2020-08-09 04:20:45.227400064
Data columns (total 5 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   price         38471 non-null  float64
 1   volume        38471 non-null  float64
 2   buy_sell      38471 non-null  object 
 3   market_limit  38471 non-null  object 
 4   misc          38471 non-null  object 
dtypes: float64(2), object(3)
memory usage: 1.8+ MB

The dataframe has a datetime index, thus I can use the^resample method for resampling the data on a daily frequency. Now, I want to calculate the volume weighted average price using apply() and np.average()

All.resample('1D').apply(lambda x: np.average(x.price, weights = x.volume))

However, this results in the following error:

AttributeError: 'Series' object has no attribute 'price'

When circumventing .resample() and using .groupby() as an alternative it does work. However, I need to do aditional steps processing the date, which I really dont want to do.

All_alt = All.reset_index()
All_alt['dtime'] = All_alt['dtime'].apply(lambda x: x.date())
All_alt.head()

dtime   price     volume buy_sell market_limit misc
0  2020-08-06  395.23   0.064363      buy        limit     
1  2020-08-06  395.23   0.114847      buy        limit     
2  2020-08-06  395.23  30.000000      buy        limit     
3  2020-08-06  395.23  11.175000      buy        limit     
4  2020-08-06  395.23   2.415115      buy        limit    


All_alt.groupby(['dtime']).apply(lambda x: np.average(x.price, weights = x.volume))

dtime
2020-08-06    396.889472
2020-08-07    381.178095
2020-08-08    387.182528
2020-08-09    397.162415
dtype: float64

Can anybody help me with the correct syntax using resample(). I dont understand why resample() creates a Series object when using apply()

Many thanks in advance.

friend1
  • 11
  • 1
  • Does this work as a workaround for errors? `df.resample('1D')['price','volume'].agg(lambda x: np.average(x.price, weights=x.volume, returned='True'))` – r-beginners Aug 09 '20 at 09:19
  • 1
    Thanks @r-beginners I think you shed some light on my problem. I think when using apply() we need a pandas.core.groupby.generic.DataFrameGroupBy this is created when appending `All.resample('1D')` with the required columns as a list `['price','volume']`. Then I can use `apply(lambda x: np.average(x.price, weights=x.volume)`. Your suggesting using agg() didnt work. To sum it up: `All.resample('1D')['price','volume'].apply(lambda x: np.average(x.price, weights = x.volume))` did the trick – friend1 Aug 09 '20 at 10:38

0 Answers0