I have a dataframe with quarterly returns of financial entities and I want to calculate 1, 3, 5 10-year annualized returns. The formula for calculating annualized returns is:
R = product(1+r)^(4/N) -1
r are the quarterly return of an entity, N is the number of quarters
for example 3-year annualized return is:
R_3yr = product(1+r)^(4/12) -1 = ((1+r1)*(1+r2)*(1+r3)*...*(1+r12))^(1/3) -1
r1, r2, r3 ... r12 are the quarterly returns of the previous 11 quarters plus current quarter.
I created a code which provides the right results but it is very slow because it is looping through each row of the dataframe. The code below is an extract of my code for 1-year and 3-year annualized retruns (I applied the same concept for 5, 7, 10, 15 and 20-year returns). r_qrt is the field with the quarterly returns
import pandas as pd
import numpy as np
#create dataframe where I append the results
df_final = pd.DataFrame()
columns=['Date','Entity','r_qrt','R_1yr','R_3yr']
#loop thorugh the dataframe
for row in df.itertuples():
R_1yr=np.nan #1-year annualized return
R_3yr=np.nan #3-year annualized return
#Calculate 1 YR Annualized Return
date_previous_period=row.Date+ pd.DateOffset(years=-1)
temp_table=df.loc[(df['Date']>date_previous_period) &
(df['Date']<=row.Date) &
(df['Entity']==row.Entity)]
if temp_table['r_qrt'].count()>=4:
b=(1+(temp_table.r_qrt))[-4:].product()
R_1yr=(b-1)
#Calculate 3 YR Annualized Return
date_previous_period=row.Date+ pd.DateOffset(years=-3)
temp_table=df.loc[(df['Date']>date_previous_period) &
(df['Date']<=row.Date) &
(df['Entity']==row.Entity)]
if temp_table['r_qrt'].count()>=12:
b=(1+(temp_table.r_qrt))[-12:].product()
R_3yr=((b**(1/3))-1)
d=[row.Date,row.Entity,row.r_qrt,R_1yr,R_3yr]
df_final = df_final.append(pd.Series(d, index=columns), ignore_index=True)
df_final looks as below (only reporting 1-year return results for space limitations)
Date | Entity | r_qrt | R_1yr |
---|---|---|---|
2015-03-31 | A | 0.035719 | NaN |
2015-06-30 | A | 0.031417 | NaN |
2015-09-30 | A | 0.030872 | NaN |
2015-12-31 | A | 0.029147 | 0.133335 |
2016-03-31 | A | 0.022100 | 0.118432 |
2016-06-30 | A | 0.020329 | 0.106408 |
2016-09-30 | A | 0.017676 | 0.092245 |
2016-12-31 | A | 0.017304 | 0.079676 |
2015-03-31 | B | 0.034705 | NaN |
2015-06-30 | B | 0.037772 | NaN |
2015-09-30 | B | 0.036726 | NaN |
2015-12-31 | B | 0.031889 | 0.148724 |
2016-03-31 | B | 0.029567 | 0.143020 |
2016-06-30 | B | 0.028958 | 0.133312 |
2016-09-30 | B | 0.028890 | 0.124746 |
2016-12-31 | B | 0.030389 | 0.123110 |
I am sure there is a more efficient way to run the same calculations but I have not been able to find it. My code is not efficient and takes more than 2 hours for large dataframes with long time series and many entities.
Thanks