6

I have a dataframe with annual price and dividend data for numerous companies. I am looking to calculate the 3-year annualized return by adding all of the dividends received during the three years to the ending stock price, and then taking the CAGR. I know how to calculate the CAGR, but where I am stuck is adding dividends received over the period to the ending price.

Sample data:

       RIC  Date    Price   Dividend
0   RSG.AX  2018    0.814   0.000
1   RSG.AX  2017    0.889   0.015
2   RSG.AX  2016    0.937   0.012
3   RSG.AX  2015    0.181   0.000
4   RSG.AX  2014    0.216   0.000
5   RSG.AX  2013    0.494   0.000
6   QBE.AX  2018    7.119   0.352
7   QBE.AX  2017    8.331   0.202
8   QBE.AX  2016    8.961   0.389
9   QBE.AX  2015    9.159   0.363
10  QBE.AX  2014    9.156   0.302

Using the company RSG.AX (RIC=company code), an example calculation from 2015 to 2018 would be:

3-year return = (End price + cumulative dividends) / Start price = (0.814+0.015+0.012)/0.182 = 4.63

Annualized return = (return)^(1/years)-1 = (4.63)^(1/3)-1 = 0.66 = 66%

How do I do this with Python? Perhaps .groupby() would work to separate each company's data. Any help is appreciated!

Community
  • 1
  • 1
nlieb7
  • 91
  • 1
  • 6

2 Answers2

2

Use shift() get values from the rows above/below for calculation


Method 1: use a loop over RIC

I loop it through df.RIC.unique() with a copy of sub dataframe sub_df on each RIC. Assuming the Price of year is post-dividend, 3-year return would be:

sub_df['3-year Return'] = (sub_df.Price +
                           sub_df.Dividend +
                           sub_df.shift(-1).Dividend +
                           sub_df.shift(-2).Dividend) / sub_df.Price.shift(-3)

Update the sub_df to original df afterward. Then calculate Annualized return based on your formula with pow()

df['3-year Return'] = None
for ric in df.RIC.unique():
    sub_df = df.loc[df['RIC'] == ric].copy()
    sub_df['3-year Return'] = (sub_df.Price + 
                               sub_df.Dividend + 
                               sub_df.shift(-1).Dividend + 
                               sub_df.shift(-2).Dividend) / sub_df.Price.shift(-3)
    df.update(sub_df)
df['Annualized return'] = pow(df['3-year Return'], 1/3)  - 1
print(df)

       RIC    Date  Price  Dividend 3-year Return Annualized return
0   RSG.AX  2018.0  0.814     0.000       4.64641          0.668678
1   RSG.AX  2017.0  0.889     0.015       4.24074          0.618629
2   RSG.AX  2016.0  0.937     0.012       1.92105           0.24312
3   RSG.AX  2015.0  0.181     0.000          None               NaN
4   RSG.AX  2014.0  0.216     0.000          None               NaN
5   RSG.AX  2013.0  0.494     0.000          None               NaN
6   QBE.AX  2018.0  7.119     0.352      0.880227        -0.0416336
7   QBE.AX  2017.0  8.331     0.202       1.01409        0.00467449
8   QBE.AX  2016.0  8.961     0.389          None               NaN
9   QBE.AX  2015.0  9.159     0.363          None               NaN
10  QBE.AX  2014.0  9.156     0.302          None               NaN

Method 2 - use groupby() and apply() on custom function

Based on method 1, we can define a custom function to be applied through groupby RIC

def three_year_return(row):
    row['3-year Return'] = (row.Price + 
                            row.Dividend + 
                            row.shift(-1).Dividend + 
                            row.shift(-2).Dividend) / row.Price.shift(-3)
    return row

df = df.groupby(['RIC']).apply(three_year_return)
df['Annualized return'] = pow(df['3-year Return'], 1/3)  - 1


       RIC  Date  Price  Dividend  3-year Return  Annualized return
0   RSG.AX  2018  0.814     0.000       4.646409           0.668678
1   RSG.AX  2017  0.889     0.015       4.240741           0.618629
2   RSG.AX  2016  0.937     0.012       1.921053           0.243120
3   RSG.AX  2015  0.181     0.000            NaN                NaN
4   RSG.AX  2014  0.216     0.000            NaN                NaN
5   RSG.AX  2013  0.494     0.000            NaN                NaN
6   QBE.AX  2018  7.119     0.352       0.880227          -0.041634
7   QBE.AX  2017  8.331     0.202       1.014089           0.004674
8   QBE.AX  2016  8.961     0.389            NaN                NaN
9   QBE.AX  2015  9.159     0.363            NaN                NaN
10  QBE.AX  2014  9.156     0.302            NaN                NaN

FYI - the results look a bit different from your example because I found that you used 0.182 as start price while it should be 0.181 based on your sample data.

henrywongkk
  • 1,840
  • 3
  • 17
  • 26
0

Keep it readable - You can use a .apply instead if you really want to avoid creating the dictionary.

result = {}
for ric, grp in df.groupby('RIC'):
    first, last = grp.iloc[-1], grp.iloc[0]
    start_price, end_price = first.Price, last.Price
    cum_div = grp.Dividend.sum()
    return_ = (end_price + cum_div) / start_price
    years = (last.Date - first.Date).days / 365
    ann_return = return_ ** (1 / years) - 1
    result[ric] = ann_return

result_df = pd.DataFrame.from_dict(result, orient='index')
print(result_df)

This is for the entire period in your data frame - if you want the three year just change last to be grp.iloc[3] (ensuring to check that much data exists). This also relies on the fact that your data is consistent with the datetime sorting invariant your sample suggests.

modesitt
  • 7,052
  • 2
  • 34
  • 64