I am trying to adjust my portfolio of stocks and trying to calculate the adjusted mean price (which is a form of weighted-average).
Here is sample data:
import pandas as pd
import numpy as np
sample_dict = {'ticker': {1: 'ABCD',
2: 'ABCD', 3: 'ABCD', 4: 'ABCD', 5: 'ABCD', 6: 'ABCD', 8: 'EFGH',
9: 'EFGH', 10: 'EFGH', 11: 'EFGH', 12: 'EFGH', 13: 'EFGH'},
'Date': {1: "2018, 1, 10", 2: "2018, 1, 20",
3: "2018, 2, 7", 4: "2018, 4, 14",
5: "2018, 5, 25", 6: "2018, 7, 4",
8: "2018, 1, 10", 9: "2018, 1, 20",
10: "2018, 2, 7", 11: "2018, 4, 14",
12: "2018, 5, 25", 13: "2018, 7, 4"},
'Sell_buy': {1: 'buy', 2: 'buy', 3: 'sell', 4: 'buy', 5: 'sell', 6: 'buy',
8: 'buy', 9: 'buy', 10: 'buy', 11: 'buy', 12: 'sell', 13: 'sell'},
'Qtd': {1: 100.0, 2: 300.0, 3: 200.0, 4: 500.0, 5: 600.0, 6: 500.0,
8: 300.0, 9: 300.0, 10: 200.0, 11: 200.0, 12: 700.0, 13: 100.0},
'Price': {1: 8.0, 2: 10.0, 3: 12.0, 4: 9.0, 5: 13.0, 6: 14.0,
8: 8.0, 9: 10.0, 10: 12.0, 11: 9.0, 12: 13.0, 13: 14.0},
'Costs': {1: 10.93, 2: 12.52, 3: 11.39, 4: 14.5, 5: 14.68, 6: 14.96,
8: 10.93, 9: 12.52, 10: 11.39, 11: 14.5, 12: 14.68, 13: 14.96}}
sample_df = pd.DataFrame(sample_dict)
sample_df['Date']=pd.to_datetime(sample_df['Date'], dayfirst=True).dt.date
I was able to get the get an updated Adjusted quantity (based on buys and sells) without any issues:
#to calculate adjusted quantity. this works as expected
sample_df['Adj Qtd'] = sample_df.apply(lambda x: ((x.Sell_buy == "buy") - (x.Sell_buy == "sell")) * x['Qtd'], axis = 1)
sample_df['Adj Qtd'] = sample_df.groupby('ticker')['Adj Qtd'].cumsum()
However, I am not able to get the correct Adjusted Price. The condition here is if I sell a stock, my Adjusted Price should not change and remain the same as the last adjusted price when the buy was made for that stock.
I've tried the following to achieve this goal:
#Adjust Price. Works good until I reach the row where a sell was made
sample_df['Adjusted Price'] = sample_df.apply(lambda x: ((x.Sell_buy == "buy") - (x.Sell_buy == "sell")) * (x["Price"] * x["Qtd"] + x["Costs"]), axis = 1)
sample_df['Adjusted Price'] = sample_df.groupby('ticker')['Adjusted Price'].cumsum().div(sample_df['Adj Qtd'])
I could partially correct the adjusted prices on sell rows with the following:
# When it's a "sell", adjusted price is the same from above
sample_df.loc[sample_df['Sell_buy'] == 'sell',['Adjusted Price']] = np.NaN
sample_df.fillna(method='ffill', inplace=True)
ticker Date Sell_buy Qtd Price Costs Adj Qtd Adjusted Price
1 ABCD 2018-10-01 buy 100.0 8.0 10.93 100.0 8.109300
2 ABCD 2018-01-20 buy 300.0 10.0 12.52 400.0 9.558625
3 ABCD 2018-07-02 sell 200.0 12.0 11.39 200.0 9.558625
4 ABCD 2018-04-14 buy 500.0 9.0 14.50 700.0 8.466514
5 ABCD 2018-05-25 sell 600.0 13.0 14.68 100.0 8.466514
6 ABCD 2018-04-07 buy 500.0 14.0 14.96 600.0 8.544733
8 EFGH 2018-10-01 buy 300.0 8.0 10.93 300.0 8.036433
9 EFGH 2018-01-20 buy 300.0 10.0 12.52 600.0 9.039083
10 EFGH 2018-07-02 buy 200.0 12.0 11.39 800.0 9.793550
11 EFGH 2018-04-14 buy 200.0 9.0 14.50 1000.0 9.649340
12 EFGH 2018-05-25 sell 700.0 13.0 14.68 300.0 9.649340
13 EFGH 2018-04-07 sell 100.0 14.0 14.96 200.0 9.649340
This would work perfectly if there is no "sell" between "buy"s (as it did for the stock EFGH in this example). To be clear, when the transaction is a "sell", we must ignore adjusting the price and use the last adjusted price in the last buy type transaction for that particular stock.
I did all these calculus in excel and the output should be the following:
For further clarification, the excel formula for that selected cell was:
=IF(C3="buy";(I2*G2+D3*E3+F3)/G3;IF(G3<>0;I2;0))
I've also tried .groupby("ticker").apply() a function using shift() in order to use prior values from rows above, but I've failed.