My problem is quite common in finance.
Given an array w (1xN) of weights and a covariance matrix Q (NxN) of assets, one can calculate the covariance of the portfolio using the quadratic expression w' * Q * w, where * is the dot product.
I want to understand what is the best way to perform this operation when I have an history of weights W (T x N) and a 3D structure for covariance matrix (T, N, N).
import numpy as np
import pandas as pd
returns = pd.DataFrame(0.1 * np.random.randn(100, 4), columns=['A', 'B', 'C', 'D'])
covariance = returns.rolling(20).cov()
weights = pd.DataFrame(np.random.randn(100, 4), columns=['A', 'B', 'C', 'D'])
My solution so far was to converting pandas DataFrames to numpy, perform the calculation doing a loop and then converting back to pandas. Note that I need to explicitly check for the alignment of labels, since in reality covariance and weights could be calculated by different processes.
cov_dict = {key: covariance.xs(key, axis=0, level=0) for key in covariance.index.get_level_values(0)}
def naive_numpy(weights, cov_dict):
expected_risk = {}
# Extract columns, index before passing to numpy arrays
# Columns
cov_assets = cov_dict[next(iter(cov_dict))].columns
avail_assets = [el for el in cov_assets if el in weights]
# Indexes
cov_dates = list(cov_dict.keys())
avail_dates = weights.index.intersection(cov_dates)
sel_weights = weights.loc[avail_dates, avail_assets]
# Main loop and calculation
for t, value in zip(sel_weights.index, sel_weights.values):
expected_risk[t] = np.sqrt(np.dot(value, np.dot(cov_dict[t].values, value)))
# Back to pandas DataFrame
expected_risk = pd.Series(expected_risk).reindex(weights.index).sort_index()
return expected_risk
Is there pure-pandas way to achieve the same result? Or is there any improvement on the code to make it more efficient? (despite using numpy, it is still quite slow).