0

I am trying to create a minimum variance portfolio based on 1 year of data. I then want to rebalance the portfolio every month recomputing thus the covariance matrix. (my dataset starts in 1992 and finishes in 2017).

I did the following code which works when it is not in a loop. But when put in the loop the inverse of the covariance matrix is Singular. I don't understand why this problem arises since I reset every variable at the end of the loop.

### Importing the necessary libraries ###
import pandas as pd
import numpy as np
from numpy.linalg import inv

### Importing the dataset ###
df = pd.read_csv("UK_Returns.csv", sep = ";")
df.set_index('Date', inplace = True)

### Define varibales ###
stocks = df.shape[1]
returns = []
vol = []
weights_p =[]

### for loop to compute portfolio and rebalance every 30 days ###
for i in range (0,288):
  a = i*30
  b = i*30 + 252
  portfolio = df[a:b]
  mean_ret = ((1+portfolio.mean())**252)-1
  var_cov = portfolio.cov()*252
  inv_var_cov = inv(var_cov)
  doit = 0
  weights = np.dot(np.ones((1,stocks)),inv_var_cov)/(np.dot(np.ones((1,stocks)),np.dot(inv_var_cov,np.ones((stocks,1)))))
  ret = np.dot(weights, mean_ret)
  std = np.sqrt(np.dot(weights, np.dot(var_cov, weights.T)))
  returns.append(ret)
  vol.append(std)
  weights_p.append(weights)
  weights = []
  var_cov = np.zeros((stocks,stocks))
  inv_var_cov = np.zeros((stocks,stocks))
  i+=1

Does anyone has an idea to solve this issue?

The error it yields is the following:

---------------------------------------------------------------------------
LinAlgError                               Traceback (most recent call last)
<ipython-input-17-979efdd1f5b2> in <module>()
     21   mean_ret = ((1+portfolio.mean())**252)-1
     22   var_cov = portfolio.cov()*252
---> 23   inv_var_cov = inv(var_cov)
     24   doit = 0
     25   weights = np.dot(np.ones((1,stocks)),inv_var_cov)/(np.dot(np.ones((1,stocks)),np.dot(inv_var_cov,np.ones((stocks,1)))))

<__array_function__ internals> in inv(*args, **kwargs)

1 frames
/usr/local/lib/python3.6/dist-packages/numpy/linalg/linalg.py in _raise_linalgerror_singular(err, flag)
     95 
     96 def _raise_linalgerror_singular(err, flag):
---> 97     raise LinAlgError("Singular matrix")
     98 
     99 def _raise_linalgerror_nonposdef(err, flag):

LinAlgError: Singular matrix

Thank you so much for any help you can provide me with!

The data is shared in the following google drive: https://drive.google.com/file/d/1-Bw7cowZKCNU4JgNCitmblHVw73ORFKR/view?usp=sharing

  • well the error says it. it is a singular matrix and singular matrix does not have simple inverse, hence throwing error. Try checking your matrix content to make sure it is non-singular or use other methods to get inverse. – Ehsan Jun 04 '20 at 10:40
  • Thank you for your answer. I checked the inverse outside of the loop and it works I checked using other subset of the dataset and it works as well. That is why I really do not understand why it does not work. the loop seems to change something but I don't understand what because it should not. – Grégoire Caye Jun 04 '20 at 10:42
  • 1
    Could you please elaborate on _I checked the inverse outside of the loop and it works_. Your selected matrix at iteration 224 seems to be singular (I guess your data is that way). Also, irrelevant to the question note that you have a `sqrt` in your loop that has negative values passed to it. You might want to check that too. Also, I suggest you convert your dataframe to numpy using `to_numpy()` to speed up your loop. – Ehsan Jun 04 '20 at 10:52
  • I ran the code outside the loop with i = 3, i = 4 etc and the code did not yield any error and gave me results for both returns and volatility of the portfolio. It is when I try to create a time serie of those returns and volatility through the for loop that it reports matrices as Singular. – Grégoire Caye Jun 04 '20 at 10:57
  • 1
    Please try `i=224` and see if it works outside the loop. – Ehsan Jun 04 '20 at 11:00
  • it works and yields me the following results: [array([0.14118882])] [array([[0.06050604]])] – Grégoire Caye Jun 04 '20 at 11:03
  • 1
    Could you please share the code you use outside the loop? If it is EXACTLY what is inside the loop, it does not work for me and throws the same error. You also have a `i+=1` at the end of your loop, I think that is unnecessary. – Ehsan Jun 04 '20 at 11:07
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/215298/discussion-between-gregoire-caye-and-ehsan). – Grégoire Caye Jun 04 '20 at 11:10

1 Answers1

2

It would be better to identify what is causing the singularity of the matrix but there are means of living with singular matrices.

Try to use pseudoinverse by np.linalg.pinv(). It is guaranteed to always exist. See pinv

Other way around it is avoid computing inverse matrix at all. Just find Least Squares solution of the system. See lstsq

Just replace np.dot(X,inv_var_cov) with

np.linalg.lstsq(var_conv, X, rcond=None)[0]

tstanisl
  • 13,520
  • 2
  • 25
  • 40
  • I am not sure if pinv guarantees inverse. Even in this case the SVD does not converge. Theory and computation are slightly different. – Ehsan Jun 04 '20 at 10:55
  • Thank you for the pinv option but I tried it from another post and it yields a different type of error: 1625 signature = 'D->DdD' if isComplexType(t) else 'd->ddd' -> 1626 u, s, vh = gufunc(a, signature=signature, extobj=extobj) 1627 u = u.astype(result_t, copy=False) 1628 s = s.astype(_realType(result_t), copy=False) ValueError: On entry to DLASCL parameter number 4 had an illegal value – Grégoire Caye Jun 04 '20 at 11:01
  • @GrégoireCaye, I see. It looks that this error is causes by presence of NaN in `var_cov` – tstanisl Jun 04 '20 at 11:12
  • 1
    @GrégoireCaye, there is a problem in the last iteration for a=6750 b=7002. Variable `portfolio` becomes a empty tensor of shape `(0,103)` what will cause `cov()` /`mean()` to produce garbage – tstanisl Jun 04 '20 at 11:17
  • Thank you for that comment ! Now I need to understand why that happens ^^ – Grégoire Caye Jun 04 '20 at 11:27