6

I have a symmetric positive-definite matrix (e.g. Covariance matrix), and I want to calculate its inverse. In math, I know that it is more efficient to use Cholesky decomposition to invert the matrix, especially if your matrix is big. But I was not sure how does "numpy.lianlg.inv()" works. Say I have the following code:

import numpy as np

X = np.arange(10000).reshape(100,100)
X = X + X.T - np.diag(X.diagonal()) #  symmetry 
X = np.dot(X,X.T) # positive-definite

# simple inversion:
inverse1 = np.linalg.inv(X) 

# Cholesky decomposition inversion:
c = np.linalg.inv(np.linalg.cholesky(X))
inverse2 = np.dot(c.T,c)

Which one is more efficient (inverse1 or inverse2)? If the second one is more efficient, why is numpy.linalg.inv() not using this instead?

Babak
  • 497
  • 1
  • 7
  • 15
  • 1
    Regarding your last question - numpy does not know that your matrix is symmetric, so cannot use the latter method. Checking if the matrix is symmetric would be slow. – Eric Jun 03 '17 at 21:03
  • 2
    Note that also, `inv` does not take advantage of the fact that `cholesky` is triangular — it does not use lapack's `DTRTRI` – Eric Jun 03 '17 at 21:06
  • Your code doesn't run for me, claiming that X is not positive definite (presumably due to overflow) – Eric Jun 03 '17 at 21:10
  • 10
    In general, it's [bad idea](https://www.johndcook.com/blog/2010/01/19/dont-invert-that-matrix) to invert a matrix. `inv` is expensive and isn't numerically stable. Usually, you want to multiply the inverse with a vector, i.e., you want to solve a system of equations. In all such cases, it's better to just solve the system using something like `linalg.solve` (telling `solve` that the matrix is symmetric and positive definite will make `solve` use Cholesky). If you want to use the inverse multiple times, compute and store the decomposition for later use. – Praveen Jun 04 '17 at 00:09
  • 2
    if you do want to invert the cholesky factor, use `scipy.linalg.lapack.dtrtri` – Bananach Sep 23 '20 at 15:30

1 Answers1

1

With the following setup:

import numpy as np

N = 100
X = np.linspace(0, 1, N*N).reshape(N, N)
X = 0.5*(X + X.T) + np.eye(N) * N

I get the following timings with IPython's %timeit:

In [28]: %timeit np.linalg.inv(X)
255 µs ± 30.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [29]: %timeit c = np.linalg.inv(np.linalg.cholesky(X)); np.dot(c.T,c)
414 µs ± 15.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Eric
  • 95,302
  • 53
  • 242
  • 374