0

i'm trying to compute the cumulative distribution function of a multivariate normal using scipy.

i'm having trouble with the "input matrix must be symmetric positive definite" error.

to my knowledge, a diagonal matrix with positive diagonal entries is positive definite (see page 1 problem 2)

However, for different (relatively) small values of these diagonal values, the error shows up for the smaller values.

For example, this code:

import numpy as np
from scipy.stats import multivariate_normal
std = np.array([0.001, 2])
mean = np.array([1.23, 3])
multivariate_normal(mean=mean, cov=np.diag(std**2)).cdf([2,1])

returns 0.15865525393145702 while changing the third line with:

std = np.array([0.00001, 2])

causes the error to show up.

i'm guessing that it has something to do with computation error of floats. The problem is, when the dimension of the cov matrix is larger, the accepted positive values on the diagoanal are bigger and bigger.

I tried multiple values on the diagonal of the covariance matrix of dimension 9x9. It seems that when other diagonal values are very large, small values cause the error.

Rima
  • 3
  • 2

1 Answers1

0

Examining the stack trace you will see that it assumes the condition number as

1e6*np.finfo('d').eps ~ 2.2e-10 in _eigvalsh_to_eps

In your example the difference the smaller eigenvalue is 5e-6**2 times smaller than the largest eigenvalue so it will be treated as zero.

You can pass allow_singular=True to get it working

import numpy as np
from scipy.stats import multivariate_normal
std = np.array([0.000001, 2])
mean = np.array([1.23, 3])
multivariate_normal(mean=mean, cov=np.diag(std**2), allow_singular=True).cdf([2,1])
Bob
  • 13,867
  • 1
  • 5
  • 27