I am writing a tutorial, as part of which I wanted to show some of the issues that occur when analyzing data that has a singular covariance matrix.
I came across some odd behavior in numpy.
The behavior in question is this:
- Given a singular matrix (A) np.linalg.inv(A) will execute without raising an exception
- The function returns a matrix that is not an inverse of the original matrix A*inv(A) != I
My question is essentially: can anyone explain why this behavior is allowed, or is it a bug or am I misunderstanding something? I am not looking for solutions for how to get the inverse or approximate inverse of a singular matrix.
Moreover what method is numpy using to test for singularity? It appears that it is not checking the condition number alone (see below).
This question seems to relate to the same behavior (at least point 1 above) however the accepted answer (that the OPs test that their matrix is singular in the first place is not working correctly owing to very small numbers in the covariance matrix does not apply in my case).
Below is a complete example demonstrating the issue. I am working in numpy 1.24.2 with Python 3.10. Any help is much appreciated.
Numpy inverts a singular matrix without error
import numpy as np
from scipy.stats import ortho_group
def generate_random_cov_matrix(sz, rank, eig_val_scale):
"""
Generate a random covariance matrix with known rank by building it from a specified number of eigenvectors and eigenvalues
:param sz: the returned covariance matrix with be shaped sz x sz
:param rank: the desired rank of the returned matrix
:param eig_val_scale: eigenvalues will be randomly sampled from a chi2 distribution (df=1) and scaled by this parameter
:return: the covariance matrix
"""
if rank > sz:
raise ValueError('rank cannot be greater than size')
#
eig_vecs = ortho_group.rvs(dim=sz)
eig_vecs = eig_vecs[:, 0:rank]
# randomly sample eigenvalues from chi square (df=1) just so that smaller eigenvalues are more likely
eig_vals = np.random.chisquare(1, rank) * eig_val_scale
# sort in descending order
eig_vals = np.sort(eig_vals)
eig_vals = eig_vals[::-1]
# build matrix from eigenvectors and eigenvalues
cov = eig_vecs @ np.diag(eig_vals) @ eig_vecs.T
return cov, eig_vecs, eig_vals
# generate a singular 20 x 20 covariance matrix
n_variables = 20
rank = 10
cov_mat,_,_ = generate_random_cov_matrix(n_variables,rank,100)
# is it singular as expected
assert np.isclose(np.linalg.det(cov_mat),[0.])
# are the elements of the of the covariance matrix so small that their determinant could be >0 but zero due to machine precision
print(np.mean(cov_mat))
# does numpy compute its inverse without an exception
try:
cov_inv = np.linalg.inv(cov_mat)
print('No exception was raised')
except:
print('An exception was raised')
# is the resulting inverse a proper inverse (is A*inv(A) equal to an identity matrix)
try:
assert np.isclose(cov_mat @ cov_inv,np.identity(n_variables))
except AssertionError:
print('The computed inverse is not a proper inverse')
Numpy doesn't check the condition number alone
# runs without failure
arr = np.array([[1.0, 1.2], [2.0 + 1e-14, 2.4 + 1e-14]])
rank = np.linalg.matrix_rank(arr)
cond = np.linalg.cond(arr)
np.linalg.inv(arr)
print("rank: ", rank)
print("cond: ", cond)
# fails despite having the same condition number
arr = np.array([[1.0, 1.2], [2.0 + 1e-15, 2.4 + 1e-15]])
rank = np.linalg.matrix_rank(arr)
np.testing.assert_approx_equal(cond, np.linalg.cond(arr))
cond = np.linalg.cond(arr)
print("rank: ", rank)
print("cond: ", cond)
np.linalg.inv(arr)