Here is a very simple script generating a 2D gaussian with 10000 points. The covariance matrix estimated by np.cov seems really far from the generating one. What is the explanation and are there solutions ?
import numpy as np
import matplotlib.pyplot as plt
center=[0,0]
npoints=10000
data_covmat = np.array([[1,1],[1,0.5]])
lines=np.random.multivariate_normal(center,data_covmat,npoints)
print(f'2D gaussian centered at {center}, {npoints} points\nCovariance matrix =')
print(data_covmat)
plt.scatter(lines[:,0],lines[:,1],alpha=.1)
plt.axis('scaled')
plt.show()
print(f'Sample covariance matrix =\n{np.cov(lines,rowvar=False)}')
Covariance matrix =
[[1. 1. ] [1. 0.5]]
Sample covariance matrix =
[[1.23880367 0.74585136] [0.74585136 0.85974812]]