How do I make a histogram based probability density estimate of each of the marginal distributions p(x1 ) and p(x2 ) of this data set:
import numpy as np
import matplotlib.pyplot as plt
linalg = np.linalg
N = 100
mean = [1,1]
cov = [[0.3, 0.2],[0.2, 0.2]]
data = np.random.multivariate_normal(mean, cov, N)
L = linalg.cholesky(cov)
# print(L.shape)
# (2, 2)
uncorrelated = np.random.standard_normal((2,N))
data2 = np.dot(L,uncorrelated) + np.array(mean).reshape(2,1)
# print(data2.shape)
# (2, 1000)
plt.scatter(data2[0,:], data2[1,:], c='green')
plt.scatter(data[:,0], data[:,1], c='yellow')
plt.show()
For this you can use the hist function in Matlab or R. How does changing the bin width (or equivalently, the number of bins) affect the plot and the estimate of p(x1 ) and p(x2 )?
I'm using Python, is there something similar to the hist function from Matlab and how to implement it?