I am having some troubles understanding proper way to marginalize out variables from probability distributions. As I understand the proper way to do this is to sum over variables that is being marginalized out leaving only variables to be kept. For case of normal distribution, the result is also normal distribution. I can show this part with equations and doing integrals, but when I try to check in python I get incorrect results--the peak of resulting distribution is much higher.
Here is example (the code is from Marginalize a surface plot and use kernel density estimation (kde) on it)
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from scipy.stats import multivariate_normal, gaussian_kde
# Choose mean vector and variance-covariance matrix
mu = np.array([0, 0])
sigma = np.array([[2, 0], [0, 3]])
# Create surface plot data
x = np.linspace(-5, 5, 100)
y = np.linspace(-5, 5, 100)
X, Y = np.meshgrid(x, y)
rv = multivariate_normal(mean=mu, cov=sigma)
Z = np.array([rv.pdf(pair) for pair in zip(X.ravel(), Y.ravel())])
Z = Z.reshape(X.shape)
# Plot it
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
pos = ax.plot_surface(X, Y, Z)
plt.show()
This makes plot of two variable normal distribution. If I take sum of variable x to get marginal distribution
Zmarg_y = Z.sum(axis=0)
plt.plot(x, Zmarg_y)
plt.show()
result is not the same as if I simply drop the variable instead of marginalize out. I tried this also with a 3 variable gaussian distribution where I marginalized 1 variable to get a 2 variable distribution. The result was also on a higher scale. Is there a problem with normalization here? I am studying probability for a first time and am trying to understand every single detail and I think I am misunderstanding something important about this. Thank you.