0

I am having some troubles understanding proper way to marginalize out variables from probability distributions. As I understand the proper way to do this is to sum over variables that is being marginalized out leaving only variables to be kept. For case of normal distribution, the result is also normal distribution. I can show this part with equations and doing integrals, but when I try to check in python I get incorrect results--the peak of resulting distribution is much higher.

Here is example (the code is from Marginalize a surface plot and use kernel density estimation (kde) on it)

import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from scipy.stats import multivariate_normal, gaussian_kde

# Choose mean vector and variance-covariance matrix
mu = np.array([0, 0])
sigma = np.array([[2, 0], [0, 3]])
# Create surface plot data
x = np.linspace(-5, 5, 100)
y = np.linspace(-5, 5, 100)
X, Y = np.meshgrid(x, y)
rv = multivariate_normal(mean=mu, cov=sigma)
Z = np.array([rv.pdf(pair) for pair in zip(X.ravel(), Y.ravel())])
Z = Z.reshape(X.shape)
# Plot it
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
pos = ax.plot_surface(X, Y, Z)
plt.show()

This makes plot of two variable normal distribution. If I take sum of variable x to get marginal distribution

Zmarg_y = Z.sum(axis=0)
plt.plot(x, Zmarg_y)
plt.show()

result is not the same as if I simply drop the variable instead of marginalize out. I tried this also with a 3 variable gaussian distribution where I marginalized 1 variable to get a 2 variable distribution. The result was also on a higher scale. Is there a problem with normalization here? I am studying probability for a first time and am trying to understand every single detail and I think I am misunderstanding something important about this. Thank you.

sk1995
  • 33
  • 5
  • The trick here is approximating a continuous distribution with a meshed sample. To "integrate" don't forget the dx term. (Is your result too large by the reciprocal "delta" of your linspace'd `x` and `y`? (for `linspace(-5, 5, 100)` the delta is `0.1010`, with reciprocal `9.9`) – AbbeGijly Apr 28 '20 at 19:46
  • @AbbeGijly Yes I believe that you exactly correct ! Thank you. I would like to understand your point better. How do you calculate this "delta" and include in the summing? Can you direct me to someplace to read about this? Also, how can "integrate" properly instead of using just summation? Thank you! – sk1995 Apr 28 '20 at 20:29
  • 1
    @AbbeGijly I believe that I now understand this. Is it just (5-(-5)/(100-1) which is length of every "segment" in the linear space? Of course this makes sense to include it in the "integral." I will read more about approximations like this. Thanks! – sk1995 Apr 28 '20 at 20:44
  • Using the "rectangle rule" for numerical integration, you add the areas of a bunch of rectangles, where the height of each is f(x) and the width is $\Delta x$. You had just left out the $\Delta x$ part. – AbbeGijly Apr 28 '20 at 21:17

0 Answers0