0

I have two 1-D arrays of coordinates x and y and I would'l to have a density plot. I found that scipy.stats.gaussian_kde() could help me, but I don't undestand how it really work.

My code is:

n = 1000
xs, ys = np.random.normal(-3., 3., size=n), np.random.normal(1., 4., size=n)
x, y = np.mgrid[xmin:xmax:100j, ymin:ymax:100j]
positions = np.vstack([x.ravel(), y.ravel()])
values = np.vstack([xs, ys])
# Bandwidth value.
bw = 0.325
kernel = stats.gaussian_kde(values, bw_method=bw/np.asarray(values).std(ddof=1))
# Evaluate kernel in grid positions.
k_pos = kernel(positions)
kde = np.reshape(k_pos.T, x.shape)

Is "kde" the gaussian kernel density normalized between (0,1)? How can I get the density unnormalized to match to physical dimention for surface density as m^-2?

Thanks for your help!

  • Are you trying to fit a 2D normal distribution to your data? If so, you do not need to use kde. in your case you can get the parameters... look at [this question][stackoverflow.com/questions/21566379/fitting-a-2d-gaussian-function-using-scipy-optimize-curve-fit-valueerror-and-m – Ben K. Oct 26 '16 at 12:11
  • No, the gaussina kernel density is fine for what I need: it tell me where my data are grouped. I just don't understand what is the output of stats.gaussian_kde and its normalization. – MarcoGAstro Oct 26 '16 at 12:35
  • OK, I am not 100% sure how the normalisation is done. From what I see, it seems like the distribution is normalised to 1, so if you want a surface density you would have to multiply kde by n. You can check it by calculating: kde.sum()*(xmax-xmin)/100 * (ymax-ymin)/100 which should amount approximately to 1 – Ben K. Oct 26 '16 at 22:31
  • Great! Thank you very much! – MarcoGAstro Dec 29 '16 at 10:40

0 Answers0