Is there a way to apply Kernel density estimation on data where each 2-D data point has an associated value?

Question

I am attempting to apply a KDE to my data which is structured as follows:

|x axis values|: 1, 2, 3, ...

|y axis values|: 5, 8, 10, ...

|coord point values|: 98, 35, 15, ...

I am able to apply the KDE to the X-Y coordinate pairs and estimate the distrubution of the points but this doesn't include the Coord-point Values.

So the question: Is there a way to estimate the KDE of the X-Y data points and also get a similar distribution on the Coordinate-point values?

Ideally, the output created would be two channel:

channel 1: The 2-D distribution of the X-Y coordinate pairs

channel 2: the 2-D estimated coord-point values on that same distribution

update:

In the code below, I attempted to solve my question, but the resulting output distribution 'gg' is too large as each point obtains some large distribution.

Currently I'm unsure the best way to condense this, prehaps simply obtaining the max value should be enough?

def kde3D(x, y, z, bandwidth, xbins=60j, ybins=60j, zbins=180j, ** kwargs):
    """Build 3D kernel density estimate (KDE)."""

    # create grid of sample locations (default: 100x100)
    xx, yy, zz = np.mgrid[0:60:xbins,
                 0:60:ybins,
                 -180:180:zbins]

    xy_sample = np.vstack([yy.ravel(), xx.ravel(),zz.ravel()]).T
    xy_train = np.vstack([y, x, z]).T

    kde_skl = KernelDensity(bandwidth=bandwidth, **kwargs)
    kde_skl.fit(xy_train)

    # score_samples() returns the log-likelihood of the samples
    gamma = np.exp(kde_skl.score_samples(xy_sample))  # Compute the log-likelihood of each sample under the model.
    gg = np.reshape(gamma, xx.shape)


    return xx, yy, zz ,gg

Update 2:

I was able to solve the problem by using the output of the 3D KDE 'gg' By utilizing the argmax and max functions you can distil the output functions per pixel to its max value, which was good enough for the problem at hand. np.argmax for the coord-pair Value, and np.max for the probability distribution value at that point. As a result, you no longer have an NxNxM output, but a NxN.

score 0 · Answer 1 · answered Jul 26 '22 at 15:27

0

not sure if I understood your question correctly, but maybe you could try the following doing two KDEs:

Channel 1: Fit only the X-Y coordinate pairs.
Channel 2: Fit the X-Y coordinate pairs and use the z-value as sample weight.

answered Jul 26 '22 at 15:27

akra1

144
1
1
12

Is there a way to apply Kernel density estimation on data where each 2-D data point has an associated value?

1 Answers1