I am attempting to apply a KDE to my data which is structured as follows:
|x axis values|: 1, 2, 3, ...
|y axis values|: 5, 8, 10, ...
|coord point values|: 98, 35, 15, ...
I am able to apply the KDE to the X-Y coordinate pairs and estimate the distrubution of the points but this doesn't include the Coord-point Values.
So the question: Is there a way to estimate the KDE of the X-Y data points and also get a similar distribution on the Coordinate-point values?
Ideally, the output created would be two channel:
channel 1: The 2-D distribution of the X-Y coordinate pairs
channel 2: the 2-D estimated coord-point values on that same distribution
update:
In the code below, I attempted to solve my question, but the resulting output distribution 'gg' is too large as each point obtains some large distribution.
Currently I'm unsure the best way to condense this, prehaps simply obtaining the max value should be enough?
def kde3D(x, y, z, bandwidth, xbins=60j, ybins=60j, zbins=180j, ** kwargs):
"""Build 3D kernel density estimate (KDE)."""
# create grid of sample locations (default: 100x100)
xx, yy, zz = np.mgrid[0:60:xbins,
0:60:ybins,
-180:180:zbins]
xy_sample = np.vstack([yy.ravel(), xx.ravel(),zz.ravel()]).T
xy_train = np.vstack([y, x, z]).T
kde_skl = KernelDensity(bandwidth=bandwidth, **kwargs)
kde_skl.fit(xy_train)
# score_samples() returns the log-likelihood of the samples
gamma = np.exp(kde_skl.score_samples(xy_sample)) # Compute the log-likelihood of each sample under the model.
gg = np.reshape(gamma, xx.shape)
return xx, yy, zz ,gg
Update 2:
I was able to solve the problem by using the output of the 3D KDE 'gg' By utilizing the argmax and max functions you can distil the output functions per pixel to its max value, which was good enough for the problem at hand. np.argmax for the coord-pair Value, and np.max for the probability distribution value at that point. As a result, you no longer have an NxNxM output, but a NxN.