I am trying to compute PDF estimate from KDE computed using scikit-learn module. I have seen 2 variants of scoring and I am trying both: Statement A and B below.
Statement A results in following error:
AttributeError: 'KernelDensity' object has no attribute 'tree_'
Statement B results in following error:
ValueError: query data dimension must match training data dimension
Seems like a silly error, but I cannot figure out. Please help. Code is below...
from sklearn.neighbors import KernelDensity
import numpy
# d is my 1-D array data
xgrid = numpy.linspace(d.min(), d.max(), 1000)
density = KernelDensity(kernel='gaussian', bandwidth=0.08804).fit(d)
# statement A
density_score = KernelDensity(kernel='gaussian', bandwidth=0.08804).score_samples(xgrid)
# statement B
density_score = density.score_samples(xgrid)
density_score = numpy.exp(density_score)
If it helps, I am using 0.15.2 version of scikit-learn. I've tried this successfully with scipy.stats.gaussian_kde so there is no problem with data.