Questions tagged [kernel-density]

kernel density estimation is a non-parametric way to estimate the probability density function of a random variable.

Kernel density estimation is a fundamental data smoothing problem where inferences about the population are made, based on a finite data sample. Kernel density estimates are closely related to histograms, but can be endowed with properties such as smoothness or continuity by using a suitable kernel

http://en.wikipedia.org/wiki/Kernel_density_estimation

656 questions
6
votes
3 answers

How to normalize kde of scikit learn?

Let's say I have an array of shape (100000,1), representing samples of variable X of uniform distribution between 0 and 1. I want to approximate the density of probability of this variable, and I use Scikit-Learn KernelDensity to do that. The…
6
votes
3 answers

Comparing Kernel Density Estimation plots

I am actually a novice to R and stats.. Could something like this be done in R Determining the density estimates of two samples ( 2 Vectors )..?? I have done this Using R and obtained 2 density curves for the 2 samples using kernel density…
Pradeep
  • 555
  • 8
  • 14
6
votes
1 answer

Is it possible to sample from a conditional density in R given some conditional data?

In R, using the np package, I have created the bandwidths for a conditional density. What I would like to do is, given some new conditional vector, sample from the resulting distribution. Current code: library('np') # Generate some test…
gdoug
  • 715
  • 1
  • 5
  • 16
6
votes
2 answers

Embedding Seaborn plot in WxPython panel

I would like to ask how I could embed a seaborn figure in wxPython panel. Similarly to this post, I want to embed an external figure in a wxPython panel. I would like a specific panel of my wxPython GUI to plot the density contours of my data based…
user_jt
  • 259
  • 4
  • 15
6
votes
2 answers

PDF estimation in Scikit-Learn KDE

I am trying to compute PDF estimate from KDE computed using scikit-learn module. I have seen 2 variants of scoring and I am trying both: Statement A and B below. Statement A results in following error: AttributeError: 'KernelDensity' object has no…
mlworker
  • 281
  • 3
  • 9
6
votes
1 answer

Relation between sigma and bandwidth in gaussian_filter and gaussian_kde

Applying the functions scipy.ndimage.filters.gaussian_filter and scipy.stats.gaussian_kde over a given set of data can give very similar results if the sigma and bw_method parameters in each function respectively are chosen adequately. For example,…
Gabriel
  • 40,504
  • 73
  • 230
  • 404
6
votes
1 answer

ggplot2 - Modify geom_density2d to accept weights as a parameter?

This is my first post to the R-community, so pardon me if it is silly. I would like to use the functions geom_density2d and stat_density2d in ggplot2 to plot kernel density estimates, but the problem is that they can't handle weighted data. From…
6
votes
0 answers

Kernel density estimation in C++

I am trying to use Kernel Density Estimation (KDE) to compute the pdf of sample data points of d-dimension. I have read the wiki page in which they cite library libAGF. However this site has no examples nor tutorials. I am reluctant to write the…
Aly
  • 15,865
  • 47
  • 119
  • 191
5
votes
3 answers

Kernel Density Estimation using scipy's gaussian_kde and sklearn's KernelDensity leads to different results

I created some data from two superposed normal distributions and then applied sklearn.neighbors.KernelDensity and scipy.stats.gaussian_kde to estimate the density function. However, using the same bandwith (1.0) and the same kernel, both methods…
akra1
  • 144
  • 1
  • 1
  • 12
5
votes
0 answers

Obtaining the max density coordinates of a Seaborn jointplot

Given the following sample script: import seaborn as sns import pandas as pd import numpy as np # Generate some random multivariate data x, y = np.random.RandomState(8).multivariate_normal([0, 0], [(1, 0), (0, 1)], 1000).T # Add to a dataframe df =…
Dman2
  • 700
  • 4
  • 10
5
votes
1 answer

How to fit a curve to a histogram

I've explored similar questions asked about this topic but I am having some trouble producing a nice curve on my histogram. I understand that some people may see this as a duplicate but I haven't found anything currently to help solve my…
Brandon
  • 153
  • 1
  • 6
5
votes
1 answer

Estimating a probability distribution and sampling from it in Julia

I am trying to use Julia to estimate a continuous univariate distribution using N observed data points (stored as an array of Float64 numbers), and then sample from this estimated distribution. I have no prior knowledge restricting attention to some…
Chai
  • 53
  • 3
5
votes
1 answer

What is _passthrough_scorer and How Can I Change Scorers in GridsearchCV (sklearn)?

http://scikit-learn.org/stable/modules/generated/sklearn.grid_search.GridSearchCV.html (for reference) x = [[2], [1], [3], [1] ... ] # about 1000 data grid = GridSearchCV(KernelDensity(), {'bandwidth': np.linspace(0.1, 1.0, 10)},…
user5790923
5
votes
1 answer

ggplot2 density of circular data

I have a data set where x represents day of year (say birthdays) and I want to create a density graph of this. Further, since I have some grouping information (say boys or girls), I want to use the capabilities of ggplot2 to make a density…
mbarete
  • 399
  • 2
  • 17
5
votes
1 answer

inaccurate range of violin plots in seaborn

For some reasons, the range of the plot is not accurate. In my data there are no negative values. When I set range to -100 to 100 there some portion of the distribution under 0 mark.