implementation of transformed KDE in R

Question

I am following the steps at the end of this post to implement a transformed Kernel Density Estimate (KDE) on a bounded support [0,+inf[. We use the transformation trick to avoid the boundary bias of the traditional KDE on bounded support (in that case, near zero). Basically, the KDE allocates weights to observations that do not exist (outside the support), so it severely underestimates the PDF at the boundary (as shows well on the figure below).

1) Regular approach (we observe the undesirable boundary bias of the KDE near zero)

# sample from exponential distribution
obs=rexp(5e2)
hist(obs,freq=FALSE)
k=density(obs)
lines(k$x,k$y)

2) Transformation approach

# 1) log transform the obs
pseudo.obs=log(obs)
# 2) estimate the density of the pseudo obs with KDE
pseudo.k=density(pseudo.obs,n=length(obs))
# 3) estimate the density of the original obs
t.density=pseudo.k$y/obs
# plot estimation
lines(obs,t.density)

Instead of getting something similar to the blue line below as I should

I'm getting this horrible thing

You guess you should use something like `pseudo.k$x` and not `obs` to plot `t.density`. — , Sep 30 '15 at 09:03
well, I am estimating the distribution of the pseudo obs with a KDE and then dividing by the original values, which seems to be faithful to the formula above... — Antoine, Sep 30 '15 at 09:15
`pseudo.k$x` won't work because it deals with the transformed space, whereas we want a plot in the original space — Antoine, Sep 30 '15 at 09:24
I just gave you a hint. `obs` is not the correct space neither, if I am not mistaken. — , Sep 30 '15 at 09:27

score 0 · Answer 1 · answered Sep 30 '15 at 19:08

I could use a KDE on my stupidity without using any transformation, because it is unbounded. Here is some code that works:

# everything before is the same
# 2) estimate the density of the pseudo obs with KDE
pseudo.k=approxfun(density(pseudo.obs))
# 3) estimate the density of the original obs
seq=seq(min(obs),max(obs),length.out=500)
t.density=as.numeric(vector(length=length(seq)))
for (i in 1:length(seq)){
x=seq[i]
t.density[i]=pseudo.k(log(x))/x
}
# plot result
lines(seq,t.density,col="red")

implementation of transformed KDE in R

1 Answers1