What goes on behind kdeplot is that a kernel density is fitted with many little normal density (see this illustration) and the densities at the very edge of the truncation cutoff spill over.
Using an example data:
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import pandas as pd
import statsmodels.api as sm
from scipy.stats import norm
np.random.seed(999)
data = pd.DataFrame({'a':np.random.exponential(0.3,100),
'b':np.random.exponential(0.5,100)})
If you use clip=
, it doesn't stop the evaluation at negative values:
for i in data.columns:
ax = sns.kdeplot(data[i],shade=True,gridsize=200)

If you add cut=0
, it will look odd. As you pointed out, you can truncate it at 0:

There are two solutions proposed in this post on cross-validated. I write a python implementation of the R code provided by @whuber:
def trunc_dens(x):
kde = sm.nonparametric.KDEUnivariate(x)
kde.fit()
h = kde.bw
w = 1/(1-norm.cdf(0,loc=x,scale=h))
d = sm.nonparametric.KDEUnivariate(x)
d = d.fit(bw=h,weights=w / len(x),fft=False)
d_support = d.support
d_dens = d.density
d_dens[d_support<0] = 0
return d_support,d_dens
We can check how it looks for data['a']
:
kde = sm.nonparametric.KDEUnivariate(data['a'])
kde.fit()
plt.plot(kde.support,kde.density)
_x,_y = trunc_dens(data['a'])
plt.plot(_x,_y)

You can plot it for both:
fig,ax = plt.subplots()
for i in data.columns:
_x,_y = trunc_dens(data[i])
ax.plot(_x,_y)
