The bw_method=
(called bw=
in older versions), is directly passed to scipy.stats.gaussian_kde. The docs there write "If a scalar, this will be used directly as kde.factor
". The explanation of kde.factor
tells "The square of kde.factor
multiplies the covariance matrix of the data in the kde estimation." So, it is a kind of scaling factor. If still more details are needed, you could dive into scipy's source code, or into the research papers referenced in the docs.
If you really want to counter the scaling, you could divide it away: sns.kdeplot(np.array(data), ..., bw_method=0.01/np.std(data))
.
Or you could create your own version of a gaussian kde, with a bandwidth in data coordinates. It just sums some gauss curves and normalizes (total area under the curve should be 1) via dividing by the number of curves.
Here is some example code, with kde curves for 1, 2 or 20 input points:
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
def gauss(x, mu=0.0, sigma=1.0):
return np.exp(-((x - mu) / sigma) ** 2 / 2) / (sigma * np.sqrt(2 * np.pi))
def kde(xs, data, sigma=1.0):
return gauss(xs.reshape(-1, 1), data.reshape(1, -1), sigma).sum(axis=1) / len(data)
sns.set()
sigma = 0.03
xs = np.linspace(0, 4, 300)
fig, ax = plt.subplots(figsize=(12, 5))
data1 = np.array([1, 2])
kde1 = kde(xs, data1, sigma=sigma)
ax.plot(xs, kde1, color='crimson', label=f'dist of 1, σ={sigma}')
ax.fill_between(xs, kde1, color='crimson', alpha=0.3)
data2 = np.array([2.4, 2.5])
kde2 = kde(xs, data2, sigma=sigma)
ax.plot(xs, kde2, color='dodgerblue', label=f'dist of 0.1, σ={sigma}')
ax.fill_between(xs, kde2, color='dodgerblue', alpha=0.3)
data3 = np.array([3])
kde3 = kde(xs, data3, sigma=sigma)
ax.plot(xs, kde3, color='limegreen', label=f'1 point, σ={sigma}')
ax.fill_between(xs, kde3, color='limegreen', alpha=0.3)
data4 = np.random.normal(0.01, 0.1, 20).cumsum() + 1.1
kde4 = kde(xs, data4, sigma=sigma)
ax.plot(xs, kde4, color='purple', label=f'20 points, σ={sigma}')
ax.fill_between(xs, kde4, color='purple', alpha=0.3)
ax.margins(x=0) # remove superfluous whitespace left and right
ax.set_ylim(ymin=0) # let the plot "sit" onto y=0
ax.legend()
plt.show()
