14

What is the unit of the y-axis when using distplot to plot a histogram? I have plotted different histograms together with a normal fit and I see that in one case, it has a range of 0 to 0.9 while in another a range of 0 to 4.5.

nbro
  • 15,395
  • 32
  • 113
  • 196
Harry
  • 145
  • 1
  • 1
  • 6

1 Answers1

18

From help(sns.distplot):

norm_hist: bool, otional If True, the histogram height shows a density rather than a count. This is implied if a KDE or fitted density is plotted.

A density is scaled so that the area under the curve is 1, so no individual bin will ever be taller than 1 (the whole dataset). But kde is True by default and overrides norm_hist, so norm_hist changes the y-units only if you explicitly set kde to False:

import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns

fig, axs = plt.subplots(figsize=(6,6), ncols=2, nrows=2)
data = np.random.randint(0,20,40)

for row in (0,1):
    for col in (0,1):
        sns.distplot(data, kde=row, norm_hist=col, ax=axs[row, col])

axs[0,0].set_ylabel('NO kernel density')
axs[1,0].set_ylabel('KDE on')
axs[1,0].set_xlabel('norm_hist=False')
axs[1,1].set_xlabel('norm_hist=True')

enter image description here

nbro
  • 15,395
  • 32
  • 113
  • 196
cphlewis
  • 15,759
  • 4
  • 46
  • 55
  • 7
    This is helpful but I think it would be good to make explicit the idea that a density is scaled so that the area under the curve is 1. – mwaskom Apr 22 '15 at 18:23
  • No problem, Harry. Check it off as done if it answers your question. – cphlewis Apr 23 '15 at 04:46