19

Hi I wanted to draw a histogram with a boxplot appearing the top of the histogram showing the Q1,Q2 and Q3 as well as the outliers. Example phone is below. (I am using Python and Pandas) enter image description here

I have checked several examples using matplotlib.pyplot but hardly came out with a good example. And I also wanted to have the histogram curve appearing like in the image below. enter image description here

I also tried seaborn and it provided me the shape line along with the histogram but didnt find a way to incorporate with boxpot above it.

can anyone help me with this to have this on matplotlib.pyplot or using pyplot

Isura Nirmal
  • 777
  • 1
  • 9
  • 26

4 Answers4

35
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

sns.set(style="ticks")

x = np.random.randn(100)

f, (ax_box, ax_hist) = plt.subplots(2, sharex=True, 
                                    gridspec_kw={"height_ratios": (.15, .85)})

sns.boxplot(x, ax=ax_box)
sns.distplot(x, ax=ax_hist)

ax_box.set(yticks=[])
sns.despine(ax=ax_hist)
sns.despine(ax=ax_box, left=True)

enter image description here


From seaborn v0.11.2, sns.distplot is deprecated. Use sns.histplot for axes-level plots instead.

np.random.seed(2022)
x = np.random.randn(100)

f, (ax_box, ax_hist) = plt.subplots(2, sharex=True, gridspec_kw={"height_ratios": (.15, .85)})

sns.boxplot(x=x, ax=ax_box)
sns.histplot(x=x, bins=12, kde=True, stat='density', ax=ax_hist)

ax_box.set(yticks=[])
sns.despine(ax=ax_hist)
sns.despine(ax=ax_box, left=True)

enter image description here

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
mwaskom
  • 46,693
  • 16
  • 125
  • 127
5

Solution using only matplotlib, just because:

# start the plot: 2 rows, because we want the boxplot on the first row
# and the hist on the second
fig, ax = plt.subplots(
    2, figsize=(7, 5), sharex=True,
    gridspec_kw={"height_ratios": (.3, .7)}  # the boxplot gets 30% of the vertical space
)

# the boxplot
ax[0].boxplot(data, vert=False)

# removing borders
ax[0].spines['top'].set_visible(False)
ax[0].spines['right'].set_visible(False)
ax[0].spines['left'].set_visible(False)

# the histogram
ax[1].hist(data)

# and we are good to go
plt.show()
erickfis
  • 1,074
  • 13
  • 19
2

Expanding on the answer from @mwaskom, I made a little adaptable function.

import seaborn as sns
def histogram_boxplot(data, xlabel = None, title = None, font_scale=2, figsize=(9,8), bins = None):
    """ Boxplot and histogram combined
    data: 1-d data array
    xlabel: xlabel 
    title: title
    font_scale: the scale of the font (default 2)
    figsize: size of fig (default (9,8))
    bins: number of bins (default None / auto)

    example use: histogram_boxplot(np.random.rand(100), bins = 20, title="Fancy plot")
    """

    sns.set(font_scale=font_scale)
    f2, (ax_box2, ax_hist2) = plt.subplots(2, sharex=True, gridspec_kw={"height_ratios": (.15, .85)}, figsize=figsize)
    sns.boxplot(data, ax=ax_box2)
    sns.distplot(data, ax=ax_hist2, bins=bins) if bins else sns.distplot(data, ax=ax_hist2)
    if xlabel: ax_hist2.set(xlabel=xlabel)
    if title: ax_box2.set(title=title)
    plt.show()

histogram_boxplot(np.random.randn(100), bins = 20, title="Fancy plot", xlabel="Some values")

Image

-1
def histogram_boxplot(feature, figsize=(15,10), bins=None):
    f,(ax_box,ax_hist)=plt.subplots(nrows=2,sharex=True, gridspec_kw={'height_ratios':(.25,.75)},figsize=figsize)                                  
                                                                                                   
    sns.distplot(feature,kde=False,ax=ax_hist,bins=bins) 
    sns.boxplot(feature,ax=ax_box, color='Red')
    ax_hist.axvline(np.mean(feature),color='g',linestyle='-')
    ax_hist.axvline(np.median(feature),color='y',linestyle='--')
Gino Mempin
  • 25,369
  • 29
  • 96
  • 135
  • Please repair your code formatting and provide some context why your solution is preferable to the other answers provided. – DaveL17 May 16 '21 at 20:18