0

How to create a boxplot like this one using the bokeh library in python?

df = sns.load_dataset("titanic")
sns.boxplot(x=df["age"])

enter image description here

Ahmed Adel
  • 49
  • 6
  • 1
    Does this answer your question? [how can I create a single box plot?](https://stackoverflow.com/questions/72059571/how-can-i-create-a-single-box-plot) – mosc9575 Oct 12 '22 at 14:32

1 Answers1

1

Here is a solution using some random data as input:

import numpy as np
import pandas as pd

from bokeh.plotting import figure, output_notebook, show
output_notebook()

series = pd.Series(list(np.random.randint(0,60,100))+[101]) # one outlier added by hand

Here is the math the boxplot is based on, some quantiles are calculated and the inter quantile range as well as the mean.

qmin, q1, q2, q3, qmax = series.quantile([0, 0.25, 0.5, 0.75, 1])
iqr = q3 - q1
upper = q3 + 1.5 * iqr
lower = q1 - 1.5 * iqr
mean = series.mean()

out = series[(series > upper) | (series < lower)]

if not out.empty:
    outlier = list(out.values)

This stays the same for both solutions.

vertical boxplot

k = 'age'
p = figure(
    tools="save",
    x_range= [k], # enable categorical axes
    title="Boxplot",
    plot_width=400,
    plot_height=500,
)

upper = min(qmax, upper)
lower = max(qmin, lower)

hbar_height = (qmax - qmin) / 500

# stems
p.segment([k], upper, [k], q3, line_color="black")
p.segment([k], lower, [k], q1, line_color="black")

# boxes
p.vbar([k], 0.7, q2, q3, line_color="black")
p.vbar([k], 0.7, q1, q2, line_color="black")

# whiskers (almost-0 height rects simpler than segments)
p.rect([k], lower, 0.2, hbar_height, line_color="black")
p.rect([k], upper, 0.2, hbar_height, line_color="black")

if not out.empty:
    p.circle([k] * len(outlier), outlier, size=6, fill_alpha=0.6)

show(p)

vertical boxplot

horizontal boxplot

To create a horizontal boxplot hbar is used instead of vbar and the order is changes in the segements and in the rects.

k = 'age'
p = figure(
    tools="save",
    y_range= [k],
    title="Boxplot",
    plot_width=400,
    plot_height=500,
)

upper = min(qmax, upper)
lower = max(qmin, lower)

hbar_height = (qmax - qmin) / 500

# stems
p.segment(upper, [k], q3, [k], line_color="black")
p.segment(lower, [k], q1, [k], line_color="black")

# boxes
p.hbar([k], 0.7, q2, q3, line_color="black")
p.hbar([k], 0.7, q1, q2, line_color="black")

# whiskers (almost-0 height rects simpler than segments)
p.rect(lower, [k], 0.2, hbar_height, line_color="black")
p.rect(upper, [k], 0.2, hbar_height, line_color="black")

if not out.empty:
    p.circle(outlier, [k] * len(outlier),  size=6, fill_alpha=0.6)

show(p)

horizontal boxplot

mosc9575
  • 5,618
  • 2
  • 9
  • 32