2

I want to to a violin plot of binned data but at the same time be able to plot a model prediction and visualize how well the model describes the main part of the individual data distributions. My problem here is, I guess, that the x-axis after the violin plot does not behave like a regular axis with numbers, but more like string-values that just accidentally happen to be numbers. Maybe not a good description, but in the example I would like to have a "normal" plot a function, e.g. f(x) = 2*x**2, and at x=1, x=5.2, x=18.3 and x=27 I would like to have the violin in the background.

import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns

np.random.seed(10)
collectn_1 = np.random.normal(1, 2, 200)
collectn_2 = np.random.normal(802, 30, 200)
collectn_3 = np.random.normal(90, 20, 200)
collectn_4 = np.random.normal(70, 25, 200)

ys = [collectn_1, collectn_2, collectn_3, collectn_4]
xs = [1, 5.2, 18.3, 27]

sns.violinplot(x=xs, y=ys)

xx = np.arange(0, 30, 10)
plt.plot(xx, 2*xx**2)

plt.show()

Somehow this code actually does not plot violins but only bars, this is only a problem in this example and not in the original code though. In my real code I want to have different "half-violins" on both sides, therefore I use sns.violinplot(x="..", y="..", hue="..", data=.., split=True).

Fuegon
  • 81
  • 1
  • 7
  • When trying to run your code with Seaborn 0.11, `violinplot` didn't seem to accept the format of `ys`. It expects `'long'` format, so with `ys = np.concatenate([collectn_1, collectn_2, collectn_3, collectn_4]); xs = np.repeat([1, 5.2, 18.3, 27], [len(collectn_1), len(collectn_2), len(collectn_3), len(collectn_4)])` it works. The weird scaling comes from the `scale=` parameter which defaults to `'area'` and scales the width of the 3 last violins to make their area equal to the first violin (which is very short). Setting `scale='width'` gives more standard-looking violins. – JohanC Nov 24 '20 at 22:06
  • Thanks, the behaviour really depends a lot on the used seaborn version. – Fuegon Nov 25 '20 at 10:30

1 Answers1

2

I think that would be hard to do with seaborn because it does not provide an easy way to manipulate the artists that it creates, particularly if there are other things plotted on the same Axes. Matplotlib's violinplot allows setting the position of the violins, but does not provide an option for plotting only half violins. Therefore, I would suggest using statsmodels.graphics.boxplots.violinplot, which does both.

from statsmodels.graphics.boxplots import violinplot

df = sns.load_dataset('tips')
x_col = 'day'
y_col = 'total_bill'
hue_col = 'smoker'

xs = [1, 5.2, 18.3, 27]
xx = np.arange(0, 30, 1)
yy = 0.1*xx**2
cs = ['C0','C1']

fig, ax = plt.subplots()

ax.plot(xx,yy)


for (_,gr0),side,c in zip(df.groupby(hue_col),['left','right'],cs):
    print(side)
    data = [gr1 for (_,gr1) in gr0.groupby(x_col)[y_col]]
    violinplot(ax=ax, data=data, positions=xs, side=side, show_boxplot=False, plot_opts=dict(violin_fc=c))

# violinplot above messes up which ticks are shown, the line below restores a sensible tick locator
ax.xaxis.set_major_locator(matplotlib.ticker.MaxNLocator())

enter image description here

Diziet Asahi
  • 38,379
  • 7
  • 60
  • 75
  • exactly what I was looking for. Thanks a lot! Interesting how many tools there a to plot violins :) – Fuegon Nov 25 '20 at 10:31