4

I have a seaborn violin plot on the left, and matplotlib on the right.

As you can see, matplotlib removes some values/data, even with showextrema=True or False, that has no effect. How do I make matplotlib use violin plot to keep those values?

import matplotlib.pyplot as plt
import seaborn as sns

a = [195.0, 245.0, 142.0, 237.0, 153.0, 238.0, 168.0, 145.0, 229.0, 138.0, 176.0, 116.0, 252.0, 148.0, 199.0, 162.0, 134.0, 163.0, 130.0, 339.0, 152.0, 208.0, 152.0, 192.0, 163.0, 249.0, 113.0, 176.0, 123.0, 189.0, 150.0, 207.0, 184.0, 153.0, 228.0, 153.0, 170.0, 118.0, 302.0, 197.0, 211.0, 159.0, 228.0, 147.0, 166.0, 156.0, 167.0, 147.0, 126.0, 155.0, 138.0, 159.0, 139.0, 111.0, 133.0, 134.0, 131.0, 156.0, 240.0, 207.0, 150.0, 207.0, 265.0, 151.0, 173.0, 157.0, 261.0, 186.0, 195.0, 158.0, 272.0, 134.0, 221.0, 131.0, 252.0, 148.0, 178.0, 206.0, 146.0, 217.0, 159.0, 190.0, 156.0, 172.0, 159.0, 141.0, 167.0, 168.0, 218.0, 191.0, 207.0, 164.0]

fig, axes = plt.subplots()

# Seaborn violin plot
sns.violinplot(data=a, width=0.6, color="w" )

# Matplotlib violin plot
axes.violinplot(a, showmeans=True, showmedians=False, showextrema=False, widths = 0.6)
axes.set_xticks([y+1 for y in range(2)])
plt.show()

enter image description here

Anderson
  • 404
  • 1
  • 7
  • 16
  • The minimum value in the list is 111. Why do you expect to have the plot go lower than that? Or why do you claim that data is removed? – ImportanceOfBeingErnest Aug 06 '17 at 18:06
  • @ImportanceOfBeingErnest Yes that is true. I should clarify why the kernel density estimate is cut off. How do I allow it to extrapolate – Anderson Aug 06 '17 at 18:27

1 Answers1

6

The range over which the KDE is plotted for a matplotlib violinplot is the range of input values. This is defined pretty deep in the code, so there is no easy option to change that.

In contrast, the seaborn violinplot allows to have some good control over the KDE range. By default, it expands the shown KDE curve by twice the bandwidth of the KDE on each side of the plot. This is steered by the cut argument to sns.violinplot(, cut=2), which defaults to 2. If you set cut=0, you will obtain the same as the matplotlib violinplot. Together with the option to manually chose the KDE bandwidth as float, sns.violinplot(..., bw = 0.2, cut=2), you have a very good control over how the violinplot is displayed.

In conclusion, just use the seaborn violinplot if you need fine grained control over the range of the KDE curve.

ImportanceOfBeingErnest
  • 321,279
  • 53
  • 665
  • 712