0

I am using sns.distplot with hist=True and kde=True. This works fine but for some datasets (e.g. if they contain only discrete values) the kernel density estimation line is zig-zagging which looks very odd given that the histogram underneath is smooth. A manual adjustment of the kde bandwidth should fix this but how can I set this for sns.distplot? The documentation does not say anything and the "bw" parameter that works for sns.kdeplot does not exist. How can I stop it from zig-zagging?

Michael Baudin
  • 1,022
  • 10
  • 25
lordy
  • 610
  • 15
  • 30
  • Why don't you use `kdeplot` if that works for you? – ImportanceOfBeingErnest Aug 10 '18 at 10:33
  • Because it does not have the "hist=True" option – lordy Aug 10 '18 at 10:36
  • 2
    You may plot a `plt.hist` and a `kdeplot` in the same plot. Or you use the `kde_kws` keyword argument to set the bandwidth for the kde curve on the `distplot`- – ImportanceOfBeingErnest Aug 10 '18 at 10:44
  • 4
    Yes, thanks! That is what I wanted - it works like a charm with "kde_kws={"bw":bandwidth}" – lordy Aug 10 '18 at 10:47
  • In the case where the data is a sample from a discontinuous distribution, I think that you should not use a general-purpose KDE. Of course, increasing the bandwidth would smooth out the zig-zags, but this would produce a very bad estimate of the distribution you try to approximate. Could you please print a sample with the same properties to see this more clearly? Please consider the following question: https://stackoverflow.com/questions/61797760/seaborn-kdeplot-not-enough-variation-in-data – Michael Baudin May 17 '20 at 14:10

1 Answers1

2

You can use the bandwidth option(bw) with the optional parameter "kde_kws" in the seaborn distplot to set the desired bandwidth.

eg : g = g.map(sns.distplot, "value", kde_kws={'bw':0.1})

Ashutosh Kumar
  • 301
  • 3
  • 10