1

I am trying to understand the meaning of multiple parameter in Seaborn's kdeplot. Below is taken from its documentation,

multiple{{“layer”, “stack”, “fill”}}

Method for drawing multiple elements when semantic mapping creates subsets. Only relevant with univariate data.

However it doesn't help much and their plots looks very different. I would appreciate it if someone can elaborate them more.

Here are the plots created with setting multiple to layer, stack and fill respectively,

sns.displot(data=bg_vs_non_bg, multiple="layer", x="Value", hue="ClassName", kind="kde", col="Modality", log_scale=True, fill=True)

multiple="layer"

sns.displot(data=bg_vs_non_bg, multiple="stack", x="Value", hue="ClassName", kind="kde", col="Modality", log_scale=True)

multiple="stack"

sns.displot(data=bg_vs_non_bg, multiple="fill", x="Value", hue="ClassName", kind="kde", col="Modality", log_scale=True)

enter image description here

tdy
  • 36,675
  • 19
  • 86
  • 83
Melike
  • 468
  • 1
  • 7
  • 15

1 Answers1

2

You can think of it like this:

Option Meaning Explanation
layer Original density The densities are overlaid on each other, so the y-value just represents the original density of each curve.
stack Stacked density The densities are stacked on each other, so the y-value represents the stacked sum of the densities, i.e, the second curve's y-value is the sum of the first and second densities.
fill Proportional density The densities are normalized to sum to 1, so the y-value represents the proportional density of each curve relative to the others.

Or in visual form:

visual comparison

tdy
  • 36,675
  • 19
  • 86
  • 83
  • Thank you very much. Is there a reference for this ? – Melike Mar 01 '23 at 23:46
  • 1
    @tdy You left out `common_norm=` which plays an essential role in this explanation. – JohanC Mar 01 '23 at 23:49
  • 1
    @JohanC Hmm does it? I thought `common_norm` changes the way each density is calculated, but the meaning of `multiple` stays the same either way. – tdy Mar 01 '23 at 23:56
  • @Melike I couldn't find a (detailed) explanation in the docs either. – tdy Mar 01 '23 at 23:58
  • 1
    For example, with `multiple='stack'` you only get the cumulative density when `common_norm=True` (the default). With `common_norm=False` you'd get N times the density (with N the number of hue values). The difference is especially striking with unbalanced data sets. With `multiple='layer'` you only get the original density when `common_norm=False`, otherwise you'll see the densities scaled down depending on the relative size of each hue group. – JohanC Mar 02 '23 at 00:06
  • 2
    Nice answer. My only nit is that "cumulative density" is a little confusing because that term is overloaded and would usually mean a cumulation of the density along the x axis. – mwaskom Mar 02 '23 at 01:40
  • @mwaskom Ah, good point. I just changed it to "stacked density." That's somewhat circular, but not sure if there's a better way to describe it unambiguously. – tdy Mar 02 '23 at 05:35