1

Using seaborn python library, I am trying to make several density plots overlapping each other in the same figure and I want to color/label each of the lines. Using seaborn objects interface I am able to make the density plots within a for loop. But I cannot add color/label to each density plot.

I understand that there are other ways e.g., I create a dataframe with all the data and corresponding labels first and then pass it to seaborn plot(). But I was just wondering if below code (using seaborn objects interface) could work with some modifications. Please advise.

  • Plot using Seaborn objects

Code:

Here I am setting color=s_n which is the number of samples that I drew from the normal distribution. I want to label each density plot with the number of samples (please also the see the desired plot towards the end of post)

import scipy.stats as st
import seaborn.objects as so

num_samples = 2000
normal_distr = st.norm(1,1)

sp = so.Plot()

for s_n in range(10,num_samples,400):
    sample_normal = normal_distr.rvs(s_n)
    sp = sp.add(so.Line(),so.KDE(),x=sample_normal,color=s_n)
sp.show()

The plots looks like this and it does not color/label each density line separately.

KDE plot without individual color for each density

  • Desired Plot

If I directly use seaborn kdeplot, I can get the desired plot (below). But I was just wondering if I can use seaborn objects instead of direct kdeplot

Code using kdeplot:

import scipy.stats as st
import seaborn as sns
import matplotlib.pyplot as plt

num_samples = 2000
normal_distr = st.norm(1,1)

for s_n in range(10,num_samples,400):
    sample_normal = normal_distr.rvs(s_n)
    sns.kdeplot(x=sample_normal, label=s_n)    
plt.legend()

The (desired) plot:

KDE plot with individual color for each density

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
rkmalaiya
  • 498
  • 1
  • 5
  • 15

1 Answers1

1

I guess the trick here would be to prepare your df so that you can forgo the loop and use the color kwarg as it's meant to be used:

import scipy.stats as st
import seaborn.objects as so
import pandas as pd

num_samples = 2000
normal_distr = st.norm(1,1)

df = pd.concat([
    pd.DataFrame(
        {'sn': str(s_n),
        'values': normal_distr.rvs(s_n)}
        )
    for s_n in range(10,num_samples,400)
])

This would look like this:

        sn    values
0       10  0.976926
1       10 -0.501831
2       10  1.748071
3       10  0.968493
4       10  0.593531
...    ...       ...
1605  1610  0.311484
1606  1610  1.332424
1607  1610  1.531519
1608  1610  1.240953
1609  1610 -0.793144

Then printing can be done in a single line:

so.Plot(df, x='values').add(so.Line(), so.KDE(common_norm=False), color='sn').show()

Output:

enter image description here

Tranbi
  • 11,407
  • 6
  • 16
  • 33
  • Thanks for the answer. So if I understand correctly, providing a vector for x (rather than a column name) works but not for color. In this line sp.add(so.Line(),so.KDE(),x=sample_normal,color=s_n) I provided direct values for both x and color. For x it worked and lines were drawn. But for colors it did not work because neither the lines have different color nor there is any legend. – rkmalaiya Apr 20 '23 at 22:35
  • You can also pass a vector for color. It's just more convenient to work with column names. This would work as well: `so.Plot(x=df['values']).add(so.Line(), so.KDE(common_norm=False), color=df['sn']).show()` – Tranbi Apr 20 '23 at 22:50