0

Suppose I want to plot the following data using sns.kdeplot:

np.random.seed(42)
x = [np.random.randint(0, 10) for _ in range(10)]

x
[6, 3, 7, 4, 6, 9, 2, 6, 7, 4]

enter image description here

But now, instead of having each value, suppose I have the probability of each one:

# y is a pd.Series
y
6    0.3
7    0.2
4    0.2
9    0.1
3    0.1
2    0.1

Is it possible to build the kdeplot from these probabilities?

I think that seaborn probably calculate these values and thus I think it might be possible

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
Bruno Mello
  • 4,448
  • 1
  • 9
  • 39
  • 2
    There is a `weights` parameter (on v0.11.0+) that may be useful but I am not exactly that sure what you are looking for with "build kdeplot from these probabilities". In general I would say that a KDE plot is not a good approach for visualization the distribution of a variable that takes a small number of discrete values. – mwaskom Dec 20 '20 at 19:00
  • Great, I think that works. My original data is a lot larger than this, I just used this small dataset as an example! @mwaskom – Bruno Mello Dec 20 '20 at 22:00

1 Answers1

1

You should be able to accomplish this with the weights parameter of kdeplot (added in v0.11.0), something like

sns.kdeplot(x=x, weights=y)
mwaskom
  • 46,693
  • 16
  • 125
  • 127