0

I am trying to plot age distribution regarding survived, sex, class variables.

from matplotlib import pyplot
import seaborn

titanic= seaborn.load_dataset("titanic")

g = seaborn.catplot(data = titanic, x = 'survived', y = 'age',
                    hue = 'sex', split = True,
                    row='class', kind ='violin', legend = False)

Result is shown in the picture below.

If you see the age distribution of the first subplot where I draw a circle around, you can see that it is plotted on negative number which doesn't make sense.

How can I solve this problem? Age data does not contain any negative numbers.

enter image description here

Dohun
  • 477
  • 2
  • 7
  • 13

1 Answers1

1

The particular violin plot you circled is based on only 3 values: [2, 25, 50]. The violin plot draws a kernel density estimate obtained with these 3 points. In your case, the KDE has a significant portion below zero.

If you want, you can limit the plotting range of the violin plots to the range of the observed data by adding the parameter cut = 0 (cf. violinplot).

Keldorn
  • 1,980
  • 15
  • 25
  • Oh.. so points below zero does not mean that age value is actually negative integers but it means kernel density estimate of age distribution goes below 0 at some point. right? – Dohun May 04 '20 at 08:18
  • Right. You're basically doing a fit of a curve with only 3 points. Nothing in the process imposes the fitted curve to be y=0 for x<0. If you had more data, the fit would be better constrained and the tail below zero would be less important, which is why this is not as bad on your other plots. – Keldorn May 04 '20 at 08:22
  • Thank you! But what do you mean by 3 values: [2, 25, 50] ? Does it mean that only three observations were used to create that orange violin plot? So there are only 3 recods that match class=First, survived = 0, sex=Female.. am I right? – Dohun May 04 '20 at 08:32
  • "there are only 3 recods that match class=First, survived = 0, sex=Female.. am I right?" Yes, that is it. – Keldorn May 04 '20 at 08:41
  • Ok thank you for answering all my questions. sorry to bother you – Dohun May 04 '20 at 09:10