0

I am trying to replace a seaborn.facetgrid with a seaborn.catplot. But the seaborn.catplot does not properly label the hue in the Embarked = C facet.

dataset: titanic


e = sns.FacetGrid(data= train_df, col='Embarked')
e.map_dataframe(sns.pointplot, 'Pclass', 'Survived', hue='Sex', palette='deep')
e.add_legend()

at Embarked C male is properly presented as hue

Embarked C: male is properly presented as hue


But my seaborn.catplot shows:

sns.catplot(x='Pclass', y= 'Survived', hue='Sex', data=train_df, kind='point',  col='Embarked')

in Emabarked= C male is not properly hued

Embarked C: male is not properly presented as hue

afsharov
  • 4,774
  • 2
  • 10
  • 27
  • Your first plot is wrong. You can force a consistent order on the hue values, either by providing `hue_order=['male','female']` or by making that column categorical (`train_df['Sex'] = pd.Categorical(train_df['Sex'])`) – JohanC Jun 16 '21 at 06:31

2 Answers2

0

JohanC already teased the answer in his comment. I will just explicate and complete.

Here is what the documentation of seaborn.catplot says about ordering:

As in the case with the underlying plot functions, if variables have a categorical data type, the levels of the categorical variables, and their order will be inferred from the objects. Otherwise you may have to use alter the dataframe sorting or use the function parameters (orient, order, hue_order, etc.) to set up the plot correctly.

This means you could e.g. use the hue_order parameter to make sure the plot is ordered like you want it to:

order, hue_order: lists of strings, optional
Order to plot the categorical levels in, otherwise the levels are inferred from the data objects.

Here how to use it in your case:

sns.catplot(x='Pclass', y='Survived', hue='Sex', hue_order=['male', 'female'], data=train_df, kind='point', col='Embarked')

Alternatively, as described in the documentation and pointed out by JohanC, you can convert the type of the column train_df['Sex'] to be categorical. Then the order will be inferred by seaborn.

afsharov
  • 4,774
  • 2
  • 10
  • 27
0

Thanks JohanC. Yes my FacetGrid plot was wrong. I checked it manually.

train_df[(train_df['Embarked']=='C') & (train_df['Survived']==1)].groupby('Sex').count()['Survived']

output:

Sex
    female    64
    male      29

Female is larger than male. In FacetGrid hue_order should be specified otherwise it may give wrong result.