5

Here's code from kaggle Titanic competition kernel:

grid = sns.FacetGrid(train_df, row='Embarked', size=2.2, aspect=1.6)
grid.map(sns.pointplot, 'Pclass', 'Survived', 'Sex', palette='deep')
grid.add_legend()

It produces wrong plot, the one with reversed colors. I'd like to know how to fix this exact code fragment. I tried adding keyword params to grid.map() call - order=["male", "female"], hue_order=["male", "female"], but then plots become empty.

sdd
  • 721
  • 9
  • 23

1 Answers1

9

In the code call to grid.map(sns.pointplot, 'Pclass', 'Survived', 'Sex', palette='deep'), the x category is the Pclass and the hue category is the Sex. Hence you need to add

order = [1,2,3], hue_order=["male", "female"]

Complete example (where I took the titanic that ships with seaborn - what wordplay!):

import seaborn as sns
import matplotlib.pyplot as plt

df = sns.load_dataset("titanic")

grid = sns.FacetGrid(df, row='embarked', size=2.2, aspect=1.6)
grid.map(sns.pointplot, 'pclass', 'survived', 'sex', palette='deep', 
             order=[1,2,3], hue_order=["female","male"])
grid.add_legend()

plt.show()

enter image description here

Note that while hue_order is definitely required, you may leave out the order. While this will throw a warning, the correct order is garantied by the fact that those values are numerical and are hence automatically sorted.

ImportanceOfBeingErnest
  • 321,279
  • 53
  • 665
  • 712
  • When I edited the original kernel on kaggle, I found that I had to use the columns with proper casing, otherwise I got a keyError – the_interest_seeker May 28 '18 at 09:31
  • @the_interest_seeker This answer uses the dataset which is distributed with seaborn (`sns.load_dataset("titanic")`). This differs from other commonly used datasets by different column names and also some columns might be missing. – ImportanceOfBeingErnest May 28 '18 at 10:33
  • 1
    Great answer but I would've +1 only for the pun – Yair Daon Dec 07 '20 at 12:41