2

I want to visualize category count by seaborn catplot but one of the hue are not important and don't need to include the visualization. How can I select specific Hues at catplot to visualize without changing or removing any value from the column ?

SA_H
  • 35
  • 7

1 Answers1

2

You could remove the rows with that value from the dataframe. If the column is Categorical you might also need to change the categories as the legend will still contain all the categories.

Here is an example:

import seaborn as sns
import pandas as pd

tips = sns.load_dataset('tips')
tips['day'].dtype # CategoricalDtype(categories=['Thur', 'Fri', 'Sat', 'Sun'], ordered=False)
# create a subset, a copy is needed to be able to change the categorical column
tips_weekend = tips[tips['day'].isin(['Sat', 'Sun'])].copy()
tips_weekend['day'].dtype # CategoricalDtype(categories=['Thur', 'Fri', 'Sat', 'Sun'], ordered=False)
tips_weekend['day'] = pd.Categorical(tips_weekend['day'], ['Sat', 'Sun'])
tips_weekend['day'].dtype # CategoricalDtype(categories=['Sat', 'Sun'], ordered=False)
sns.catplot(data=tips_weekend, x='smoker', y='tip', hue='day')

catplot with reduced hue levels

For the follow-up question, a histplot with multiple='fill' can show the percentage distribution:

import seaborn as sns
import pandas as pd
from matplotlib.ticker import PercentFormatter

tips = sns.load_dataset('tips')
tips_weekend = tips.copy()
tips_weekend['day'] = tips_weekend['day'].apply(lambda x: x if x in ['Sat', 'Sun'] else 'other')
# fix a new order
tips_weekend['day'] = pd.Categorical(tips_weekend['day'], ['other', 'Sat', 'Sun'])

ax = sns.histplot(data=tips_weekend, x='smoker', hue='day', stat='count', multiple='fill',
                  palette=['none', 'turquoise', 'crimson'])
# remove the first label ('other') in the legend
ax.legend(handles=ax.legend_.legendHandles[1:], labels=['Sat', 'Sun'], title='day')
ax.yaxis.set_major_formatter(PercentFormatter(1))
# add percentages
for bar_group in ax.containers[:-1]:
    ax.bar_label(bar_group, label_type='center', labels=[f'{bar.get_height() * 100:.1f} %' for bar in bar_group])

seaborn showing percentage distribution

JohanC
  • 71,591
  • 8
  • 33
  • 66
  • As followup question ,If we remove other days How can you demonstrate how many percent of total weekdays as Yes or No are Sat and Sun? – SA_H Dec 14 '21 at 00:51
  • 1
    Thank you for reply In this case we have Sat ,Sun and Other days as [Mon,Tue,Wed,Thursday,Friday] Total Sat and Sun are not 100% as we have other days but we don't want to visualize others percentage and only need percentages for Saturday and Sunday . – SA_H Dec 14 '21 at 01:55
  • @SA_H Have you been able to find a solution? – Anna K Nov 01 '22 at 15:55