1

Say I have data that I want to box plot and overlay with a swarm plot in seaborn, whose colors of the points add additional information on the data.

Question: How can I get box plots to be close to each other for a given x axis value (as is done in hue) without refactorizing x to the hue value and the x axis value?

For example, here I want to overlay the points to the box plot and want the points further colored by ‘sex’. Example:

plt.figure(figsize = (5, 5))

sns.boxplot(x = 'class', y = 'age', 
            hue = 'embarked', dodge = True, data = df)


sns.swarmplot(x = 'class', y = 'age', 
              dodge = True,
              color = '0.25',
              
              hue = 'sex', data = df)

plt.legend(bbox_to_anchor = (1.5, 1))

EDIT: The idea would be to have something that looks like the 'S' box for 'Third' in the plot (I made a fake example in powerpoint, so hue in both boxplot and swarmplot are the same to overlay the points on the appropriate boxes).

enter image description here

Is there a way to make this plot without first refactorizing the x-axis to ‘first-S’, ‘first-C’, ‘first-Q’, ‘second-S’, etc and then add hue by ’sex’ in both plots?

wiscoYogi
  • 305
  • 2
  • 10

1 Answers1

1

Using original x as col and hue as x

To work with two types of hue, seaborn's alternative is to create a FacetGrid. The original x= then becomes the col= (or the row=), and one of the hues becomes the new x=.

Here is an example. Note that aspect= controls the width of the individual subplots (the width being height*aspect).

from matplotlib import pyplot as plt
import seaborn as sns

df = sns.load_dataset('titanic')
g = sns.catplot(kind='box', x='embarked', y='age', hue='sex', col='class',
                dodge=True, palette='spring',
                height=5, aspect=0.5, data=df)
g.map_dataframe(sns.swarmplot, x='embarked', y='age', hue='sex', palette=['0.25'] * 2, size=2, dodge=True)
for ax in g.axes.flat:
    # use title as x-label
    ax.set_xlabel(ax.get_title())
    ax.set_title('')
    # remove y-axis except for the left-most columns
    if len(ax.get_ylabel()) == 0:
        ax.spines['left'].set_visible(False)
        ax.tick_params(axis='y', left=False)
plt.subplots_adjust(wspace=0)
plt.show()

combining sns.boxplot and sns.swarmplot with two hue variables

Only using hue for the swarmplot, without dodge

Here is a variant, where the boxplot doesn't use hue, but the swarmplot does. A bit more padding can be added inside the subplots, and the boxplots can be made touching via width=1. Suppressing the outliers of the boxplot looks cleaner, as they would overlap with the outlier of the swarmplot.

from matplotlib import pyplot as plt
import seaborn as sns
import pandas as pd

df = sns.load_dataset('titanic')
df['embarked'] = pd.Categorical(df['embarked'], ['S', 'C', 'Q'])  # force a strict order
g = sns.catplot(kind='box', x='embarked', y='age', col='class',
                dodge=True, palette='summer', width=1, showfliers=False,
                height=5, aspect=0.5, data=df)
g.map_dataframe(sns.swarmplot, x='embarked', y='age', hue='sex', palette=['b', 'r'], size=2, dodge=False)
g.add_legend()
for ax in g.axes.flat:
    # use title as x-label
    ax.set_xlabel(ax.get_title())
    ax.set_title('')
    # remove y-axis except for the left-most columns
    if len(ax.get_ylabel()) == 0:
        ax.spines['left'].set_visible(False)
        ax.tick_params(axis='y', left=False)
xmin, xmax = ax.get_xlim()
ax.set_xlim(xmin - 0.2, xmax + 0.2)  # add a bit more spacing between the groups
plt.subplots_adjust(wspace=0)
plt.show()

catplot using hue for swarmplot

JohanC
  • 71,591
  • 8
  • 33
  • 66
  • my issue is that one of my groups (eg the x-axis in scatter grid) doesn't have as much data as the other group, and I wanted the two to overlap rather than plot side-by-side? is there a somewhat pythonic way to accomplish this or am I stuck with facit grid? – wiscoYogi Jan 19 '23 at 18:01
  • You can change the spacing to zero (`plt.subplots_adjust(wspace=0)`), to make the facet grid look like a single subplot. The `aspect=` of `sns.catplot` can be reduced to use less width (but this doesn't work well for this example, as the swarms needs some width). You might want to [edit](https://stackoverflow.com/posts/75149732/edit) your post and add a sketch of how you'd want the plot to look like. – JohanC Jan 19 '23 at 18:25
  • 1
    I added a variant which only uses hue for the swarmplot – JohanC Jan 21 '23 at 23:46