1

I want to plot a "highlighted" point on top of swarmplot like this

enter image description here

The swarmplot don't have the y-axis, so I have no idea how to plot that point.

import seaborn as sns
sns.set(style="whitegrid")
tips = sns.load_dataset("tips")
ax = sns.swarmplot(x=tips["total_bill"])
Chris Maverick
  • 928
  • 1
  • 8
  • 18
  • 1
    Probably just using the specific x-value (`tips["total_bill"]` in this case) and zero as y-value is sufficient. The scatter dots come in an order left to right. Or you could sort the complete data frame via this column before calling `swarmplot`. – JohanC Apr 03 '20 at 07:33
  • trying sorting the data. I have a lot of subplot, each subplot needs sorting so it's kinda tricky – Chris Maverick Apr 03 '20 at 09:49

2 Answers2

2

You can highlight a point/s using the hue attribute if you add a grouping variable for the y axis (so that they appear as a single group), and then use another variable to highlight the point that you're interested in.

Then you can remove the y labels and styling and legend.

import matplotlib.pyplot as plt
import seaborn as sns
sns.set(style="whitegrid")

# Get data and mark point you want to highlight
tips = sns.load_dataset("tips")
tips['highlighted_point'] = 0
tips.loc[tips[tips.total_bill > 50].index, 'highlighted_point'] = 1

# Add holding 'group' variable so they appear as one
tips['y_variable'] = 'testing'

# Use 'hue' to differentiate the highlighted point
ax = sns.swarmplot(x=tips["total_bill"], y=tips['y_variable'], hue=tips['highlighted_point'])

# Remove legend
ax.get_legend().remove()

# Hide y axis formatting 
ax.set_ylabel('')
ax.get_yaxis().set_ticks([])
plt.show()

Output plot

wcanners
  • 35
  • 7
1

This approach is predicated on knowing the index of the data point you wish to highlight, but it should work - although if you have multiple swarmplots on a single Axes instance it will become slightly more complex.

import matplotlib.pyplot as plt
import matplotlib
import seaborn as sns
sns.set(style="whitegrid")
tips = sns.load_dataset("tips")
ax = sns.swarmplot(x=tips["total_bill"])
artists = ax.get_children()
offsets = []
for a in artists:
    if type(a) is matplotlib.collections.PathCollection:
        offsets = a.get_offsets()
        break
plt.scatter(offsets[50,0], offsets[50,1], marker='o', color='orange', zorder=10)

enter image description here

William Miller
  • 9,839
  • 3
  • 25
  • 46
  • one more question. How do you know the data of that point ? I want to know where the "50" point is in the original data. – Chris Maverick Apr 02 '20 at 05:31
  • `offsets = a.get_offsets()` stores the locations of the plotted points - which, critically, are stored in the same order they were plotted in. So the data of `offsets[50]` should be the same as the data from `tips["total_bill"].values[50]`. There's no other way to back out the data from the swarmplot – William Miller Apr 02 '20 at 05:37
  • I thought the index in get_offsets() is index of sorted list ? The original data is not sorted – Chris Maverick Apr 02 '20 at 05:46
  • @ChrisMaverick You might be correct, I may be confusing it with `matplotlib.Line2D.get_data()`, though the documentation doesn't specify – William Miller Apr 02 '20 at 05:49
  • so...should I sort the original data or something ? – Chris Maverick Apr 02 '20 at 05:54