0

I'm trying to make a pairplot of kind scatter plot with histogram diagonals, but when adding a hue the histograms become invalid.

My code before hue:

import seaborn as sn
sn.pairplot(dropped_data)

Output: Output1

My code after adding hue:

sn.pairplot(dropped_data, hue='damage rating')

Output: Output 2

What I have tried:

sn.pairplot(dropped_data, hue='damage rating', diag_kind='hist', kind='scatter')

Output: Output 3

As you can see, when using a hue, the diagonal histogram it goes all weird and becomes incorrect. How can I fix this?

2 Answers2

0

I assume the question here is "how to have a hue mapping for the scatterplot but not the diagonal when the hue variable is numeric". If so:

mpg = sns.load_dataset("mpg")
sns.pairplot(mpg, hue="cylinders", diag_kws=dict(hue=None, color=".2"))

enter image description here

mwaskom
  • 46,693
  • 16
  • 125
  • 127
0

It looks like the hue column is continuous and contains only unique values. As the diagonal is build up of kdeplots, those won't work when each kde is build from only one value.

One way to tackle this, is using stacked histplots. This might be slow when a lot of data is involved.

Another approach is to make the hue column discrete, e.g. by rounding them.

A reproducible example

First, let's try to recreate the problem with easily reproducible data:

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

np.random.seed(20220226)
df = pd.DataFrame({f'Sensor {i}': np.random.randn(100) for i in range(1, 4)})
df['damage'] = np.random.rand(100)

sns.pairplot(df, hue="damage")

sns.pairplot using continuous hue

Working with a stacked histogram

sns.pairplot(df, hue="damage", diag_kind='hist', diag_kws={'multiple': 'stack'})

sns.pairplot with stacked histplot

Making the hue column discrete via rounding:

df['damage'] = (df['damage'] * 5).round() / 5  # round to multiples of 0.2
sns.pairplot(df, hue="damage")

sns.pairplot with rounded hue values

JohanC
  • 71,591
  • 8
  • 33
  • 66