I have a scatterplot and I want to color it based on another value (naively assigned to np.random.random()
in this case).
Is there a way to use seaborn
to map a continuous value (not directly associated with the data being plotted) for each point to a value along a continuous gradient in seaborn
?
Here's my code to generate the data:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sklearn.preprocessing import StandardScaler
from sklearn import decomposition
import seaborn as sns; sns.set_style("whitegrid", {'axes.grid' : False})
%matplotlib inline
np.random.seed(0)
# Iris dataset
DF_data = pd.DataFrame(load_iris().data,
index = ["iris_%d" % i for i in range(load_iris().data.shape[0])],
columns = load_iris().feature_names)
Se_targets = pd.Series(load_iris().target,
index = ["iris_%d" % i for i in range(load_iris().data.shape[0])],
name = "Species")
# Scaling mean = 0, var = 1
DF_standard = pd.DataFrame(StandardScaler().fit_transform(DF_data),
index = DF_data.index,
columns = DF_data.columns)
# Sklearn for Principal Componenet Analysis
# Dims
m = DF_standard.shape[1]
K = 2
# PCA (How I tend to set it up)
Mod_PCA = decomposition.PCA(n_components=m)
DF_PCA = pd.DataFrame(Mod_PCA.fit_transform(DF_standard),
columns=["PC%d" % k for k in range(1,m + 1)]).iloc[:,:K]
# Plot
fig, ax = plt.subplots()
ax.scatter(x=DF_PCA["PC1"], y=DF_PCA["PC2"], color="k")
ax.set_title("No Coloring")
Ideally, I wanted to do something like this:
# Color classes
cmap = {obsv_id:np.random.random() for obsv_id in DF_PCA.index}
# Plot
fig, ax = plt.subplots()
ax.scatter(x=DF_PCA["PC1"], y=DF_PCA["PC2"], color=[cmap[obsv_id] for obsv_id in DF_PCA.index])
ax.set_title("With Coloring")
# ValueError: to_rgba: Invalid rgba arg "0.2965562650640299"
# to_rgb: Invalid rgb arg "0.2965562650640299"
# cannot convert argument to rgb sequence
but it didn't like the continuous value.
I want to use a color palette like:
sns.palplot(sns.cubehelix_palette(8))
I also tried doing something like below, but it wouldn't make sense b/c it doesn't know which values I used in my cmap
dictionary above:
ax.scatter(x=DF_PCA["PC1"], y=DF_PCA["PC2"],cmap=sns.cubehelix_palette(as_cmap=True)