0

I have code to create a scatterplot matrix with density curves that I want to be colored based on a categorical variable in the data set. I need it to match a specific color, but I can't seem to get the colors to update from the default.

Below is an example of the concept I'm trying to accomplish with a well-known dataset (since my data has sensitive information and can't be posted).

For example if I wanted to create this with the crabs dataset in R, I would be assigning categorical variables to a color and symbol like:

species <- ifelse(crabs$sp == "B", "blue", "orange")
gender <- ifelse(crabs$sex == "M", "O", "+")

Then I would want to have the exact same symbols and colors in my matrix and desity plots:

ggpairs(crabs, columns=4:8, aes(color=species, shape=gender),
        lower=list(continuous="smooth"), diag=list(continuous="densityDiag"))

However, this outputs the following:

enter image description here


But the coral color should be blue, and the teal color should be true orange.
data_life
  • 387
  • 1
  • 11

2 Answers2

1

You may need to specify the colours for each group using scale_color_manual() and scale_fill_manual(). Also the colour values must be same as the number of categories and not the length of the data:

species <- ifelse(crabs$sp == "B", "blue", "orange")
gender <- ifelse(crabs$sex == "M", "O", "+")

ggpairs(crabs, columns=4:8, aes(colour=species, shape=gender),
lower=list(continuous="smooth"), diag=list(continuous="densityDiag"))+
scale_color_manual(values = unique(species))+
scale_fill_manual(values = unique(species))

enter image description here

S-SHAAF
  • 1,863
  • 2
  • 5
  • 14
0

It's not easy to change the default colours of ggpairs, and here the issue is specifically with the upper right correlation text which means scale_colour_identity() won't work. You could define the colours using manual scales instead:

species <- ifelse(crabs$sp == "B", "blue", "orange")
gender <- ifelse(crabs$sex == "M", "O", "+")

ggpairs(crabs, columns=4:8, aes(color=species,
                                shape=gender),
        lower=list(continuous="smooth"),
        diag=list(continuous="densityDiag")) +
  scale_fill_manual(values = c("blue", "orange")) +
  scale_colour_manual(values = c("blue", "orange"))

Here, I'd suggest you call the values in species something more meaningful to get better labels in the correlation plot. I would also generally recommend passing in a column of your data for the colours in aes() rather than an external vector. You could do something similar for the points with scale_shape_manual().

enter image description here

nrennie
  • 1,877
  • 1
  • 4
  • 14