0

I am working on the following dataframe:

structure(c(NA, 52, 22, 52, 111, 0, 3, 16, 5, 0, 0, 80, NA, 14, 
243, 14, 41, 1, 177, 3, 0, 20, 26, 8, NA, 13, 21, 10, 0, 0, 0, 
0, 0, 59, 29, 7, NA, 15, 20, 0, 8, 8, 0, 0, 111, 9, 273, 1733, 
NA, 60, 3, 14, 164, 0, 8, 20, 171, 53, 932, 135, NA, 0, 33, 3, 
0, 144, 21, 0, 3, 1, 2, 0, NA, 1, 0, 0, 0, 23, 81, 14, 146, 20, 
19, 5, NA, 1, 1, 0, 11, 4, 4, 8, 11, 0, 2, 12, NA, 0, 0, 1, 0, 
4, 32, 0, 16, 0, 3, 0, NA, 0, 0, 40, 2, 1, 0, 0, 0, 1, 5, 0, 
NA), .Dim = c(11L, 11L), .Dimnames = list(c("WILL_", "WOULD_", 
"MAY_", "MIGHT_", "CAN_", "COULD_", "SHALL_", "SHOULD_", "MUST_", 
"OUGHT TO_", "USED TO_"), c("_WILL", "_WOULD", "_MAY", "_MIGHT", 
"_CAN", "_COULD", "_SHALL", "_SHOULD", "_MUST", "_OUGHT TO", 
"_USED TO")))

And I have the following plot:

reshape2::melt(twitter_us_no_clit_matrix_nohave, value.name = "Freq") %>%
  mutate(label = ifelse(is.na(Freq) | Freq == 0, "", as.character(Freq))) %>%
  ggplot(aes(Var2, fct_rev(Var1))) +
  geom_tile(aes(fill = Freq), color = "black") +
  geom_text(aes(label = label), color = "black") +
  scale_fill_gradient(low = "grey", high = "purple", na.value = "black") +
  scale_x_discrete(NULL, expand = c(0, 0), position="top") +
  scale_y_discrete(NULL, expand = c(0, 0)) +
  theme(axis.text.x = element_text(angle=60,vjust = 0.5, hjust = 0)) 

It's all looking good but I'm trying to figure out how to change the breaks so that it changes the heatmap and its "Freq" label accordingly.

The breaks would be c(0,1,5,10,50,100,500,100000) and would make the values look more or less like the following in terms of colour heat:

Which exact lines should I tweak in my geom_tile()?

Cameron
  • 85
  • 6

1 Answers1

1

If I understand correctly, you want to divide your continuous scale into steps. I had some success using the scale_fill_steps() function:

library(tidyverse)

twitter_us_no_clit_matrix_nohave <- structure(c(NA, 52, 22, 52, 111, 0, 3, 16, 5, 0, 0, 80, NA, 14, 
            243, 14, 41, 1, 177, 3, 0, 20, 26, 8, NA, 13, 21, 10, 0, 0, 0, 
            0, 0, 59, 29, 7, NA, 15, 20, 0, 8, 8, 0, 0, 111, 9, 273, 1733, 
            NA, 60, 3, 14, 164, 0, 8, 20, 171, 53, 932, 135, NA, 0, 33, 3, 
            0, 144, 21, 0, 3, 1, 2, 0, NA, 1, 0, 0, 0, 23, 81, 14, 146, 20, 
            19, 5, NA, 1, 1, 0, 11, 4, 4, 8, 11, 0, 2, 12, NA, 0, 0, 1, 0, 
            4, 32, 0, 16, 0, 3, 0, NA, 0, 0, 40, 2, 1, 0, 0, 0, 1, 5, 0, 
            NA), .Dim = c(11L, 11L), .Dimnames = list(c("WILL_", "WOULD_", 
                                                        "MAY_", "MIGHT_", "CAN_", "COULD_", "SHALL_", "SHOULD_", "MUST_", 
                                                        "OUGHT TO_", "USED TO_"), c("_WILL", "_WOULD", "_MAY", "_MIGHT", 
                                                                                    "_CAN", "_COULD", "_SHALL", "_SHOULD", "_MUST", "_OUGHT TO", 
                                                                                    "_USED TO")))

breaks <- c(0,1,5,10,50,100,500,100000)

reshape2::melt(twitter_us_no_clit_matrix_nohave, value.name = "Freq") %>%
  mutate(label = ifelse(is.na(Freq) | Freq == 0, "", as.character(Freq))) %>%
  ggplot(aes(Var2, fct_rev(Var1))) +
  geom_tile(aes(fill = Freq), color = "black") +
  geom_text(aes(label = label), color = "black") +
  scale_fill_steps(low = "grey", high = "purple", breaks = breaks, na.value = "black") + 
  scale_x_discrete(NULL, expand = c(0, 0), position="top") +
  scale_y_discrete(NULL, expand = c(0, 0)) +
  theme(axis.text.x = element_text(angle=60,vjust = 0.5, hjust = 0)) 

However, given the varying distance between your breaks, most of the scale is in gray. I don't know if you want to preserve that perspective.

For rescaling the colors, change the line to

scale_fill_steps(low = "grey", high = "purple", breaks = breaks, na.value = "grey", trans = "log")

which will make them evenly distributed in your scale. I was unsuccessful at setting them manually. I changed the color of the NA value, because now all your zeroes are NAs due to the log transformation.

tlaus
  • 28
  • 6
  • That's already quite helpful, thanks ! How would you get around to setting a color for each break manually? Just in case that's useful, the second heatmap in my question is based on a colour palette called "Purples2" which seems to go from this light grey to a deep purple. I'm trying to find out what colours that palette is made of, to set those colours manually unless there's an automatic alternative – Cameron Jan 21 '21 at 15:15
  • 1
    I have added the code for log transformation of the colors, which will make the steps look even. – tlaus Jan 22 '21 at 13:37
  • Looks even better and perfectly what I'm looking for, thanks ! Reproducing the exact code you provide I have a small error : can't find scale_fill_steps() function. Is this normal? – Cameron Jan 22 '21 at 13:41
  • I just tried it out in a fresh session and had no issues. The [scale_fill_steps](https://ggplot2.tidyverse.org/reference/scale_steps.html) is part of the ggplot2 package, I have version 3.3.3 – tlaus Jan 22 '21 at 13:49
  • Just updated to R 4.0 and it works. Thanks, looks great ! – Cameron Jan 22 '21 at 14:08
  • 1
    yes, if you change all the Freq values that equal 0 to a small number (like 0.001) and then replace 0 in your breaks with that number. – tlaus Jan 22 '21 at 14:16