0

For a heatmap made using ggplot and geom_tile, how would you "cross off" a tile based on a conditional value?

The heatmap shows counts of the number of times an animal performed a behavior between 1990-2020. Rows are animal IDs, columns are years.

Years go from 1990-2020 but not all animals are alive throughout that time frame (ie, some born later than 1990 or die earlier than 2020)

So I want to cross off any tiles where an animal isn't alive, or before it was born.

Data look like this (shortened to 5 rows for brevity):

data <- data.frame(date = structure(c(8243, 8243, 8243, 8248, 8947), class = "Date"), 
                       year = c("1992", "1992", "1992", "1992", "1994"), 
                       event.id = c(8L, 8L, 8L, 10L, 11L), 
                       id = c("L5", "L58", "L73", "L21", "L5"),
                       birth = c(1964L, 1980L, 1986L, 1950L, 1964L), 
                       death = c(2012L, 2003L, NA, NA, 2012L))

NA means the animal is still alive and it wouldn't be crossed off since before it was born.

Any help to create this is greatly appreciated!

Code looks like this:

heatmap <- data %>%
mutate(x = case_when(year %in% 1990:1999 ~ "1990-1999",
                   TRUE ~ year)) %>%
mutate(y = paste(id)) %>%
group_by(x, y, .drop  = FALSE) %>%
summarize(count = n()) %>%
arrange(y)

ggplot(data = heatmap, aes(x, y, fill = count)) +
geom_tile()

EDIT Current heat map. enter image description here

burphound
  • 161
  • 7
  • Visually, how would a "crossed-off" tile look like? – Maurits Evers Jun 24 '22 at 03:18
  • Maybe a simple diagonal or horizontal slash. I edited my question to add what the heatmap looks like now (still a work in progress). – burphound Jun 24 '22 at 04:19
  • I think crossed off tiles would be visually very distracting. I'd suggest using color: for example use plain white for tiles before birth and after death (get rid of the `theme_gray`), use a darker gray for the `NA` color indicating absence of data (`+ scale_fill_continuous(na.value = "gray60")`), and then let non-gray colors represent counts (as you have already... sort of... not sure what's wrong with your picture). – Gregor Thomas Jun 24 '22 at 04:43
  • Thanks Gregor. I like the idea of using colors instead of crossing-off. I would like to differentiate between 0 counts (most of the graph TBH) and absence of data (when an animal wasn't born or had died) - to do that, is that something I'd first have to transform the data beforehand? As my data structure is now, each animal that performs the behavior has its own row, so there is no row for "no" counts for an animal. Any help you can provide in sorting that out is appreciated! Oh yeah, and ignore all the tiles being red...meant to edit that out before posting. – burphound Jun 24 '22 at 04:53
  • Yes, data where the rows don't exist will not be plotted. Data with `NA` values will be plotted with fill color of the `na.value` arg in `scale_fill_continuous`. – Gregor Thomas Jun 27 '22 at 17:09
  • @burphound, were you about to figure this out? – Skaqqs Aug 22 '22 at 14:02

1 Answers1

1

Here's how you could use color to indicate NA, like suggested by @Gregor Thomas.

Transforming your data to "complete":

library(dplyr)
library(tidyr)
library(ggplot2)

hm <- dat %>%
  mutate(x = case_when(year < 1960 ~ "Pre-1960", 
                       year %in% 1960:1969 ~ "1960-1969",
                       year %in% 1970:1979 ~ "1970-1979",
                       year %in% 1980:1989 ~ "1980-1989",
                       year %in% 1990:1999 ~ "1990-1999",
                       TRUE ~ year)) %>%
  mutate(y = paste(matriline, id)) %>%
  group_by(x, y, .drop  = FALSE) %>%
  summarize(count = n()) %>%
  ungroup() %>%
  tidyr::complete(x, y) %>%
  arrange(y) %>%
  tidyr::separate(y, into = c("ym", "yid"), sep = " ", remove = FALSE)

Then define a color for NA:

ggplot(data = hm, aes(x, yid, fill = count)) +
  geom_tile() +
  scale_fill_gradient(low = "white", high = "red", na.value = "grey50") +
  scale_x_discrete(position = "top", drop = FALSE) +
  scale_y_discrete(limits=rev) +
  labs(x = NULL, y = NULL) +
  facet_wrap( ~ ym, strip.position = "left", dir = "v", ncol = 1) +
  theme(panel.spacing = unit(0.2, "lines"),
        strip.background = element_blank(),
        strip.placement = "outside",
        axis.text.x = element_text(angle = 45, hjust = -0.02))

enter image description here

Data:

ids <- c("J11", "J16", "J17", "J02", "J22", "J26", "J27", "J30")
matrilines <- c("J02","J04", "K11", "L20", "P90", "K100", "R22")

dat <- data.frame(year = as.character(sample(1960:2018, 1000, replace = TRUE)),
                  id = sample(ids, 1000, replace = TRUE),
                  matriline = sample(matrilines, 1000, replace = TRUE))
Skaqqs
  • 4,010
  • 1
  • 7
  • 21