I have categorical data that I'd like to map the frequency of using a heatmap (geom_tile), much like the example below:
data("mtcars")
freq <- data.frame(xtabs(~cyl + gear, mtcars)) #count number of 4,6,8 cyl cars by gear
ggplot(freq, aes(cyl, gear)) +
geom_tile(aes(fill = Freq)) +
scale_fill_gradient(low = "white",high = "steelblue")
But I'd like to split each tile according to the proportion of significant or non-significant results (0-1 values). In this example, I would generate the same frequency count but differentiate between automatic and manual transmission (am
)
freq_am <- data.frame(xtabs(~cyl + gear + am, mtcars))
print(freq_am)
#cyl gear am Freq
4 3 0 1
6 3 0 2
8 3 0 12
4 4 0 2
6 4 0 2
8 4 0 0
4 5 0 0
6 5 0 0
8 5 0 0
4 3 1 0
6 3 1 0
8 3 1 0
4 4 1 6
6 4 1 2
8 4 1 0
4 5 1 2
6 5 1 1
8 5 1 2
The resulting heatmap would have (for example) blue for values of am==0
and red for am==1
. Each tile would be divided (along a diagonal?) according to the proportion of cars of that type that are automatic (am==0
) or manual (am==1
). The shades of blue and red would be proportionate to the count, just as the gradient already reflects.
For example:
the top left tile (4,5) would be completely light red because all of the 4-cyl, 5-gear cars (count = 2) are manual
the middle left tile (4,4) would be 1/4 blue and 3/4 red because 25% of the 4-gear, 4-cyl cars are automatic (count = 2) and 75% are manual (count = 6)
the bottom left tile (4,3) would be completely lightest blue because all of the 4-cyl, 3-gear cars (count = 1) are automatic