I am trying to normalize the count of individual bins in a 2d histogram. Here, group 3 has a substantially higher number of inputs, however, I want to compare bins. So I am trying to get it to show the proportional y values of each bin, that the total count of each bin adds up to e.g. 100.
I reckon that this has to be done with the dataframe beforehand. I have managed to normalize the values per group, however, I havent managed to reduce the count to be able to visualize it like so in with the 2d histogram function.
perClassNormalized <- Variables %>%
group_by(Class) %>%
mutate(Nor = procntStad/(max(procntStad)))
Variables <- dataframe with about 10 variables (columns), each with x entries per one of 5 classes. The current total counts per class are: 1 = 639, 2 = 247, 3 = 9881, 4 = 1084, 5 = 823. So the number of inputs for 3 is substantially higher than the others.
Class | variable1 | variable2 |
---|---|---|
1 | 3 | 7 |
1 | 2 | 3 |
2 | 2 | 6 |
2 | 5 | 8 |
3 | 3 | 9 |
3 | 2 | 1 |
3 | 2 | 3 |
3 | 8 | 4 |
4 | 9 | 5 |
5 | 10 | 2 |
Example of what image I currently have
my_breaks = c(2, 10, 50, 100, 5000)
##
procentStadVSKlasse <- ggplot(perClassNormalized , aes(x = Class, y = (Nor))) + geom_bin2d(bins = 10) +
ylab("Percentage bebouwd oppervlak") + xlab("Norm klasse regionale kering") +
labs(title = "Bebouwd oppervlak") +
scale_fill_gradient(name = "count", trans = "log", breaks = my_breaks, labels = my_breaks,
low = '#55C667FF', high = '#FDE725FF') +
theme_bw() +
scale_x_discrete(limits = c(1,2,3,4,5)) +
theme(plot.title = element_text(hjust = 0.5, size = 14, face = "bold"),
axis.title.x = element_text(size=14),
axis.text.x = element_text(size=12),
axis.title.y = element_text(size=14))
The new image should likely be similar, however, the visualization is likely to be improved and distinctions are hopefully more easily spotted.