0

I'm attempting to create an image that shows error of machines vs temperature and humidity. After reading a paper (see image below), it seems like the best route to go is a hexagon or density plot to show these errors. My issue is that every time I create a (1) density plot it produces a grey diagram that really shows no data whatsoever (2) a hexagram plot it only shows count data.

Example of my subset of my data (with only the temperature, humidity and PMdata included as thats what I want to display

library(ggplot2)
ggplot(DM_EPA_1H)+
geom_hex(aes(x=Relative.humidity, y=Temperature, color=Diff_PM1)

Image produced with the hex

The above image is along the lines of what I want but obviously its difficult to interpret because it has count data. I can't tell under what circumstances (temperature/humidity) are we seeing an error.

ggplot(DM_EPA_1H, aes(x=Relative.humidity,y=Temperature), na.rm = FALSE)+
stat_density_2d(aes(fill=Diff_PM1), geom = "polygon")+
scale_fill_viridis_c()

Image produced by stat_density

This above image isn't very interpretable and am unsure what the next best route is to get the desired outcome.

Desired format for displaying data. Credit Lui et al., 2019 (Atmosphere, 10, 41)

Unfortunately the above image does not have any source code for how they produced these images so is making it difficult to reproduce. It remains possible that it wasn't even done in ggplot but to me it looked like the source.

I appreciate the help. Let me know if any more clarifications are needed

1 Answers1

1

Use stat_summary_hex and geom_density2d. With stat_summary_hex, you can specify what you want to calculate for each bin instead of the count; here I assumed you wanted the mean, but you can use essentially any function. Also, you made it a bit difficult by not providing any example data, so I generated some randomly.

library(tidyverse)

set.seed(0)
DM_EPA_1H = tibble(Relative.humidity = (rbeta(1000, 6, 1.3)) * 100, Temperature = rnorm(1000, mean = 50, sd = 10), Diff_PM1 = rnorm(1000, mean = 0, sd = 5))

ggplot(DM_EPA_1H, mapping = aes(x = Relative.humidity, y = Temperature)) +
  stat_summary_hex(mapping = aes(z = Diff_PM1), fun = ~mean(.x)) +
  scale_fill_steps2(low = "#eb0000", mid = "#e0e0e0", high = "#1094c4") +
  geom_hex(stat = "identity") +
  geom_density2d(colour = "black") +
  geom_point(size = 0.5)

This roughly reproduces the original plot:

roughly reproduced original plot from Lui et al. 2019

Of course, if you want to use viridis as you indicated in your second code sample, you can do that as well with scale_fill_viridis_c instead of scale_fill_steps2.

shizundeiku
  • 270
  • 2
  • 6
  • Hi my apologies I pasted the data into the question, but it uploaded it as an image. This looks great and think it's producing what I need. Just have to figure out a color romp and then we should be good to go! Thanks for your help! – Adam_Eire_2020 Oct 15 '20 at 09:31