I have a data frame of hourly weather data. I want to group the hours into days and assign values to the new daily data frame when a condition is met and only the mean of the rows that meet that condition: In the daily data frame I don't want to use 0 values or non-zero values that do not meet the condition HumH >50 | WetH>30
My data looks like this (sorry I couldn't use a smaller subset): I would like to have
for day 1 --> only the mean(of the rows 12:18)
for day 2 --> 0 (since neither condition is met)
structure(list(Day = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), Temp = c(12, 12, 12,
11, 10, 9, 9, 9, 9, 9, 8, 7, 6, 6, 6, 6, 6, 8, 11, 12, 13, 14,
15, 16, 16, 16, 16, 14, 13, 11, 10, 9, 8, 7, 7, 6, 4, 3, 3, 2,
4, 8, 11, 13, 14, 16, 17, 18), RH = c(69L, 68L, 71L, 73L, 76L,
81L, 78L, 77L, 76L, 78L, 80L, 83L, 88L, 90L, 91L, 94L, 93L, 91L,
78L, 67L, 62L, 56L, 49L, 39L, 36L, 34L, 34L, 36L, 38L, 42L, 48L,
50L, 53L, 58L, 59L, 63L, 66L, 69L, 71L, 73L, 71L, 60L, 50L, 43L,
41L, 36L, 34L, 31L), HumH = c(NA, 0, 0, 0, 0, 9, 0, 0, 0, 0,
8, 15, 21, 27, 33, 39, 45, 53, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0),
WetH = c(NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 12,
18, 24, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)), row.names = c(NA,
-48L), class = "data.frame")
Then the 10th and 11thn days are the following
I would like to have
for day 10 --> only the mean(of the rows 9:16)
for day 11 --> 0 (since neither condition is met)
structure(list(Day = c(10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 11, 11, 11,
11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11,
11, 11, 11, 11, 11, 12), Temp = c(18, 17, 16, 14, 13, 12, 11,
9, 8, 7, 7, 6, 6, 6, 7, 8, 8, 8, 8, 8, 9, 9, 9, 10, 10, 11, 10,
9, 9, 9, 9, 9, 8, 8, 8, 8, 7, 7, 7, 8, 9, 10, 12, 13, 13, 14,
15, 15), RH = c(36L, 39L, 42L, 48L, 55L, 65L, 69L, 76L, 81L,
85L, 84L, 90L, 90L, 92L, 90L, 88L, 89L, 89L, 90L, 86L, 83L, 83L,
81L, 78L, 75L, 76L, 79L, 87L, 89L, 87L, 87L, 78L, 72L, 70L, 70L,
76L, 77L, 82L, 86L, 84L, 81L, 72L, 62L, 55L, 51L, 50L, 46L, 44L
), HumH = c(0, 0, 0, 0, 13, 0, 0, 0, 8, 15, 22, 28, 34, 40, 47,
55, 63, 71, 79, 87, 96, 105, 114, 0, 0, 0, 0, 9, 18, 27, 36,
0, 0, 0, 0, 0, 0, 7, 14, 22, 31, 0, 0, 0, 0, 0, 0, 0), WetH = c(0,
0, 0, 0, 14, 0, 0, 0, 0, 0, 0, 0, 0, 6, 12, 0, 0, 8, 16, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0)), row.names = c(NA, -48L), class = "data.frame")
And my expected output is:
structure(list(Day = c(1, 2, 3, 4, 5, 6), Temp = c(6.2, 0, 0,
0, 0, 0), RH = c(75.3333333333333, 49.8333333333333, 45.625,
51.6666666666667, 52.75, 56.2083333333333)), row.names = c(NA,
-6L), class = c("tbl_df", "tbl", "data.frame")
This is what I have done so far, but I cannot add the conditions:
weather_day <- weather_hour %>%
group_by(as.numeric(Day)) %>%
group_map(~head(.x,24L)) %>%
summarize(Temp_day=mean(Temp),RH_day=mean(RH))}
I want the values of Temp
and RH
of the daily data frame to be for each group. Here is some pseudocode:
if(HumH <50 | WetH < 30){
Temp= 0 & RH= 0
}
else{
Temp = mean(of consecutive non-zeros that lead to the condition) &
RH = mean(of consecutive non-zeros that lead to the condition)
}