I have a data frame, similar to the one below (see dput), recording responses of a variable to a treatment over time:
df <- structure(list( time = c(0, 0, 0, 0, 0, 0, 14, 14, 14, 14, 14, 14, 33, 33, 33, 33, 33, 33, 90, 90, 90, 90, 90, 90),
trt = structure(c(2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L),
.Label = c("1", "2"), class = "factor"),
A1 = c(6.301, 5.426, 5.6021, NA, NA, NA, 6.1663, 6.426, 6.8239, 2.301, 4.7047, 2.301, 5.8062, 4.97, 4.97, 2.301, 2.301, 2.301, 2.301, 2.301, 2.301, 2.301, 2.301, 2.301),
B1 = c(5.727, 5.727, 5.4472, NA, NA, NA, 6.6021, 7.028, 7.1249, 3.028, 3.1663, 3.6021, 5.727, 5.2711, 5.2389, 3.3554, 3.9031, 4.2389, 3.727, 3.6021, 3.6021, 3.8239, 3.727, 3.426)),
row.names = c(NA, -24L), class = c("tbl_df", "tbl", "data.frame"))
which looks lie this:
time trt A1 B1
<dbl> <fct> <dbl> <dbl>
1 0 2 6.30 5.73
2 0 2 5.43 5.73
3 0 2 5.60 5.45
4 0 1 NA NA
5 0 1 NA NA
6 0 1 NA NA
7 14 2 6.17 6.60
8 14 2 6.43 7.03
9 14 2 6.82 7.12
10 14 1 2.30 3.03
In our experiments, we don’t always record values for all treatments at time == 0. I want to replace any missing values (NA) when (and only when) time == 0 with the mean of the trt ‘2’ group at time == 0. So NA in A1 all become 5.78, and those in B1 become 5.63.
Using answers from here and here, as well as some others, I have been able to come up with the following:
df %>%
mutate_if(is.numeric, funs(if_else(is.na(.),if_else(time == 0, 0, .), .)))
This replaces NA at time == 0 with 0 (this is useful for some of my variables where there is no data in any of the treatments at time == 0, but not what i'm after here). I also tried this:
df %>%
mutate_if(is.numeric, funs(if_else(is.na(.),if_else(time == 0, mean(., na.rm = TRUE), .), .)))
This is closer to what I want, but is averaging the values from the whole column/variable. Can I make it average only those values from treatment ‘2’ when time == 0?