The data below have an IndID
field as well as three columns containing numbers, including NA in some instances, with a varying number of rows for each IndID
.
library(dplyr)
n = 10
set.seed(123)
dat <- data.frame(IndID = sample(c("AAA", "BBB", "CCC", "DDD"), n, replace = T),
Num1 = c(2,4,2,4,4,1,3,4,3,2),
Num2 = sample(c(1,2,5,8,7,8,NA), n, replace = T),
Num3 = sample(c(NA, NA,NA,8,7,9,NA), n, replace = T)) %>%
arrange(IndID)
head(dat)
IndID Num1 Num2 Num3
1 AAA 1 NA 7
2 BBB 2 NA NA
3 BBB 2 7 7
4 BBB 2 NA NA
5 CCC 3 2 8
6 CCC 3 5 NA
For each IndID
, I would like to make a new column Max
that contains the maximum value for Num1
:Num3
. In most instances this involves finding the max value across multiple rows and columns. Within dplyr
I am missing the final step (below) and would appreciate any suggestions.
dat %>%
group_by(IndID) %>%
mutate(Max = "???")