I want to find the non-missing value of each group associated with the largest index value, for many columns.
I have gotten fairly close by using summarize_all with which.max but I am not sure how to remove the NAs from each vector before I find the latest value. I read about using na.rm in summarize_all with functions like mean but not sure how to incorporate similar functionality without a built in function. I have tried na.omit but it doesnt provide the solution I'm looking for.
a <- head(iris, 10)
a$num <- 1:10
a$grp <- c("a","a","a","b","b","c","c","d","d","d")
a[10, "Species"] <- NA
a %>%
group_by(grp) %>%
summarize_all(funs(na.omit(.)[which.max(num)]))
grp Sepal.Length Sepal.Width Petal.Length Petal.Width Species num
<chr> <dbl> <dbl> <dbl> <dbl> <fct> <int>
1 a 4.70 3.20 1.30 0.200 setosa 3
2 b 5.00 3.60 1.40 0.200 setosa 5
3 c 4.60 3.40 1.40 0.300 setosa 7
4 d 4.90 3.10 1.50 0.100 NA 10
I expect all the values in the Species column to be setosa, however the last value is NA.