I'm trying to run a function on columns that have NA observations. When all observations are NA I would like it to return NA, but when only a fraction of rows has it, just apply na.rm=T. I've seen a few posts showing how to do this (link_1, link_2, link_3), but none of them seem to work for my function and I'm not sure where I'm going wrong.
# data frame
species_1<- c(NA, 10, 40)
species_2<- c(NA, NA, 30)
species_3<- c(NA, NA, NA)
group<- c(1, 1, 1)
df<- data.frame(species_1, species_2, species_3, group)
# function argument
y_true_test<- c(30, 20, 20)
# function
estimate = function(df, y_true, na.rm=T) {
if (all(is.na(df))) df[NA_integer_] else
sqrt(colSums((t(t(df) - y_true_test))^2, na.rm=T) / 3) / y_true_test * 100
}
# run
final<- df %>%
group_by(group) %>%
group_modify( ~ as.data.frame.list(estimate(., y_true_test))) #species 3 returns '0' when it should be NA
Any help would be greatly appreciated.