I am working with a dataset where I need to evaluate hundreds of columns at the time to create new variables with computations by row. I have three new variables, one needs the "or" operator to decide if there is any "yes" across the ~100 columns. The second one needs to count across the variables how many "yes" I have in total, and the third one needs to create a constellation variable that shows me the name of variables with the "yes" value, all of this by row. I have the code for the first two, but for the third one I am stuck. Also, I am using only a few variables for example purposes but I have ~100 variables that I need to use. My code is below:
#making the data - I am using actually ~100 variables
test.data <- data.frame(var1 = c("yes", "no", "no", "N/A", NA, NA),
var2 = c(NA, NA, "yes", "no", "yes", NA),
var3 = c("yes", "yes", "yes", "no", "yes", "N/A"),
var4 = c("N/A", "yes", "no", "no", "yes", NA))
# code for the first two variables: is.positive and number.pos - not elegant nor efficient since I #need to work with ~100 vars
final.data <- data.frame(test.data %>%
mutate(is.positive = ifelse(var1=="yes" | var2=="yes" | var3=="yes" | var4=="yes", 1,
ifelse((is.na(var1) | var1=="N/A") &
(is.na(var2) | var2=="N/A") &
(is.na(var3) | var3=="N/A") &
(is.na(var4) | var4=="N/A"), NA, 0))) %>%
rowwise() %>%
mutate(number.pos = sum(c_across(c(var1, var2, var3, var4))=="yes",na.rm=TRUE)))