I am sure I am not the only person who has asked this but after hours of searching with no luck I need to ask the question myself.
I have a df (rp) like so:
rp <- structure(list(agec1 = c(7, 16, 11, 11, 17, 17),
agec2 = c(6, 12, 9, 9, 16, 15),
agec3 = c(2, 9, 9, 9, 14, NA),
agec4 = c(NA, 7, 9, 9, 13, NA),
agec5 = c(NA, 4, 7, 7, 10, NA),
agec6 = c(NA, NA, 6, 6, 9, NA),
agec7 = c(NA, NA, NA, NA, 7, NA),
agec8 = c(NA, NA, NA, NA, 5, NA),
row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"))
Where each obs in agecX refers to the age of a parent's children up to 8 children. I would like to create a new column "agec5_12" that contains the age of the oldest child aged 5-12. So my df would look like this:
rpage <- structure(list(agec1 = c(7, 16, 11, 11, 17, 17),
agec2 = c(6, 12, 9, 9, 16, 15),
agec3 = c(2, 9, 9, 9, 14, NA),
agec4 = c(NA, 7, 9, 9, 13, NA),
agec5 = c(NA, 4, 7, 7, 10, NA),
agec6 = c(NA, NA, 6, 6, 9, NA),
agec7 = c(NA, NA, NA, NA, 7, NA),
agec8 = c(NA, NA, NA, NA, 5, NA),
agec5_12 = c(7, 12, 11, 11, 10, NA))
row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"))
Notes about my data:
- Ages are not always in the same chronological order i.e. youngest to oldest or oldest to youngest
- It is possible for a row to have no children aged within this range (in which case I would like NA to be returned)
I have tried writing a function and applying it using rowwise
and mutate
:
fun.age5_12 <- function(x){
x[which(x == max(x[(x > 4) & (x < 13)], na.rm = TRUE))]
}
rpage <- rp %>%
select(-c(20:21, 199:200)) %>%
rowwise() %>%
mutate(agec5_12 = fun.age5_12(c(1:8)))
However, this returns all obs as "12". Ideally I would like to do this using dplyr. Any suggestions using mutate
or ifelse
and not necessarily with functions are fine.
Thank you