I have a nested data.frame - df_nested
, there one of column contains df
:
df <- tibble(ID_Value = 1:8,
xyz001 = c("text4", NA, NA, NA, NA, NA, NA, "text2"),
xyz002 = c(NA, NA, NA, "text3", "text1", NA, NA, NA),
xyz003 = c(NA, "text1", NA, NA, "text2", NA, "text2", NA))
I want to find a way, how to mutate this df, on these requirements:
mutate(across(matches("\\d")
- there are 4 cases - 4 levels by priority. text4 <- text3 <- text2 <- text1: I need to find and keep column values containing only the highest level text. e.g. if column contains text4, I want to remove text3, text2, text1 and replace them to NA. If it contains multiple highest order text, we should keep all these values (e.g. column xyz003).
- how to apply these conditions not specifying column name, because there can be any number in column name.
- if column contains all NAs, do nothing.
my attempt:
df_nested <- df_nested %>%
mutate(df = map(data, ~.x %>%
mutate(across(matches("\\dd"), function (x) {
conditions (ifelse, case_when or other)
...}
Also, should we better use across()
, or is vars()
still a good way to do it as well?
Thank you in advance.
Expected Output
df <- tibble(ID_Value = 1:8,
xyz001 = c("text4", NA, NA, NA, NA, NA, NA, NA),
xyz002 = c(NA, NA, NA, "text3", NA, NA, NA, NA),
xyz003 = c(NA, NA, NA, NA, "text2", NA, "text2", NA))