I have a data frame in long format, that contains multiple entries per id. I also have a condition column that is either "app condition", "control condition" or NA. Each id has at least one "app condition" or "control condition" entry, but usually the rest are NAs. Now I need to filter out all id rows that belong to the app condition. So I need something like: if condition == "App condition" for id 5, remove all rows of id 5.
My df looks something like this:
ID | Condition | .... |
---|---|---|
A | App condition | |
A | NA | |
A | NA | |
B | Control condition | |
B | NA | |
B | Control condition | |
C | NA | |
C | App condition | |
D | NA | |
D | Control condition |
And I want to keep all ID that have at least one "control condition" entry. So basically something like this:
ID | Condition | .... |
---|---|---|
B | Control condition | |
B | NA | |
B | Control condition | |
D | NA | |
D | Control condition |
My approach so far is using dplyr with
df <- df %>%
group_by(id) %>%
filter(any(condition != "App condition")|is.na(condition))
But that also still returns IDs that belong to the app condition, but just removed these rows so that the NA rows of the same ID still stay in the data frame.
Can anybody help?
Thanks a lot!