My dataset a large number of columns starting with "dis....".
The values in the columns are either 0 (without disease) or 1 (with disease). I would like create a dataframe of observations with 1 for a specific disease and 0 for everything else.
I have tried the following:
istroke <- filter(onlyCRP, dis_ep0009 == 1 & grep("dis_" == 0))
and in combination with select:
istroke1 <- filter(onlyCRP, dis_ep0009 == 1 & select(contains("dis_") == 0))
As you'd guess, neither of them work.
I have looked at these posts:
filtering columns by regex in dataframe
Subset data based on partial match of column names
But they don't answer my question.
Please let me know if you require further clarifications.
Edit I realized I needed to clarify further what I wanted. Consider this table:
dis_ep0009 dis_epxxx dis_epxxx
0 0 0
0 1 0
0 0 1
1 0 1
0 0 0
0 0 0
1 1 1
I need another column, e.g - IS according to some conditions of these 3 columns (I actually have 29 of these "dis_" columns):
If dis_ep0009 == 1, then IS == 1 (regardless of 0 or 1 on any other "dis.." columns).
if dis_ep0009 == 0 and dis_epxxx == 1, I want to drop these observations
if dis_ep0009 == 0 and dis_epxxx == 0, I want to code IS == 0.
So the resulting table should look like this:
dis_ep0009 dis_epxxx dis_epxxx IS
0 0 0 0
0 1 0 drop
0 0 1 drop
1 0 1 1
0 0 0 0
0 0 0 0
1 1 1 1
I have tried pairing filter (dplyr) with grep and ifelse statements but can't make head or tails of it. In essence, it should be something simple like this (not meant to work):
istroke <- filter(df, ifelse(dis_ep0009 == 1, 1, ifelse(dis_ep0009 == 0 & grep("dis_", names(df)) == 0, 0, ifelse(dis_ep0009 == 0 & grep("dis_", names(df)) == 1, drop())))
Thanks in advance!