I apologize in advance if the title of this post isn't accurate. I already know that this is a super easy question and if I knew the correct terminology I probably could find a pervious post about this.
So what I am attempting to do is filter my data using dplyr for a gene family. Here is an example so it makes a bit more sense.
I have a gene family called ADCY
but what comprises that family is 10 seperate genes. So the family looks like this
ADCY1
ADCY2
ADCY3
ADCY4
ADCY5
ADCY6
ADCY7
ADCY8
ADCY9
ADCY10
I know I can do something like this but it is kind of annoying to have to type out all 10 genes, especially when I have a bunch of other gene families I want to look at.
genes <- c("ADCY1", "ADCY2", "ADCY3", "ADCY4", "ADCY5", "ADCY6", "ADCY7",
"ADCY8", "ADCY9", "ADCY10")`
df_filtered <- df %>%
filter(symbol %in% genes)
I was wondering if there was a was to use dpylr and filter for just maybe the start of the gene name? If that makes sense? I know there is a starts_with("ADCY")
that I can use, but my R session crashes when I try and use that with the filter
option. I was wondering if anyone had some solutions!