Update
As @r_31415 noted in the comments, packages such as stringr
provide functions that can better address this question.
With str_subset(string, pattern, negate=FALSE)
, one could filter character vectors like
library(stringr)
# Strings that have at least one character that is neither "A" nor "B".
> c("AB", "BA", "ab", "CA") %>% str_subset("[^AB]")
[1] "ab" "CA"
# Strings that do not include characters "A" or "B".
> c("AB", "BA", "ab", "CA") %>% str_subset("[AB]", negate=TRUE)
[1] "ab"
By default, the pattern
is interpreted as a regular expression. Therefore, to search literal patterns that contains special characters like (
, *
, and ?
, one could enclose the pattern string with the modifier function fixed(literal_string)
instead of escaping with double-backslash escape or the raw-string since R 4.0.0
# escape special character with "\\" (has to escape `\` with itself in a string literal).
> c("(123.5)", "12345") %>% str_subset("\\(123\\.5\\)")
[1] "(123.5)"
# R 4.0.0 supports raw-string, which is handy for regex strings
> c("(123.5)", "12345") %>% str_subset(r"{\(123\.5\)}")
[1] "(123.5)"
# use the fixed() modifier
> c("(123.5)", "12345") %>% str_subset(fixed("(123.5)"))
[1] "(123.5)"
## unexpected results if without escaping or the "fixed()" modifier
> c("(123.5)", "12345") %>% str_subset("(123.5)")
[1] "(123.5)" "12345"
Original Answer
Sorry for posting on a 5-month-old question to archive a simpler solution.
Package dplyr
can filter character vectors in following ways:
> c("A", "B", "C", "D") %>% .[matches("[^AB]", vars=.)]
[1] "C" "D"
> c("A", "B", "C", "D") %>% .[.!="A"]
[1] "B" "C" "D"
The first approach allows you to filter with regular expression, and the second approach uses fewer words. It works because package dplyr
imports package magrittr
albeit masks its functions like extract
, but not the placeholder .
.
Details of placeholder .
can be found on within help of forward-pipe operator %>%
, and this placeholder has mainly three usage:
- Using the dot for secondary purposes
- Using lambda expressions with %>%
- Using the dot-place holder as lhs
Here we are taking advantage of its 3rd usage.