1

I'm looking to match mentions of foo in a username. I need to be able to match text strings that start with '@' and contain the word 'foo' at any location within that username, ending by either a space or grammar.

I neeed to be able to match:

example1: @anycharacterhere_foo, anything else here

example2: @foo_anymorecharacters here

I'm looking to use the stringr library like so:

str_extract_all(x, perl("?<=@"))

What I don't understand is the match all function

Alexander
  • 105,104
  • 32
  • 201
  • 196
lmcshane
  • 1,074
  • 4
  • 14
  • 27

2 Answers2

2

Assuming that your usernames won't have special characters:

x <- "@anycharacterhere_foo, anything else here"
username <- str_extract_all(x, "\\w*(foo)\\w*")

which yields a string with your username. This will pick up additional foos in the remaining string, but you could fix that with str_extract rather than all. I am not certain if you really need all foo from the string or simply the username which in your example data is at the beginning. You could also limit that with the all match by including the @, thus:

username <- str_extract_all(x, "\\@\\w*(foo)\\w*")
Shawn Mehan
  • 4,513
  • 9
  • 31
  • 51
2

You need to look for "zero or more" word characters that precede or follow:

x <- '@anycharacterhere_foo @foo_anymorecharacters here anything else here'
str_extract_all(x, '@\\w*foo\\w*')[[1]]
# [1] "@anycharacterhere_foo"  "@foo_anymorecharacters"

If you don't want to include the marker:

str_extract_all(x, '(?<=@)\\w*foo\\w*')[[1]]
# [1] "anycharacterhere_foo"  "foo_anymorecharacters"

You could also use rm_tag from the qdapRegex package for this:

library(qdapRegex)
rm_tag(x, extract=TRUE)[[1]]
# [1] "@anycharacterhere_foo"  "@foo_anymorecharacters"
hwnd
  • 69,796
  • 4
  • 95
  • 132