1
st = list("amber johnson", "anhar link ari")
t = stringr::str_match_all(st, "(\\ba[a-z]+\\b)")
str(t)
# List of 2
#  $ : chr [1, 1:2] "amber" "amber"
#  $ : chr [1:2, 1:2] "anhar" "ari" "anhar" "ari"

Why are the results repeated like so?

smci
  • 32,567
  • 20
  • 113
  • 146
tnabdb
  • 517
  • 2
  • 8
  • 22

1 Answers1

4

If you look at ?str_match_all value, it says:

For str_match, a character matrix. First column is the complete match, followed by one column for each capture group. For str_match_all, a list of character matrices.

Since you pattern contains a capture group, the result contains two columns, one for the complete match one for the capture group. If you don't want the repeated column, you can remove the group parentheses from the pattern:

st = list("amber johnson", "anhar link ari")
t = str_match_all(st, "\\ba[a-z]+\\b")
str(t)

Which gives:

# List of 2
#  $ : chr [1, 1] "amber"
#  $ : chr [1:2, 1] "anhar" "ari"
Psidom
  • 209,562
  • 33
  • 339
  • 356
  • 3
    If there are no capturing groups, there is no need in `str_match_all` at all, `str_extract_all` is enough, or even use `gregexpr` with `regmatches`. – Wiktor Stribiżew Jul 08 '16 at 20:00
  • @Psidom, Ah I see... I read the help page, but didn't understand that the result contains a complete match and capture group(s). Thanks. – tnabdb Jul 08 '16 at 20:22