0

I have a table of lookup values with patterns to look for and replacements, but the patterns have strings containing one another and I want to match them exactly.

lookup <- tibble(
  pattern = c("ONE", "ONET", "ONETR"),
  replacement = c("one new", "this is 2", "for 3")
)
other_table <- tibble(
  strings = c(
    "I want to replace ONE",
    "Or ONET is what to change",
    "We can change ONE again",
    "ONETR also can be replaced"
  ),
  other_dat = 1:4
)

I've tried using stringi but this doesn't work when the patterns contain one another.

other_table %>%
  mutate(
    strings = stringi::stri_replace_all_fixed(
      strings, 
      pattern = lookup$pattern, 
      replacement = lookup$replacement,
      vectorize_all = FALSE)
    )

What function can I use to replace all the patterns found in in_table$strings with lookup$replacement?

Desired Output:

  strings                        other_dat
  <chr>                              <int>
1 I want to replace one new              1
2 Or this is 2 is what to change         2
3 We can change one new again            3
4 for 3 also can be replaced             4

Any help appreciated!

MayaGans
  • 1,815
  • 9
  • 30

1 Answers1

0

Use word-boundaries in your regex (not fixed), e.g., "\\b".

other_table %>%
  mutate(
    strings = stringi::stri_replace_all(
      strings, 
      regex = paste0("\\b", lookup$pattern, "\\b"), 
      replacement = lookup$replacement,
      vectorize_all = FALSE)
    )
# # A tibble: 4 x 2
#   strings                        other_dat
#   <chr>                              <int>
# 1 I want to replace one new              1
# 2 Or this is 2 is what to change         2
# 3 We can change one new again            3
# 4 for 3 also can be replaced             4
r2evans
  • 141,215
  • 6
  • 77
  • 149