5

I have text strings like this:

u <- "she goes ~Wha::?~ and he's like ~↑Yeah believe me!~ and she's etc."

What I'd like to do is replace all characters occurring between pairs of ~ delimitors (including the delimitors themselves) by, say, X.

This gsub method replaces the substrings between ~-delimitor pairs with a single X:

gsub("~[^~]+~", "X", u)
[1] "she goes X and he's like X and she's etc."

However, what I'd really like to do is replace each and every single character between the delimitors (and the delimitors themselves) by X. The desired output is this:

"she goes XXXXXXXXX and he's like XXXXXXXXXXXXXXXXXXX and she's etc."

I've been experimenting with nchar, backreference, and paste as follows but the result is incorrect:

gsub("(~[^~]+~)", paste0("X{", nchar("\\1"),"}"), u)
[1] "she goes X{2} and he's like X{2} and she's etc."

Any help is appreciated.

Chris Ruehlemann
  • 20,321
  • 4
  • 12
  • 34

1 Answers1

5

The paste0("X{", nchar("\\1"),"}") code results in X{2} because "\\1" is a string of length 2. \1 is not interpolated as a backreference if you do not use it in a string pattern.

You can use the following solution based on stringr:

> u <- "she goes ~Wha::?~ and he's like ~↑Yeah believe me!~ and she's etc."
> str_replace_all(u, '~[^~]+~', function(x) str_dup("X", nchar(x)))
[1] "she goes XXXXXXXX and he's like XXXXXXXXXXXXXXXXXXX and she's etc."

Upon finding a match with ~[^~]+~, the value is passed to the anonymous function and str_dup creates a string out of X that is the same length as the match value.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • Thanks. Works beautifully. Just for curiosity sake: is a solution with backreference possible? – Chris Ruehlemann Oct 14 '20 at 11:30
  • 1
    @ChrisRuehlemann Backreferences like `\1`, `\2` are only supported in *string* replacement patterns. When you use a function and pass a string containing backreference syntax, it is treated like an independent string, not a replacement pattern. After that function processes the string, the result becomes the string replacement pattern. To evaluate the backreference, you need to pass the match object / value to an anonymous function. – Wiktor Stribiżew Oct 14 '20 at 11:37