1

I am looking for a way to replace all _ (by say '') in each of the following characters

x <- c('test_(match)','test_xMatchToo','test_a','test_b') 

if and only if _ is followed by ( or x. So the output wanted is:

x <- c('test(match)','testxMatchToo','test_a','test_b') 

How can this be done (using any package is fine)?

statquant
  • 13,672
  • 21
  • 91
  • 162
  • Easiest way I can think of is just to replace `_(` and `_x` with `''` without using any regular expressions at all - It'll be faster and easier to read too. – John Bustos Nov 15 '16 at 17:58
  • Oh sorry, do this - Replace `_(` with `(` and `_x` with `x` – John Bustos Nov 15 '16 at 18:00
  • 1
    John's suggestion, `gsub("_([(x])", "\\1", x)`, seems generic enough to me, though this is not "without using any regex", so maybe I'm misunderstanding. – Frank Nov 15 '16 at 18:02
  • yeah not sure that's what he meant... but maybe I'm misunderstanding think he wanted to replace both strings by new strings and this was a minimal example, I can get manymore chatacters to filter... – statquant Nov 15 '16 at 18:12

1 Answers1

5

Using a lookahead:

_(?=[(x])

What a lookahead does is assert that the pattern matches, but does not actually match the pattern it's looking ahead for. So, here, the final match text consists of only the underscore, but the lookahead asserts that it's followed by an x or (.

Demo on Regex101

Your R code would look a bit like this (one arg per line for clarity):

gsub(
    "_(?=[(x])",                            # The regex
    "",                                     # Replacement text
    c("your_string", "your_(other)_string"), # Vector of strings
    perl=TRUE                               # Make sure to use PCRE
)
statquant
  • 13,672
  • 21
  • 91
  • 162
Sebastian Lenartowicz
  • 4,695
  • 4
  • 28
  • 39