1

How can I make this work in R?

str_split("U.S. (California, San Luis Obispo County)",pattern=' (')

Error in gregexpr("(", "U.S. (California, San Luis Obispo County)", fixed = FALSE, : invalid regular expression '(', reason 'Missing ')''

gregexpr("(", "U.S. (California, San Luis Obispo County)")

Error in gregexpr("(", "U.S. (California, San Luis Obispo County)") : invalid regular expression '(', reason 'Missing ')''

gregexpr("(", "U.S. (California, San Luis Obispo County)",perl=T)

Error in gregexpr("(", "U.S. (California, San Luis Obispo County)", perl = T) : invalid regular expression '('

In addition, warning message:

In gregexpr("(", "U.S. (California, San Luis Obispo County)", perl = T) :
  PCRE pattern compilation error
    'missing )'
    at ''
Shmwel
  • 1,697
  • 5
  • 26
  • 43
yonicd
  • 498
  • 1
  • 4
  • 15

2 Answers2

2

To split using a special character like"(" you have to escape it. To escape a regular expression in R, you have to use a double "\", one for the R character string and another one for the regular expression, as suggested by Hugh. Then your pattern should be "\(". See the regex doc for more information.

The following code does the job

raw_string <- "U.S. (California, San Luis Obispo County)"
splitted_string <- strsplit(x=raw_string, split="\\(")

splitted_string

#[[1]]
#[1] "U.S. "                              
#[2] "California, San Luis Obispo County)"

But I'm not sure that's what you want. If your goal is to remove the left-hand side parenthesis in your character string, use gsub with an empty replacement pattern.

raw_string <- "U.S. (California, San Luis Obispo County)"
no_parenthesis_string <- gsub(pattern="\\(", replacement="", x= raw_string)
no_parenthesis_string 
# [1] "U.S. California, San Luis Obispo County)"

Does it help?

jomuller
  • 1,032
  • 2
  • 10
  • 19
0
gsub("\\\\(","",c("U.S. (California, San Luis Obispo County)"))

or

paste0(strsplit(c("U.S. (California, San Luis Obispo County)"), "\\(")[[1]], collapse = "")

Answer: "U.S. California, San Luis Obispo County)"

Tunaki
  • 132,869
  • 46
  • 340
  • 423
陳奕錩
  • 1
  • 1