2

After updating R to version 3.2.0 (from "Spring Dance" 3.1.0), I am having an unforeseen regex error.

I had the following code to find characters that contain X, followed by four or more numerical digits.

library(stringr)
vec = c("X12345", "X12", "X235252", "X442")
str_detect(vec, "X[0-9]{4, }")

Obviously, empty space after a comma means four-or-more. However, in R 3.2.0, this above statement yields an error.

Error in stri_detect_regex(string, pattern, opts_regex = attr(pattern,  : 
Error in {min,max} interval. (U_REGEX_BAD_INTERVAL)
Error during wrapup:

This is in fact caused by empty space following the comma in regex. However, to my knowledge, above statement is totally fine, and worked just fine in the previous version.

Does anyone know if regex engine had changed, and perhaps offer me a workaround for this, other than putting huge number after the comma? (Proper regular expression that matches four or more digits)

Alex A.
  • 5,466
  • 4
  • 26
  • 56
won782
  • 696
  • 1
  • 4
  • 13
  • 4
    What version of **stringr** are you using? – joran May 05 '15 at 19:16
  • 3
    ...the reason I ask is that nothing drastic has changed in R's implementation of regex, but **stringr** has just undergone a rather massive update. So you may simply have found a bug in the new version of **stringr**. – joran May 05 '15 at 19:21

1 Answers1

4

With a space after the comma within the curly braces, str_detect is expecting both a min and max value supplied in the regex. For just a min value, use{min,} as the repetition operator with no spaces between the comma and closing curly brace.

 library(stringr)
 vec = c("X12345", "X12", "X235252", "X442")
 str_detect(vec, "X[0-9]{4, }")

gives the error message

 Error in stri_detect_regex(string, pattern, opts_regex = attr(pattern,  : 
          Error in {min,max} interval. (U_REGEX_BAD_INTERVAL)

Without the space

 str_detect(vec, "X[0-9]{4,}")

returns

 [1]  TRUE FALSE  TRUE FALSE
  • 2
    FWIW, on R 3.1.2 and string 0.6.2, the version with the extra space really did work, so I think there was a change in behavior. – joran May 05 '15 at 19:48
  • @joran Thank you. So it is a rather update on stringr. Yes, I recently updated R to 3.2.0 and went through reinstalling the packages. – won782 May 05 '15 at 20:58