0

I am using gsub regex to select last part of expression

Example:

  • "Bla-text-01" - I want -> "text-01"
  • "Name-xpto-08" - I want -> "xpto-08"
  • "text-text-04" - I want -> "text-04"
  • "new-blaxpto-morexpto-07" - I want -> "morexpto-07"
  • "new-new-new-bla-ready-05" - I want -> "ready-05"

I created this code that works with first 3 cases but now I have a new request to also work with 5 cases.

gsub(x = match$id,
          pattern =  "(.*?-)(.*)",
          replacement = "\\2")

Can you help me?

user438383
  • 5,716
  • 8
  • 28
  • 43
  • Just match the regular expression `[a-z]+-\\d+$`. [Demo](https://regex101.com/r/81sd4U/1). You may need to change `[a-z]` to `[a-zA-Z]` or set the case-indifferent flag. Hover the cursor over each part of the expression at the link to obtain an explanation of its function. – Cary Swoveland Jan 27 '22 at 19:56

2 Answers2

2
x <- c("Bla-text-01",
       "Name-xpto-08", 
       "text-text-04", 
       "new-blaxpto-morexpto-07", 
       "new-new-new-bla-ready-05")

sub("^.*-([^-]*-[^-]*)$", "\\1", x)
## [1] "text-01"     "xpto-08"     "text-04"     "morexpto-07" "ready-05"
Mikael Jagan
  • 9,012
  • 2
  • 17
  • 48
1

Try this regular expression:

sub(".*-(.*-.*)$", "\\1", x)
## [1] "text-01"     "xpto-08"     "text-04"     "morexpto-07" "ready-05"   

Other approaches would be:

# 2. use basename/dirname
xx <- gsub("-", "/", x)
paste(basename(dirname(xx)), basename(xx), sep = "-")
## [1] "text-01"     "xpto-08"     "text-04"     "morexpto-07" "ready-05"   

# 3. use scan
f <- function(x) {
  scan(text = x, what = "", sep = "-", quiet = TRUE) |>  
    tail(2) |>
    paste(collapse = "-")
}
sapply(x, f)
##              Bla-text-01             Name-xpto-08             text-text-04 
##                "text-01"                "xpto-08"                "text-04" 
##  new-blaxpto-morexpto-07 new-new-new-bla-ready-05 
##            "morexpto-07"               "ready-05" 

Note

Input in reproducible form:

x <- c("Bla-text-01", "Name-xpto-08", "text-text-04", "new-blaxpto-morexpto-07", 
"new-new-new-bla-ready-05")
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
  • Might as well add `vapply(strsplit(x, "-"), function(x) paste(tail(x, 2L), collapse = "-"), "")`... – Mikael Jagan Jan 27 '22 at 18:49
  • 1
    2nd way is cleaver, but `.*-(.*-.*)$` may cause a many backtracking steps with a long string. It's safer to write it in a more explicit way: `(?:[^-]*-)*([^-]*-[^-]*])$`. Or better in pcre flavor: `^(?>[^-]*-)*([^-]*-[^-]*])$` – Casimir et Hippolyte Jan 27 '22 at 21:57