0

I am practicing with regular expressions in R. I would like to extract the last occurrence of two upper case letters. I tried

>str_extract("kjhdjkaYY,","[:upper:][:upper:]")
[1] "YY"

And it works perfectly fine. What if I would like to extract the last occurrence of such pattern. Example:

function("kKKjhdjkaYY,")
[1] "YY"

Thank you for your help

Marco De Virgilis
  • 982
  • 1
  • 9
  • 29
  • The regex cheatsheet from RStudio will really help you out here. [Here is a link](https://www.rstudio.com/resources/cheatsheets/). For this, you'll want to read about what `$` does. Hope this helps get you started. – Michael Bird Nov 13 '18 at 13:28

1 Answers1

1

We can use stri_extract_last_regex from stringi package

library(stringi)
stri_extract_last_regex("AAkjhdjkaYY,","[:upper:][:upper:]")
#[1] "YY"

Or if you want to stick with stringr, we can extract all the groups which match the pattern and then get the last one using tail

library(stringr)
tail(str_extract_all("AAkjhdjkaYY,","[:upper:][:upper:]")[[1]], 1)
#[1] "YY"
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • Or you could have a more descriptive regex, for example, `str_extract("kKKjhdjkaYY,","[:upper:]{2}(?=[^[:upper:]{2,}]*$)")` – Michael Bird Nov 13 '18 at 13:33
  • @MichaelBird If you are going with such a regex then I'd reccomend to use `gsub`. The reason for loading an external package, `stringr` or `stringi`, is to avoid complex regex – Sotos Nov 13 '18 at 13:42