1

I have this string:

mystring <- "HMSC-bm_in_ALL_CELLTYPES.distal"

What I want to do is to extract the substring as defined in this bracketing

[HMSC-bm]_in_ALL_CELLTYPES.[distal]

So in the end it will yield a vector with two values: HMSC-bm and distal. How can I do it? I tried this but failed:

> stringr::str_extract(base,"\\([\\w-]+\\)_in_ALL_CELLTYPES\\.\\([\\w+]\\)")
[1] NA
littleworth
  • 4,781
  • 6
  • 42
  • 76

2 Answers2

3

I'd use str_match:

library(stringr)
mymatch <- str_match(mystring, "^(.*?)_.*?\\.(.*?)$")
mymatch

     [,1]                              [,2]      [,3]    
[1,] "HMSC-bm_in_ALL_CELLTYPES.distal" "HMSC-bm" "distal"

mymatch[, 2]
[1] "HMSC-bm"

mymatch[3, ]
[1] "distal"
neilfws
  • 32,751
  • 5
  • 50
  • 63
2

We can split the string by _in_ALL_CELLTYPES..

strsplit(mystring, split = "_in_ALL_CELLTYPES.")[[1]]
[1] "HMSC-bm" "distal" 
www
  • 38,575
  • 12
  • 48
  • 84