4

UPDATED I need to get the characters between braces { }.

For example,

a <- "{a,b}->{v}"

Output : a,b and v

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
anz
  • 987
  • 7
  • 21
  • 1
    Have you tried anything yourself already? Why not sharing your efforts? – vaettchen Jul 25 '16 at 05:39
  • Is this related to - http://stackoverflow.com/questions/38559859/r-arules-extract-lhs-items-from-rules ? If so, I expect there is something better than trying to extract from text. `arules` probably has methods for doing exactly this. – thelatemail Jul 25 '16 at 05:39
  • Yep, I'm looking through the documentation. Can't find anything yet. – anz Jul 25 '16 at 05:45

6 Answers6

3

You can use stingr's str_extract_all

In the following expression (?<=\\{) is used to find opening curly braces, (?=\\}) is used to detect closed braces and .+? is used to extract text in between. Hence, the final expression would become (?<=\\{).+?(?=\\})

This will return a list()

str_extract_all(a, "(?<=\\{).+?(?=\\})")[[1]]

Please follow another example performed by me:

> a <- "{a,b}->{v}{d}{c}{67}"
> str_extract_all(a, "(?<=\\{).+?(?=\\})")[[1]]
[1] "a,b" "v"   "d"   "c"   "67" 
Rishabh Ojha
  • 57
  • 1
  • 8
1

If you need to match strings in between curly braces excluding the curly braces, you may use

a <- "{a,b}->{v}"
stringr::str_extract_all(a, "(?<=\\{)[^{}]+(?=\\})")           # With stringr library
# => [1] "a,b" "v"
regmatches(a, gregexpr("(?<=\\{)[^{}]+(?=\\})", a, perl=TRUE)) # Base R approach #1
# => [1] "a,b" "v"
regmatches(a, gregexpr("\\{\\K[^{}]+(?=\\})", a, perl=TRUE))   # Base R approach #2
# => [1] "a,b" "v"

See the regex #1 demo. Details:

  • (?<=\{) - a positive lookbehind that requires a { immediately to the left of the current location
  • [^{}]+ - 1 or more (due to the + quantifier) chars other than { and } (the [^...] is a negated bracket expression in the TRE regex that is used by default in base R regex functions (or a negated character class in NFA regex, as is used in the ICU regexps in stringr package)
  • (?=\}) - a positive lookahead that requires a } immediately to the left of the current location
  • \{\K means that after matching and consuming {, the text matched is discarded from the match value, so the { does not land in the results. See Keep The Text Matched So Far out of The Overall Regex Match for more details.

To match strings inside non-nested curly braces including the curly braces, you may use

a <- "{a,b}->{v}"
stringr::str_extract_all(a, "\\{[^{}]*\\}")  # With stringr library
regmatches(a, gregexpr("\\{[^{}]*}", a))     # Base R approach
# => [1] "{a,b}" "{v}" 

See the regex

Here, \{[^{}]*\} matches all substrings starting with {, then 0+ chars other than { and } (with [^{}]*) and then ending with }.

See the R demo online.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
0

Sorry that I'm answering my own questions but

j <- "{a,b}->{v}"
unlist(strsplit(j, split="[{}]"))

Apparently for braces and brackets, we have to put it inside []

anz
  • 987
  • 7
  • 21
  • If you don't need to know what was in each bracket set your solution is fine. If you want to keep the information you can use grep with placeholders. Regex can be tricky but it's well worth the time. You could also split on -> if that's always present and remove everything before and after the brackets – Ulrik Jul 25 '16 at 06:36
  • Regex has always been "when I need black-box" , I realize learning it well will be a life-saver. As for the question, I could further split each { } splits with a "," split to get the individual items. Thanks. – anz Jul 25 '16 at 06:47
-1

below is code

var a = "{a,b} xyz {v}";
a = a.split(" ");
a[0] //outupt {a,b}
a[2]  //output {v}
danish farhaj
  • 1,316
  • 10
  • 20
  • Sorry, my question was a bit unclear. I'll update it. – anz Jul 25 '16 at 05:31
  • 1
    I see that you used space. The problem is there may not be space at some instance, I need to extract between braces. – anz Jul 25 '16 at 05:34
-1

Here is my solution

library(stringr)

a <- "{a,b}->{v}"

betw_curly <- function(a) { 
  str_sub(a, 
     str_locate_all(a, '\\{')[[1]][,1]+1, 
     str_locate_all(a, '\\}')[[1]][,1]-1)
}

betw_curly(a)

[1] "a,b" "v"

Yuriy Barvinchenko
  • 1,465
  • 1
  • 12
  • 17
-1

The function tools::delimMatch() is designed for just this purpose.

tx <- '\\caption{Groups are \\code{ctl} and \\code{trt}}.\label{fig:gps}'
tools::delimMatch(tx, delim = c("{", "}"))
## [1] 9
## attr(,"match.length")
## [1] 38
substring(tx,9,9+38-1)
## "{Groups are \\code{ctl} and \\code{trt}}"

Note that a second match ({fig:gps}) was not captured.