11

I have a string that needs to be split based on the occurrence of a ","(comma), but need to ignore any occurrence of it that comes within a pair of parentheses. For example, B2B,(A2C,AMM),(BNC,1NF),(106,A01),AAA,AX3 Should be split into

B2B,
(A2C,AMM),
(BNC,1NF),
(106,A01),
AAA,
AX3
cdesrosiers
  • 8,862
  • 2
  • 28
  • 33
user1724501
  • 113
  • 5

4 Answers4

6

FOR NON NESTED

,(?![^\(]*\))

FOR NESTED(parenthesis inside parenthesis)

(?<!\([^\)]*),(?![^\(]*\))
Anirudha
  • 32,393
  • 7
  • 68
  • 89
2

Try below:

var str = 'B2B,(A2C,AMM),(BNC,1NF),(106,A01),AAA,AX3';
console.log(str.match(/\([^)]*\)|[A-Z\d]+/g));
// gives you ["B2B", "(A2C,AMM)", "(BNC,1NF)", "(106,A01)", "AAA", "AX3"]

Java edition:

String str = "B2B,(A2C,AMM),(BNC,1NF),(106,A01),AAA,AX3";
Pattern p = Pattern.compile("\\([^)]*\\)|[A-Z\\d]+");
Matcher m = p.matcher(str);
List<String> matches = new ArrayList<String>();
while(m.find()){
    matches.add(m.group());
}

for (String val : matches) {
    System.out.println(val);
}
xdazz
  • 158,678
  • 38
  • 247
  • 274
2

One simple iteration will be probably better option then any regex, especially if your data can have parentheses inside parentheses. For example:

String data="Some,(data,(that),needs),to (be, splited) by, comma";
StringBuilder buffer=new StringBuilder();
int parenthesesCounter=0;
for (char c:data.toCharArray()){
    if (c=='(') parenthesesCounter++;
    if (c==')') parenthesesCounter--;
    if (c==',' && parenthesesCounter==0){
        //lets do something with this token inside buffer
        System.out.println(buffer);
        //now we need to clear buffer  
        buffer.delete(0, buffer.length());
    }
    else 
        buffer.append(c);
}
//lets not forget about part after last comma
System.out.println(buffer);

output

Some
(data,(that),needs)
to (be, splited) by
 comma
Pshemo
  • 122,468
  • 25
  • 185
  • 269
0

Try this

\w{3}(?=,)|(?<=,)\(\w{3},\w{3}\)(?=,)|(?<=,)\w{3}

Explanation: There are three parts separated by OR (|)

  • \w{3}(?=,) - matches the 3 any alphanumeric character (including underscore) and does the positive look ahead for comma

  • (?<=,)\(\w{3},\w{3}\)(?=,) - matches this pattern (ABC,E4R) and also does a positive lookahead and look behind for the comma

  • (?<=,)\w{3} - matches the 3 any alphanumeric character (including underscore) and does the positive look behind for comma

Brian Webster
  • 30,033
  • 48
  • 152
  • 225
inxss
  • 85
  • 2
  • 7