3

I have played around on regex101 for a while now and can't figure this out and maybe it is because of the version of regex used in NetSuite. On regex101 using the string below it highlights the part I want as group 1, but results in nothing in NetSuite.

I have tried: (?:,[^,]+,?){2}(,[^,]+,?) to try and get the value between the 3rd and 4th commas out of this data set:

+000000006 06:23:15.291450,W-CHEVLPFULL-LP | ,+000000006 06:23:15.291450,W-CHEVUS | ,
+000000044 08:09:52.291450,W-ADITIVOSSM-SM | KM8014,+000000044 08:09:52.291450,W-CHEVLPFULL-LP | KM8014,
+000000125 00:53:18.291450,W-ADITHPSMFULL-HP-SM | ,+000000125 00:53:18.291450,W-ADITIVOSSM-SM | ,
+000000138 05:08:01.291450,W-ADITHPSMFULL-HP-SM | KM8512,+000000138 05:08:01.291450,W-EMPTYAVAILABLE | KM8512,
+000000138 05:20:45.291450,W-ADITIVOSSM-SM | ,+000000138 05:20:45.291450,W-EMPTYAVAILABLE |

Which would be:

W-CHEVUS | 
W-CHEVLPFULL-LP | KM8014
W-ADITIVOSSM-SM | 
W-EMPTYAVAILABLE | KM8512
W-EMPTYAVAILABLE |

I have tried other 'non capturing groups' and had strange results which leads me to think it isn't supported or I'm using them incorrectly.

Any help would be greatly appreciated. Thanks!

enfrost
  • 165
  • 1
  • 14
  • 1
    What is the regex flavor in Netsuite? Is there any method there that allows you access to groups? Try `(?<=^(?:[^,]*,){3})[^,]+` – Wiktor Stribiżew Jul 04 '19 at 06:37
  • I'm guessing it is whatever oracle database uses. I'm calling the REGEXP_SUBSTR( ) function. Searches lead me to here: https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions131.htm – enfrost Jul 04 '19 at 16:03
  • If it is Oracle, use `REGEXP_SUBSTR(col, '^([^,]*,){3}([^,]+)', 1, 1, NULL, 2)` – Wiktor Stribiżew Jul 04 '19 at 19:29
  • What is the context? Where is the data from? If this is happening in a suitescript then what type of script? I ask because the data you've pasted isn't any natural Netsuite data format that I am aware of. – bknights Jul 04 '19 at 20:04

4 Answers4

0

My guess is that this expression might return the desired data:

.+?,.+?,.+?,(.+?)(?:\s*,|\s*$)

DEMO

Emma
  • 27,428
  • 11
  • 44
  • 69
  • 1
    Thanks, this groups between the 2nd and 3rd comma not the 3rd and 4th so I switched it to .+?,.+?,.+?,(.+) The problem here is that it actually selects the inverse of what I want. Any thoughts? – enfrost Jul 04 '19 at 15:41
0

In the most complex case, you can use:

^([^,]*),([^,]*),([^,]*),([^,]*),?$

Then you can use \1, \2, \3, \4 to access any field that you want.

Test here.


If everything is on one line, then you need every fourth element from the string.

The updated regex would be:

[^,]*,[^,]*,[^,]*,([^,]*),?

Get your string with \1.

Test here.

Of course, I do not know the peculiarities of your specific tool, so I cannot "compensate" for them.

virolino
  • 2,073
  • 5
  • 21
  • I really like the idea of this solution and in the test the 4th group is exactly what I want. When I try \4 in the demo or in the application it gives me no results, so maybe this doesn't work for this flavor of regex? – enfrost Jul 04 '19 at 15:42
  • From more practicing, it looks like it is returning whatever the first "Full match" is, but adding different back references does change the results. Also I realized there was the ^ at the start and $ at the end, the actual string returned is one line, I made it multiline in my example for readability oops. – enfrost Jul 04 '19 at 16:00
0

Perhaps you could use a single capturing group and an anchor to assert the start of the string. Your value would be in the first capturing group

^[^,]+(?:,[^,]+){2},([^,]+)
  • ^ Start of string
  • [^,]+ Match 1+ times not a comma
  • (?:,[^,]+){2} Repeat 2 times matching a comma, then 1+ times not a comma
  • , Match the third comma
  • ([^,]+) Capture in group 1 matching 1+ times not a comma

Regex demo

The fourth bird
  • 154,723
  • 16
  • 55
  • 70
0

Since you are actually using Oracle behind the scenes you may use

REGEXP_SUBSTR(col, '^([^,]*,){3}([^,]+)', 1, 1, NULL, 2)

See the regex demo

Details

  • ^ - start of string
  • ([^,]*,){3} - three occurrences of Group 1 matching any 0 or more chars other than , and then a ,
  • ([^,]+) - Group 2 matching one or more chars other than ,.

The last 2 argument means that only Group 2 value is returned.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563