2

I have a sample source string like below, which was in pipe delimited format in that the value obr can be at anywhere. I need to get the second value of the pipe from the first occurrence of obr. So for the below source strings the expected would be,

Source string:

select 'asd|dfg|obr|1|value1|end' text from dual
union all
select 'a|brx|123|obr|2|value2|end' from dual
union all
select 'hfv|obr|3|value3|345|pre|end' from dual

Expected output:

value1
value2
value3

I have tried the below regexp in oracle sql, but it is not working fine properly.

with t as (
            select 'asd|dfg|obr|1|value1|end' text from dual
            union all
            select 'a|brx|123|obr|2|value2|end' from dual
            union all
            select 'hfv|obr|3|value3|345|pre|end' from dual
            )
            select text,to_char(regexp_replace(text,'*obr\|([^|]*\|)([^|]*).*$', '\2')) output from t;

It is working fine when the string starts with OBR, but when OBR is in the middle like the above samples it is not working fine.

Any help would be appreciated.

B. Clay Shannon-B. Crow Raven
  • 8,547
  • 144
  • 472
  • 862
arunb2w
  • 1,196
  • 9
  • 28
  • You said 'obr' can be anywhere in the list. What if 'obr' is second from last or last in the list? – Gary_W Nov 18 '15 at 19:01

2 Answers2

3

Not sure of how Oracle handles regular expressions, but starting with an asterisk usually implies that you're looking for zero or more null characters.

Have you tried '^.*obr\|([^|]*\|)([^|]*).*$' ?

Paurian
  • 1,372
  • 10
  • 18
  • its working good. But it also matches this kind of string too but technically it shouldn't happen. **'a|brx|123|obriem|2|value2|end'** – arunb2w Nov 18 '15 at 17:42
  • Add that to your test data then – Gary_W Nov 18 '15 at 18:54
  • Thinking about mainframe or unix editors (e.g. sed), did you perhaps use overwrite instead of insert, thus removing the "\|" search after the "obr"? Does your string still contain "...obr\|..." or is it now "...obr([..."? – Paurian Nov 18 '15 at 20:58
0

This handles null elements and is wrapped in a NVL() call which supplies a value if 'obr' is not found or occurs too far toward the end of a record so a value 2 away is not possible:

SQL> with t(id, text) as (
     select 1, 'asd|dfg|obr|1|value1|end'      from dual
     union
     select 2, 'a|brx|123|obr|2|value2|end'    from dual
     union
     select 3, 'hfv|obr|3|value3|345|pre|end'  from dual
     union
     select 4, 'hfv|obr||value4|345|pre|end'   from dual
     union
     select 5, 'a|brx|123|obriem|2|value5|end' from dual
     union
     select 6, 'a|brx|123|obriem|2|value6|obr' from dual
   )
   select
     id,
     nvl(regexp_substr(text, '\|obr\|[^|]*\|([^|]*)(\||$)', 1, 1, null, 1), 'value not found') value
   from t;

        ID VALUE
---------- -----------------------------
         1 value1
         2 value2
         3 value3
         4 value4
         5 value not found
         6 value not found

6 rows selected.

SQL>

The regex basically can be read as "look for a pattern of a pipe, followed by 'obr', followed by a pipe, followed by zero or more characters that are not a pipe, followed by a pipe, followed by zero or more characters that are not a pipe (remembered in a captured group), followed by a pipe or the end of the line". The regexp_substr() call then returns the 1st captured group which is the set of characters between the pipes 2 fields from the 'obr'.

Gary_W
  • 9,933
  • 1
  • 22
  • 40