0

I want to write a case statement which can extract value for a particular substring from a column named details which has multiple occurrences for [address] is it possible with REGEX along with case?

sample Data in the column:

[address]:kattengat 28
[address]:1012 SZ
[address]: Amsterdam

The below SQL only outputs:

kattengat 28

Select case when to_number(instr(details),'[address')>0 then substr(details,REGEXP_INSTR(details,'address',1,1)+8,instr(substr(details,REGEXP_INSTR(details,'address',1,1)+8,'[')-1)) else '' end from table_name;

Expected output is : 
kattengat 28 1012 SZ Amsterdam

Create table statement:

Create table test (id number(10), details clob);

Insert statement :

insert into test (id, details) values (1,to_clob ('[ADDRESS    ] kattengat 28
             [NAME       ] ALEX
             [PHONE      ] 65438
             [ADDRESS    ] 1012 SZ
             [DOB        ] 1st Jan 1998
             [ADDRESS    ] Amsterdam')):

Please note I don't want to concat and add statements rather looking for a solution which can extract values associated with the substring [address] based on the number of occurrences of the substring in a single line

Linnea
  • 65
  • 1
  • 6

1 Answers1

1

Here's one option:

SQL> with test (col) as
  2    (select '[address]:kattengat 28
  3  [address]:1012 SZ
  4  [address]: Amsterdam' from dual
  5    )
  6  select trim(replace(regexp_substr(replace(col, chr(10), '#'), '[^#]+', 1, column_value), '[address]:', '')) result
  7  from test cross join
  8    table(cast(multiset(select level from dual
  9                        connect by level <= regexp_count(col, ':')
 10                       ) as sys.odcinumberlist));

RESULT
--------------------------------------------------------------------------------
kattengat 28
1012 SZ
Amsterdam

SQL>

What does it do?

  • lines #1 - 5 - sample data
  • line #6:
    • regexp_substr part of code is responsible for splitting source column value into separate rows
    • it affects not the original value, but the one whose new line character (chr(10)) is replaced (the 2nd replace) by #, and that character is used as a separator for regexp_substr
  • the 1st replace removes [address]: from the source
  • trim removes leading/trailing empty strings (as the one in front of "Amsterdam"
  • lines #8 - 10 are here to remove duplicate values from the final result (if source table doesn't contain only one row, as in my example). If it actually does, then code can be somewhat simplified.

With sample data you posted later (by the way, are you sure there are spaces in front of [NAME] etc.? I guess NOT!):

SQL> select * from test;

        ID DETAILS
---------- --------------------------------------------------
         1 [ADDRESS    ] kattengat 28
           [NAME       ] ALEX
           [PHONE      ] 65438
           [ADDRESS    ] 1012 SZ
           [DOB        ] 1st Jan 1998
           [ADDRESS    ] Amsterdam

Code I previously posted, slightly modified because previously address was in lower case, there were NO spaces within square brackets, and there was a colon sign):

SQL> with temp as
  2    (select trim(replace(regexp_substr(replace(details, chr(10), '#'), '[^#]+', 1, column_value), '[ADDRESS    ]', '')) result
  3     from test cross join
  4     table(cast(multiset(select level from dual
  5                         connect by level <= regexp_count(details, '\[')
  6                        ) as sys.odcinumberlist))
  7    )
  8  select *
  9  from temp
 10  where instr(result, '[') = 0;

RESULT
--------------------------------------------------------------------------------
kattengat 28
1012 SZ
Amsterdam

SQL>

If you want to get result in one line, you could aggregate values returned by that query as

SQL> with temp as
  2    (select trim(replace(regexp_substr(replace(details, chr(10), '#'), '[^#]+', 1, column_value), '[ADDRESS    ]', '')) result,
  3            column_value cv
  4     from test cross join
  5     table(cast(multiset(select level from dual
  6                         connect by level <= regexp_count(details, '\[')
  7                        ) as sys.odcinumberlist))
  8    )
  9  select listagg (result, ', ') within group (order by cv) final_result
 10  from temp
 11  where instr(result, '[') = 0;

FINAL_RESULT
--------------------------------------------------------------------------------
kattengat 28, 1012 SZ, Amsterdam

SQL>
Littlefoot
  • 131,892
  • 15
  • 35
  • 57
  • thank you @Littlefoot The value of address would be anything and not a static as stated in sample – Linnea Apr 18 '22 at 18:36
  • 1
    As you didn't provide CREATE TABLE and INSERT INTO sample data, I just used what you posted in a CTE. My code doesn't mean that there's anything "static"; you'd run code that begins at line #6 against your table and its column. – Littlefoot Apr 18 '22 at 18:40
  • My bad, I will post the data in a while with inserts. – Linnea Apr 18 '22 at 19:04
  • You don't have to (maybe); run code (line #6 onwards) on your data and see what it returns. – Littlefoot Apr 18 '22 at 19:08
  • I ran but it does not only return address values it's returning other values as well which is not expected. Had updated the ask with insert and create – Linnea Apr 18 '22 at 19:18
  • Well, that's not really fair. Data you previously posted doesn't match what you posted now. See edited answer. – Littlefoot Apr 18 '22 at 19:32
  • Agree and I am sorry for that. You are great man !!! Genuineness indeed would you please explain it – Linnea Apr 18 '22 at 19:46
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/244007/discussion-between-linnea-and-littlefoot). – Linnea Apr 19 '22 at 08:15
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/244009/discussion-between-linnea-and-littlefoot). – Linnea Apr 19 '22 at 08:40
  • Can the output be presented in one line please instead of multiple rows – Linnea Apr 19 '22 at 08:41
  • Yes, I believe it can - by using LISTAGG, for example. See edited answer, please. – Littlefoot Apr 19 '22 at 18:30