redshift regexp_substr all occurrences

Question

I am trying to return all results that match a specific regex in Redshift database.

Sample query

WITH aa AS
(SELECT DISTINCT
id,
record,
regexp_substr(record, '(#{2})([A-Z]{2,3})',1,1),
REGEXP_COUNT(record, '(#{2})([A-Z]{2,3})')
FROM table_a)

SELECT
*
FROM aa

The above query returns the first occurrence and the total count of the occurrences.

Is there a way to return all occurrences? Maybe an iteration using a variable that is limited by the count?

regexp_substr(record, '(#{2})([A-Z]{2,3})',1,n)

There is a solution for Oracle with the CONNECT BY LEVEL, but that option seems to be unavailable for Redshift.

Can you provide sample data and desired results? – Gordon Linoff Mar 20 '21 at 12:36 — Gordon Linoff, Mar 20 '21 at 12:36

score -1 · Answer 1 · answered Mar 20 '21 at 12:41

-1

If I understand correctly, you can use split_part() instead:

split_part(record, '##', <n>)

Then you can attempt to generate numbers and get the results:

with n as (
      select row_number() over () as n
      from t
      limit 10
     )
select t.*, split_part(record, '##', n.n) 
from t join
     n
     on split_part(record, '##', n.n) <> ''

answered Mar 20 '21 at 12:41

Gordon Linoff

1,242,037
58
646
786

Thank you for your answer. I have a field that contains sentences like this : "I want to ##DG split ##AFI ##GFA the sentence #AA". I am trying to isolate the ## substrings and return a field like this : "##DG ##AFI ##GFA #AA". Not really sure whether your proposed solution identifies the regexp and collects them in a field. It seems to me that it identifies the ## only and does not identify regexp – Pericles Faliagas Mar 22 '21 at 10:06

redshift regexp_substr all occurrences

1 Answers1