0

I have a column bundle_availability_ids in Snowflake that has a value:

6891-HCB-ENG
6433C-MMW-ENG
6433C-MMC-ENG

I would like to use SQL to return the middle value in this case to return (MMC,HCB,MMW) To return what's in the middle of what's in between the pipe delimiter and what's in between the '-' delimiter into a separate column.

There are might be instances that the I could have more than 2 sets of pipe delimiters under the bundle_availability_ids, such as 6891-HCB-ENG|6433C-MMW-ENG|6433C-MMC-ENG|5675-HBC-ENG

Thank you for any assistance any of you can provide!

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360

2 Answers2

0
select column1
    ,split(column1,'|') as s
    ,array_size(s) as s_len
    ,floor((s_len+1)/2) as idx
    ,s[idx-1] as middle_item
    ,split(middle_item,'-') as sb
    ,array_size(sb) as sb_len
    ,floor((sb_len+1)/2) as b_idx
    ,sb[b_idx-1] as b_middle_item
from values
('6891-HCB-ENG|6433C-MMW-ENG|6433C-MMC-ENG|5675-HBC-ENG'),
('6891-HCB-ENG|6433C-MMW-ENG|6433C-MMC-ENG');

gives:

COLUMN1 S S_LEN IDX MIDDLE_ITEM SB SB_LEN B_IDX B_MIDDLE_ITEM
6891-HCB-ENG|6433C-MMW-ENG|6433C-MMC-ENG|5675-HBC-ENG [ "6891-HCB-ENG", "6433C-MMW-ENG", "6433C-MMC-ENG", "5675-HBC-ENG" ] 4 2 "6433C-MMW-ENG" [ "6433C", "MMW", "ENG" ] 3 2 "MMW"
6891-HCB-ENG|6433C-MMW-ENG|6433C-MMC-ENG [ "6891-HCB-ENG", "6433C-MMW-ENG", "6433C-MMC-ENG" ] 3 2 "6433C-MMW-ENG" [ "6433C", "MMW", "ENG" ] 3 2 "MMW"

which can be compacted to:

select column1
    ,split(column1,'|') as s
    ,s[floor((array_size(s)+1)/2)-1] as middle_item
    ,split(middle_item,'-') as sb
    ,sb[floor((array_size(sb)+1)/2)-1] as b_middle_item
from values
('6891-HCB-ENG|6433C-MMW-ENG|6433C-MMC-ENG|5675-HBC-ENG'),
('6891-HCB-ENG|6433C-MMW-ENG|6433C-MMC-ENG');
COLUMN1 S MIDDLE_ITEM SB B_MIDDLE_ITEM
6891-HCB-ENG|6433C-MMW-ENG|6433C-MMC-ENG|5675-HBC-ENG [ "6891-HCB-ENG", "6433C-MMW-ENG", "6433C-MMC-ENG", "5675-HBC-ENG" ] "6433C-MMW-ENG" [ "6433C", "MMW", "ENG" ] "MMW"
6891-HCB-ENG|6433C-MMW-ENG|6433C-MMC-ENG [ "6891-HCB-ENG", "6433C-MMW-ENG", "6433C-MMC-ENG" ] "6433C-MMW-ENG" [ "6433C", "MMW", "ENG" ] "MMW"

Taking all middle parts:

select 
    array_agg(split_part(s.value,'-',2)) within group(order by s.index) as middle_bits
from values
('6891-one-ENG|6433C-two-ENG|6433C-three-ENG|5675-four-ENG'),
('6891-one-ENG|6433C-two-ENG|6433C-three-ENG'),
('6891-one-ENG|6433C-two-ENG'),
('6891-one-ENG')
as t
,table(split_to_table(t.column1,'|')) as s
group by s.seq
order by s.seq
;
MIDDLE_BITS
[ "one", "two", "three", "four" ]
[ "one", "two", "three" ]
[ "one", "two" ]
[ "one" ]

using listagg and it becomes are string:

select 
    listagg(split_part(s.value,'-',2), ', ') within group(order by s.index) as middle_bits
from values
('6891-one-ENG|6433C-two-ENG|6433C-three-ENG|5675-four-ENG'),
('6891-one-ENG|6433C-two-ENG|6433C-three-ENG'),
('6891-one-ENG|6433C-two-ENG'),
('6891-one-ENG')
as t
,table(split_to_table(t.column1,'|')) as s
group by s.seq
order by s.seq
;
MIDDLE_BITS
one, two, three, four
one, two, three
one, two
one
Simeon Pilgrim
  • 22,906
  • 3
  • 32
  • 45
  • Thank you, Simeon - Based on your example, I only see MMW, I need to return all the three middle values (HCB, MMC, MMW) - How would it work when the pipe delimiter is in between such as this 6891-HCB-ENG|6433C-MMC-ENG|6433C-MMW-ENG - Thank you! – Catalina Tocasuchil Sep 23 '22 at 01:49
  • Thank you so much Simeon, it worked as index, is there a way that we could display without been index formatted, without the brackets? Thank you! – Catalina Tocasuchil Sep 23 '22 at 15:00
0

We could try a regex replacement approach:

SELECT val, REGEXP_REPLACE(val, '^.*?-|-.*$', '') AS middle
FROM yourTable;

Here is a regex demo showing that the replacement logic is working.

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360