I have a field that can have one or multiple states listed in it (callcenter.stateimpact). If the callcenter.stateimpact contains "OK","TX","AK","TN","NC","SC","GA","FL","AL","MS" or "LA" I need the output field of the SQL to say "South" and if not those, the output needs to say "North". If the callcenter.stateimpact has both South & North states, it needs to say "BOTH" in the output. How do I do this in the Select statement? The fields in this table are callcenter.callid, callcenter.stateimpact, callcenter.callstart and callcenter.callstop. You help is greatly appreciated.
-
1create a lookup table and perform a join. `case` is bad practice in this scenario - not scalable and will potentially require maintenance. – haki Jun 10 '13 at 19:24
-
maybe better to create a small table with state + region and join to it. – tbone Jun 11 '13 at 10:22
2 Answers
This is tough to explain, so there's a SQL Fiddle here that lays out the values involved.
The best approach I could come up with (other than normalizing the StateImpact
value) was to use REGEXP_REPLACE
to suck all the "South" states out of the string and then look at the length of what was left. First, here's what REGEXP_REPLACE(StateImpact, '(OK|TX|AK|TN|NC|SC|GA|FL|AL|MS|LA)')
will do to a few sample values:
StateImpact REGEXP_REPLACE(StateImpact, '(OK|TX|AK|TN|NC|SC|GA|FL|AL|MS|LA)')
----------------------------- -----------------------------------------------------------------
OK,TX,AK,TN,NC,SC,GA,FL,AL,MS ,,,,,,,,,
MI,MA MI,MA
TX null
TX,MI,MA ,MI,MA
So if you're left with all commas or with a null, all the states were South
. If you're left with the original string, all states were North
. Anything else and it's Both
. That makes for a pretty big and confusing CASE
statement no matter how you write it. I went with comparing lengths before and after, like so:
- Length after replace = 0 (or null):
South
- Length after replace = (length before + 1) * 3 - 1:
South
- Length after replace = length before replace:
North
- Anything else:
Both
The second one above is just some math to account for the fact that if (for example) there are five states in StateImpact
and they're all South
, you'll be left with four commas. Hard to explain but it works :)
Here's the query:
SELECT
StateImpact,
CASE NVL(LENGTH(REGEXP_REPLACE(StateImpact, '(OK|TX|AK|TN|NC|SC|GA|FL|AL|MS|LA)')), 0)
WHEN LENGTH(StateImpact) THEN 'North'
WHEN (LENGTH(StateImpact) + 1) / 3 - 1 THEN 'South'
ELSE 'Both'
END AS RegionImpact
FROM CallCenter
The SQL Fiddle referenced above also shows the length before and after the REGEXP_REPLACE
, which will hopefully help explain the calculations.

- 25,924
- 4
- 46
- 69
One of the ways to reach desired result is to use multiset operators.
But first we need to break string separated by ,
into rows. One of the way to do that is trick with connect by
:
-- Trick with building resultset from tokenized string
with dtest_string as (
select 'OK,TX,AK,TN,NC,SC,GA,FL,AL,MS' StateImpact from dual
)
select
level lvl,
substr( -- Extract part of source string
StateImpact,
-- from N-th occurence of separator
decode( level, 1, 1, instr(StateImpact,',',1,level-1)+1 ),
-- with length of substring from N-th to (N+1)-th occurence of separator or to the end.
decode( instr(StateImpact,',',1,level), 0, length(StateImpact)+1, instr(StateImpact,',',1,level) )
-
decode( level, 1, 1, instr(StateImpact,',',1,level-1)+1 )
) code
from test_string
start with
StateImpact is not null -- no entries for empty string
connect by
instr(StateImpact,',',1,level-1) > 0 -- continue if separator found on previous step
Just for fun: same trick with ANSI syntax on SQLFiddle
Next, we need to declare type which we can use to store collections:
create or replace type TCodeList as table of varchar2(100);
After that it's possible to build a query:
with all_south_list as (
-- prepare list of south states
select 'OK' as code from dual union all
select 'TX' as code from dual union all
select 'AK' as code from dual union all
select 'TN' as code from dual union all
select 'NC' as code from dual union all
select 'SC' as code from dual union all
select 'GA' as code from dual union all
select 'FL' as code from dual union all
select 'AL' as code from dual union all
select 'MS' as code from dual union all
select 'LA' as code from dual
)
select
StateImpact,
-- Make decision based on counts
case
when total_count = 0 then 'None'
when total_count = south_count then 'South'
when south_count = 0 then 'North'
else 'Both'
end RegionImpact,
total_count,
south_count,
north_count
from (
select
StateImpact,
-- count total number of states in StateImpact
cardinality(code_list) total_count,
-- count number of south states in StateImpact
cardinality(code_list multiset intersect south_list) south_count,
-- count number of non-south states in StateImpact
cardinality(code_list multiset except south_list) north_count
from (
select
StateImpact,
(
cast(multiset( -- Convert set of values into collection which acts like a nested table
select -- same trick as above
substr(
StateImpact,
decode( level, 1, 1, instr(StateImpact,',',1,level-1)+1 ),
decode( instr(StateImpact,',',1,level), 0, length(StateImpact)+1, instr(StateImpact,',',1,level) )
-
decode( level, 1, 1, instr(StateImpact,',',1,level-1)+1 )
) code
from dual
start with StateImpact is not null
connect by instr(StateImpact,',',1,level-1) > 0
) as TCodeList
)
) code_list,
-- Build collection from south states list
cast(multiset(select code from all_south_list) as TCodeList) south_list
from
CallCenter
)
)

- 6,725
- 24
- 33