6

After searching the forums I have come up with the following but its not working :/

I have a table with the following;

ID |   Strings     
123|   abc fgh dwd   
243|   dfs dfd dfg  
353|   dfs  
424|   dfd dfw  
523|    
.  
.  
. 

Please not that there is around 20,000 rows my other option is to write a stored procedure to do this ...Basically I need to split the strings up so there is a row for each one like this

ID |  Strings  
123| abc  
123| fgh  
123| dwd  
243| dfs  

and so on...

this is what I have.

create table Temp AS   
SELECT ID, strings   
From mytable;  

SELECT DISTINCT ID, trim(regexp_substr(str, '[^ ]+', 1, level)) str  
FROM (SELECT ID, strings str FROM temp) t  
CONNECT BY instr(str, ' ', 1, level -1) >0  
ORDER BY ID;  

Any help is appreciated

Grant McKinnon
  • 445
  • 3
  • 7
  • 17
  • The only ways I can think to do that would be very inefficient. Is there a reason you can't do this in a programming language after receiving the initial response from the DB? – Sam Oct 16 '14 at 15:57

3 Answers3

10

This should do the trick:

SELECT DISTINCT ID, regexp_substr("Strings", '[^ ]+', 1, LEVEL)
FROM T
CONNECT BY regexp_substr("Strings", '[^ ]+', 1, LEVEL) IS NOT NULL
ORDER BY ID;

Notice how I used regexp_substr in the connect by clause too. This is to deal with the case of multiple spaces.


If you have a predictable upper bound on the number of items per line, it might worth comparing the performances of the recursive query above with a simple CROSS JOIN:

WITH N as (SELECT LEVEL POS FROM DUAL CONNECT BY LEVEL < 10)
--                                                       ^^
--                                                 up to 10 substrings
SELECT ID, regexp_substr("Strings", '[^ ]+', 1, POS)
FROM T CROSS JOIN N
WHERE regexp_substr("Strings", '[^ ]+', 1, POS) IS NOT NULL
ORDER BY ID;

See http://sqlfiddle.com/#!4/444e3/1 for a live demo

Sylvain Leroux
  • 50,096
  • 7
  • 103
  • 125
  • This is how I thought to do it but it doesn't seem to accept the attribute Strings in the " " It tells me its a invalid identifier – Grant McKinnon Oct 17 '14 at 00:45
  • @Grant _"the attribute Strings [...] It tells me its a invalid identifier"_ FWIW, when quoted, identifiers are case sensitive. Maybe that's your issue? Or a simple typo? If not, that's very strange?!? If you can't fix it yourself, maybe worth _asking an other question_ focusing on that issue only, and mentioning the exact structure of your table, your attempted query and the very precise error code and message produced by your version of Oracle. – Sylvain Leroux Oct 17 '14 at 08:05
  • 1
    All good I got this to work thank you It was a typo on my end – Grant McKinnon Oct 19 '14 at 04:21
  • @SylvainLeroux You could remove the dependency upon predictability of the number of items per line using `sys.odciNumberList` and a simple join. There are many other ways, however, performance might be a factor. You could have a look at [Split comma delimited strings in a table](http://lalitkumarb.wordpress.com/2015/03/04/split-comma-delimited-strings-in-a-table-using-oracle-sql/). I have added few examples in my answer too. – Lalit Kumar B Jun 04 '15 at 05:31
8

A more flexible and better solution which:

  • doesn't depend upon the predictability of the number of items per line.
  • doesn't depend on the ID column, the solution gives correct result irrespective of the number of column.
  • doesn't even depend upon the DISTINCT keyword.

There are other examples using XMLTABLE and MODEL clause, please read Split comma delimited strings in a table.

For example,

Without ID column:

SQL> WITH T AS
  2    (SELECT 'abc fgh dwd' AS text FROM dual
  3    UNION
  4    SELECT 'dfs dfd dfg' AS text FROM dual
  5    UNION
  6    SELECT 'dfs' AS text FROM Dual
  7    UNION
  8    SELECT 'dfd dfw' AS text FROM dual
  9    )
 10  SELECT trim(regexp_substr(t.text, '[^ ]+', 1, lines.column_value)) text
 11  FROM t,
 12    TABLE (CAST (MULTISET
 13    (SELECT LEVEL FROM dual CONNECT BY instr(t.text, ' ', 1, LEVEL - 1) > 0
 14    ) AS sys.odciNumberList )) lines
 15  /

TEXT
-----------
abc
fgh
dwd
dfd
dfw
dfs
dfs
dfd
dfg

9 rows selected.

With ID column:

SQL> WITH T AS
  2    (SELECT 123 AS id, 'abc fgh dwd' AS text FROM dual
  3    UNION
  4    SELECT 243 AS id, 'dfs dfd dfg' AS text FROM dual
  5    UNION
  6    SELECT 353 AS Id, 'dfs' AS text FROM Dual
  7    UNION
  8    SELECT 424 AS id, 'dfd dfw' AS text FROM dual
  9    )
 10  SELECT id, trim(regexp_substr(t.text, '[^ ]+', 1, lines.column_value)) text
 11  FROM t,
 12    TABLE (CAST (MULTISET
 13    (SELECT LEVEL FROM dual CONNECT BY instr(t.text, ' ', 1, LEVEL - 1) > 0
 14    ) AS sys.odciNumberList )) lines
 15   ORDER BY id
 16   /

        ID TEXT
---------- -----------
       123 abc
       123 fgh
       123 dwd
       243 dfs
       243 dfd
       243 dfg
       353 dfs
       424 dfd
       424 dfw

9 rows selected.

SQL>
Lalit Kumar B
  • 47,486
  • 13
  • 97
  • 124
-1
With T As 
      (select 123 as id, 'abc fgh dwd' as strings from dual

      union

      select 243 as id, 'dfs dfd dfg' as strings from dual

      union 

      Select 353 As Id, 'dfs' As Strings From Dual

      union 

      select 424 as id, 'dfd dfw' as strings from dual


      )


select distinct id, REGEXP_SUBSTR (Replace(Strings, ' ', ','), '[^,]+', 1, level) as Strings
from t
Connect By Level <= Length(Regexp_Replace(Replace(Strings, ' ', ','),'[^,]*'))+1
order by id, strings;


**********OUTPUT*************
        ID STRINGS   
---------- -----------
   123 abc     
   123 dwd   
   123 fgh     
   243 dfd     
   243 dfg     
   243 dfs      
   353 dfs      
   424 dfd     
   424 dfw    

 9 rows selected 
Taryn
  • 242,637
  • 56
  • 362
  • 405
Steve
  • 541
  • 7
  • 12
  • There are actually around 20000 rows so I need a way to split the strings for all of them – Grant McKinnon Oct 17 '14 at 01:50
  • the "WITH T as(...) was just to create the test data you showed. So main query would be "Select Distinct.... From myTable..." that sort of thing. However the other answer is better honestly – Steve Oct 17 '14 at 12:55