0

I need to extract values delimited by semicolon in a text. I use regexp_substr which is very powerful but the string has a limit. I have sometimes a text with 10 000 characters so is not possible to use such string in this inbuild function.

the text looks like this - AU783 343; 3N9493 ;113 UN9 WE3 54 ; OI8343, ; 43U.783.3

just 0-9 and A-Z , if there is a colon or period then it should be deleted. the output must be as it is but withou leading and trailing spaces

AU783 343
3N9493
113 UN9 WE3 54
OI8343
43U7833

any suggestion how to avoid the max length limit?

Pato
  • 153
  • 6
  • Where is the string coming from - presumably you're currently cutting and pasting it from somewhere? If you can read it from a table or from a file as a CLOB then it won't be a literal any more, so it will be (practically) unlimited. Otherwise you'll have to chop your original value up into smaller chunks and stick them together to make a CLOB. – Alex Poole Mar 29 '22 at 17:15

1 Answers1

0

You don't need to use regular expressions. The simple string functions SUBSTR, INSTR and REPLACE are sufficient and work with CLOB values.

For example, given the table:

CREATE TABLE table_name (
  id    NUMBER
        GENERATED ALWAYS AS IDENTITY
        PRIMARY KEY,
  value CLOB
);

You can extract all the substrings using:

WITH bounds (id, value, s_pos, e_pos) AS (
  SELECT id,
         value,
         1,
         INSTR(value, ';', 1)
  FROM   table_name
UNION ALL
  SELECT id,
         value,
         e_pos + 1,
         INSTR(value, ';', e_pos + 1)
  FROM   bounds
  WHERE  e_pos > 0
)
SEARCH DEPTH FIRST BY id SET id_order
SELECT id,
       REPLACE(
         REPLACE(
           CASE e_pos
           WHEN 0
           THEN SUBSTR(value, s_pos)
           ELSE SUBSTR(value, s_pos, e_pos - s_pos)
           END,
           ':'
         ),
         '.'
       ) AS value
FROM   bounds;

See db<>fiddle for an example with very large strings.

MT0
  • 143,790
  • 11
  • 59
  • 117
  • MTO, thanks but this will not work in my case because the user copies the text from Excel and pastes in the input so the function with regexp will do the work in one single online without inserting the data into table. Is not an app ,we can not create procedure etc to fo the inserting then deleting the table value etc – Pato Mar 30 '22 at 06:15
  • 1
    @Pato A string literal is limited to 4000 bytes so your copy-paste method is fundamentally restricted to that limitation. What you need to do is use a `CLOB` data type to store the value and then work on that; however, there is not a `CLOB` literal so you cannot just copy-paste into it. You can look at putting the data into a file and loading it from there or using a VBA macro directly from Excel or some other method of creating a `CLOB` but you do need to use a `CLOB` data type unless you want to split your string up into 4000 byte chunks. – MT0 Mar 30 '22 at 07:32