Oracle SQL - how to split up a string into parts but not break up words

Question

I need to split a long description into 4 parts of up to 60 characters each, but I need each part to end with a space (not break up any words). What is the best method to do this? Below is the snippet of the code, however I need it to not break up words.

SELECT emplid as SHRQPNM_PIDM,
       substr(listagg(SSR_transcript_txt) WITHIN GROUP (ORDER BY text_seq_nbr), 1, 60),
       substr(listagg(SSR_transcript_txt) WITHIN GROUP (ORDER BY text_seq_nbr), 61, 60),
       substr(listagg(SSR_transcript_txt) WITHIN GROUP (ORDER BY text_seq_nbr), 121, 60),
       substr(listagg(SSR_transcript_txt) WITHIN GROUP (ORDER BY text_seq_nbr), 181, 60),
       to_char(sysdate, 'YYYYMMDD') SHRQPNM_ACTIVITY_D,
FROM   PS_TSCRPT_TEXT
GROUP BY emplid, print_loc_SEQ) 
order by emplid, print_loc_SEQ;

example of current output:

DOCTORAL THESIS -A MEASUREMENT OF DISTORTION PRODUCT **OTOACOU|STIC** EMISSIONS WHILE ATTENDINGTO FAMILIAR AND UNFAMILIAR **VIS|UAL** STIMULATION|

score 0 · Answer 1 · answered Jun 09 '22 at 14:42

If you want to strip the trailing space from the output then you can use:

SELECT emplid as SHRQPNM_PIDM,
       REGEXP_SUBSTR(
         listagg(SSR_transcript_txt) WITHIN GROUP (ORDER BY text_seq_nbr),
         '(.{1,60})( |$)',
         1,
         1,
         NULL,
         1
       ) AS part1,
       REGEXP_SUBSTR(
         listagg(SSR_transcript_txt) WITHIN GROUP (ORDER BY text_seq_nbr),
         '(.{1,60})( |$)',
         1,
         2,
         NULL,
         1
       ) AS part2,
       REGEXP_SUBSTR(
         listagg(SSR_transcript_txt) WITHIN GROUP (ORDER BY text_seq_nbr),
         '(.{1,60})( |$)',
         1,
         3,
         NULL,
         1
       ) AS part3,
       REGEXP_SUBSTR(
         listagg(SSR_transcript_txt) WITHIN GROUP (ORDER BY text_seq_nbr),
         '(.{1,60})( |$)',
         1,
         4,
         NULL,
         1
       ) AS part4,
       to_char(sysdate, 'YYYYMMDD') SHRQPNM_ACTIVITY_D
FROM   PS_TSCRPT_TEXT
GROUP BY emplid, print_loc_SEQ
order by emplid, print_loc_SEQ;

Which, for the sample data:

CREATE TABLE ps_tscrpt_text (emplid, ssr_transcript_txt, text_seq_nbr, print_loc_seq) AS
SELECT 1,
       'DOCTORAL THESIS',
       1,
       1
FROM   DUAL UNION ALL
SELECT 1,
       ' -A MEASUREMENT OF DISTORTION PRODUCT OTOACOUSTIC EMISSIONS WHILE ATTENDINGTO FAMILIAR AND UNFAMILIAR VISUAL STIMULATION|',
       2,
       1
FROM   DUAL;

Outputs:

SHRQPNM_PIDM PART1 PART2 PART3 PART4 SHRQPNM_ACTIVITY_D

1 DOCTORAL THESIS -A MEASUREMENT OF DISTORTION PRODUCT OTOACOUSTIC EMISSIONS WHILE ATTENDINGTO FAMILIAR AND UNFAMILIAR VISUAL STIMULATION null

If you want to include the trailing space in the output (and have a maximum of 60 characters) then:

SELECT emplid as SHRQPNM_PIDM,
       REGEXP_SUBSTR(
         listagg(SSR_transcript_txt) WITHIN GROUP (ORDER BY text_seq_nbr),
         '(.{1,59})( |$)',
         1,
         1
       ) AS part1,
       REGEXP_SUBSTR(
         listagg(SSR_transcript_txt) WITHIN GROUP (ORDER BY text_seq_nbr),
         '(.{1,59})( |$)',
         1,
         2
       ) AS part2,
       REGEXP_SUBSTR(
         listagg(SSR_transcript_txt) WITHIN GROUP (ORDER BY text_seq_nbr),
         '(.{1,59})( |$)',
         1,
         3
       ) AS part3,
       REGEXP_SUBSTR(
         listagg(SSR_transcript_txt) WITHIN GROUP (ORDER BY text_seq_nbr),
         '(.{1,59})( |$)',
         1,
         4
       ) AS part4,
       to_char(sysdate, 'YYYYMMDD') SHRQPNM_ACTIVITY_D
FROM   PS_TSCRPT_TEXT
GROUP BY emplid, print_loc_SEQ
order by emplid, print_loc_SEQ;

db<>fiddle here

Oracle SQL - how to split up a string into parts but not break up words

1 Answers1