1

Below is the CLOB column value in one of my table. This value contains many images paths.

<p><img alt="" src="/library/NATIONWIDE/CCS_SUPPORT/Emmy2.jpg" style="height:67px; width:50px" /></p>
<p><img alt="" src="/library/NATIONWIDE/CCS_SUPPORT/Emmy%201.jpg" style="height:133px; width:100px" />Please test this document</p>
<p>&nbsp;</p>
<p><img alt="" src="/library/NATIONWIDE/CCS_SUPPORT/Kenny.jpg" style="height:250px; width:200px" /></p>
<p>&nbsp;</p>

i wrote below query to find the all images paths from above column value but its returning only first image path but not others. How can i get other string values.

SELECT REGEXP_SUBSTR(contentdata.xml, '/library/NATIONWIDE/.* style') AS xml_substr
FROM contenttext 
WHERE 
    contenttext.xml LIKE '%img alt=%'
    AND contenttext.documentid LIKE 'SPT91%'
    AND contenttext.published = 'Y';

Output of my Query is:

/library/NATIONWIDE/CCS_SUPPORT/Emmy2.jpg" style

Can you guide me, how to get all the image paths ?

Thank You kishore Kuna

GMB
  • 216,147
  • 25
  • 84
  • 135
Kishore
  • 11
  • 2

2 Answers2

1

Oracle REGEXP_SUBSTR returns the first match in the string. You could use a CONNECT BY clause to parse the string recursively :

SELECT REGEXP_SUBSTR(x.xml, '/library/NATIONWIDE/.* style', 1, LEVEL) AS xml_substr
FROM ( 
    SELECT xml
    FROM contenttext 
    WHERE 
        contenttext.xml like '%img alt=%' 
        AND contenttext.documentid like 'SPT91%'
        AND contenttext.published = 'Y'
) x
CONNECT BY LEVEL <= LENGTH(REGEXP_REPLACE(x.xml, '/library/NATIONWIDE/.* style')) + 1;
GMB
  • 216,147
  • 25
  • 84
  • 135
0

This SQL doesn't require any connect by. By using the 'm' option, regexp_replace treats each line as as separate string. I supply two arguments separated by an 'or' (the vertical bar). The first string is the desired string, the second string is everything else.

Only the desired string is kept ('\1'), but the extra carriage returns are still there, hence the surrounding regexp_replace to remove multiple carriage returns.

-- Just setting up a test dataset
WITH aset AS( SELECT '<p><img alt="" src="/library/NATIONWIDE/CCS_SUPPORT/Emmy2.jpg" 
style="height:67px; width:50px" /></p>
<p><img alt="" src="/library/NATIONWIDE/CCS_SUPPORT/Emmy%201.jpg" 
style="height:133px; width:100px" />Please test this document</p>
<p>nbsp;</p>
<p><img alt="" src="/library/NATIONWIDE/CCS_SUPPORT/Kenny.jpg" style="height:250px; 
width:200px" /></p>
<p>nbsp;</p>' test FROM DUAL )
-- Magic starts here
SELECT test
     , regexp_replace( REGEXP_REPLACE( test
                                 , '(^.*/library/NATIONWIDE/.*$)|(.*$)'
                                 , '\1'
                                 , 1
                                 , 0
                                 , 'im' )
                 , CHR( 10 ) || '+'
                 , CHR( 10 ) )    one
  FROM aset

This resulted in the following:

<p><img alt="" src="/library/NATIONWIDE/CCS_SUPPORT/Emmy2.jpg" style="height:67px; width:50px" /></p>
<p><img alt="" src="/library/NATIONWIDE/CCS_SUPPORT/Emmy%201.jpg" style="height:133px; width:100px" />Please test this document</p>
<p><img alt="" src="/library/NATIONWIDE/CCS_SUPPORT/Kenny.jpg" style="height:250px; width:200px" /></p>
Brian Leach
  • 2,025
  • 1
  • 11
  • 14