5

I have an SQL table column (BINARYTEXT) populated with a CLOB. Within this CLOB is a number of attributes, e.g.,

CE.EffDate="20140106";
CE.CCY="EUR";
CE.TransactionType="STANDARDEUROPEAN";
CE.CAL="LON";

I need to extract only the value of the CE.TransactionType attribute contained between the double quotes so 'STANDARDEUROPEAN'. Note that the CLOB does not contain XML and only contains attributes as above with no start or end tags.

I have worked out how to do this using the REGEXP_SUBSTR function when I specify the string in the command:

select REGEXP_SUBSTR('CE.TransactionType="STANDARDEUROPEAN"', '="[^"]+') transtype
from DUAL

which returns: ="STANDARDEUROPEAN

I am unable to manipulate this into using the CLOB as the string. This does not work:

select REGEXP_SUBSTR(BINARYTEXT,'CE.TransactionType="STANDARDEUROPEAN"', '="[^"]+')  transtype
from DUAL

This comes up with a blank integer error, although I am unsure why it is expecting an integer.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Steve
  • 51
  • 1
  • 1
  • 2

5 Answers5

0

You can use the DBMS_LOB package to extract the string.

SELECT REGEXP_SUBSTR(DBMS_LOB.substr(BINARYTEXT,3000) ,'CE.TransactionType="STANDARDEUROPEAN"', '="[^"]+')  transtype
FROM YOURTABLENAME

The only thing you might have problems with is the buffer size, which is restricted to 32767 bytes (it's set to 3000 in my example). If your data is bigger than this, then you'd need to query it in chunks. See the documentation here

If you need to change the TransactionType you're searching for, you should probably create a function which takes that part of the string as a parameter and build the SQL content dynamically.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
StevieG
  • 8,639
  • 23
  • 31
  • Also note that the value within the double quotes will change so I will need to search on the CE.TransactionType=" as 'STANDARDEUROPEAN' is only an example as 'STANDARDNORTHAMERICAN' is another possibility. Thanks – Steve Jan 07 '14 at 09:31
0

Use:

select BINARYTEXT from your_table
where regexp_like(dat,'^(\CE.TransactionType=)?"[^"]+"?;$');
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Sai
  • 659
  • 4
  • 12
  • 21
  • Hi Sai. Is this taking into account the fact that the information is in a single column CLOB for each record? E.g. Record 1 CLOB Record 2 CLOB Record 3 CLOB Also I am looking only for the value of TransactionType ignoring all other rows of data in the CLOB. The SQL you provided only searches on the double quote and not on the actual text I require. Thanks – Steve Jan 07 '14 at 12:48
  • so u want o/p like CE.TransactionType="STANDARDEUROPEAN" ? – Sai Jan 07 '14 at 13:09
  • Yes for the first part as CE.TransactionType=" never changes but the value between the quotes is what changes and is what I want to return using the query. – Steve Jan 07 '14 at 13:14
  • hope my edit helps u.. the query will return nulls for unmatched rows – Sai Jan 07 '14 at 13:28
  • I think there is an issue with the matching as all rows are coming up blank. – Steve Jan 07 '14 at 15:37
  • Hi Sai. This seems to be running but still returns zero rows. The actual data in the CLOB contains a semi colon after the last double quote. Does this need to be added to the expression to match the end of the line? e.g. data in CLOB is: CE.TransactionType="STANDARDEUROPEAN"; – Steve Jan 08 '14 at 15:58
  • sry..didn't get u.. you want to say at the time of data insert if you entered like CE.TransactionType="STANDARDEUROPEAN"; then you should mention ; in the expression.. – Sai Jan 09 '14 at 04:41
0

You can try the below SQL:

select substr ( REGEXP_SUBSTR(to_clob ('CE.EffDate="20140106"; CE.CCY="EUR";CE.TransactionType="STANDARDEUROPEAN";CE.CAL="LON";'), '="[^"]+', 1,3), 3)  transtype from DUAL
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Younes
  • 46
  • 2
  • Re *"You can try"*: An explanation would be in order. E.g., what is the idea/gist? From [the Help Center](https://stackoverflow.com/help/promotion): *"...always explain why the solution you're presenting is appropriate and how it works"*. Please respond by [editing (changing) your answer](https://stackoverflow.com/posts/70595130/edit), not here in comments (but *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***[without](https://meta.stackexchange.com/a/131011)*** *** *** *** *** *** *** *** *** "Edit:", "Update:", or similar - the answer should appear as if it was written today). – Peter Mortensen Aug 15 '23 at 18:36
0

If your db does not have REGEXP, you can use the DBMS_LOB package (as mentioned by StevieG). Here is SQL using just the DBMS_LOB... I used the semicolon as a delimiter as it is in your CLOB string in the question.

WITH
    table_test As
        (
            Select
              TO_CLOB('... ... ...
                       Some other text within the CLOB field
                       ...
                       CE.EffDate="20140106";
                       CE.CCY="EUR";
                       CE.TransactionType="STANDARDEUROPEAN";
                       CE.CAL="LON";
                       and yet some more text;
                       and more ...
                       ... ... ...
                    ') "BINARYTEXT"
            From Dual
        )
Select
    DBMS_LOB.SubStr(
                      DBMS_LOB.SubStr(BINARYTEXT, 100, InStr(BINARYTEXT, 'CE.TransactionType=') + Length('CE.TransactionType=')+1),
                      InStr(DBMS_LOB.SubStr(BINARYTEXT, 100, InStr(BINARYTEXT, 'CE.TransactionType=') + Length('CE.TransactionType=')+1), ';')-2,
                      1

                    ) "TRANSACTION_TYPE"
From
    table_test
--
-- Result:
--
-- TRANSACTION_TYPE
-- STANDARDEUROPEAN
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
d r
  • 3,848
  • 2
  • 4
  • 15
0

This is because the REGEXP_SUBSTR function expects a string as its first argument, and the BINARYTEXT column is a CLOB. To fix this, you can use the DBMS_LOB package to convert the CLOB to a string. The following query will work:

select REGEXP_SUBSTR(DBMS_LOB.substr(BINARYTEXT, 3000), 'CE.TransactionType="STANDARDEUROPEAN"', '="[^"]+')  transtype 
from DUAL

This query will first use the DBMS_LOB.substr function to convert the BINARYTEXT column to a string with a maximum length of 3000 characters. It will then use the REGEXP_SUBSTR function to extract the value of the CE.TransactionType attribute from the string. The result will be a string containing the value of the CE.TransactionType attribute, or NULL if the attribute is not found.

Here is an example of the output of this query:

| transtype |
|----------|
| STANDARDEUROPEAN |
Griffin
  • 7
  • 3
  • [Three](https://stackoverflow.com/a/76697548) [other](https://stackoverflow.com/questions/76694777/how-do-you-know-if-a-class-can-not-create-an-instant-object-in-java-without-read/76695382#76695382) [answers](https://stackoverflow.com/questions/76695072/is-it-possible-to-check-for-content-encoding-header-programmatically/76695257#76695257) posted within the same about 48 hours were likely generated by [ChatGPT](https://meta.stackoverflow.com/questions/421831/temporary-policy-chatgpt-is-banned), so this one was probably generated by ChatGPT as well. – Peter Mortensen Aug 15 '23 at 18:43