2

I'm creating a DataModel in Oracle Fusion Financials to match parties together, from supplier use and customer use. These parties have a code which is registered in their name. Searching for the table names on Google will find the schemas (eg. HZ_PARTIES), although it's not very necessary to see the schemas to tackle this issue.

Our data quality is not quite what we want it to be. To ensure I'm not missing records, I need to join on other parties who also have the code in the name.

This is what I have so far, which gives results.

SELECT
 RCTA.TRX_NUMBER
,RCTA.CT_REFERENCE 
,HP.PARTY_NAME  PARTY_NAME1
,HP2.PARTY_NAME PARTY_NAME2
,IEBC.IBAN CUSTOMER_IBAN

FROM
 HZ_PARTIES HP,
 HZ_PARTIES HP2,
 IBY_ACCOUNT_OWNERS IAO,
 IBY_EXT_BANK_ACCOUNTS IEBC,
 RA_CUSTOMER_TRX_ALL RCTA,
 HZ_CUST_ACCOUNTS HCA 

WHERE 1=1
  AND RCTA.BILL_TO_CUSTOMER_ID = HCA.CUST_ACCOUNT_ID (+)
  AND HCA.PARTY_ID = HP.PARTY_ID(+)
  AND REGEXP_SUBSTR(HP.PARTY_NAME,'([0-9]{2}[A-Z]{2}[0-9]{3})') in REGEXP_SUBSTR(HP2.PARTY_NAME,'([0-9]{2}[A-Z]{2}[0-9]{3})') -- Join on code found in party name.  
  AND IAO.ACCOUNT_OWNER_PARTY_ID (+) IN (HP2.PARTY_ID)
  AND IAO.EXT_BANK_ACCOUNT_ID = IEBC.EXT_BANK_ACCOUNT_ID (+)

However, this performs an inner join instead of the outer join I need.

I've tried the following, which gives a syntax error (missing parenthesis):

AND REGEXP_SUBSTR(HP.PARTY_NAME,'([0-9]{2}[A-Z]{2}[0-9]{3})') = REGEXP_SUBSTR(HP2.PARTY_NAME,'([0-9]{2}[A-Z]{2}[0-9]{3})') (+)

Also tried this, which makes the query run for way too long. Did not wait for results, because it's probably incorrect:

AND ( REGEXP_SUBSTR(HP.PARTY_NAME,'([0-9]{2}[A-Z]{2}[0-9]{3})') = REGEXP_SUBSTR(HP2.PARTY_NAME,'([0-9]{2}[A-Z]{2}[0-9]{3})') (+) -- Join on investor code found in party name.  
   OR NOT REGEXP_LIKE(HP.PARTY_NAME,'([0-9]{2}[A-Z]{2}[0-9]{3})') -- Escape to outer join in case there's no investor code in name
  )

If it's necessary to make this work I'm willing to rewrite the (+) joins to regular outer join syntax.

MaartenB
  • 31
  • 4
  • I'd re-write the query with explicit joins anyway. It's not hard to do and you end up with a readable query. You may even spot mistakes by doing so. – Thorsten Kettner Mar 20 '20 at 09:53
  • Your `IN` clauses surprise me by the way. It should be something like `IN (a,b,c)`, but you have `in REGEXP_SUBSTR ...` (no parentheses and only a single value to compare) and `IN (HP2.PARTY_ID)` (only a single value to compare, which is just `= HP2.PARTY_ID`). The lack of the parentheses in that first expression should result in a syntax error exception. – Thorsten Kettner Mar 20 '20 at 10:16

2 Answers2

1

You put outer join operator (+) to a wrong place. Should be something like this:

SQL> with
  2  hp (party_name) as
  3    (select '11AA111' from dual union all
  4     select '22BB222' from dual
  5    ),
  6  hp2 (party_name) as
  7    (select '11AA111' from dual union all
  8     select '33CC333' from dual
  9    )
 10  select hp.*
 11  from hp, hp2
 12  where regexp_substr(hp.party_name     , '([0-9]{2}[A-Z]{2}[0-9]{3})') =
 13        regexp_substr(hp2.party_name (+), '([0-9]{2}[A-Z]{2}[0-9]{3})')
 14  /                                  ---
                                        here
PARTY_N
-------
11AA111
22BB222

SQL>

As of proper joins ... well, yes - you could rewrite it if you want, but I don't think it'll help in this case. If query runs OK as is, I'd leave it as is and rewrite it if necessary.

Littlefoot
  • 131,892
  • 15
  • 35
  • 57
  • Thank you so much for the lightning fast answer Littlefoot. I had tried looking up documentation on where to put it, but I never found anything mentioning this syntax. – MaartenB Mar 20 '20 at 09:44
0

I suggest you add a virtual column to the hz_parties table and index it, if you are allowed to:

alter table hz_parties add code varchar2(7) as regexp_substr(party_name, '([0-9]{2}[A-Z]{2}[0-9]{3})');
create index idx_parties_code on hz_parties (code);

If you are not allowed to alter the table, then use a function index instead:

create index idx_parties_code on hz_parties (regexp_substr(party_name, '([0-9]{2}[A-Z]{2}[0-9]{3})'));

If you are not allowed to add an index on an existing table, then create a new table with an index, e.g.:

create table party_code
(
  party_id  number(10)   not null,
  code      varchar2(7)  not null,
  primary key (party_id)
);

insert into party_code (party_id, code)
select party_id, regexp_substr(party_name, '([0-9]{2}[A-Z]{2}[0-9]{3})')
from hz_parties;

create index idx_party_code on party_code (code, party_id);

In any of these cases you have the code pre-extracted and the join should be fast.

In order to find duplicates just group by code. E.g.:

select code, listagg(party_id, ', ') within group (order by party_id)
from party_code
group by code
having count(*) > 1;

Re-write your query to use explicit joins anyway to get it readable, fix the erroneous outer joins and spot possible other mistakes.

Thorsten Kettner
  • 89,309
  • 7
  • 49
  • 73