I'm working on a DWH supplying script, that deal with ~40M lines on an Oracle Exadata server.
I've got the following tables :
CREATE TABLE P.ARCHIVED_ID
(
ID_ARCHIVED VARCHAR2(31 BYTE) NOT NULL, -- CONSTRAINT UNIQUE / PRIMARY KEY
IS_DELETE CHAR(1 BYTE) DEFAULT null
);
CREATE TABLE P.DWH_tableX
(
ID VARCHAR2(31 BYTE) NOT NULL, -- CONSTRAINT UNIQUE / PRIMARY KEY
FIELD1 VARCHAR2(31 BYTE),
FIELD2 VARCHAR2(31 BYTE),
FIELD3 CHAR(2 BYTE),
FIELD4 CHAR(1 BYTE)
);
As you can see, I've got a IS_DELETE
flag that can be set to Y
(and is NULL by default).
This is the standard query I need to use all around :
SELECT
ID,
FIELD1,
FIELD2,
FIELD3,
FIELD4
FROM P.DWH_tableX A
WHERE EXISTS (
SELECT 1
FROM P.ARCHIVED_ID B
WHERE
A.ID = B.ID_ARCHIVED
AND IS_DELETE = 'Y'
);
Question, is there a better way to optimize this than the following index ?
CREATE BITMAP INDEX P.I_IS_DELETE
ON P.ARCHIVED_ID(IS_DELETE)
LOGGING
TABLESPACE TBS_P_01
NOPARALLEL;
A few more points :
- I'll never access
P.ARCHIVED_ID
withoutWHERE IS_DELETE = 'Y'
(or only once when I update this field at the beginning of the script) by default, Oracle don't registerFALSENULL
in index, so I useNULL
instead of"N"
, allowing me to make a partial="Y"
bitmap index, keeping it as small and useful as I can.- I use
EXISTS
, instead ofIN
, as Oracle recommend (ability to use index), - The part with
IS_DELETE = 'Y'
is for now 0%, but as it's a DWH, it's expected to only increase (as new lines are create everyday in the data source, and older are physically removed, setting them atIS_DELETE = 'Y'
in the DWH) - the
WHERE EXISTS
is used around 20 time in my script, as it's how we deal to froze the archived values (move them and back on a temporary table, withinsert append
hint)