I'm facing a problem where I have to select rows from a table ('CHILD') that should have a foreign key towards another table ('PARENT'). The problem is that the foreign key is broken (long story short, the table is partitioned and for some reason there are orphans) and I must clean the child table before restoring the foreign key. What I am trying to do is (roughly) :
SELECT child.ID
from CHILD child
WHERE child.PARENT_ID NOT IN
(
SELECT parent.ID FROM PARENT parent
);
This seems correct (judging by the results), however it is quite inefficient: there are 1M results, and the child table holds 100M+ rows.
Since I have to delete each row coming from the results of that query I am using pagination, but afaict this means that the NOT IN query is repeated each time.
For this reason I am wondering if there is any way to improve the performance of the query. I have tried joining the tables but I have realized it won't work because I should join on child.PARENT_ID = parent.ID
, so there would be no result.
So the question is : is there any way to rewrite the NOT IN query so that it is more performant?