We've got a system where we've got a collection of items (> 1 million), and several things processing it. Each processor should only process each item once, and the processors have a heirarchy.
Our current implementation is to have a 'processed' table keeping track of what each processor has already done:
CREATE TABLE items (id NUMBER PRIMARY KEY, ...)
CREATE TABLE itemsProcessed(
item NUMBER REFERENCES items(id),
processor NUMBER)
our query is this (itemsProcessed
has got relevant indexes on it) - we're using a NOT IN to filter out items that have already been processed by the current processor or it's ancestors:
SELECT ... FROM items i WHERE <additional queries on items>
AND id NOT IN (SELECT item FROM itemsProcessed WHERE processor IN (1, 2))
When the processed table gets very big, this query starts taking a long time (several seconds), as it has to do a lot of filtering before it starts returning the first item (the query plan is using a hash antijoin)
We need this query to return the first few items very quickly - ideally return the first items under 500ms. This means it can't iterate over items
and filter out those in itemsProcessed
. So we need some way of doing a negation index on the join of items
and itemsProcessed
(we've accomplished this on mongo, but oracle doens't seem to be able to do a similar thing)
Is this possible with Oracle?