0

I'm trying to extract a list of IDs from a hierarchical table (non-cyclic trees) where all of the entries in the tree are matching a condition for deletion.

The structure of the table is similar to this :

╔════╦═══════════╦═══════════╗
║ id ║ parent_id ║ condition ║
╚════╩═══════════╩═══════════╝

In this table some trees have all entries matching the condition, some have all entries not matching the condition, for those cases there is no problem.

But in some trees there are entries matching the condition and other are not.

I have a request returning all roots from trees where only some of the entries are matching :

select distinct(act.id) from acti_test act
where connect_by_isleaf = 1
start with id in (
select distinct(act1.id) from acti_test act1
    join acti_test act2 on act1.id = act2.PARENT_ID
    where act2.CONDITION = 0
    and act1.CONDITION = 1
    or act1.CONDITION = 0
    and act2.CONDITION = 1
)
connect by act.id = prior act.PARENT_ID;

This request is finding 62 elements in my data (on 915,102 entries) in about 8 seconds.

Then I want to use this result in a select to exclude all the element from the trees which the root have been find :

select * from acti_test act
where act.id not in (
            select act_not_to_delete.id from acti_test act_not_to_delete
            start with act_not_to_delete.id in (
                select distinct(act.id) from acti_test act
                where connect_by_isleaf = 1
                start with id in (
                select distinct(act1.id) from acti_test act1
                    join acti_test act2 on act1.id = act2.PARENT_ID
                    where act2.CONDITION = 0
                    and act1.CONDITION = 1
                    or act1.CONDITION = 0
                    and act2.CONDITION = 1
                )
                connect by act.id = prior act.PARENT_ID;
            )
            connect by prior act_not_to_delete.id = act_not_to_delete.PARENT_ID)
and act.CONDITION = 1;

But this request does not end.

Strangely the same request with the ids written in hard (as 'id1, id2, id3, ...') respond almost immediately :

select * from acti_test act
where act.id not in (
            select act_not_to_delete.id from acti_test act_not_to_delete
            start with act_not_to_delete.id in (
                1, 2, 3, ... , 62
            )
            connect by prior act_not_to_delete.id = act_not_to_delete.PARENT_ID)
and act.CONDITION = 1;

Is there any way to perform this select with only one request ?

Pilou
  • 1,398
  • 13
  • 24
  • The first part of your question was going well, but then you jumped straight to code too soon. I still don't understand what you are trying to do. I suppose by "trees" (as in "some trees in this table") you mean subtrees, starting at a node (ANY node) and including all the descendants? Then, the "condition" column is actually simply a "value" column, and the "condition" is a boolean (logical) condition on those values? If we can agree on the exact requirement, then we can think about the best algorithm to achieve what you need (before writing any code). –  Jul 18 '17 at 11:09
  • Yes by "tree" I mean a set of entries hierarchicaly linked (recursively) with a root having no parent. And the actual condition is way more complex but it can be simplified as a simple check on a column. – Pilou Jul 18 '17 at 11:29
  • Oh - so you are talking about "maximal" subtrees. So - your table has several roots, not one? Then it seems it would be best to find the roots of the subtrees that must be deleted, not the ones that must be kept. You should be able to use the same hierarchical query twice - both to find those roots and to "mark" the rows that must be deleted. –  Jul 18 '17 at 11:34

0 Answers0