12

Does anyone know why Oracle continues to follow a path beyond a cyclical loop when the cycle occurs at the top node (root node connected right back to root node)? More importantly, how to prevent it?

I have Oracle 11g Release 2 (11.2) and I have been exploring hierarchical queries. I will build my question around the tree structure in figure 9-1 of the Oracle Database SQL Language Reference page 9-4

I created a table structe for this tree using the concept of vendors and cusomers:

    create table t
     ( vendor       varchar2(3)
    , customer   varchar2(3)
    );
    insert into t values ( '1'  , '2'  ); 
    insert into t values ( '2'  , '3'  ); 
    insert into t values ( '2'  , '4'  ); 
    insert into t values ( '4'  , '5'  ); 
    insert into t values ( '4'  , '6'  ); 
    insert into t values ( '1'  , '7'  ); 
    insert into t values ( '7'  , '8'  ); 
    insert into t values ( '1'  , '9'  ); 
    insert into t values ( '9'  , '10' ); 
    insert into t values ( '10' , '11' ); 
    insert into t values ( '9'  , '12' ); 
    commit;

The following select query traverses the tree with no problems:

    select vendor, 
           customer, 
           level, 
           connect_by_isleaf as isleaf, 
           connect_by_iscycle as iscycle, 
           connect_by_root vendor||sys_connect_by_path(customer,' ~ ') as path 
    from t
    connect by nocycle
          vendor=prior customer
    start with vendor='1';

Giving the results:

Vendor Cust     Level   Isleaf Iscycle   Path
1        2        1        0        0   1 ~ 2
2        3        2        1        0   1 ~ 2 ~ 3
2        4        2        0        0   1 ~ 2 ~ 4
4        5        3        1        0   1 ~ 2 ~ 4 ~ 5
4        6        3        1        0   1 ~ 2 ~ 4 ~ 6
1        7        1        0        0   1 ~ 7
7        8        2        1        0   1 ~ 7 ~ 8
1        9        1        0        0   1 ~ 9
9        10       2        0        0   1 ~ 9 ~ 10
10       11       3        1        0   1 ~ 9 ~ 10 ~ 11
9        12       2        1        0   1 ~ 9 ~ 12

I then complicated things by adding cycles to the structure. First a record for a vendor who sells to themselves…

    --self cycle
    insert into t values ( '4'  , '4'  ); 

and one for a vendor whos customer is the vendor of their vendor…

    --ancestor cycle
    insert into t values ( '6'  , '2'  ); 

Reexecuting the select query above results in the same output as above except Iscycle is 1 for row 3 and row 5 (Paths 1 ~ 2 ~ 4 and 1 ~ 2 ~ 4 ~ 6). Note that the CONNECT BY nomenclature flags the parent record of a cycle not the child record actually completing the cycle. (So I know 4 and 6 both cycle back to an ancestor but I don’t know WHICH ancestor.)

Adding two more records creates a larger cycle across the branches of the original tree:

 --cycle crossing branches of tree
  insert into t values ( '6'  , '9'  ); 
  insert into t values ( '11' , '2'  );  

Reexecuting the select query again gives the following output:

Vendor Customer Level   Isleaf   Iscycle       Path
1        2        1        0        0    1 ~ 2
2        3        2        1        0    1 ~ 2 ~ 3
2        4        2        0        1    1 ~ 2 ~ 4
4        5        3        1        0    1 ~ 2 ~ 4 ~ 5
4        6        3        0        1    1 ~ 2 ~ 4 ~ 6
6        9        4        0        0    1 ~ 2 ~ 4 ~ 6 ~ 9
9       10        5        0        0    1 ~ 2 ~ 4 ~ 6 ~ 9 ~ 10
10      11        6        1        1    1 ~ 2 ~ 4 ~ 6 ~ 9 ~ 10 ~ 11
9       12        5        1        0    1 ~ 2 ~ 4 ~ 6 ~ 9 ~ 12
1        7        1        0        0    1 ~ 7
7        8        2        1        0    1 ~ 7 ~ 8
1        9        1        0        0    1 ~ 9
9       10        2        0        0    1 ~ 9 ~ 10
10      11        3        0        0    1 ~ 9 ~ 10 ~ 11
11       2        4        0        0    1 ~ 9 ~ 10 ~ 11 ~ 2
2        3        5        1        0    1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 3
2        4        5        0        1    1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 4
4        5        6        1        0    1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 4 ~ 5
4        6        6        1        1    1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 4 ~ 6
9       12        2        1        0    1 ~ 9 ~ 12

The output continues to be as expected. All cycles are flaged and the mapping stops when a cycle is encountered.

Now the problem child… Let’s add a self cycle to the root node which is exactly the same as the first cycle created above with node 4; just for node 1.

    insert into t values ( '1'  , '1'  );

This time Oracle detects the cycle at node 1, as expected (first row is flagged with Iscycle set to 1); HOWEVER, it continues past this cycle and builds out the entire tree structure twice. Rows 2 through 21 are a duplication of rows 22 through 41 with the cycle of node 1 prepended onto the front of the path.

Vendor Customer  Level Isleaf Iscycle    Path
1        1        1        0    1      1 ~ 1
1        2        2        0    0      1 ~ 1 ~ 2
2        3        3        1    0      1 ~ 1 ~ 2 ~ 3
2        4        3        0    1      1 ~ 1 ~ 2 ~ 4
4        5        4        1    0      1 ~ 1 ~ 2 ~ 4 ~ 5
4        6        4        0    1      1 ~ 1 ~ 2 ~ 4 ~ 6
6        9        5        0    0      1 ~ 1 ~ 2 ~ 4 ~ 6 ~ 9
9       10        6        0    0      1 ~ 1 ~ 2 ~ 4 ~ 6 ~ 9 ~ 10
10      11        7        1    1      1 ~ 1 ~ 2 ~ 4 ~ 6 ~ 9 ~ 10 ~ 11
9       12        6        1    0      1 ~ 1 ~ 2 ~ 4 ~ 6 ~ 9 ~ 12
1        7        2        0    0      1 ~ 1 ~ 7
7        8        3        1    0      1 ~ 1 ~ 7 ~ 8
1        9        2        0    0      1 ~ 1 ~ 9
9       10        3        0    0      1 ~ 1 ~ 9 ~ 10
10      11        4        0    0      1 ~ 1 ~ 9 ~ 10 ~ 11
11       2        5        0    0      1 ~ 1 ~ 9 ~ 10 ~ 11 ~ 2
2        3        6        1    0      1 ~ 1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 3
2        4        6        0    1      1 ~ 1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 4
4        5        7        1    0      1 ~ 1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 4 ~ 5
4        6        7        1    1      1 ~ 1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 4 ~ 6
9       12        3        1    0      1 ~ 1 ~ 9 ~ 12
1        2        1        0    0      1 ~ 2
2        3        2        1    0      1 ~ 2 ~ 3
2        4        2        0    1      1 ~ 2 ~ 4
4        5        3        1    0      1 ~ 2 ~ 4 ~ 5
4        6        3        0    1      1 ~ 2 ~ 4 ~ 6
6        9        4        0    0      1 ~ 2 ~ 4 ~ 6 ~ 9
9       10        5        0    0      1 ~ 2 ~ 4 ~ 6 ~ 9 ~ 10
10      11        6        1    1      1 ~ 2 ~ 4 ~ 6 ~ 9 ~ 10 ~ 11
9       12        5        1    0      1 ~ 2 ~ 4 ~ 6 ~ 9 ~ 12
1        7        1        0    0      1 ~ 7
7        8        2        1    0      1 ~ 7 ~ 8
1        9        1        0    0      1 ~ 9
9       10        2        0    0      1 ~ 9 ~ 10
10      11        3        0    0      1 ~ 9 ~ 10 ~ 11
11       2        4        0    0      1 ~ 9 ~ 10 ~ 11 ~ 2
2        3        5        1    0      1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 3
2        4        5        0    1      1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 4
4        5        6        1    0      1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 4 ~ 5
4        6        6        1    1      1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 4 ~ 6
9       12        2        1    0      1 ~ 9 ~ 12

Why isn’t the 1-1 cycle treated the same as the 4-4 cycle? What am I missing?

To mitigate against this I added an additional condition on the CONNECT BY clause requiring that the customer not be ‘1’.

    select vendor, 
           customer, 
           level, 
           connect_by_isleaf as isleaf, 
           connect_by_iscycle as iscycle, 
           connect_by_root vendor||sys_connect_by_path(customer,' ~ ') as path 
    from t
    connect by nocycle
          vendor=prior customer
          and customer<>'1' 
    start with vendor='1';

Ironically, all this did was REMOVE the cycle flag from row one.

Any help would be appreciated.

Larry May
  • 121
  • 1
  • 1
  • 4
  • Maybe its because 1 is level 1 and 4 isn't. Also why not just add [`WHERE VENDOR <> CUSTOMER`](http://sqlfiddle.com/#!4/34b20/6) – Conrad Frix Sep 20 '13 at 21:39
  • I want to be notified when their is a cycle so I don't want to use where vendor <> customer. Some of my actual tree structures involve 10 to 20 levels and thousands of nodes. having level one replicated causes extensive duplication. According to all documentation I can find it shouldn't matter that the cycle occurs at level one. ...but it does... :-( – Larry May Sep 21 '13 at 13:06

4 Answers4

1

Oracle selects the root row(s) of the hierarchy (those rows that satisfy the START WITH condition.) Oracle selects the child rows of each root row. Each child row must satisfy the condition of the CONNECT BY condition with respect to one of the root rows.

To find the children of a parent row, Oracle evaluates the PRIOR expression of the CONNECT BY condition for the parent row and the other expression for each row in the table. Rows for which the condition is true are the children of the parent. The CONNECT BY condition can contain other conditions to further filter the rows selected by the query.

A root row is the highest row within an inverted tree. 

If you try with same parent as child (22 or 33 or 44) it will work since they are not root rows and just parents Since 1 is the root and also a child with 1, the LEVEL is set to be cycle due to CONNECT_BY_ROOT clause

The duplication in output occurs since the connect by works on root which is duplicated as well.

Oracle is not able to restrict the uniqueness since Oracle can't give preference to one of the other

Either make your data set unique or code them such that oracle can work on preference in hierarchy

FOLLOW UP: SOLUTION FOR OP's problem

SELECT
      VENDOR,
      CUSTOMER,
      LEVEL,
      CONNECT_BY_ISLEAF AS ISLEAF,
      CONNECT_BY_ISCYCLE AS ISCYCLE,
      CONNECT_BY_ROOT VENDOR
      || SYS_CONNECT_BY_PATH ( CUSTOMER,
                          ' ~ ' )
          AS PATH
FROM
      (SELECT
            VENDOR,
            CUSTOMER
       FROM
            T
       WHERE
            CUSTOMER <> '1')
CONNECT BY
      NOCYCLE VENDOR = PRIOR CUSTOMER
START WITH
      VENDOR = '1';

Results:

VENDOR CUSTOMER      LEVEL     ISLEAF    ISCYCLE PATH                                                                            
------ -------- ---------- ---------- ------------------------------------------------------------------------------------------
1      2                 1          0          0 1 ~ 2                                                                           
2      3                 2          1          0 1 ~ 2 ~ 3                                                                       
2      4                 2          0          1 1 ~ 2 ~ 4                                                                       
4      5                 3          1          0 1 ~ 2 ~ 4 ~ 5                                                                   
4      6                 3          0          1 1 ~ 2 ~ 4 ~ 6                                                                   
6      9                 4          0          0 1 ~ 2 ~ 4 ~ 6 ~ 9                                                               
9      10                5          0          0 1 ~ 2 ~ 4 ~ 6 ~ 9 ~ 10                                                          
10     11                6          1          1 1 ~ 2 ~ 4 ~ 6 ~ 9 ~ 10 ~ 11                                                     
9      12                5          1          0 1 ~ 2 ~ 4 ~ 6 ~ 9 ~ 12                                                          
1      7                 1          0          0 1 ~ 7                                                                           
7      8                 2          1          0 1 ~ 7 ~ 8                                                                       
1      9                 1          0          0 1 ~ 9                                                                           
9      10                2          0          0 1 ~ 9 ~ 10                                                                      
10     11                3          0          0 1 ~ 9 ~ 10 ~ 11                                                                 
11     2                 4          0          0 1 ~ 9 ~ 10 ~ 11 ~ 2                                                             
2      3                 5          1          0 1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 3                                                         
2      4                 5          0          1 1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 4                                                         
4      5                 6          1          0 1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 4 ~ 5                                                     
4      6                 6          1          1 1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 4 ~ 6                                                     
9      12                2          1          0 1 ~ 9 ~ 12                                                                      

20 rows selected
Srini V
  • 11,045
  • 14
  • 66
  • 89
  • @reaispirituais Thanks for the response. I'm not quite sure I follow what you mean when you say `Since 1 is the root and also a child with 1, the LEVEL is set to be cycle due to CONNECT_BY_ROOT clause` Are you saying tht if I take out the CONNECT_BY_ROOT function my issue would go away? Your proposed code fails to notify my of the cycle that exists at node 4 back to node 4. :-( Also: I subsequently found out that ANY reference back to the root node fails to trigger the cycle flag on the first pass. If I eliminate any childs equal to root I have issues of destroying legitamate linkages. – Larry May Sep 27 '13 at 21:22
  • (continue) when more than one root exists; for example `START WITH VENDOR IN (1, 2)` {along with a where clause inthe subquery you propose of `WHERE CUSTOMER NOT IN (1, 2)` } would destroy an entire branch of the tree structure for Vendor 1. – Larry May Sep 27 '13 at 21:23
  • Yes, you are correct. my point was "Oracle is not able to restrict the uniqueness since Oracle can't give preference to one of the other" due to self cycle root node. What I gave is a demo for reasoning your problem and not a solution. Still need to work on that – Srini V Sep 29 '13 at 22:33
  • @LarryMay Updated the SQL for your problem – Srini V Sep 30 '13 at 14:19
1

I would agree with @realspirituals's initial part of explanation about how Oracle deals with hierarchical data. In my vision the first step taken is to find root elements of the trees specified by START WITH clause. This might be rephrased to following query:

select * from t where vendor = '1';
VENDOR  CUSTOMER
------------------
1   2
1   7
1   9
1   1

So actually we have 4 root nodes and 4 separate trees. Next steps are to iteratively evaluate CONNECT BY clause. Imagine that we take above list of CUSTOMER values and seeking for their descendants:

select * from t where vendor in ('2', '7', '9', '1');
VENDOR  CUSTOMER
------------------
1   2
2   3
2   4
1   7
7   8
1   9
9   10
9   12
1   1 --This one is loop and is not taken to final resultset

As soon as we specified NOCYCLE, detected loops are are thrown away and previous row that led us to the loop record is marked as CONNECT_BY_ISCYCLE = 1.

The third step:

select * from t where vendor in ('2', '3', '4', '7', '8', '9', '10', '12');
VENDOR  CUSTOMER
------------------
2   3
2   4
4   5
4   6
7   8
9   10
10  11
9   12
4   4 --This one is loop

So it goes until there is at least one record in output. It takes some time and patience but results returned by your query are fully reproduceable and seems absolutely legitimate to me. That is the way Oracle's algorythm works so everyone just have to keep it in mind when writing queries.

How can we avoid cycle on the top level node? I would suggest to add virtual record that will make our top level node not the top one. Consider this:

insert into t values(null, '1');

select vendor, 
       customer, 
       level, 
       connect_by_isleaf as isleaf, 
       connect_by_iscycle as iscycle, 
       connect_by_root vendor||sys_connect_by_path(customer,' ~ ') as path 
from t
connect by nocycle
      vendor=prior customer
start with vendor is null; --Note the changed condition

Vendor Customer Level   Isleaf  Iscycle  Path
------------------------------------------------------------
        1       1       0       1        ~ 1
1       2       2       0       0        ~ 1 ~ 2
2       3       3       1       0        ~ 1 ~ 2 ~ 3
2       4       3       0       1        ~ 1 ~ 2 ~ 4
4       5       4       1       0        ~ 1 ~ 2 ~ 4 ~ 5
4       6       4       0       1        ~ 1 ~ 2 ~ 4 ~ 6
6       9       5       0       0        ~ 1 ~ 2 ~ 4 ~ 6 ~ 9
9       10      6       0       0        ~ 1 ~ 2 ~ 4 ~ 6 ~ 9 ~ 10
10      11      7       1       1        ~ 1 ~ 2 ~ 4 ~ 6 ~ 9 ~ 10 ~ 11
9       12      6       1       0        ~ 1 ~ 2 ~ 4 ~ 6 ~ 9 ~ 12
1       7       2       0       0        ~ 1 ~ 7
7       8       3       1       0        ~ 1 ~ 7 ~ 8
1       9       2       0       0        ~ 1 ~ 9
9       10      3       0       0        ~ 1 ~ 9 ~ 10
10      11      4       0       0        ~ 1 ~ 9 ~ 10 ~ 11
11      2       5       0       0        ~ 1 ~ 9 ~ 10 ~ 11 ~ 2
2       3       6       1       0        ~ 1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 3
2       4       6       0       1        ~ 1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 4
4       5       7       1       0        ~ 1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 4 ~ 5
4       6       7       1       1        ~ 1 ~ 9 ~ 10 ~ 11 ~ 2 ~ 4 ~ 6
9       12      3       1       0        ~ 1 ~ 9 ~ 12

Of course it mignt be not appropriate to add new records to production database. Rather combine query to real table with some query that dynamically determines top level nodes. Something like that (giving the same output as above):

delete from t where vendor is null; --Removing previosly inserted record

select vendor, 
       customer, 
       level, 
       connect_by_isleaf as isleaf, 
       connect_by_iscycle as iscycle, 
       connect_by_root vendor||sys_connect_by_path(customer,' ~ ') as path 
from (select vendor, customer from t
      union all
      select distinct null, vendor from t
      where vendor = 1) --Here is your START WITH condition
connect by nocycle
      vendor=prior customer
start with vendor is null;
Yaroslav Shabalin
  • 1,645
  • 3
  • 17
  • 29
-1

Starting from a node and connecting one node to another is not the same thing. ISCYCLE looks for customer ~ vendor connections and only connects it once per path it takes. If you tell oracle to

START WITH vendor = '1'

it actually starts at 4 points simultaneousely:

1 ~ 1
1 ~ 2
1 ~ 7
1 ~ 9

Those path-searches are executed in parallel, and every path tries not to cycle with its own path. Each path does not know anything about the others. So the path starting with 1 ~ 1 does not know why it should stop continuing to 2, 7 and 9, because it hasn't been there before. The NOCYCLE just forbids to look into 1 once more. So you can either

START WITH (vendor='1' AND customer !='1')

to avoid too many starting points and/or ignore any connections where vendor and customer are the same:

CONNECT BY NOCYCLE ( vendor = PRIOR customer AND vendor != customer )
Swen Vermeul
  • 104
  • 6
  • Customer !=1 will remove the cycle flag on 1, nothing more. CONNECT BY NOCYCLE ( vendor = PRIOR customer AND vendor != customer ) will also produce no desired result. – Srini V Sep 30 '13 at 12:56
  • @Swen Vermeul This helped provide insight by highlighting that the CONNECT BY clause and the START WITH clause operate independently. CONNECT BY limits the records for connection INDEPENTANTLY of the initilal records being selected by the START WITH clause. The 'customer !='1'` in the START WITH clause prevents the entire tree duplication; however, 1 ~ 1 is not flagged as a cycle (since it does not exists) and I'd like to know about the cycle. the `vendor != customer` in the CONNECT BY clause causes the 4 ~ 4 cycle to not be flagged as well. – Larry May Sep 30 '13 at 13:47
  • I like vendor != customer in the START WITH clause and customer!= 1 in the CONNECT BY clause -- would actually prefer customer != CONNECT_BY_ROOT vendor but that is not supported -- which prevents all cycles that go through the root BUT it also fails to flag these as cycles. I want to know that these cycles exist. – Larry May Sep 30 '13 at 13:56
-1

The nocycle actually allows your query to have cycles, without that keyword Oracle would stop as soon as cycle is detected (ORA-01436: CONNECT BY loop in user data). It also allows you to use "CONNECT_BY_ISCYCLE" to detect places where chidren are making the cycle, but filtering the query on that result would remove valid rows. So, maybe, you could use your connect by nocycle vendor=prior customer AND connect_by_iscycle = 0 in the loop condition to avoid all loops after parents of cycles are detected? (I do not have things to test it). This would stop recursivity on the first 1~1 path.

regilero
  • 29,806
  • 6
  • 60
  • 99
  • CONNECT_BY_ISCYCLE=0 will not produce the desired result, rather partial ones. – Srini V Sep 30 '13 at 12:54
  • @realspirituals. this does not deserve a -1. I think this would remove the full tree duplication. – regilero Sep 30 '13 at 12:59
  • Sorry for being rude on that.. I was looking for a credible answer than an untested snippet. Hence a -1 on this. When you test that, you can see the results. (I do read your comment that it is not tested) – Srini V Sep 30 '13 at 13:11