The most common cause for explosions is a badly formed START WITH / CONNECT BY clause. Here's an extract from an article I wrote on it. It assumes we're starting with the standard 14 rows in SCOTT.EMP
Filtering versus Navigation
The definition of how to navigate the hierarchy is quite distinct from the filtering that comes with WHERE clause predicates in the SQL. To demonstrate this, the 14 rows in the EMP table have been duplicated to be a set of rows for multiple geographic regions as shown in Listing 7. There are now 42 rows in the table, the standard set of 14 rows replicated twice for EAST, NORTH and SOUTH regions.
Listing 7
SQL> alter table emp add region varchar2(10);
Table altered.
SQL> update emp set region = 'NORTH';
14 rows updated.
SQL>
SQL> insert into emp
2 select
3 empno, ename, job, mgr, hiredate, sal, comm, deptno,
4 decode(r,1,'SOUTH',2,'EAST') region
5 from emp,
6 ( select rownum r from dual connect by level <= 2 )
7 where region = 'NORTH';
28 rows created.
SQL>
SQL> select region, count(*)
2 from emp
3 group by region;
REGION COUNT(*)
---------- ----------
NORTH 14
SOUTH 14
EAST 14
Let's assume a requirement of obtaining the hierarchy for the 14 rows for just the NORTH region. It would seem intuitive to use the identical query as seen in Listing 2 but with an additional predicate to restrict the REGION to NORTH. Listing 8 shows the result.
Listing 8
SQL> select empno, rpad(' ',level*3)||ename ename, mgr
2 from emp
3 where region = 'NORTH'
4 connect by prior empno = mgr
5 start with mgr is null;
EMPNO ENAME MGR
---------- ---------------------------------------- ----------
7839 KING
7566 JONES 7839
7788 SCOTT 7566
7876 ADAMS 7788
...
...
7782 CLARK 7839
7934 MILLER 7782
7934 MILLER 7782
7934 MILLER 7782
136 rows selected.
Notice that the output returned is not 14 rows but 136 rows, even though a predicate to restrict the results to just the NORTH region was applied.
To understand the cause of this problem, consider the very first NORTH region row that is being processed. The START WITH clause is MANAGER IS NULL, so three King rows, one for each region, will belocated, and then the CONNECT BY operation will commence. The CONNECT BY clause is PRIOR EMPNO = MANAGER, so for each KING record , three related employees, one for each region, will be connected to, which rapidly increases the number of rows returned as th query traverses throughout the hierarchy. Only once the hierarchy relationship is established is the WHERE clause predicate applied, so even though the resultset only contains rows for REGION= NORTH, a large amount of redundant work has performed doing erroneous connections to the other regions because CONNECT BY syntax and START WITH syntax did not
• restrict themselves to starting with only the NORTH region, and
• did not make sure that when connecting to the next entry in the hierarchy,
that the row maps from the same region to the subsequent row's region.
Listing 9 shows the corrected syntax for this query. It starts with managers that are null but also only in the NORTH region. It then ensures that when connecting from one manager to a subsequent employee, the definition of connecting also includes a common region. This is synonymous to the way the ANSI join syntax is processed, namely, there is a distinct difference between the hierarchical connecting condition and the WHERE clause predicates that will be applied subsequently as filters.
This gives rise to a common correct criticism of hierarchy queries, that they are running too slowly or consume too many resources. Commonly the cause is that the CONNECT BY information is incomplete, yielding far many more rows being processed than need to be.
Listing 9
SQL> select empno, rpad(' ',level*3)||ename ename, mgr
2 from emp
3 connect by prior empno = mgr
4 and region = 'NORTH'
5 start with mgr is null
6 and region = 'NORTH';
EMPNO ENAME MGR
---------- ---------------------------------------- ----------
7839 KING
7566 JONES 7839
7788 SCOTT 7566
7876 ADAMS 7788
7902 FORD 7566
7369 SMITH 7902
7698 BLAKE 7839
7499 ALLEN 7698
7521 WARD 7698
7654 MARTIN 7698
7844 TURNER 7698
7900 JAMES 7698
7782 CLARK 7839
7934 MILLER 7782
14 rows selected.
You might ask "What then is the use of a WHERE clause in a hierarchical query?" The presence of a WHERE clause is still valid, not to define the hierarchy, but to define filtering of the data. For example, once the NORTH region hierarchy has been instantiated, the requirement might be to only those included employees above a certain SALARY.