1

I am trying to remove duplicate subtree returned by "connect by prior" clause. I wish to check top level node of tree hierarchy, where user can enter child IDs which are already part of subtree. Take a look at following example:

SELECT * FROM (
With test_hierarchy as(
       SELECT 'a' parent, 'b' child FROM dual UNION ALL
       SELECT 'b','c' FROM dual UNION ALL
       SELECT 'd','e' FROM dual UNION ALL
       SELECT 'e','f' FROM dual UNION ALL
       SELECT 'f','g' FROM dual UNION ALL
       SELECT 'f','h' FROM dual)
SELECT
    parent,
    child,
    CONNECT_BY_ROOT child AS init_child,
    LEVEL,
    CONNECT_BY_ISLEAF,
    MAX(LEVEL) OVER(
        PARTITION BY parent
    ) AS max_level
FROM
    test_hierarchy
WHERE
    CONNECT_BY_ISLEAF = 1
START WITH
    child IN (
        'c', 'b', 'e', 'f', 'h', 'g'
    )
CONNECT BY
    PRIOR parent = child);

This query returns result as:

P C I      LEVEL CONNECT_BY_ISLEAF  MAX_LEVEL
- - - ---------- ----------------- ----------
a b b          1                 1          2
a b c          2                 1          2
d e g          3                 1          3
d e f          2                 1          3
d e h          3                 1          3
d e e          1                 1          3

I wish to return only those top level nodes with level = max_level. I.e. my query should return result as:

P C I      LEVEL CONNECT_BY_ISLEAF  MAX_LEVEL
- - - ---------- ----------------- ----------
a b c          2                 1          2
d e g          3                 1          3
d e h          3                 1          3

If I try to filter out results using WHERE clause as "WHERE level = max_level", Oracle complains:

ORA-01788: CONNECT BY clause required in this query block
01788. 00000 -  "CONNECT BY clause required in this query block"

Please let me know if you have an idea on how to do it. Thanks,

user613114
  • 2,731
  • 11
  • 47
  • 73

2 Answers2

1

Wrap your recursive query in another CTE and filter on that:

WITH 
  test_hierarchy AS (
    SELECT 'a' parent, 'b' child FROM dual UNION ALL
    SELECT 'b','c' FROM dual UNION ALL
    SELECT 'd','e' FROM dual UNION ALL
    SELECT 'e','f' FROM dual UNION ALL
    SELECT 'f','g' FROM dual UNION ALL
    SELECT 'f','h' FROM dual
  ),
  recursion AS (
    SELECT
      parent,
      child,
      CONNECT_BY_ROOT child AS init_child,
      LEVEL AS lvl,
      CONNECT_BY_ISLEAF AS isleaf,
      MAX(LEVEL) OVER(
        PARTITION BY parent
      ) AS max_level
    FROM
      test_hierarchy
    START WITH child IN ('c', 'b', 'e', 'f', 'h', 'g')
    CONNECT BY PRIOR parent = child
  )
SELECT *
FROM recursion
WHERE isleaf = 1 AND lvl = max_level
Lukas Eder
  • 211,314
  • 129
  • 689
  • 1,509
1

Your logik works, but is sort of brute force approach, i.e. examine all possibilities and select the valid only.

One alternative is simple to limit your START WITH to consider the leave nodes only.

This can be done by excluding all nodes that are parent nodes:

START WITH
    child IN ( 'c', 'b', 'e', 'f', 'h', 'g') and
    child not in (select parent from test_hierarchy)

The final query have better performance because of limited START list and you need no WINDOWS SORT to get max level:

With test_hierarchy as(
       SELECT 'a' parent, 'b' child FROM dual UNION ALL
       SELECT 'b','c' FROM dual UNION ALL
       SELECT 'd','e' FROM dual UNION ALL
       SELECT 'e','f' FROM dual UNION ALL
       SELECT 'f','g' FROM dual UNION ALL
       SELECT 'f','h' FROM dual)
SELECT
    parent,
    child,
    CONNECT_BY_ROOT child AS init_child,
    LEVEL,
    CONNECT_BY_ISLEAF
FROM
    test_hierarchy
WHERE
    CONNECT_BY_ISLEAF = 1 
START WITH
    child IN ( 'c', 'b', 'e', 'f', 'h', 'g') and
    child not in (select parent from test_hierarchy)
CONNECT BY
    PRIOR parent = child;

P C I      LEVEL CONNECT_BY_ISLEAF
- - - ---------- -----------------
a b c          2                 1
d e g          3                 1
d e h          3                 1
Marmite Bomber
  • 19,886
  • 4
  • 26
  • 53
  • Marmite, Kudos to your performance efficient query. It has slightly better performance compared to my analytical function approach. So I will use your query in my solution. Cheers! I will still keep @Lukas Eder's query as the accepted answer as he answered exactly what was asked in the question :) – user613114 Feb 01 '19 at 08:46
  • @user613114 thanks, I've no problem with answering something a bit different than what you have asked;) – Marmite Bomber Feb 01 '19 at 09:47