6

So I have an SQL query to retrieve all the children of a given node in an adjacency list.

WITH    RECURSIVE
        q AS
        (
        SELECT  id, name
        FROM    categories h
        WHERE   id = 11846801
        UNION ALL
        SELECT  hc.id, hc.name
        FROM    q
        JOIN    categories hc
        ON      hc.parent = q.id
        )
SELECT  name
FROM    q

Is there a way to modify this query to return me just the bottom level of nodes? I can't just specify a given level as each path may have a different depth.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
OverlordQ
  • 172
  • 1
  • 1
  • 11

3 Answers3

3

Interpretation 1

"All leave nodes with the longest path from the start."

One way would be to count the levels on your way down and only return members of the bottom one:

WITH RECURSIVE q AS (
   SELECT  id, name, 0 AS lvl
   FROM    categories
   WHERE   id = 11846801

   UNION ALL
   SELECT  c.id, c.name, q.lvl + 1
   FROM    q
   JOIN    categories c ON c.parent = q.id
   )
SELECT  id, name
FROM    q
WHERE   lvl = (SELECT max(lvl) FROM q);

Interpretation 2

"All leave nodes."

WITH RECURSIVE q AS (
   SELECT  id, name, parent
   FROM    categories
   WHERE   id = 11846801

   UNION ALL
   SELECT  c.id, c.name, c.parent
   FROM    q
   JOIN    categories c ON c.parent = q.id
   )
SELECT  id, name
FROM    q
WHERE   NOT EXISTS (SELECT FROM q q1 WHERE q1.parent = q.id);

It should be faster to check on q than on the base table - unless q is very big, in which case indexes on the main table may be faster.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
2

The things at the bottom are never parents. So, you can add the following where clause:

where id not in (select parent from categories)

Actually, in Postgres, not in may not be the most efficient method. So, this might be more efficient:

where not exists (select 1 from categories c where c.parent = q.id)
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
1

Just to iterate Gordon Linoff's answer. His query will work is great idea, but only if there are no NULL values. Thus to counter this, an additional where clause is needed.

SELECT id WHERE id NOT IN (SELECT parent FROM categories WHERE parent IS NOT NULL);

I only found out about this today. (and not enough rep to comment)

To comply with the SQL standard, IN returns NULL not only if the expression on the left hand side is NULL, but also if no match is found in the list and one of the expressions in the list is NULL.

Reference http://dev.mysql.com/doc/refman/5.6/en/comparison-operators.html#function_in

Community
  • 1
  • 1
Joshua Graham
  • 268
  • 1
  • 5