3

Given the following PostgreSQL table:

items
  integer id
  integer parent_id
  string name

unique key on [parent_id, name]

parent_id is null for all root nodes

Currently I build the sql query manually, doing a join for every path element. But is seems quite ugly to me and of course it limits the possible depth.

Example:

path: holiday,images,spain
SELECT i3.* 
FROM items AS i1
   , items AS i2
   , items AS i3
WHERE i1.parent_id IS NULL AND i1.name = 'holiday'
  AND i2.parent_id=i1.id AND i2.name = 'images'
  AND i3.parent_id=i2.id AND i3.name = 'spain'

I wonder if there's a better way, probably using CTE?

You can see how my current code works and what the expected output is here:
http://sqlfiddle.com/#!1/4537c/2

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
gucki
  • 4,582
  • 7
  • 44
  • 56
  • The question is not very clear. What is "last node" ? What do you need to search if you already have the "full path" to it ? Your table definition is not very helpful either. Maybe add some sample data to the question? – wildplasser Aug 24 '13 at 12:52
  • Extra questions: 1) can there be any other _unmatched_ nodes above i1 ? Between i1 and i2, between i2 and i3 ? below i3 ? If so: do you want these to be retrieved, too? 2) is the _order_ of {holiday,images,spain} in the path important , or could these occur in any order ? BTW: unique index on `parent_id` is impossible (and wrong) – wildplasser Aug 24 '13 at 13:16
  • I just added an sqlfiddle. Last node is the last item of the path (a,b,c,d -> d). I only have the names, not any ids. The name of the node is not globally unique, that's why I have to check all parents. I'm doing this using joins right now, but the number of joins is equal to the number of path elements. Which is bad. – gucki Aug 24 '13 at 13:18
  • In my query, i1 is always a root node as the parent_id is null. The patch is complete, so no missing elements. I only need the last node (id would be enough). The order of the path elements is important. Think of it as a filesystem's directory structure. – gucki Aug 24 '13 at 13:20

3 Answers3

2

update2 here's a function, it peforms well, because search goes only within the path, starting from parent:

create or replace function get_item(path text[])
returns items
as
$$
    with recursive cte as (
        select i.id, i.name, i.parent_id, 1 as level
        from items as i
        where i.parent_id is null and i.name = $1[1]

        union all

        select i.id, i.name, i.parent_id, c.level + 1
        from items as i
            inner join cte as c on c.id = i.parent_id
        where i.name = $1[level + 1]
    )
    select c.id, c.parent_id, c.name
    from cte as c
    where c.level = array_length($1, 1)
$$
language sql;

sql fiddle demo

update I think you can do recursive traversal. I've written sql version of this, so it's a bit messy because of cte, but it's possible to write a function:

with recursive cte_path as (
    select array['holiday', 'spain', '2013'] as arr
), cte as (
    select i.id, i.name, i.parent_id, 1 as level
    from items as i
        cross join cte_path as p
    where i.parent_id is null and name = p.arr[1]

    union all

    select i.id, i.name, i.parent_id, c.level + 1
    from items as i
        inner join cte as c on c.id = i.parent_id
        cross join cte_path as p
    where i.name = p.arr[level + 1]
)
select c.*
from cte as c
    cross join cte_path as p
where level = array_length(p.arr, 1)

sql fiddle demo

or you can build path for all of the elements using recursive cte for that and accumuate your path into array or string:

with recursive cte as (
    select i.id, i.name, i.parent_id, i.name::text as path
    from items as i
    where i.parent_id is null

    union all

    select i.id, i.name, i.parent_id, c.path || '->' || i.name::text as path
    from items as i
        inner join cte as c on c.id = i.parent_id
)
select *
from cte
where path = 'holiday->spain->2013';

or

with recursive cte as (
    select i.id, i.name, i.parent_id, array[i.name::text] as path
    from items as i
    where i.parent_id is null

    union all

    select i.id, i.name, i.parent_id, c.path || array[i.name::text] as path
    from items as i
        inner join cte as c on c.id = i.parent_id
)
select *
from cte
where path = array['holiday', 'spain', '2013']

sql fiddle demo

Roman Pekar
  • 107,110
  • 28
  • 195
  • 197
1

This should perform very well, as it eliminates impossible paths immediately:

WITH RECURSIVE cte AS (
   SELECT id, parent_id, name
         ,'{holiday,spain,2013}'::text[] AS path  -- provide path as array here
         ,2 AS lvl                                -- next level
   FROM   items
   WHERE  parent_id IS NULL
   AND    name = 'holiday'                        -- being path[1]

   UNION ALL
   SELECT i.id, i.parent_id, i.name
         ,cte.path, cte.lvl + 1
   FROM   cte 
   JOIN   items i ON i.parent_id = cte.id AND i.name = path[lvl]
)
SELECT id, parent_id, name
FROM   cte
ORDER  BY lvl DESC
LIMIT  1;

Assuming you provide a unique path (only 1 result).

->SQLfiddle demo

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • `where path = array...` is redundant here – Roman Pekar Aug 24 '13 at 13:56
  • Nice work! BTW: I don't like the `ORDER BY ... LIMIT 1` ; I'd prefer `lvl >=3` (I think the OP wants the offspring of the path. not sure) – wildplasser Aug 24 '13 at 14:06
  • @wildplasser: It depends on the exact requirements. if we know there are always three levels, it would have to be `lvl > 3` or `lvl = 4` in my example, since each row carries the *next* level (one less computation). – Erwin Brandstetter Aug 24 '13 at 14:10
  • Ah! my bad (off-by-one is my middle name) The problem gets more complex if intermediate (unmatched) nodes were allowed, or the ordering were not imposed; see my comment on the OQ. – wildplasser Aug 24 '13 at 14:12
  • Great! Thank you very much for your effort! – gucki Aug 24 '13 at 16:45
0

Too late to post my answer (very equivalent to Roman's and Erwin's) But an improvement on the table definition instead:

CREATE TABLE items
        ( id integer NOT NULL PRIMARY KEY
        , parent_id integer REFERENCES items(id)
        , name varchar
        , UNIQUE (parent_id,name) -- I don't actually like this one
        );                        -- ; UNIQUE on a NULLable column ...

INSERT INTO items (id, parent_id, name) values
        (1, null, 'holiday')
        , (2, 1, 'spain'), (3, 2, '2013')
        , (4, 1, 'usa'), (5, 4, '2013')
        ;
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
wildplasser
  • 43,142
  • 8
  • 66
  • 109