3

EDIT: I provided additinal information by introducing location entity to make it clear why I tried to use subqueries

In oracle 11g database I have hierarchical structured table of elements which will eventually contain few millions of rows. Each row has indexed foreign key that points to its parent and no cycles are allowed. Element also has name and type. Besides that, there is another entity - location, which is similar to element (hierarchical, has foreign key that points to parent + name). Top element (root of you will) can be in location (they are connected by LocationId). So there are 2 entities:

Location:

  • Id [NUMBER(9,0), PK]
  • ParentId [NUMBER(9,0), FK]
  • Name [VARCHAR2(200)]

Element:

  • Id [NUMBER(9,0), PK]
  • LocationId [NUMBER(9,0), FK]
  • ParentId [NUMBER(9,0), FK]
  • TypeId [NUMBER(9,0), FK]
  • Name [VARCHAR2(200)]

Now lets say tables contain following data for example:

Location:

Id   | ParentId | Name
----------------------------------
100  |   null   | TopLocation
101  |   100    | Level1Location
102  |   101    | Level2Location    

Element:

Id | LocationId | ParentId | TypeId | Name
----------------------------------------------------
1  |    102     |   null   |    10  | TopParent
2  |   null     |     1    |    11  | Level1Child
3  |   null     |     2    |    11  | Level2Child

What I have to do is write query for elements, which besides basic 4 element columns also returns full paths of parent ids, names and type ids + full path of top element location ids and names. So if I fetched element with Id 3 (this condition can also be complex by multiple columns which are not specified here) query would have to return this:

Id | ParentId | TypeId | Name        | IdsPath | TypeIdsPath | NamesPath                           | LocIdsPath   | LocNamesPath
---------------------------------------------------------------------------------------------------------------------------------------------------------------
3  |    2     |    11  | Level2Child | /3/2/1  |   /11/11/10 |  /Level2Child/Level1Child/TopParent | /102/101/100 | /Level2Location/Level1Location/TopLocation

First I wrote oracle hierarchical query which returns desired paths for location and element

Location

select
    SYS_CONNECT_BY_PATH(Id, '/') IdsPath,
    SYS_CONNECT_BY_PATH(Name, '/') NamesPath
from 
    loc
where
     connect_by_isleaf = 1
CONNECT BY PRIOR ParentId = e.Id
start with Id = 102

Element

select
    SYS_CONNECT_BY_PATH(Id, '/') IdsPath,
    SYS_CONNECT_BY_PATH(TypeId, '/') TypeIdsPath,
    SYS_CONNECT_BY_PATH(Name, '/') NamesPath
from 
    ele
where
     connect_by_isleaf = 1
CONNECT BY PRIOR ParentId = e.Id
start with Id = 3

The problems started when I wanted to use these queries as subqueries joined in basic select - one cannot replace start with condition with join condition because hierarchical queries than do full table scans:

select
    e.*,
    elePath.IdsPath,
    elePath.TypeIdsPath,
    elePath.NamesPath,
    locPath.IdsPath as LocIdsPath,
    locPath.NamesPath as LocNamesPath
from
    ele e
    left join (
        --full table scan!
        select
            CONNECT_BY_ROOT(Id) Id,
            Id as TopEleId,
            SYS_CONNECT_BY_PATH(Id, '/') IdsPath,
            SYS_CONNECT_BY_PATH(TypeId, '/') TypeIdsPath,
            SYS_CONNECT_BY_PATH(Name, '/') NamesPath
        from ele
        where
             connect_by_isleaf = 1
        CONNECT BY PRIOR ParentId = e.Id
    ) elePath on elePath.Id = e.Id
    left join (
        --full table scan!
        select
            CONNECT_BY_ROOT(Id) Id,
            SYS_CONNECT_BY_PATH(Id, '/') IdsPath,
            SYS_CONNECT_BY_PATH(Name, '/') NamesPath
        from loc
        where
             connect_by_isleaf = 1
        CONNECT BY PRIOR ParentId = e.Id
    ) locPath on locPath.Id = elePath.TopEleId
where
    e.Id = 3

I also cant do scalar subquery since query has to return multiple paths, not just one. Any suggestions? Am I even going in right direction or should I rather add a few fields to element table and cache all paths I need? (they will not be frequently updated)

Thanks!

Pavle Gartner
  • 659
  • 1
  • 7
  • 21

1 Answers1

3

You traverse hierarchical structure in reverse direction, simply use connect_by_root() operator to get column values of the root row.

clear screen;
column IdPath format a11;
column TypeIdPathformat a11
column NamePath format a35;

with t1(id1, parent_id, type_id, Name1) as(
  select 1, null, 10, 'TopParent' from dual union all
  select 2, 1   , 11, 'Level1Child' from dual union all
  select 3, 2   , 11, 'Level2Child' from dual
)
select connect_by_root(id1)                as id1
     , connect_by_root(parent_id)          as ParentId
     , connect_by_root(type_id)            as Typeid
     , connect_by_root(name1)              as name1
     , sys_connect_by_path(id1, '/')       as IdPath
     , sys_connect_by_path(type_id, '/')   as TypeIdPath
     , sys_connect_by_path(name1, '/')     as NamePath 
 from t1
where connect_by_isleaf = 1
start with id1 = 3
connect by id1 = prior parent_id

Result:

 id1 ParentId TypeId  Name1        IdPath  TypeIdPath NamePath                         
 ---------------------------------------------------------------------------       
 3        2      11   Level2Child /3/2/1  /11/11/10  /Level2Child/Level1Child/TopParent

Edit #1

One way to get the desired output is to use scalar sub-query:

with Locations(Id1, ParentId, Name1) as(
  select 100,  null, 'TopLocation' from dual union all
  select 101,  100 , 'Level1Location' from dual union all
  select 102,  101 , 'Level2Location' from dual
),
elements(id1, LocationId, parent_id, type_id, Name1) as(
  select 1, 102,  null, 10, 'TopParent' from dual union all
  select 2, null, 1   , 11, 'Level1Child' from dual union all
  select 3, null, 2   , 11, 'Level2Child' from dual
)
select e.*
     , (select sys_connect_by_path(l.id1, '/')
          from locations l
          where connect_by_isleaf = 1
          start with l.id1 = e.locationid
          connect by l.id1 = prior parentid)        as LocIdPath
     , (select sys_connect_by_path(l.name1, '/')
          from locations l
          where connect_by_isleaf = 1
          start with l.id1 = e.locationid
          connect by l.id1 = prior parentid)        as LocNamePath
  from ( select connect_by_root(id1)                as id1
              , connect_by_root(parent_id)          as ParentId
              , connect_by_root(type_id)            as Typeid
              , connect_by_root(name1)              as name1
              , sys_connect_by_path(id1, '/')       as IdPath
              , sys_connect_by_path(type_id, '/')   as TypeIdPath
              , sys_connect_by_path(name1, '/')     as NamePath  
              , locationid
          from elements
         where connect_by_isleaf = 1
         start with id1 = 3
       connect by id1 = prior parent_id ) e 

Result:

ID1   PARENTID     TYPEID NAME1       IDPATH      TYPEIDPATH  NAMEPATH                            LOCATIONID LOCIDPATH     LOCNAMEPATH                               
---------- ---------- ----------- ----------- ----------- ----------------------------------- ---------- ------------- -------------------------------------------
  3          2         11 Level2Child /3/2/1      /11/11/10   /Level2Child/Level1Child/TopParent         102 /102/101/100  /Level2Location/Level1Location/TopLocation  
Nick Krasnov
  • 26,886
  • 6
  • 61
  • 78
  • Thanks, for answer, I have thought of that but maybe this example is too simple - In actual case I have to get paths for current entity (element in this case) + few entities that are related to the element by foreign keys - e.g. every element has also foreign key LocationId and locations are also hierarchically structured - so I also need those paths and thats why I asked how to do it with subquery – Pavle Gartner Mar 02 '14 at 13:02
  • @PavleGartner Then you need to provide more information - sample of master\detail data and want exact output you are after. It can be done in sub-query if such a need arises, but more accurate information is needed. In your example you are left outer joining result-set returned by the in-line view to the table that in-line view is based. There no `LocationId` column as well as table that contains that column being mentioned in your question. – Nick Krasnov Mar 02 '14 at 13:17
  • Thanks for the effort - I added concept of location to original question to make it more clear. – Pavle Gartner Mar 02 '14 at 14:04
  • Thanks, that could work - do you maybe know how would corelated subqueries for location work performance wise - is optimiser smart enough not to walk the hierarchy twice in your query? – Pavle Gartner Mar 02 '14 at 15:14
  • @PavleGartner In fact, because of backward traversing of hierarchical structure, the `Locations` table will be, in this, situation, accessed four times(extra nested loops will be required), and `Elements` table will be accessed twice. But there should not be any major performance penalties if there are appropriate indexes created. Create an index on foreign key if there isn't one. – Nick Krasnov Mar 03 '14 at 05:45