0

I have a simple query:

SELECT DISTINCT 
dep.DEP_ID,
dep.DEP_NAME,
dep.PARENT_DEP_ID,
emp.SITE_LOCATION
from DEPARTMENT dep
INNER JOIN EMPLOYEE emp ON dep.DEP_ID = emp.DEP_ID
ORDER BY dep.DEP_ID;

query

This returns information about departments with some rows having the same DEP_ID, DEP_NAME and PARENT_DEP_NAME. This is the expected result because some employees belong to the same department but have a different SITE_LOCATION

What I'd like is to add another column MASTER_PARENT_ID with the DEP_ID of the MASTER parent.

What I call master parent is the one which is referenced as PARENT_DEP_ID but actually doesn't exist (missing value in the DEP_ID column).

So all these rows should actually have a MASTER_PARENT_ID value equals to DEP_2000. This is just a sample data but in reality many rows would have a different MASTER_PARENT_ID. Not all of them would have the same value.

To do that, for each row, I need to execute a recursive query to go all the way trough the tree until I find a PARENT_DEP_ID value that doesn't have any matching DEP_ID.

I'm trying to read and understand the Oracle documentation and examples but I can't find what's working in my case. Should I use something like CONNECT BY PRIOR to perform such recursive function?

SQL in general isn't my cup of tea and even less Oracle. I can't find how to solve this problem.

Thank you

Jérôme MEVEL
  • 7,031
  • 6
  • 46
  • 78
  • A parent field like that is usually a foreign key to the related department; so is there no dept 2000 at all, or just not in your result set? And could there be more than one 'master' parent in the same result set? – Alex Poole Nov 20 '18 at 11:43
  • @Alex Poole It's complicated, I have to write some code that will run in another company but this company is crazy paranoid about security rules. I don't have access to the full DB schema and I don't even know how the real data looks like. I just created some sample data based on the information I have. – Jérôme MEVEL Nov 20 '18 at 14:33
  • There there will be probably multiple master parents but obviously one (department + site location) tuple only has one master parent – Jérôme MEVEL Nov 20 '18 at 14:35

2 Answers2

1

You can use the CONNECT_BY_ISLEAF pseudo-column to find the leaves of the hierarchy tree and then use the CONNECT_BY_ROOT( ... ) function to get values when you started navigating the tree:

SQL Fiddle

Oracle 11g R2 Schema Setup:

CREATE TABLE DEPARTMENT( DEP_ID, DEP_NAME, PARENT_DEP_ID ) AS
SELECT 'DEP_2000', 'Dep0', NULL       FROM DUAL UNION ALL
SELECT 'DEP_2400', 'Dep1', 'DEP_2000' FROM DUAL UNION ALL
SELECT 'DEP_2410', 'Dep2', 'DEP_2400' FROM DUAL UNION ALL
SELECT 'DEP_2420', 'Dep3', 'DEP_2400' FROM DUAL;

Query 1:

SELECT CONNECT_BY_ROOT( DEP_ID )        AS DEP_ID,
       CONNECT_BY_ROOT( DEP_NAME )      AS DEP_NAME,
       CONNECT_BY_ROOT( PARENT_DEP_ID ) AS PARENT_DEP_ID,
       DEP_ID                           AS MASTER_PARENT_DEP_ID
FROM   DEPARTMENT
WHERE  CONNECT_BY_ISLEAF = 1
CONNECT BY PRIOR PARENT_DEP_ID = DEP_ID

Results:

|   DEP_ID | DEP_NAME | PARENT_DEP_ID | MASTER_PARENT_DEP_ID |
|----------|----------|---------------|----------------------|
| DEP_2000 |     Dep0 |        (null) |             DEP_2000 |
| DEP_2400 |     Dep1 |      DEP_2000 |             DEP_2000 |
| DEP_2410 |     Dep2 |      DEP_2400 |             DEP_2000 |
| DEP_2420 |     Dep3 |      DEP_2400 |             DEP_2000 |

Note: By traversing the tree from each element up to the root, rather than the reverse, you do not need a START WITH clause nor a separate query to find the roots.

This will even work if there is no DEP_2000 row (just change DEP_ID to PARENT_DEP_ID):

SQL Fiddle

Oracle 11g R2 Schema Setup:

CREATE TABLE DEPARTMENT( DEP_ID, DEP_NAME, PARENT_DEP_ID ) AS
--SELECT 'DEP_2000', 'Dep0', NULL       FROM DUAL UNION ALL
SELECT 'DEP_2400', 'Dep1', 'DEP_2000' FROM DUAL UNION ALL
SELECT 'DEP_2410', 'Dep2', 'DEP_2400' FROM DUAL UNION ALL
SELECT 'DEP_2420', 'Dep3', 'DEP_2400' FROM DUAL;

Query 1:

SELECT CONNECT_BY_ROOT( DEP_ID )        AS DEP_ID,
       CONNECT_BY_ROOT( DEP_NAME )      AS DEP_NAME,
       CONNECT_BY_ROOT( PARENT_DEP_ID ) AS PARENT_DEP_ID,
       PARENT_DEP_ID                    AS MASTER_PARENT_DEP_ID
FROM   DEPARTMENT
WHERE  CONNECT_BY_ISLEAF = 1
CONNECT BY PRIOR PARENT_DEP_ID = DEP_ID

Results:

|   DEP_ID | DEP_NAME | PARENT_DEP_ID | MASTER_PARENT_DEP_ID |
|----------|----------|---------------|----------------------|
| DEP_2400 |     Dep1 |      DEP_2000 |             DEP_2000 |
| DEP_2410 |     Dep2 |      DEP_2400 |             DEP_2000 |
| DEP_2420 |     Dep3 |      DEP_2400 |             DEP_2000 |
MT0
  • 143,790
  • 11
  • 59
  • 117
  • Thanks for your answer. It works with the sample data I gave but for some reason, not with my real data. Right now I don't really have time to understand why, we already passed the deadline. – Jérôme MEVEL Nov 21 '18 at 03:06
0

If the master department ID doesn't exist at all in the department table - rather than it just not appearing in the results due to haveing no direct employees, for instance - then you could use a hierarchical query to get the root parent for each deparement:

select dep_id, dep_name, parent_dep_id,
  connect_by_root(parent_dep_id) as master_parent_id
from department
connect by parent_dep_id = prior dep_id
start with parent_dep_id in (
  select parent_dep_id from department d1
  where not exists (
    select *
    from department d2
    where d2.dep_id = d1.parent_dep_id
  )
);

DEP_ID   DEP_NAME        PARENT_D MASTER_P
-------- --------------- -------- --------
DEP_2400 Department 2400 DEP_2000 DEP_2000
DEP_2410 Department 2410 DEP_2400 DEP_2000
DEP_2420 Department 2420 DEP_2400 DEP_2000

and then use that as an inline view in your main query, instead of referring to the table directly:

SELECT DISTINCT 
dep.DEP_ID,
dep.DEP_NAME,
dep.PARENT_DEP_ID,
emp.SITE_LOCATION,
dep.MASTER_PARENT_ID
from (
  select dep_id, dep_name, parent_dep_id,
    connect_by_root(parent_dep_id) as master_parent_id
  from department
  connect by parent_dep_id = prior dep_id
  start with parent_dep_id in (
    select parent_dep_id from department d1
    where not exists (
      select *
      from department d2
      where d2.dep_id = d1.parent_dep_id
    )
  )
) dep
INNER JOIN EMPLOYEE emp ON dep.DEP_ID = emp.DEP_ID
ORDER BY dep.DEP_ID;

DEP_ID   DEP_NAME        PARENT_D SITE_LO MASTER_P
-------- --------------- -------- ------- --------
DEP_2400 Department 2400 DEP_2000 SITE_01 DEP_2000
DEP_2400 Department 2400 DEP_2000 SITE_02 DEP_2000
DEP_2410 Department 2410 DEP_2400 SITE_01 DEP_2000
DEP_2410 Department 2410 DEP_2400 SITE_02 DEP_2000
DEP_2420 Department 2420 DEP_2400 SITE_01 DEP_2000
DEP_2420 Department 2420 DEP_2400 SITE_02 DEP_2000

It would seem more natural for DEP_2000 to actually exist though, and for it to have no parent; if that is actually the case then the inline view is simpler:

SELECT DISTINCT 
dep.DEP_ID,
dep.DEP_NAME,
dep.PARENT_DEP_ID,
emp.SITE_LOCATION,
dep.MASTER_PARENT_ID
from (
  select dep_id, dep_name, parent_dep_id,
    connect_by_root(dep_id) as master_parent_id
  from department
  connect by parent_dep_id = prior dep_id
  start with parent_dep_id is null
) dep
INNER JOIN EMPLOYEE emp ON dep.DEP_ID = emp.DEP_ID
ORDER BY dep.DEP_ID;

Notice that as well as the start with clause just looking for a null parent, the connect_by_root() is now looking at the dep_id instead of parent_dep_id (which would be null).

Alex Poole
  • 183,384
  • 11
  • 179
  • 318