0

My hierarchical query looks like this:

select ID, PARENT_ID, START_TIME
FROM PROCESS
START WITH ID='X'
CONNECT BY ID=PRIOR PARENT_ID

My problem is that there are rows with the same ID 'X' and I only want the most recent one returned (when a process is duplicated, its ID is the same - can't change that). Additionally, I only want it to connect by prior parent ID for the row that is most recent as well.

From the docs, it appears that I can't do a query in the connect by to get just the latest prior parent_id - will I need to filter in code or is there a way to do this within the query?

Mike
  • 609
  • 12
  • 36

1 Answers1

0

Use ROW_NUMBER() analytic function to get the latest row for each ID:

SELECT ID,
       PARENT_ID,
       START_TIME
FROM   (
  SELECT p.*,
         ROW_NUMBER() OVER (
           PARTITION BY id
           ORDER BY start_time DESC
         ) AS rn
  FROM   process p
)
WHERE rn = 1
START WITH ID = 'X'
CONNECT BY ID = PRIOR PARENT_ID;
MT0
  • 143,790
  • 11
  • 59
  • 117
  • Thanks MT0. This is better, but it only gets the first level of latest process. If A is parent of B and B is parent of C and B was duplicated, those duplicates are showing up. The duplicates of C are not. It seems that the connect by ID = PRIOR PARENT_ID will always match more rows when they exist? – Mike May 04 '18 at 02:48