1

I have a database where from_node and to_node are the columns. Trying to find all the reachable nodes from from_node to to_node. There are cycles. Few from_node using connect by nocycle, is generating millions of children nodes. How can this be resolved?

SELECT from_node,
       to_node,
       level
FROM PATH
START WITH from_node = input_var
CONNECT BY NOCYCLE PRIOR to_node= from_node;
Raghav
  • 11
  • 4
  • What is wrong with the results? Are the million records wrong? Are you seeing loops? Are you seeing duplicates? You only have one input variable, but you mention finding all nodes between from_node and to_node. Are you just trying to find all nodes which are reachable from the input source node? – Del Mar 30 '20 at 15:45
  • Sample data would help. – Littlefoot Mar 30 '20 at 15:49
  • I have only 5k records in my table. Yes I am trying to find all reachable nodes from the input source node.Loops and duplicates are seen – Raghav Mar 30 '20 at 15:59
  • Unless you specifically want a stored procedure to solve this problem, no PL/SQL is needed and the `plsql` tag is not unnecessary. –  Mar 30 '20 at 20:00

1 Answers1

1

Based on your description, I might look into using a recursive subquery factoring clause instead of a hierarchical query. This will more easily allow you to get rid of the cycles. This is because NOCYCLE doesn't actual stop cycles, but prevents Oracle from caring about them. This is an extract from the Oracle Documentation:

The NOCYCLE parameter instructs Oracle Database to return rows from a query even if a CONNECT BY loop exists in the data. Use this parameter along with the CONNECT_BY_ISCYCLE pseudocolumn to see which rows contain the loop. Refer to CONNECT_BY_ISCYCLE Pseudocolumn for more information.

Something like this recursive subquery may work better for you:

WITH path_search (from_node, to_node, path_level) AS
(
  SELECT p.from_node, p.to_node, 0 AS PATH_LEVEL
  FROM paths p
  WHERE p.from_node = 'A'
  UNION ALL
  SELECT p.from_node, p.to_node, ps.path_level+1 AS path_level
  FROM path_search ps
  INNER JOIN paths p ON ps.to_node = p.from_node
)
SEARCH DEPTH FIRST BY to_node SET order_col
CYCLE from_node SET is_cycle TO 'Y' DEFAULT 'N'
SELECT ps.from_node, ps.to_node, MIN(ps.path_level) AS MIN_DISTANCE
FROM path_search ps
WHERE is_cycle = 'N'
GROUP BY ps.from_node, ps.to_node
ORDER BY min_distance;

Here is an SQLFiddle with this solution (Link).

This will allow you to stop the recursion once a cycle is detected in a given path. However, as there still may be duplicate from and to nodes as the same path segment can be found on different paths, I added the grouping to only show the minimum level as it would be the best way to get to a given node.

desertnaut
  • 57,590
  • 26
  • 140
  • 166
Del
  • 1,529
  • 1
  • 9
  • 18
  • This logic doesnot work for something like from-to A-B, B-C , C-D, D-E,E-F,F-G,G-H WITH path_search (from_node, to_node)AS ( SELECT p.from_node, p.to_node FROM path_temp p WHERE p.from_node = 'A' UNION ALL SELECT p.from_node, p.to_node FROM path_search ps INNER JOIN path_temp p ON ps.from_node = p.to_node ) SEARCH DEPTH FIRST BY to_node SET order_col CYCLE from_node SET is_cycle TO 'Y' DEFAULT 'N' SELECT ps.from_node, ps.to_node FROM path_search ps WHERE is_cycle = 'N' --GROUP BY ps.from_node, ps.to_node ORDER BY order_col; – Raghav Apr 01 '20 at 11:48
  • @Raghav Apologies, I had a few typos in the query (I flipped the join column names). I've updated the query and added an SQLFiddle so you can take a look. – Del Apr 01 '20 at 14:25
  • @Raghav you just want the with portion assigned to a cursor? Sure just have your cursor definition be the WITH clause (including the search clause) and `SELECT * FROM path_search` – Del Apr 02 '20 at 14:00
  • I have around 5k records in my table. for every DISTINCTt from_node , I have to find all the reachable nodes.So first , all the distinct from_nodes are fetched.In a loop WITH clause is run for every from_node. procedure is not terminating. After inserting few records, it becomes stagnant – Raghav Apr 02 '20 at 19:04
  • There is a scenario where above query goes in to the infinite loop and doesnt terminate. Tried to add it in SQL Fiddle ,since there are 182 rows not able to add. I am not able to minimize the testcase – Raghav Apr 03 '20 at 12:06
  • @Raghav If you wanted to change the above query to calculate for your entire dataset, you don't need to put it in a loop. Just remove the `WHERE from_node = 'A'` from the query. That will in turn have every path be the first path in the recursion. It will definitely take a while to run that, but it should not be infinite. I made a few other changes and put them into an updated [SQLFiddle](http://sqlfiddle.com/#!4/02852d/7/0) – Del Apr 03 '20 at 17:17
  • @Raghav, The updated Fiddle calculates the minimum distance from any start node to a given node. – Del Apr 03 '20 at 17:22