I have a recursive SQL query that traverses a tree-like structure using a table variable. The query works correctly, but I'm facing performance issues. The idea to pull hierarchical parents till the root parent without traversing redundant paths(assuming it improves performance)
Here's the current structure of my SQL query:
DECLARE @lvl AS INT
DECLARE @rows AS INT
DECLARE @foo AS TABLE(
parent_id INT,
child_id INT,
lvl INT)
--Get anchor condition
INSERT @foo (parent_id, child_id, lvl)
SELECT parent_id, child_id, 0
FROM bar WHERE child_id IN (SELECT child_id from bar WHERE child_id BETWEEN 50 AND 150)
SET @rows=@@ROWCOUNT
SET @lvl=0
--Do recursion
WHILE @rows > 0
BEGIN
SET @lvl = @lvl + 1
INSERT @foo (parent_id, child_id, lvl)
SELECT DISTINCT b.parent_id, b.child_id, @lvl
FROM bar b
INNER JOIN @foo f ON b.child_id = f.parent_id
-- avoid revisiting nodes: do not insert a node if its parent_id is already in the table
LEFT JOIN @foo dup ON dup.child_id = b.parent_id
WHERE f.lvl = @lvl-1 AND dup.parent_id IS NULL
SET @rows=@@ROWCOUNT
END
SELECT * FROM @foo ORDER BY lvl DESC
Used a loop-based approach combined with the use of a temporary table variable (@foo) to gradually build the hierarchical parent-child relationships, ensuring that redundant paths are avoided during the traversal process.
Expected Output:
Each child brings in its parent and its parent brings in its own parent.Until the root parent
Code snippet to create the table
CREATE TABLE bar(
parentUId INT,
childUid INT
);
WHILE @counter <= 10000
BEGIN
DECLARE @parent_id INT;
DECLARE @child_id INT;
-- Generate child_id
SET @child_id = @counter;
-- Generate parent_id
SET @parent_id = FLOOR((@counter - 1) / 10) + 1;
INSERT INTO bar (parent_id, child_id)
VALUES (@parent_id, @child_id);
SET @counter = @counter + 1;
END;
The current query produces the desired output, but it takes a significant amount of time to execute, especially when the data set grows larger. I have a hierarchical structure that consists of 12 levels. The provided example is simplified.
I'm looking for suggestions on how to optimize this SQL query to improve its performance. Are there any modifications I can make to the query to achieve better performance? Any insights or alternative approaches would be greatly appreciated.
I already tried recursive CTE.