Assuming from the sample data provided in question, that there is no loops like A->B->C->A
in data, Following is the query that will return the desired output from nodes
table.
WITH RECURSIVE NodeCluster (node1,node2,Cluster1) AS
(SELECT node1,
node2,
Rank() Over (
ORDER BY node1)
FROM nodes AS n1
WHERE NOT EXISTS
(SELECT *
FROM nodes AS n2
WHERE n1.node1 = n2.node2)
UNION ALL SELECT N1.node1,
N1.node2,
NodeCluster.Cluster1
FROM nodes n1,
NodeCluster
WHERE NodeCluster.node2=n1.node1 )
SELECT *
FROM NodeCluster
ORDER BY Cluster1,
node1,
node2;
In seed query all the starting nodes are selected and ranked
in asc
to assign Cluster number to data in ascending order.
On the data provided in question, following is the output.
Node1 | Node2 | Cluster1
-------------------------
A B 1
B C 1
D E 2
For re-assurance, more data has been added to the sample data as below.
Node1 | Node2
-------------
A B
B C
D E
E F
F G
H I
I J
J K
L M
The query resulted in below output.
Node1 | Node2 | Cluster1
-------------------------
A B 1
B C 1
D E 2
E F 2
F G 2
H I 3
I J 3
J K 3
L M 4
The solution query is successfully tested both with Teradata SQL Assistant
and bteq
in teradata
and ANSI
modes.
Hope this will help.