1

I'm just trying to understand CTE and recursion to solve an issue that I would previously have used a cursor for.

create table ##ACC (
AccNo int,
Property char
)

Insert into ##ACC 
VALUES (1,'A'),(1,'B'),(2,'A'),(2,'C'),(3,'C'),(4,'D')

What I'm trying to achieve is to get a list of all AccNo's, and all AccNo's they're related to via Property. So my expected results are

PrimaryAccNo | LinkedAccNo
1 | 1
1 | 2
1 | 3
2 | 1
2 | 2
2 | 3
3 | 1
3 | 2
3 | 3
4 | 4

I've attempted the following code and variations but I either get 4 results (PrimaryAccNo=LinkedAccNo) only or I hit 100 recursions.

WITH Groups(PrimaryAccNo, LinkedAccNo)
AS
(
Select distinct AccNo, AccNo from ##ACC

UNION ALL
Select g.PrimaryAccNo, p.AccNo from
##ACC p inner join Groups g on p.AccNo=g.LinkedAccNo
inner join ##ACC pp on p.Property=pp.Property
where p.AccNo<> pp.AccNo
)
Select PrimaryAccNo,LinkedAccNo 
from Groups

What am I doing wrong?

TT.
  • 15,774
  • 6
  • 47
  • 88
GavinP
  • 677
  • 1
  • 5
  • 18
  • 1
    Debugging tip: Add `0 as Depth` to the anchor in your CTE and `Depth + 1` to the recursive `select`. You can stop the recursion at any point by adding to the recursive `where` clause, e.g. `and Depth < 42`, and see what is going on. – HABO Feb 08 '17 at 18:58
  • @HABO thanks. I will have to look at this. Right now I'm not seeing how this helps me. – GavinP Feb 09 '17 at 09:14
  • I've had a look. Unfortunately I don't know how many recursions I need to do for my data set, so in the example above I can make it run nice and quick with a `Depth<2` but on my real data set my query has been running for 13 minutes now with a `Depth<4`.... The issues appears to be around duplicates being added, rather than only unique new matches. I think I might just revert to the cursor based approach I would have gone with to begin with. – GavinP Feb 09 '17 at 10:11
  • Sorry no one could help you. – HABO Feb 27 '17 at 03:55

2 Answers2

0

You're running into an infinite loop caused by cycles within your data, e.g.: 1 > 2 > 3 > 2 > ... . The solution is to keep track of the rows that have already been "consumed". Due to limitations in CTEs, this has to be done by including the history within each CTE row, e.g. by assembling the path followed to arrive at each row. You can uncomment the , Path on the final select to see what is going on.

-- Sample data.
declare @ACC as Table ( AccNo Int, Property Char );
insert into @ACC values
  ( 1, 'A' ), ( 1, 'B' ), ( 2, 'A' ), ( 2, 'C' ), ( 3, 'C' ), ( 4, 'D' );
select * from @ACC;

-- Recursive CTE.
with Groups as (
  select distinct AccNo, AccNo as LinkedAccNo,
    Cast( '|' + Cast( AccNo as VarChar(10) ) + '|' as VarChar(1024) ) as Path
    from @ACC
  union all
  select G.AccNo, A.AccNo, Cast( Path + Cast( A.AccNo as VarChar(10) ) + '|' as VarChar(1024) )
    from Groups as G inner join -- Take the latest round of new rows ...
      @ACC as AP on AP.AccNo = G.LinkedAccNo inner join -- ... and get the   Property   for each ...
      @ACC as A on A.Property = AP.Property -- ... to find new linked rows.
      where G.Path not like '%|' + Cast( A.AccNo as VarChar(10) ) + '|%' )
  select AccNo, LinkedAccNo -- , Path
    from Groups
    order by AccNo, LinkedAccNo;
HABO
  • 15,314
  • 5
  • 39
  • 57
0

Another approach similar to yours but differs in the following:

  1. The property value is included in the recursive CTE so that it can be used later
  2. The < is used to prevent duplicates and the resulting infinite recursion
  3. Another CTE is added AccGroups to provide the mirror of the relations

A demo fiddle has been included below:

CREATE TABLE ##ACC (
    AccNo int,
    Property char
);

INSERT INTO ##ACC 
VALUES (1,'A'),(1,'B'),(2,'A'),(2,'C'),(3,'C'),(4,'D');

WITH Groups(PrimaryAccNo, LinkedAccNo, Property) AS (
    SELECT AccNo, AccNo, Property FROM ##ACC
    UNION ALL
    SELECT g.PrimaryAccNo, pp.AccNo, pp.Property
    FROM Groups g 
    INNER JOIN ##ACC p ON g.Property=p.Property AND
                          g.LinkedAccNo < p.AccNo
    INNER JOIN ##ACC pp ON p.AccNo = pp.AccNo 
),
AccGroups AS (
    SELECT DISTINCT * FROM (
        SELECT PrimaryAccNo, LinkedAccNo FROM Groups
        UNION ALL
        SELECT LinkedAccNo, PrimaryAccNo FROM Groups
    ) t
)
SELECT * FROM AccGroups
ORDER BY PrimaryAccNo,LinkedAccNo
GO
PrimaryAccNo | LinkedAccNo
-----------: | ----------:
           1 |           1
           1 |           2
           1 |           3
           2 |           1
           2 |           2
           2 |           3
           3 |           1
           3 |           2
           3 |           3
           4 |           4

db<>fiddle here

ggordon
  • 9,790
  • 2
  • 14
  • 27