2

I'm trying to calculate number of cluster when I know all members of each cluster.

I need a function in SqlServer2008 to solve this problem.

My table has more than 14 million distinct records like this:

CREATE TABLE Test
(
   F1 varchar(5),
   F2 varchar(5)
)
INSERT INTO TEST ( F1, F2) VALUES ( 'A', 'B')
INSERT INTO TEST (  F1, F2 ) VALUES ( 'A', 'K')
INSERT INTO TEST ( F1, F2) VALUES ( 'C', 'H')
INSERT INTO TEST (  F1, F2 ) VALUES ( 'D', 'B')
INSERT INTO TEST (  F1, F2 ) VALUES ( 'F', 'I')
INSERT INTO TEST (  F1, F2 ) VALUES ( 'F', 'B')
INSERT INTO TEST (  F1, F2 ) VALUES ( 'D', 'H' )
INSERT INTO TEST (  F1, F2 ) VALUES ( 'E', 'G' )
INSERT INTO TEST (  F1, F2 ) VALUES ( 'G', 'L' )

note that:

If A=B AND A=K THEN B=K 
IF E=G AND G=L THEN E=L

Now the output of the function should be as below:

ClusterNumber--- point
1---A
1---B
1---k
1---D
1---F
1---I
1---H
1---C
2---E
2---G
2---L

I guess the solution would be a recursive function, but does not know it!

ARZ
  • 2,461
  • 3
  • 34
  • 56
jozi
  • 2,833
  • 6
  • 28
  • 41

2 Answers2

2

I solve it in C# using EntityFrameWork and LINQ in a Recursive Function!

-Note that I add a new column Clus for cluster label.

So place this loop in your main code:

    foreach (var item1 in db.Test.ToList())
    {
        FirstFunc(item1, item1.F1);
    }

and use these functions:

private void FirstFunc(Test item1,string cc)
{
    if (item1.Clus == null)
    {
        item1.Clus = cc;
        db.SaveChanges();

        RecFunc(item1, cc);
    }
}


private void RecFunc(Test item1,string cc)
{
        var t1 = db.Test.Where(x => (x.F1 == item1.F1 | x.F2 == item1.F1 | x.F1 == item1.F2| x.F2 == item1.F2) & x.Clus == null).ToList();
        foreach (var item2 in t1)
        {
            item2.Clus = cc;
            db.SaveChanges();
            RecFunc(item2,cc);
        }
}

and the result is: enter image description here

and then:

 var ClusterCount = db.Test.Select(x => x.Clus).Distinct().Count();

Hope this help!

ARZ
  • 2,461
  • 3
  • 34
  • 56
0

Transitive closures cannot be computed in plain SQL. They are a key example for using PL/SQL and similar languages. You might want to look into related questions such as:

mySQL transitive closure table

Anyway, the search term you need is closure. You have a relation, but you want to get the closure of the relation.

Community
  • 1
  • 1
Has QUIT--Anony-Mousse
  • 76,138
  • 12
  • 138
  • 194