2

I've seen similar questions posted here but either I don't get the answers or they don't apply... here is what I need and thought it would be really simple: I have a set of items and each item has a set of subitems. The number of subitems changes per item. E.G.:

Item 1
   SubItem 1-1
   SubItem 1-2
   SubItem 1-3
Item 2
   SubItem 2-1
Item 3
   SubItem 3-1
   SubItem 3-2

For a very specific use, the want is to add a comment for every possible combination of subitems on each item, plus a boolean property on each subitem, so it ends up like this:

Item 1   Subitem 1-1 = True, Subitem 1-2 = True, Subitem 1-3 = True
Item 1   Subitem 1-1 = True, Subitem 1-2 = True, Subitem 1-3 = False
Item 1   Subitem 1-1 = True, Subitem 1-2 = False, Subitem 1-3 = True
Item 1   Subitem 1-1 = True, Subitem 1-2 = False, Subitem 1-3 = False
Item 1   Subitem 1-1 = False, Subitem 1-2 = True, Subitem 1-3 = True
... (the rest of Item 1 possible combinations)
Item 2   Subitem 2-1 = True
Item 2   Subitem 2-1 = False
Item 3   Subitem 3-1 = True, Subitem 3-2 = True
Item 3   Subitem 3-1 = True, Subitem 3-2 = False
Item 3   Subitem 3-1 = False, Subitem 3-2 = True
Item 3   Subitem 3-1 = False, Subitem 3-2 = False

I've tried a varietè of inner joins and cross joins but could not make it work. I think the boolean part can be added using a cross join to a table with two rows that have the values True and False, and I also think I need to do a "FOR XML" subquery to get the subitems in a single line, but I'm failing to get the subitems combinations

This is what I have so far:

-- Schema creation and data filling
DECLARE @Item TABLE (ItemId int, Name varchar(50))
DECLARE @Item_SubItem TABLE (ItemId int, SubitemId int)
DECLARE @SubItem TABLE (SubitemId int, Name varchar(50))

INSERT INTO @Item values (1, 'Item 1')
INSERT INTO @Item values (2, 'Item 2')
INSERT INTO @Item values (3, 'Item 3')
INSERT INTO @SubItem values (1, 'SubItem 1-1')
INSERT INTO @SubItem values (2, 'SubItem 1-2')
INSERT INTO @SubItem values (3, 'SubItem 1-3')
INSERT INTO @SubItem values (4, 'SubItem 2-1')
INSERT INTO @SubItem values (5, 'SubItem 3-1')
INSERT INTO @SubItem values (6, 'SubItem 3-2')
INSERT INTO @Item_SubItem values (1, 1)
INSERT INTO @Item_SubItem values (1, 2)
INSERT INTO @Item_SubItem values (1, 3)
INSERT INTO @Item_SubItem values (2, 4)
INSERT INTO @Item_SubItem values (3, 5)
INSERT INTO @Item_SubItem values (3, 6)

select I.Name, SI.Name
  from @Item I
       inner join @Item_SubItem ISI on ISI.ItemId = I.ItemId
       INNER JOIN @SubItem SI on SI.SubitemId = ISI.SubitemId
 order by I.Name, SI.Name


-- Actual query
SELECT ItemName = M.name, (SELECT iC.name + '=' + CASE AuxCode WHEN 1 THEN 'True' WHEN 0 THEN 'False' END + ' '
                                   FROM Item_subitem AS iCGM
                                        INNER JOIN Subitem AS iC ON iC.SubitemId = iCGM.SubitemId
                                        CROSS JOIN (SELECT AuxCode = 1 UNION SELECT AuxCode = 0) Aux
                                  WHERE iCGM.ItemId = M.ItemId
                                  ORDER BY iC.name
                                    FOR XML PATH(''))
  FROM Item M

So, it's the subquery what's failing for me. Any help would be much appreciated!

Damien_The_Unbeliever
  • 234,701
  • 27
  • 340
  • 448
Alejandro B.
  • 4,807
  • 2
  • 33
  • 61
  • have a look here: http://stackoverflow.com/questions/3621494/the-most-elegant-way-to-generate-permutations-in-sql-server – epoch Sep 06 '13 at 13:16
  • @epoch Thanks, but I don't think it's my same case... I need all combinations of subitems per Item, and the asker needs all against all... – Alejandro B. Sep 06 '13 at 13:30
  • What is the maximum number of subitems for an item? Your result could potentially explode, as the number of combinations for n items would be 2^n. For example, 10 subitems would result in 1024 combinations... – Dan Sep 06 '13 at 13:40
  • @Dan Well, as of today there is an Item that has 7 subitems, but usually there are 2~3 subitems per item – Alejandro B. Sep 06 '13 at 13:46
  • 1
    +1 for a fun and challenging query, but I suspect you will get better performance if you generate these permutations in a client-side language like C# than you could ever possibly hope to obtain in SQL Server. – Aaron Bertrand Sep 06 '13 at 22:32

1 Answers1

3

Here's how to get up to 4 levels (your sample data only requires 3, but I wanted to make sure this worked beyond that). You should be able to follow the pattern to go up to 7, 10, what have you. Oh, and don't expect this to be fast.

;WITH z AS 
(
  SELECT i,inm,si,snm,truth,c FROM 
  (
    SELECT i = i.ItemId, inm = i.Name, si = isi.SubItemId, snm = s.Name,
      c = COUNT(isi.SubItemId) OVER (PARTITION BY i.ItemId)
    FROM @Item_SubItem AS isi
    INNER JOIN @Item AS i    ON isi.ItemId    = i.ItemId
    INNER JOIN @SubItem AS s ON isi.SubItemId = s.SubItemId
  ) AS y
  CROSS JOIN (VALUES('true'),('false')) AS t(truth)
)
SELECT Item = z1.inm, 
  SubItems = COALESCE(       z1.snm + ' = ' + z1.truth,'')
           + COALESCE(', ' + z2.snm + ' = ' + z2.truth,'')
           + COALESCE(', ' + z3.snm + ' = ' + z3.truth,'')
           + COALESCE(', ' + z4.snm + ' = ' + z4.truth,'')
FROM z AS z1
  LEFT OUTER JOIN z AS z2 
    ON z1.i = z2.i AND z1.si < z2.si
  LEFT OUTER JOIN z AS z3 
    ON z2.i = z3.i AND z2.si < z3.si 
  LEFT OUTER JOIN z AS z4 
    ON z3.i = z4.i AND z3.si < z4.si 
  WHERE (z1.c = 1) 
    OR (z1.c = 2 AND z2.i IS NOT NULL)
    OR (z1.c = 3 AND z3.i IS NOT NULL)
    OR (z1.c = 4 AND z4.i IS NOT NULL);

Results given the sample data:

Item        SubItems
------      ---------------------------------------------------------------
Item 1      SubItem 1-1 = true, SubItem 1-2 = true, SubItem 1-3 = true
Item 1      SubItem 1-1 = true, SubItem 1-2 = true, SubItem 1-3 = false
Item 1      SubItem 1-1 = true, SubItem 1-2 = false, SubItem 1-3 = true
Item 1      SubItem 1-1 = true, SubItem 1-2 = false, SubItem 1-3 = false
Item 1      SubItem 1-1 = false, SubItem 1-2 = true, SubItem 1-3 = true
Item 1      SubItem 1-1 = false, SubItem 1-2 = true, SubItem 1-3 = false
Item 1      SubItem 1-1 = false, SubItem 1-2 = false, SubItem 1-3 = true
Item 1      SubItem 1-1 = false, SubItem 1-2 = false, SubItem 1-3 = false
Item 2      SubItem 2-1 = true
Item 2      SubItem 2-1 = false
Item 3      SubItem 3-1 = true, SubItem 3-2 = true
Item 3      SubItem 3-1 = true, SubItem 3-2 = false
Item 3      SubItem 3-1 = false, SubItem 3-2 = true
Item 3      SubItem 3-1 = false, SubItem 3-2 = false

Edit after having thought about this a bit, I tested this compared to dumping a bunch of info to a #temp table first, and this seems to optimize better, though the order comes out different (still ordered by ItemId but the false values sort higher):

SELECT c.i, c.inm, c.si, c.snm, c.c, t.truth 
INTO #x
FROM 
(
  SELECT 
     i = i.ItemId, inm = i.Name, si = isi.SubItemId, snm = s.Name,
     c = COUNT(isi.SubItemId) OVER (PARTITION BY i.ItemId)
  FROM @Item_SubItem AS isi
  INNER JOIN @Item AS i    ON isi.ItemId    = i.ItemId
  INNER JOIN @SubItem AS s ON isi.SubItemId = s.SubItemId
) AS c
CROSS JOIN (VALUES('true'),('false')) AS t(truth);

CREATE UNIQUE CLUSTERED INDEX x ON #x(i,si,truth);

SELECT 
  Item = z1.inm, 
  SubItems = COALESCE(       z1.snm + ' = ' + z1.truth,'')
           + COALESCE(', ' + z2.snm + ' = ' + z2.truth,'')
           + COALESCE(', ' + z3.snm + ' = ' + z3.truth,'')
           + COALESCE(', ' + z4.snm + ' = ' + z4.truth,'')
FROM #x AS z1
  LEFT OUTER JOIN #x AS z2 ON z1.i = z2.i AND z1.si < z2.si
  LEFT OUTER JOIN #x AS z3 ON z2.i = z3.i AND z2.si < z3.si
  LEFT OUTER JOIN #x AS z4 ON z3.i = z4.i AND z3.si < z4.si
WHERE (z1.c = 1) 
  OR (z1.c = 2 AND z2.i IS NOT NULL)
  OR (z1.c = 3 AND z3.i IS NOT NULL)
  OR (z1.c = 4 AND z4.i IS NOT NULL);

DROP TABLE #x;

The original version is much more favorable if the underlying tables are indexed, e.g.

DECLARE @Item TABLE (ItemId int PRIMARY KEY, Name varchar(50));

DECLARE @Item_SubItem TABLE (ItemId int, SubitemId int, 
  PRIMARY KEY (ItemId,SubItemId));

DECLARE @SubItem TABLE (SubitemId int PRIMARY KEY, Name varchar(50));

You should probably test both variations against your actual data/schema.

Aaron Bertrand
  • 272,866
  • 37
  • 466
  • 490