1

Problem Statement: I have used the below algorithm for generating all unique combinations for N elements in SQL Server via SQL procedure. I am unable to generate unique combination if N>35, even if I am using BIGINT(2^-63 to (2^63)-1) for storing the value and iterating through the loop. Pseudo-code

N items, indexed 1-N

for i=1 to 2^N-1     
  for each bit in i
      if bit is set, output item[i]

Example for N=4:

N = 4, 2^4 = 16

i = 1:  binary = 00000001 -> output I1 
i = 2:  binary = 00000010 -> output I2 
i = 3:  binary = 00000011 -> output I1, I2 
i = 4:  binary = 00000100 -> output I3 
i = 5:  binary = 00000101 -> output I1, I3 
i = 6:  binary = 00000110 -> output I2, I3
i = 7:  binary = 00000111 -> output I1, I2, I3 
i = 8:  binary = 00001000 -> output I4 
i = 9:  binary = 00001001 -> output I1, I4 
i = 10: binary = 00001010 -> output I2, I4 
i = 11: binary = 00001011 -> output I1, I2, I4 
i = 12: binary = 00001100 -> output I3, I4 
i = 13: binary = 00001011 -> output I1, I2, I4 
i = 14: binary = 00001110 -> output I2, I2, I4 
i = 15: binary = 00001111 -> output I1, I2, I3, I4

Below is SQL Code written for the same algorithm.

CREATE PROCEDURE [dbo].[GENERATE_UNIQUE_COMBINATIONS_DIMENSION]
    -- Add the parameters for the stored procedure here

AS
BEGIN
    -- SET NOCOUNT ON added to prevent extra result sets from
    -- interfering with SELECT statements.
    SET NOCOUNT ON;
    DECLARE @NO_ITEMS_INPUT INT=(SELECT COUNT(*) FROM TBL_INPUT_DIMENSION)
    DECLARE @NO_ITEMS_OUTPUT BIGINT = (SELECT POWER(2,@NO_ITEMS_INPUT)-1)
    DECLARE @I BIGINT =1;
    DECLARE @BINARY_VALUE VARCHAR(MAX)
    DECLARE @BIN_VALUE_REQD VARCHAR(MAX)

    --SET @I=@NO_ITEMS_OUTPUT
    WHILE (@I<=@NO_ITEMS_OUTPUT)
    BEGIN
        SET @BINARY_VALUE=[dbo].[CONVERT_INT_BINARY](@I)
        SET @BIN_VALUE_REQD = RIGHT(@BINARY_VALUE,@NO_ITEMS_INPUT)
        DECLARE @TEMP_STRING VARCHAR(MAX)=''
        DECLARE @J BIGINT=1;
        WHILE(@J<=@NO_ITEMS_INPUT)
        BEGIN
            DECLARE @BIT_VALUE CHAR(2)= (SELECT SUBSTRING(@BIN_VALUE_REQD,@J,1))
            IF(@BIT_VALUE=1)
            BEGIN
                DECLARE @INDEX BIGINT= @J
                DECLARE @INDEX_VALUE VARCHAR(MAX) = (SELECT INPUT FROM TBL_INPUT_DIMENSION WHERE REF_NO= @J)
                SET @TEMP_STRING =@TEMP_STRING+','+@INDEX_VALUE
            END
        SET @J=@J+1
        END
        SET @TEMP_STRING = (SELECT SUBSTRING(REPLACE(@TEMP_STRING,' ',''),2,LEN(@TEMP_STRING)))
        PRINT @TEMP_STRING

        INSERT INTO OUTPUT_DIMENSION VALUES(@I,@TEMP_STRING)
        SET @I=@I+1
    END

    -- Insert statements for procedure here

END
Pavel
  • 1
  • 3
  • 17
  • 51
nikn8
  • 1,016
  • 8
  • 23
  • 1
    See this http://stackoverflow.com/questions/127116/sql-server-convert-integer-to-binary-string which references this http://mitch-wheat.blogspot.com.au/2006/10/t-sql-way-converting-integers-to.html which is still a pretty clever way of converting ints to binary – Liesel Mar 29 '16 at 08:16
  • can you post table TBL_INPUT_DIMENSION with sample data ? – Squirrel Mar 29 '16 at 08:20
  • 1
    I think you are unable to perform the procedure because of unacceptable duration of execution of the procedure. Each time you add a new dimension the running time doubles. Time complexity is O(n^35 * 35). 2^35 = 34359738368 which is very big number. Are you satisfied by performance when you have 34, 33, 32 dimensions? I expect that 30 dimensions will have noticeable execution time. I expect the reason to be very big time complexity of the algo. – Ivan Gritsenko Mar 29 '16 at 08:24
  • @Squirrel you can take sample data like D1,D2, D3....D100. – nikn8 Mar 29 '16 at 08:52
  • @IvanGritsenko I know about time complexity but here I am not concerned about it. The major part is, I need to generate unique combination for approx 150 elements and I am not able to figure out how to do using the mentioned algo or we have some other algo for the same. – nikn8 Mar 29 '16 at 08:55
  • 150 elements is 150 rows of result or 150 digits ? – Squirrel Mar 29 '16 at 08:56
  • @Squirrel 150 is N(no of distinct items.) – nikn8 Mar 29 '16 at 09:01
  • @user2746878, I think it is impossible to generate such large number of rows not only because of the large amount of time needed but also because of the lack of memory where you have to store your results. `INSERT INTO OUTPUT_DIMENSION VALUES(@I,@TEMP_STRING)` Here you append rows. Take a look at `34359738368` number, let each row take 10 bytes of memory, that is `343 Gb` in total! I think such large volumes are not feasible. – Ivan Gritsenko Mar 29 '16 at 09:02
  • I would like to ask if you are interested in **partial** view on these combinations, for example show only 25 rows from 100000 to 100025. I think that I have solution for such a problem. – Ivan Gritsenko Mar 29 '16 at 09:05
  • 2 to the power of 150 ? i don't even know how many digits is that . . . maybe after 10 - 20 years later you will be able to do it :) – Squirrel Mar 29 '16 at 09:07
  • @IvanGritsenko anything is fine. Thank u so much. – nikn8 Mar 29 '16 at 09:10
  • @Squirrel Even I think like that. :). That's y I was asking, Is there any other way to generate unique combination for given N elements.. – nikn8 Mar 29 '16 at 09:19
  • @NeelKamal, the real question is why do you need to generate these billions of combinations. What are you going to do with `2^35 = 34,359,738,368` rows? Maybe it is not necessary to generate **all** these combinations to solve the actual practical problem? – Vladimir Baranov Mar 29 '16 at 10:46

0 Answers0