7

I have a SQL Server table which have records like this

ID | Value

1   |   100
2   |   150
3   |   250
4   |   600
5   |   1550
6   |   50
7   |   300

I need to select random records, but the only condition is that the total sum of this records value achieve a specific number or percentage i define.

let's say i need a total value of 300 or 10%, so here are the chances

1   |   100
2   |   150
6   |   50

or

3   |   250
6   |   50

or

7   |   300

can any one help me to do this.

  • do you required all possible combination or only one ? – Squirrel Nov 08 '18 at 09:54
  • I'd suggest having a look at `SUM()` and `ANY` functions. – Jean-Marc Zimmer Nov 08 '18 at 09:56
  • 5
    What you are asking may seem simple to you, but you actually have to design an algorithm that does that and the implement it in sql. That is harder than 90% of the questions I've seen here. – George Menoutis Nov 08 '18 at 10:09
  • What George said. I've once built some routine to find every valid combination of available options in Oracle. I think it was a similar issue to this, and it was a hell of a job. – GolezTrol Nov 08 '18 at 10:26
  • get the powerset like this https://stackoverflow.com/questions/36859361/powersetall-combinations-of-a-resultset-in-t-sql, check which ones have the right sum, randomly pick one. It won't be fast... – AakashM Nov 08 '18 at 10:33
  • also see https://stackoverflow.com/questions/24175724/combinations-power-sets-no-idea-where-to-even-start/24179688#24179688 – AakashM Nov 08 '18 at 10:34
  • So you are asking both an algorithm, and an implementation in T-SQL? That's two separate questions... and thus too broad for Stack Overflow. Perhaps try asking about an algorithm first (tag with `algorithm`) and once you have that, and you have problems implementing this in T-SQL, ask a new question about that. – TT. Nov 09 '18 at 05:12

2 Answers2

4

Think this recursive CTE works, no idea what the performance will be like though once you get past a trivial amount of rows:

DECLARE @Test TABLE
(
    ID INT NOT NULL,
    VAL INT NOT NULL
);

INSERT INTO @Test
VALUES  (1,100),
        (2,150),
        (3,250),
        (4,600),
        (5,1550),
        (6,50),
        (7,300);

DECLARE @SumValue INT = 300,
        @Percentage INT = 10;

WITH GetSums
AS
(
    SELECT  T.ID, 
            T.Val,
            CAST(T.ID AS VARCHAR(MAX)) AS IDs
    FROM    @Test AS T

    UNION ALL

    SELECT  T1.ID, 
            T1.Val + GS.Val AS Val,
            CAST(T1.ID AS VARCHAR(MAX)) + ',' + GS.IDs AS IDs
    FROM    @Test AS T1
    INNER
    JOIN    GetSums AS GS 
            ON  T1.ID > GS.ID
)
SELECT  GS.IDs,
        GS.Val
FROM    GetSums AS GS
WHERE   (GS.Val = @SumValue OR GS.VAL = (SELECT SUM(Val) FROM @Test AS T) / @Percentage)
OPTION  (MAXRECURSION 50);

Similar found here:

find all combination where Total sum is around a number

Dohsan
  • 361
  • 2
  • 12
  • i think its not right because we are getting 300 all the time..we need to get random values to obtain the result ,right? (: – DeadCat Nov 08 '18 at 11:26
  • The IDs are the combination that are reaching 300, you'd just need to pick one of the lines at random. Depends if the op just needs to know the combination of return the actual rows – Dohsan Nov 08 '18 at 11:33
  • Could this be sped up with a where value<= 300 clause? – Kevin Nov 08 '18 at 17:33
2

Try this...we will get the correct answer if the 6th value is 250...

     SELECT          1   ID,  100 Value
     INTO #Temp_1
     UNION ALL SELECT 2   ,  150
     UNION ALL SELECT 2   ,  150
     UNION ALL SELECT 3   ,  250
     UNION ALL SELECT 4   ,  600
     UNION ALL SELECT 5   ,  1550
     UNION ALL SELECT 6   ,  250
     UNION ALL SELECT 7   ,  300


     CREATE TABLE #Temp_IDs
        (
            ID Int,
            Value Numeric(18,2)
        )


        DELETE
        FROM    #Temp_IDs

        DECLARE @ID     Int,
                @Vale   Numeric(18,2),
                @ContinueYN Char(1)

        SET @ContinueYN =   'Y'

        IF  EXISTS (SELECT TOP 1 1 FROM #Temp_1
                                                    WHERE Value <= 300
                                                        AND ID NOT IN (SELECT ID FROM #Temp_IDs )
                                                        AND Value <= (SELECT 300 - ISNULL( SUM(Value),0) FROM #Temp_IDs)
                                                    ORDER BY NEWID())

        BEGIN
                WHILE (@ContinueYN  =   'Y')
                BEGIN

                        SELECT  @ID     =   ID,
                                @Vale   =   Value
                        FROM #Temp_1
                        WHERE Value <= 300
                            AND ID NOT IN (SELECT ID FROM #Temp_IDs )
                            AND Value <= (SELECT 300 - ISNULL( SUM(Value),0) FROM #Temp_IDs)
                        ORDER BY NEWID()

                        INSERT INTO #Temp_IDs
                        SELECT @ID,@Vale

                        IF (SELECT  SUM(Value) FROM #Temp_IDs) = 300
                        BREAK
                        ELSE IF @ID IS NULL
                        BEGIN 

                            DELETE FROM #Temp_IDs

                        END

                        SET  @ID = NULL
                        SET  @Vale = NULL


                END

        END

        SELECT *
        FROM #Temp_IDs

        DROP TABLE #Temp_IDs
        DROP TABLE #Temp_1
Elby Joy
  • 231
  • 1
  • 7