-1

I have a table that has the below data.

COUNTRY LEVEL   NUM_OF_DUPLICATES
  US    9           6
  US    8           24
  US    7           12
  US    6           20
  US    5           39
  US    4           81
  US    3           80
  US    2           430
  US    1           178
  US    0           430

I wrote a query that will calculate the sum of cumulative rows and got the below output .

COUNTRY LEVEL   NUM_OF_DUPLICATES      POOL
  US    9           6                   6
  US    8           24                  30
  US    7           12                  42
  US    6           20                  62
  US    5           39                  101
  US    4           81                  182
  US    3           80                  262
  US    2           130                 392
  US    1           178                 570
  US    0           254                 824

Now I want to to filter the data and take only where the POOL <=300, if the POOL field does not have the value 300 then I should take the first value after 300. So, in the above example we do not have the value 300 in the field POOL, so we take the next immediate value after 300 which is 392. So I need a query so that I can pull the records POOL <= 392(as per the example above) which will yield me the output as

COUNTRY LEVEL   NUM_OF_DUPLICATES      POOL
  US    9           6                   6
  US    8           24                  30
  US    7           12                  42
  US    6           20                  62
  US    5           39                  101
  US    4           81                  182
  US    3           80                  262
  US    2           130                 392

Please let me know your thoughts. Thanks in advance.

Dale K
  • 25,246
  • 15
  • 42
  • 71
Vinay
  • 3
  • 2
  • How would we order the rows (remember tables have no inherent ordering)? Do you want to order them in increasing amounts of `NUM_OF_DUPLICATES`? Something else? – Charlieface May 23 '21 at 00:38

3 Answers3

1
declare @t table(Country varchar(5), Level int, Num_of_Duplicates int)
insert into @t(Country, Level, Num_of_Duplicates)
values
('US', 9, 6),
('US', 8, 24),
('US', 7, 12),
('US', 6, 20),
('US', 5, 39),
('US', 4, 81),
('US', 3, 80),
('US', 2, 130/*-92*/),
('US', 1, 178),
('US', 0, 430);


select *, sum(Num_of_Duplicates) over(partition by country order by Level desc),
(sum(Num_of_Duplicates) over(partition by country order by Level desc)-Num_of_Duplicates) / 300 as flag,--any row which starts before 300 will have flag=0
--or
case when sum(Num_of_Duplicates) over(partition by country order by Level desc)-Num_of_Duplicates < 300 then 1 else 0 end as startsbefore300
from @t;

select *
from
   (
    select *, sum(Num_of_Duplicates) over(partition by country order by Level desc) as Pool
    from @t
) as t
where Pool - Num_of_Duplicates < 300 ;
lptr
  • 1
  • 2
  • 6
  • 16
0

The logic here is quite simple:

  • Calculate the running sum POOL value up to the current row.
  • Filter rows so that the previous row's total is < 300, you can either subtract the current row's value, or use a second sum
  • If the total up to the current row is exactly 300, the previous row will be less, so this row will be included
  • If the current row's total is more than 300, but the previous row is less then it will also be included
  • All higher rows are excluded

It's unclear what ordering you want. I've used NUM_OF_DUPLICATES column ascending, but you may want something else

SELECT
    COUNTRY,
    LEVEL,
    NUM_OF_DUPLICATES,
    POOL
FROM (
    SELECT *,
        POOL = SUM(NUM_OF_DUPLICATES) OVER (ORDER BY NUM_OF_DUPLICATES ROWS UNBOUNDED PRECEDING)
        -- alternative calculation
        -- ,POOLPrev = SUM(NUM_OF_DUPLICATES) OVER (ORDER BY NUM_OF_DUPLICATES ROWS UNBOUNDED PRECEDING AND 1 PRECEDING)
    FROM YourTable
) t
WHERE POOL - NUM_OF_DUPLICATES < 300;
-- you could also use POOLPrev above
Charlieface
  • 52,284
  • 6
  • 19
  • 43
-1

I used two temp tables to get the answer.

 DECLARE @t TABLE(Country VARCHAR(5), [Level] INT, Num_of_Duplicates INT)
 INSERT INTO @t(Country, Level, Num_of_Duplicates)
  VALUES ('US', 9, 6),
      ('US', 8, 24),
      ('US', 7, 12),
      ('US', 6, 20),
      ('US', 5, 39),
      ('US', 4, 81),
      ('US', 3, 80),
      ('US', 2, 130),
      ('US', 1, 178),
      ('US', 0, 254);


SELECT 
    Country
    ,Level
    , Num_of_Duplicates
    , SUM (Num_of_Duplicates) OVER (ORDER BY id) AS [POOL]
 INTO #temp_table
FROM
   (
    SELECT
        Country,
        level,
        Num_of_Duplicates,
        ROW_NUMBER() OVER (ORDER BY country) AS id
    FROM    @t
 ) AS A 


SELECT  
    [POOL],
    ROW_NUMBER() OVER (ORDER BY [POOL] )   AS [rank]
INTO #Temp_2
FROM  #temp_table 
WHERE [POOL] >= 300


SELECT * 

FROM #temp_table WHERE 
[POOL]   <= (SELECT [POOL] FROM #Temp_2 WHERE [rank] = 1 ) 

    
DROP TABLE #temp_table
DROP TABLE #Temp_2

Result

Gudwlk
  • 1,177
  • 11
  • 11