5

I have a table with id, cost, and priority columns:

create table a_test_table (id number(4,0), cost number(15,2), priority number(4,0));

insert into a_test_table (id, cost, priority) values (1, 1000000, 10);
insert into a_test_table (id, cost, priority) values (2, 10000000, 9);
insert into a_test_table (id, cost, priority) values (3, 5000000, 8);
insert into a_test_table (id, cost, priority) values (4, 19000000, 7);
insert into a_test_table (id, cost, priority) values (5, 20000000, 6);
insert into a_test_table (id, cost, priority) values (6, 15000000, 5);
insert into a_test_table (id, cost, priority) values (7, 2000000, 4);
insert into a_test_table (id, cost, priority) values (8, 3000000, 3);
insert into a_test_table (id, cost, priority) values (9, 3000000, 2);
insert into a_test_table (id, cost, priority) values (10, 8000000, 1);
commit;

select 
    id,
    to_char(cost, '$999,999,999') as cost,
    priority
from 
    a_test_table;
        ID COST            PRIORITY
---------- ------------- ----------
         1    $1,000,000         10
         2   $10,000,000          9
         3    $5,000,000          8
         4   $19,000,000          7
         5   $20,000,000          6
         6   $15,000,000          5
         7    $2,000,000          4
         8    $3,000,000          3
         9    $3,000,000          2
        10    $8,000,000          1

Starting with the highest priority (descending), I want to select the rows where the cost adds up to less than (or equal to) $20,000,000.

The result would look like this:

       ID COST            PRIORITY
---------- ------------- ----------
         1    $1,000,000         10
         2   $10,000,000          9
         3    $5,000,000          8
         7    $2,000,000          4

      Total: $18,000,000

How can I do this with Oracle SQL?

User1974
  • 276
  • 1
  • 17
  • 63
  • 1
    Why isn't it 16,000,000 (10+9+8+7 exceeds 20,000,000) – Salman A Feb 08 '19 at 19:53
  • Can you use PL/SQL or does it have to be a SQL query? And if it does have to be SQL then why? – Bobby Durrett Feb 08 '19 at 20:15
  • @SalmanA : Good question. The values in the `priority` column represent a **benefit**. The benefit/priority is more important than squeezing every last dollar out of the budget. Does that answer your question? – User1974 Feb 08 '19 at 20:29

3 Answers3

7

Here is a way to do it in pure SQL. I won't swear there isn't a better way.

Basically, it uses a recursive common table expression (i.e., WITH costed...) to compute every possible combination of elements totaling less than 20,000,000.

Then it gets the first full path from that result.

Then, it gets all the rows in that path.

NOTE: the logic assumes that no id is longer than 5 digits. That's the LPAD(id,5,'0') stuff.

WITH costed (id, cost, priority, running_cost, path) as 
( SELECT id, cost, priority, cost running_cost, lpad(id,5,'0') path
  FROM   a_test_table
  WHERE  cost <= 20000000
  UNION ALL 
  SELECT a.id, a.cost, a.priority, a.cost + costed.running_Cost, costed.path || '|' || lpad(a.id,5,'0')
  FROM   costed, a_test_table a 
  WHERE  a.priority < costed.priority
  AND    a.cost + costed.running_cost <= 20000000),
best_path as (  
SELECT *
FROM   costed c 
where not exists ( SELECT 'longer path' FROM costed c2 WHERE c2.path like c.path || '|%' )
order by path
fetch first 1 row only )
SELECT att.* 
FROM best_path cross join a_test_table att
WHERE best_path.path like '%' || lpad(att.id,5,'0') || '%'
order by att.priority desc;
+----+----------+----------+
| ID |   COST   | PRIORITY |
+----+----------+----------+
|  1 |  1000000 |       10 |
|  2 | 10000000 |        9 |
|  3 |  5000000 |        8 |
|  7 |  2000000 |        4 |
+----+----------+----------+

UPDATE - Shorter version

This version uses MATCH_RECOGNIZE to find all the rows in the best group following the recursive CTE:

WITH costed (id, cost, priority, running_cost, path) as 
( SELECT id, cost, priority, cost running_cost, lpad(id,5,'0') path
  FROM   a_test_table
  WHERE  cost <= 20000000
  UNION ALL 
  SELECT a.id, a.cost, a.priority, a.cost + costed.running_Cost, costed.path || '|' || lpad(a.id,5,'0')
  FROM   costed, a_test_table a 
  WHERE  a.priority < costed.priority
  AND    a.cost + costed.running_cost <= 20000000)
  search depth first by priority desc set ord
SELECT id, cost, priority
FROM   costed c 
MATCH_RECOGNIZE (
  ORDER BY path
  MEASURES
    MATCH_NUMBER() AS mno
  ALL ROWS PER MATCH
  PATTERN (STRT ADDON*)
  DEFINE
    ADDON AS ADDON.PATH = PREV(ADDON.PATH) || '|' || LPAD(ADDON.ID,5,'0')
    )
WHERE mno = 1
ORDER BY priority DESC;

UPDATE -- Even shorter version, using clever idea from the SQL*Server link the OP posted

*Edit: removed use of ROWNUM=1 in anchor part of recursive CTE, since it depended on the arbitrary order in which rows would be returned. I'm surprised no one dinged me on that. *

WITH costed (id, cost, priority, running_cost) as 
( SELECT id, cost, priority, cost running_cost
  FROM   ( SELECT * FROM a_test_table
           WHERE  cost <= 20000000
           ORDER BY priority desc
           FETCH FIRST 1 ROW ONLY )
  UNION ALL 
  SELECT a.id, a.cost, a.priority, a.cost + costed.running_Cost
  FROM   costed CROSS APPLY ( SELECT b.*
                              FROM   a_test_table b 
                              WHERE  b.priority < costed.priority
                              AND    b.cost + costed.running_cost <= 20000000
                              FETCH FIRST 1 ROW ONLY
                              ) a
)
CYCLE id SET is_cycle TO 'Y' DEFAULT 'N'
select id, cost, priority from costed
order by priority desc
Matthew McPeak
  • 17,705
  • 2
  • 27
  • 59
3

I'm too stupid to do it in plain SQL, so I tried PL/SQL - a function that return s a table. Here's how: looping through all rows in a table, I'm calculating the sum; if it is lower than a limit, fine - add row's ID into an array and go on.

SQL> create or replace function f_pri (par_limit in number)
  2    return sys.odcinumberlist
  3  is
  4    l_sum   number := 0;
  5    l_arr   sys.odcinumberlist := sys.odcinumberlist();
  6  begin
  7    for cur_r in (select id, cost, priority
  8                  from a_test_table
  9                  order by priority desc
 10                 )
 11    loop
 12      l_sum := l_sum + cur_r.cost;
 13      if l_sum <= par_limit then
 14         l_arr.extend;
 15         l_arr(l_arr.last) := cur_r.id;
 16      else
 17         l_sum := l_sum - cur_r.cost;
 18      end if;
 19    end loop;
 20    return (l_arr);
 21  end;
 22  /

Function created.

Preparing SQL*Plus environment so that the output looks prettier:

SQL> break on report
SQL> compute sum of cost on report
SQL> set ver off

Testing:

SQL> select t.id, t.cost, t.priority
  2  from table(f_pri(&par_limit)) x join a_test_table t on t.id = x.column_value
  3  order by t.priority desc;
Enter value for par_limit: 20000000

        ID       COST   PRIORITY
---------- ---------- ----------
         1    1000000         10
         2   10000000          9
         3    5000000          8
         7    2000000          4
           ----------
sum          18000000

SQL> /
Enter value for par_limit: 30000000

        ID       COST   PRIORITY
---------- ---------- ----------
         1    1000000         10
         2   10000000          9
         3    5000000          8
         7    2000000          4
         8    3000000          3
         9    3000000          2
           ----------
sum          24000000

6 rows selected.

SQL>
Littlefoot
  • 131,892
  • 15
  • 35
  • 57
2

@ypercubeᵀᴹ on the DBA-SE chat posted this solution. It's pretty consise.

with  rt (id, cost, running_total, priority) as
(
    (
    select 
        id,
        cost,
        cost as running_total,
        priority
    from 
        a_test_table
    where cost <= 20000000 
    order by priority desc
    fetch first 1 rows only
    )

    union all

        select 
            t.id,
            t.cost,
            t.cost + rt.running_total,
            t.priority
        from a_test_table  t
             join rt 
             on t.priority < rt.priority      -- redundant but throws
                                              -- "cycle detected error" if omitted

             and t.priority =                             -- needed 
                 ( select max(tm.priority) from a_test_table tm
                   where tm.priority < rt.priority
                     and tm.cost + rt.running_total <= 20000000 )
    )
    select *
    from rt ;

(@ypercubeᵀᴹ is not interested in posting it himself.)

User1974
  • 276
  • 1
  • 17
  • 63