1

I have a table of user batches. I only want to select until my amount total reaches a certain amount.

id  | user_id | balance | batch_id 
----|---------|-------- |--------
 1  | 1       |   2     | 1
 2  | 2       |   15    | 2
 3  | 1       |   8     | 3
 4  | 1       |   5     | 4 
 5  | 2       |   7     | 5
 6  | 1       |   1     | 6
 7  | 2       |   5     | 7

Consider the following query:

SELECT * FROM tb_batch_user WHERE user_id = 1 ORDER BY batch_id asc

The result of query is:

    id  | user_id | balance | batch_id 
    ----|---------|-------- |--------
     1  | 1       |   2     | 1
     3  | 1       |   8     | 3
     4  | 1       |   5     | 4 
     6  | 1       |   1     | 6

I want to do a select on the table until the balance total is 6. Then only ids 1, 2 should be returned:

    id  | user_id | balance | batch_id 
    ----|---------|-------- |--------
     1  | 1       |   2     | 1
     3  | 1       |   8     | 3

Another example with balance total 1. Then only ids 1 should be returned:

    id  | user_id | balance | batch_id 
    ----|---------|-------- |--------
     1  | 1       |   2     | 1

Example with balance total 11. Only ids 1,3,4 should be returned:

    id  | user_id | balance | batch_id 
    ----|---------|-------- |--------
     1  | 1       |   2     | 1
     3  | 1       |   8     | 3
     4  | 1       |   5     | 4

So, after that I need to lock those lines with FOR UPDATE ex:

     SELECT * FROM tb_batch_user WHERE user_id = 1 ORDER BY batch_id asc FOR UPDATE

I tried with window function, but it doesn't allow lock (FOR UPDATE). Thanks for any help.

Kelvin Santiago
  • 305
  • 1
  • 10

4 Answers4

1

are you looking for this?

with w0 as (
  select id, user_id, balance, batch_id,
     coalesce(lag(running_balance) over (partition by user_id order by batch_id asc), 0) running_balance 
  from (
      SELECT t.* ,
        sum(balance) over (partition by user_id order by batch_id asc) running_balance
      FROM tb_batch_user t 
      --where t.user_id = 1
  ) x 
)
select * from w0
where running_balance < 6

PS: you can add user_id as where clause. see comment

for locking,

select * from tb_batch_user tb
where tb.id in (select w0.id from w0 where running_balance < 6)
for update 
Derviş Kayımbaşıoğlu
  • 28,492
  • 4
  • 50
  • 72
  • Hmm, that doesn't seem to lock anything, when `FOR UPDATE` is added. It doesn't throw an error though. – sticky bit Sep 23 '20 at 20:48
  • for update and query is working. I'll evaluate the performance of this query with explain. – Kelvin Santiago Sep 23 '20 at 22:59
  • @Derviş Kayımbaşıoğlu there is some way not to go through the entire table, that is, stop until the value found. In query SELECT t.* , sum(balance) over (partition by user_id order by batch_id asc) running_balance FROM tb_batch_user t – Kelvin Santiago Sep 24 '20 at 13:35
1

Here is an approach using window functions:

select id, balance, user_id, batch_id
from (
    select t.*, 
        sum(balance) over(partition by user_id order by id) sum_balance
    from mytable t
    where user_id = 1
) t
where sum_balance - balance < 6

You want a cumulative balance until the first one that equals or exceeds the threshold. For this, you can just use a window sum().

You can change the inequality condition to the threshold value that you like. You can also change (or remove) the filtering on user_id in the subquery.

We can easily implement the same logic with a subquery, that would support for update:

select *
from mytable t
where user_id = 1 and (
    select coalesce(sum(balance), 0)
    from mytable t1
    where t1.user_id = t.user_id and t1.id < t.id
) < 6
for update

Demo on DB Fiddle:

id | balance | user_id
-: | ------: | ------:
 1 |       2 |       1
 3 |       8 |       1
GMB
  • 216,147
  • 25
  • 84
  • 135
  • Adding `FOR UPDATE` to it throws the error "FOR UPDATE is not allowed with window functions". – sticky bit Sep 23 '20 at 20:46
  • @stickybit: good point, thank you. I updated my answer with the same solution as a subquery. – GMB Sep 23 '20 at 20:55
  • And I can confirm that your edit seems to do the locking right. – sticky bit Sep 23 '20 at 20:56
  • Hmm but another thing: Hmm, [if the `balance` of the row with `id` `3` was `4` it includes the row with the `id` of `4` and `6`](https://dbfiddle.uk/?rdbms=postgres_12&fiddle=bef8308b981a5c95545e84ed37d58cc2)... The row with `id` `4` might be intended by the OP in such a case, but I cannot imagine they then also want the row with `id` `6`. – sticky bit Sep 23 '20 at 21:12
  • @GMB not workimg, try with amount 11, only ids 1,3,4 should be returned – Kelvin Santiago Sep 23 '20 at 23:20
  • @KelvinSantiago: ok, this clarifies the logic you want (you might want to add that information to your question as well). I updated my answer. – GMB Sep 23 '20 at 23:28
  • @KelvinSantiago: thanks. Do my revised queries work for you? – GMB Sep 23 '20 at 23:38
  • @GMB Yes! Working, now i'll try with order by, and test performance. Do you see problem with this query in table with million of records? – Kelvin Santiago Sep 24 '20 at 00:50
  • @GMB I made an insert with more than 1 million records, and it is very slow. I waited more than 10min and did not return, even with index. – Kelvin Santiago Sep 24 '20 at 02:49
1

I am able to select. . . for update using window functions:

with inparms as (
  select 1 as user_id, 6 as target
), rtotal as (
  select t.id, i.target,
         sum(t.balance) over (partition by t.user_id
                                  order by t.id
                              rows between unbounded preceding
                                       and 1 preceding) as runbalance
    from tb_batch_user t
         join inparms i 
           on i.user_id = t.user_id
)
select t.*
  from rtotal r
       join tb_batch_user t
         on t.id = r.id
 where coalesce(r.runbalance, 0) < r.target
for update of t;

Fiddle here

Mike Organek
  • 11,647
  • 3
  • 11
  • 26
  • Hmm, [if the `balance` of the row with `id` `3` was `4` it includes the row with the `id` of `4`](https://dbfiddle.uk/?rdbms=postgres_12&fiddle=878052ce2ddf7614ec858c8ffa6627bf)... is that intended @KelvinSantiago? – sticky bit Sep 23 '20 at 21:07
  • Yes, it does. And I would imagine that means not to include the row with id `4` when the rows with `id` `1` and id `4` already add up to a sum `6`. But maybe the OP indeed wanted the "next" one row in such a case too. – sticky bit Sep 23 '20 at 21:16
  • @stickybit You are correct. That should be a straight `<`. Thank you for catching that. – Mike Organek Sep 23 '20 at 21:16
  • Working. Would it have performance problems with a table with millions of records? – Kelvin Santiago Sep 23 '20 at 22:42
  • @KelvinSantiago You need to run it to find out. Provided that `user_id` is indexed, this should perform pretty well. If it doesn't then you can change the `rtotal` CTE into a subquery. – Mike Organek Sep 23 '20 at 22:58
  • @MikeOrganek Where I could do the sorting (order by) with distribution_id, and filter by other columns (where) – Kelvin Santiago Sep 24 '20 at 02:56
  • @KelvinSantiago I do not see a `distribution_id` in your original question. Do you want to `order by distribution_id` before or after you filter by the running `balance`? – Mike Organek Sep 24 '20 at 10:59
  • @MikeOrganek Sorry, I meant batch_id. Example. SELECT * FROM tb_batch_user WHERE user_id = 1 and **another_field_1 = 10** ORDER BY batch_id asc – Kelvin Santiago Sep 24 '20 at 12:16
  • @KelvinSantiago Add the `where` clause to the `rtotal` CTE. Then change the `rtotal` CTE so that the window for `sum(balance)` is `order by t.batch_id` instead of `order by t.id`. – Mike Organek Sep 24 '20 at 14:05
0

Assuming that (user_id, batch_id) is a key, you can use correlated subqueries to avoid window functions. The outer subquery gets the minimal batch_id where the sum of balance reaches or exceeds 6 for the given user id. That sum is fetched in the inner one.

SELECT *
       FROM tb_batch_user bu1
            WHERE bu1.user_id = 1
                  AND bu1.batch_id <= (SELECT min(bu2.batch_id) batch_id
                                              FROM tb_batch_user bu2
                                              WHERE bu2.user_id = bu1.user_id
                                                    AND (SELECT sum(bu3.balance)
                                                                FROM tb_batch_user bu3
                                                                WHERE bu3.user_id = bu2.user_id
                                                                      AND bu3.batch_id <= bu2.batch_id) >= 6)
       FOR UPDATE;

With the pgrowlocks extension installed, we can check that the right rows are locked.

SELECT *
       FROM pgrowlocks('tb_batch_user');

returns:

 locked_row | locker   | multi | xids       | modes          | pids
------------+----------+-------+------------+----------------+---------
 (0,1)      | 10847645 | f     | {10847645} | {"For Update"} | {11996}
 (0,3)      | 10847645 | f     | {10847645} | {"For Update"} | {11996}
sticky bit
  • 36,626
  • 12
  • 31
  • 42