0

I have a database which contains the amounts and dates per user paid. Now some users make payments on the same day and I want to show the cumulative sum of these payments only once per day in a pivot table, which I am creating using Amazon QuickSight.

I have gone through the following, but they provide the cumulative values once per row and I don't have a way to partition on just the date and not on anything else, with the sum over the payment made.

Calculating Cumulative Sum in PostgreSQL

Calculating cumulative sum with date filtering in PostgreSQL

Calculating Cumulative daily sum in PostgreSQL

PostgreSQL, renumber and cumulative sum at once

How to conditional sum two columns in PostgreSQL 9.3

My query looks like this:

SELECT
    s.id,
    s.first_name,
    s.last_name,
    s.birth_date,
    s.card,
    p.datetime,
    p.amount,
    Sum(p.amount)OVER(partition BY p.datetime ORDER BY p.datetime ) AS "Daily Amount"

FROM payments AS p

LEFT JOIN users AS s
ON p.s_h_uuid = s.h_uuid

ORDER BY p.datetime DESC

Where I am doing a Sum() Over() at this row:

Sum(pa.amount)OVER(partition BY p.datetime ORDER BY p.datetime ) AS "Daily Amount"

My Table has data as:

Users:

| id | first_name | last_name | birth_date | card |
| 2  | first_nam2 | last_nam2 | 1990-02-01 | M    |
| 3  | first_nam3 | last_nam3 | 1987-07-23 | M    |
| 1  | first_nam1 | last_nam1 | 1954-11-15 | A    |
| 4  | first_nam4 | last_nam4 | 1968-05-07 | V    |

Payments:

| p_uuid | datetime   | amount |
| 2      | 2021-05-01 | 100.00 |
| 3      | 2021-05-01 | 100.00 |
| 2      | 2021-05-02 | 100.00 |
| 1      | 2021-05-03 | 100.00 |
| 3      | 2021-05-03 | 100.00 |
| 4      | 2021-05-03 | 100.00 |
| 2      | 2021-05-05 | 100.00 |
| 1      | 2021-05-05 | 100.00 |
| 4      | 2021-05-06 | 100.00 |

The output I want is that the "Daily Amount" is shown only once for a specific date, if there are multiple rows with the same date, then for the other rows, it should be blank or display something like "NA":

| p.datetime | id | first_name | last_name | birth_date | card | pa.amount | "Daily Amount" |
| 2021-05-01 | 2  | first_nam2 | last_nam2 | 1990-02-01 | M    | 100.00    | 200.00         |
| 2021-05-01 | 3  | first_nam3 | last_nam3 | 1987-07-23 | M    | 100.00    |                |
| 2021-05-02 | 2  | first_nam2 | last_nam2 | 1990-02-01 | M    | 100.00    | 100.00         |
| 2021-05-03 | 1  | first_nam1 | last_nam1 | 1954-11-15 | A    | 100.00    | 300.00         |   
| 2021-05-03 | 3  | first_nam3 | last_nam3 | 1987-07-23 | M    | 100.00    |                |
| 2021-05-03 | 4  | first_nam4 | last_nam4 | 1968-05-07 | V    | 100.00    |                |
| 2021-05-05 | 2  | first_nam2 | last_nam2 | 1990-02-01 | M    | 100.00    | 200.00         |
| 2021-05-05 | 1  | first_nam1 | last_nam1 | 1954-11-15 | A    | 100.00    |                |
| 2021-05-06 | 4  | first_nam4 | last_nam4 | 1968-05-07 | V    | 100.00    | 100.00         |   

Is there some way that it is possible to get this output from SQL (PostgreSQL specific query)?

gagneet
  • 35,729
  • 29
  • 78
  • 113
  • Make it easy to assist you - simplify! [mcve] – jarlh Jun 01 '21 at 06:37
  • I have provided the different ways in which the output can come for a given date, hence the 9 rows in the output. I had put in less rows, but the context goes away then. – gagneet Jun 01 '21 at 06:38

2 Answers2

1

Looks like your sum() over() computes the wrong amount, try

 Sum(p.amount) OVER(partition BY s.id, p.datetime) AS "Daily Amount",

EDIT If you want to format output (cumulative amount only once per date), use row_number() to detect first row in a group. Make sure over() clause is in sync with ORDER BY of the query.

SELECT 
        id,
        first_name,
        last_name,
        birth_date,
        card,
        datetime,
        amount,
        case when rn=1 then "Daily Amount" end "Daily Amount" 
FROM (
    SELECT
        s.id,
        s.first_name,
        s.last_name,
        s.birth_date,
        s.card,
        p.datetime,
        p.amount,
        Sum(p.amount) OVER(partition BY s.id, p.datetime) AS "Daily Amount",
        row_number() OVER(partition BY s.id, p.datetime ORDER BY p.amount) AS rn
    FROM payments AS p
    LEFT JOIN users AS s ON p.s_h_uuid = s.h_uuid
) t
ORDER BY datetime DESC, id, amount
Serg
  • 22,285
  • 5
  • 21
  • 48
  • Thanks for this answer, but it still displays the amount once for each row. I want to avoid it to be displayed multiple times for the same "Date". So the desired output is to have it displayed as a cumulative amount only once per date. If the date is present in another column, I want the "Daily Amount" to be blank or some other value like "NA" – gagneet Jun 01 '21 at 07:03
  • Then first you need a more specific `ORDER BY` in the query to define the order of rows in the output. For example `ORDER BY s.id, p.datetime, p.amount` . And use it to compute row_number () and show only first one amount. What is the desired order of rows in the output? – Serg Jun 01 '21 at 07:11
  • Okay, will try with a tuple in the ORDER BY clause. The order I want is by the Date field. I was looking at the IF and WHERE clauses, but am unable to formulate the required SQL using those, as the query is a row by row one and for what I want, it might require the whole query to be generated and then a sub-query, which takes the required columns and sorts it in that manner? – gagneet Jun 01 '21 at 08:04
  • Thanks a lot for this. I think just a small change there, as we are looking at the date, would need to convert the datetime to just a date(), else if the time is different, it will start as a new 'rn' again. – gagneet Jun 01 '21 at 11:42
0

If you want the value only once per date, then use row_number():

select (case when 1 = row_number() over (partition by p.date order by p.p_uuid)
             then sum(p.amount) over (partition by p.date)
        end) as day_payments
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786