0

I have postgres query, and I want to represent it using Django QuerySet builder

I have a table: history_events

date -------------------------- amount
2019-03-16 16:03:11.49294+05    250.00
2019-03-18 14:56:30.224846+05   250.00
2019-03-18 15:07:30.579531+05   250.00
2019-03-18 20:52:53.581835+05   5.00
2019-03-18 22:33:21.598517+05   1000.00
2019-03-18 22:50:57.157465+05   1.00
2019-03-18 22:51:44.058534+05   2.00
2019-03-18 23:11:29.531447+05   255.00
2019-03-18 23:43:43.143171+05   250.00
2019-03-18 23:44:47.445534+05   500.00
2019-03-18 23:59:23.007685+05   250.00
2019-03-19 00:01:05.103574+05   255.00
2019-03-19 00:01:05.107682+05   250.00
2019-03-19 00:01:05.11454+05    500.00
2019-03-19 00:03:48.182851+05   255.00

and I need to build graphic using this data with step-by step incrementing amount sum by dates

This SQL collects correct data:

with data as (
  select
    date(date) as day,
    sum(amount) as day_sum
  from history_event
  group by day
)
select
  day,
  day_sum,
  sum(day_sum) over (order by day asc rows between unbounded preceding and current row)
from data

But I can not understand how to build correct Queryset expression for this

Another problem - there is no data for some days, and they do not appear on my graph

1 Answers1

4

Nested queries like yours cannot be easily defined in ORM syntax. Subquery is limited to correlated subqueries returning a single value. This often results in contorted and inefficient ORM workarounds for queries that you can easily express in SQL.

In this case, you can use two Window functions combined with a distinct clause.

result = (Event.objects
    .values('date', 'amount')
    .annotate(day_sum=Window(
        expression=Sum('amount'),
        partition_by=[F('date')],
    ))
    .annotate(total=Window(
        expression=Sum('amount'),
        frame=RowRange(start=None, end=0),
        order_by=F('date').asc(),
    ))
    .distinct('date')       
    .order_by('date', '-total')
)

You need to order by '-total' as otherwise distinct discards the wrong rows, leaving you with less than the correct amounts in total.

As to the missing days; SQL has no inherent concept of calendars (and therefore of missing dates) and unless you have lots of data, it should be easier to add missing days in a Python loop. In SQL, you would do it with a calendar table.

Endre Both
  • 5,540
  • 1
  • 26
  • 31