I'm trying to aggregate values down to a specific level with Django, but it does not seem straighforward : aggregation occurs at the full queryset or at the object level, but not 'in-between'.
Here is my query : I need to compute a total activity time by year from multiple event entries.
events.annotate( # Extracting the year
year = ExtractYear('dtstart')
).annotate( # Computing duration...
time_span= ExpressionWrapper(
F('dtend') - F('dtstart'),
output_field=IntegerField()
)
).values( # ... then aggregating by year
'year', 'time_span' #!!!! ... BUT ALSO unfortunately
# by 'time_span' !!!!!!!!!!!!!!!
).annotate( # otherwise 'time_span' would not
total_span=Sum('time_span') # be available for a new computation.
).values( # Therefore, the result is not
'year', 'total_span' # grouped by year, but by the
).order_by( # distinct combination of 'year'
'year' # AND 'total_span'
)
As a consequence, I get this result :
<QuerySet [
{'total_span': 1800000000, 'year': 2016},
{'total_span': 7200000000, 'year': 2016},
{'total_span': 2700000000, 'year': 2016},
{'total_span': 14400000000, 'year': 2016},
{'total_span': 8100000000, 'year': 2017}, ...>
instead of something like this :
<QuerySet [
{'total_span': 16700000000, 'year': 2016},
{'total_span': 19800000000, 'year': 2017}, ...>
In order to set the aggregation level, I would need something like :
aggregate(total_span=Sum('time_span'), group_by='year')
I guess I should turn to subquery or custom aggregation class, but this is beyond my skills so far. Any hint would be warmly appreciated...