This is related to this question, however it is slightly different.
For simplicity, say I have the models (using Django 1.7):
class TimesheetEntry(models.Model):
date = models.DateField()
employee = models.CharField(max_length=255)
hours = models.FloatField()
class OnCallEntry(models.Model)
timesheet = models.ForeignKey(TimesheetEntry)
minutes = models.FloatField()
comment = models.CharField(max_length=255)
Where for a particular TimesheetEntry/day there can be multiple OnCallEntries, each with the amount of minutes spent on call. I want to query for the sum of all hours the employee worked, as well as the total minutes spent on call.
To do this I have the query:
query = TimesheetEntry.objects.all().filter(employee="John Smith").aggregate(
total_hours = Sum('hours'),
total_oncall_minutes = Sum('oncallentry__minutes')
)
However, when there are more than one OnCallEntry for a given TimesheetEntry the hours for an individual TimesheetEntry multiplies by the number of OnCallEntries, causing the total_hours Sum to get skewed. (From what I read this is due to the way django handles joins). And when I dont include the total_oncall_minutes part, it works as expected.
For the total_hours part, I only want to get the Sum for distinct TimesheetEntries. And I would prefer to do it all in one query, as in reality I have even more aggregate functions I want to perform, with other related models.
I have tried all variations on adding the distinct keyword in, ex:
...Sum('hours', distinct=True)..
...Sum('oncallentry__minutes', distinct=True)..
...filter(employee="John Smith").distinct().aggregate(...
But these didnt seem to work. Any help would be appreciated!
Thanks