10

How can I do a GROUP BY in Django without invoking an aggregate function like Max, Sum, Avg, etc?

In my case, I have a table that has columns like { local_path, parent, grandparent }, and I want to do SELECT local_path FROM ... WHERE grandparent = "foo" GROUP BY parent. There will be multiple rows with a grandparent of "foo" and the same parent, but I just want one (any one) of their local_path's.

As you can see I don't want to take an aggregate value of anything. And I couldn't get distinct() to work because I want to find non-distinct local_paths.

I've searched and read the documentation without any luck. Thanks!

adam smith
  • 764
  • 7
  • 16

3 Answers3

7

Here's a trick that works to get Django to execute a GROUP BY query without running an aggregate function. We make a custom Func subclass that is treated as an aggregate function by Django but actually just evaluates to NULL in a SQL query.

from django.db.models import Func, CharField


class NullAgg(Func):
    """Annotation that causes GROUP BY without aggregating.

    A fake aggregate Func class that can be used in an annotation to cause
    a query to perform a GROUP BY without also performing an aggregate
    operation that would require the server to enumerate all rows in every
    group.

    Takes no constructor arguments and produces a value of NULL.

    Example:
        ContentType.objects.values('app_label').annotate(na=NullAgg())
    """
    template = 'NULL'
    contains_aggregate = True
    window_compatible = False
    arity = 0
    output_field = CharField()
Steve Jorgensen
  • 11,725
  • 1
  • 33
  • 43
1

You could use order_by('parent') then in the template use {% ifchanged %} to only show one of each. https://docs.djangoproject.com/en/dev/ref/templates/builtins/?from=olddocs#ifchanged

Ted
  • 12,122
  • 4
  • 31
  • 39
0

Does the SQL query work if you GROUP BY parent with a WHERE grandparent = and then you SELECT local_path? Shouldn't it complain that you are only allowed to SELECT aggregates of the "grouped-by" clause?

I want to find non-distinct local_paths

You don't say, so let me assume that local_path is a string, and that parent and grandparent are ForeignKeys in a Model called Person.

from collections import defaultdict
paths = defaultdict(list)  # parent_id -> list of paths to grandpa 'foo'

for parent_id in User.objects.filter(grandparent='foo').only('parent__id'):
    for path in User.objects.filter(grandparent='foo', parent__id=parent_id) \
        .only('local_path'):
        paths[parent__id] = path.local_path

paths will be a dict holding the values of local_path for different "parents" with grandparent='foo'

Roshan Mathews
  • 5,788
  • 2
  • 26
  • 36
  • Yes the SQL query in the original post actually works. I'm not a SQL expert but if you do a GROUP BY without an aggregate function, it just returns one matching row for each value in the GROUP BY clause. Just by looking at your code I think it would work. I was just hoping there'd be an easy way to do it with one DB query. I suppose I'll just go with raw SQL. – adam smith Aug 17 '11 at 01:54