9

On the face of it, this might seem like a weird thing to request but I need a comma-separated string of related items for a model. Taking the Author/Book model example from any tutorial, here's what I'm currently doing:

authors = Authors.objects.all().prefetch_related('books')
for author in authors:
    author.book_titles = ', '.join([book.title for book in author.books.all()])

It's by no means heavy, but it feels redundant. Like the database could be doing it. In an ideal world, I feel like I should able to annotate this on with one of these fancy new database functions. Here's a fantasy example, using a made-up function called Joiner(..):

Authors.objects.annotate(book_titles=Joiner('books__title', separator=', ')

Is that possible? If so, how?

Oli
  • 235,628
  • 64
  • 220
  • 299

5 Answers5

23
from django.db.models import Aggregate, CharField, Value

class GroupConcat(Aggregate):
    function = 'GROUP_CONCAT'
    template = '%(function)s(%(expressions)s)'

    def __init__(self, expression, delimiter, **extra):
        output_field = extra.pop('output_field', CharField())
        delimiter = Value(delimiter)
        super(GroupConcat, self).__init__(
            expression, delimiter, output_field=output_field, **extra)

    def as_postgresql(self, compiler, connection):
        self.function = 'STRING_AGG'
        return super(GroupConcat, self).as_sql(compiler, connection)

Usage:

Author.objects.annotate(book_titles=GroupConcat('book__title', ', '))

Custom aggregates. This should work on SQLite, MySQL and PostgreSQL.

djvg
  • 11,722
  • 5
  • 72
  • 103
Vladimir Danilov
  • 2,338
  • 14
  • 15
  • 4
    Wowee. That's scary (and awesome) looking. – Oli Jun 24 '16 at 16:08
  • 2
    I've pushed a few little changes to make this work, but yeah, this now works REALLY well (2s→400ms). Thank you. – Oli Jun 24 '16 at 16:21
  • 6
    There is also a `StringAgg` function built into `django.contrib.postgres.aggregates` which handles this now. The above approach is still better if you need to support more than one database. Maybe import the below answer as a `as_mysql` function if you need many databases to use the same code. – Oli Apr 25 '17 at 11:56
  • @Oli would. you kindly publish your "few little changes" here, as well, please? – M.Ionut Sep 07 '22 at 11:30
  • 1
    @M.Ionut I did, it was the second revision. Missing import, missing self. – Oli Sep 07 '22 at 12:55
1

the accepted answer doesn't pass the separator correctly in MySQL

this lets you set the separator and also specify DISTINCT:

class GroupConcat(Aggregate):
    function = 'GROUP_CONCAT'
    template = '%(function)s(%(distinct)s%(expressions)s%(separator)s)'

    def __init__(self, expression, distinct=False, separator=None, **extra):
        output_field = extra.pop('output_field', CharField())
        distinct = 'DISTINCT ' if distinct else ''
        separator = " SEPARATOR '{}'".format(separator) if separator is not None else ''
        super(GroupConcat, self).__init__(
        expression, separator=separator, distinct=distinct, output_field=output_field, **extra)

see also: GROUP_CONCAT equivalent in Django

Community
  • 1
  • 1
aoxborrow
  • 48
  • 5
  • 1
    sorry, but actually your solution is the one that raises an error... I get `TypeError: GroupConcat does not allow distinct.` Or maybe we need to add `allow_distinct = True`? as per @Ali Dabour 's comment here: https://stackoverflow.com/a/31337612/10632656? – M.Ionut Sep 07 '22 at 12:47
0

I don't have the rep to post a comment, but I should add that while both answers will work if using MySQL (just pass a delimiter of "" for the accepted answer to work), the group_concat_max_len is set to 1024 by default. This means your comma-separated string will be truncated at 1024 characters. To increase this value, use the myql command:

SET GLOBAL group_concat_max_len = 1000000;

This will set all the group_concat_max_len of all your databases to 1 million characters, which should be enough in most cases.

vtbiggi
  • 183
  • 1
  • 2
  • 12
0

Not sure if this would be helpful to anyone but I needed it to be able to handle distinct values (only include each item once in the aggregate list). I do not know if this will work for anything but postgres.

class GroupConcat(Aggregate):
    function = 'GROUP_CONCAT'

    def __init__(self, expression, delimiter=', ', distinct=False, **extra):
        output_field = extra.pop('output_field', CharField())
        delimiter = Value(delimiter)
        self.template = '%(function)s(DISTINCT %(expressions)s)' if distinct else '%(function)s(%(expressions)s)'
        super(GroupConcat, self).__init__(
            expression, delimiter, output_field=output_field, **extra)

    def as_postgresql(self, compiler, connection):
        self.function = 'STRING_AGG'
        return super(GroupConcat, self).as_sql(compiler, connection)
Ashley H.
  • 689
  • 4
  • 11
0

Thi is a pretty old question but I've scoured the internet and cannot find a GroupConcat that works for different databases. If you are like me and use sqlite for testing and something else for production. Or, if you simply want things to work regardless of the backend you are using give this a try.

Confirmed working in sqlite3. Should work in MySQL. May work in Postgresql.

from django.db.models import Aggregate, CharField, Value
from django.conf import settings


class GroupConcat(Aggregate):
    function = "GROUP_CONCAT"
    template = ""

    def __init__(self, expression=None, delimiter=None, **extra):
        db_engine = settings.DATABASES["default"]["ENGINE"]
        if db_engine == "django.db.backends.sqlite3":
            self.as_sqlite3(expression, delimiter, extra)
        elif db_engine == "django.db.backends.mysql":
            self.as_mysql(expression, extra)
        elif db_engine == "django.db.backends.postgresql":
            return self.as_postgresql(extra.pop("compiler"), extra.pop("connection"))
        else:
            raise NotImplementedError(
                f"DB Engine {db_engine!r} not supported for {self.function!r}"
            )

    def as_mysql(self, expression, extra):
        self.template = "%(function)s(%(distinct)s%(expressions)s%(separator)s)"
        output_field = extra.pop("output_field", CharField())
        distinct = "DISTINCT " if distinct else ""
        separator = " SEPARATOR '{}'".format(separator) if separator is not None else ""
        super(GroupConcat, self).__init__(
            expression,
            separator=separator,
            distinct=distinct,
            output_field=output_field,
            **extra,
        )

    def as_sqlite3(self, expression, delimiter, extra):
        self.template = "%(function)s(%(expressions)s)"
        output_field = extra.pop("output_field", CharField())
        delimiter = Value(delimiter)
        super(GroupConcat, self).__init__(
            expression, delimiter, output_field=output_field, **extra
        )

    def as_postgresql(self, compiler, connection):
        self.function = "STRING_AGG"
        return super(GroupConcat, self).as_sql(compiler, connection)
Matt Henry
  • 31
  • 5