2

Let's imagine I have two models: User and Event:

class User(models.Model):
    email = models.EmailField(_('Email'), max_length=255, unique=True, db_index=True)

class Event(models.Model):
    class Type(models.IntegerChoices):
        ONE = 1, 'One'
        TWO = 2, 'Two'

    type = models.PositiveSmallIntegerField('Type', default=Type.ONE, choices=Type.choices, null=False, blank=False)
    when = models.DateTimeField(_('When?'), blank=True, null=True)

Now I have a queryset that is defined like below, to filter on users whose number of events are a multiple of 5. Using the db-side function Mod().

cycle_size = 5

users = User.objects.annotate(
    event_count=Coalesce(Subquery(
        Event.objects.filter(
            user__pk=OuterRef('pk')
        ).order_by(
            'user__pk'
        ).annotate(
            count=Count('pk', distinct=True)
        ).values(
            'count'
        ),
        output_field=models.IntegerField()
    ), 0),
    event_cycle_mod=Mod(F('event_count'), Value(cycle_size ))
).filter(
    event_count__gt=0,
    event_cycle_mod=0
)

It works.

But

The resulting SQL query that is generated looks like the following:

SELECT
  `user`.`id`,
FROM
  `user_user`
WHERE
  (
    `user_user`.`id` IN (
      SELECT
        V0.`id`
      FROM
        `user_user` V0
      WHERE
        (
          COALESCE(
            (
              SELECT
                COUNT(DISTINCT U0.`id`) AS `count`
              FROM
                `event_event` U0
              WHERE
                (
                  U0.`user_id` = V0.`id`
                )
              GROUP BY
                U0.`user_id`
              ORDER BY
                NULL
            ),
            0
          ) > 0
          AND MOD(
            COALESCE(
              (
                SELECT
                  COUNT(DISTINCT U0.`id`) AS `count`
                FROM
                  `event_event` U0
                WHERE
                  (
                    U0.`user_id` = V0.`id`
                  )
                GROUP BY
                  U0.`user_id`
                ORDER BY
                  NULL
              ),
              0
            ),
            5
          ) = 0.0
        )
    )
  )

The question is the following: is it normal that the whole COALESCE() and COUNT() function calls get repeated? Would it be normal to expect a AS COUNT and a HAVING or similar, instead of that duplication of code? Is there a way to avoid that duplication that I am not aware of? :-)

I have noticed this "bug" that seems to be related, but it has been flagged as fixed for a long while: https://code.djangoproject.com/ticket/26658

If useful: running django 3.1.3 and Mysql 5.7.xx.

(please ignore the nesting of bits like user_user.id IN (SELECT V0.id FROM user_user V0 -- they are the result of a dynamic building of the queryset).

Thanks!

Hal
  • 537
  • 4
  • 13

0 Answers0