11

Let's say I'm using Django to manage a database about athletes:

class Player(models.Model):
    name = models.CharField()
    weight = models.DecimalField()
    team = models.ForeignKey('Team')

class Team(models.Model):
    name = models.CharField()
    sport = models.ForeignKey('Sport')

class Sport(models.Model):
    name = models.CharField()

Let's say I wanted to compute the average weight of the players on each team. I think I'd do:

Team.objects.annotate(avg_weight=Avg(player__weight))

But now say that I want to compute the variance of team weights within each sport. Is there a way to do that using the Django ORM? How about using the extra() method on a QuerySet? Any advice is much appreciated.

8one6
  • 13,078
  • 12
  • 62
  • 84
  • Could you give us an example of the sort of calculation you'd want to do? I'm assuming you mean avg_weight - specific_player.weight. – schillingt Oct 25 '13 at 21:25
  • 1
    You can't reference a field created by an annotate call in the extra call. However, you could try something like this one: http://stackoverflow.com/a/9284364/1637351 – schillingt Oct 25 '13 at 21:38
  • The specific example from the original post (variance) is the one I had in mind. But just for variety, I might also want to know, for each sport, what is the Maximum average weight of each team in that sport. Does that make sense? – 8one6 Oct 26 '13 at 00:55

2 Answers2

0

you can use query like this :

class SumSubquery(Subquery):
    template = "(SELECT SUM(`%(field)s`) From (%(subquery)s _sum))"
    output_field = models.Floatfield()
    def as_sql(self, compiler, connection, template=None, **extra_context):
        connection.ops.check_expression_support(self)
        template_params = {**self.extra, **extra_context}
        template_params['subquery'], sql_params = self.queryset.query.get_compiler(connection=connection).as_sql()
        template_params["field"] = list(self.queryset.query.annontation_select_mask)[0]
        sql = template % template_params
        return sql, sql_params



Team.objects.all().values("sport__name").annotate(variance=SumSubquery(Player.objects.filter(team__sport_id=OuterRef("sport_id")).annotate(sum_pow=ExpressionWrapper((Avg("team__players__weight") - F("weight"))**2,output_field=models.Floatfield())).values("sum_pow"))/(Count("players", output_field=models.FloatField())-1))

and add related name to model like this:

class Player(models.Model):
    name = models.CharField()
    weight = models.DecimalField()
    team = models.ForeignKey('Team', related_name="players")
0

I'm going to assume (perhaps incorrectly) that you mean by 'variance' the difference between maximum and minimum weights. If so, you can generate more than one aggregate with a single query, like so:

from django.db.models import Avg, Max, Min

Team.objects.aggregate(Avg('player__weight'), Max('player__weight'), Min('player__weight'))

This is taken from the django docs on generating aggregation over a queryset.

Milo Persic
  • 985
  • 1
  • 7
  • 17