0

Relevant models are:

class Score(models.Model):
    chouette = models.ForeignKey(
        Chouette, on_delete=models.CASCADE, related_name="scores"
    )
    game = models.IntegerField(blank=True, null=True)
    player = models.ForeignKey(User, on_delete=models.CASCADE, related_name="scores")
    position = models.CharField(max_length=10, blank=True, null=True)
    score = models.DecimalField(max_digits=10, decimal_places=3, blank=True, null=True)

    objects = ScoreManager()

class User(AbstractUser):
    name = CharField(_("Name of User"), blank=True, max_length=255)
    handle = CharField(max_length=10)
    xp = IntegerField(default=0)
    objects = CustomUserManager()

I have a custom manager for User that builds a queryset with a lot of annotations, but the one I am struggling with relates to Score. I am trying to calculate a total_winnings annotation that does a subquery on each User and multiplies the score field in Score for each Score instance associated with a user by a field in the Chouette model (not shown--just a decimal field called stake). I have tried several variations of this as my subquery:

def build_chwinnings(self):
    total_winnings = self.filter(pk=OuterRef("pk")).annotate(
        total_winnings = Sum(F("scores__score") * F("scores__chouette__stake")
        )
    )
    return self.annotate(total_winnings=Subquery(total_winnings.values("total_winnings"))

But they all result in the following exception:

django.db.utils.ProgrammingError: more than one row returned by a subquery used as an expression

The "easiest" way to do this would be to add the calculated value (scores__score * scores__chouette__stake) as a field to the Score model, but all the boo birds hate duplicating data within the database. Adding it as a property to Score or to the queryset for Score don't work, because those aren't accessible through the subquery.

Is there a pretty way to do this, or does it make the most sense to add a field in Score and update it on every save? (The stake value in Chouette doesn't typically get changed after the fact, so I'm unlikely to have to worry about any signals, and in reality, the Score instances are similarly "fixed" after their creation.)

sputman
  • 13
  • 6
  • In your code I think you are using `total_winnings` and `total_score` inconsistently. – schillingt Apr 14 '21 at 22:45
  • Have you verified that doing the annotation `Sum(F("scores_score") * F("scores__chouette__stake") )` doesn't cause duplicates in the query? And then after that I would try using `Subquery(total_score.values('total_score')[:1])` – schillingt Apr 14 '21 at 22:47
  • I don't understand why you even need to use that subquery there? `self.annotate(total_score=Sum(F("scores_score") * F("scores__chouette__stake")))` should be enough? – Abdul Aziz Barkat Apr 15 '21 at 04:26
  • @schillingt Fixed the code typos, thanks--actual code was in disarray, so that was a copy-paste error to SO. Subscripting worked--thanks! – sputman Apr 15 '21 at 05:05
  • @AbdulAzizBarkat This issue may be resolved in the ORM by this point, but the queryset has a long chain of annotations, and at the time I initially wrote it, chaining annotations could (and did, in my case) result in strange results in the later annotations. See [this post](https://stackoverflow.com/questions/12108710/multiple-annotate-sum-terms-yields-inflated-answer?noredirect=1&lq=1), for example. – sputman Apr 15 '21 at 05:10
  • @sputman that's informative, unfortunately looking at the ticket it is still not solved (after 12 years!), it is even present in the [documentation](https://docs.djangoproject.com/en/3.2/topics/db/aggregation/#combining-multiple-aggregations) – Abdul Aziz Barkat Apr 15 '21 at 05:31
  • Subscripting may get rid of the exception, but give an incorrect result. If there are multiple rows in the result set, they probably all need to be summed, not take only the first one. – Brad Martsberger Apr 16 '21 at 00:06

1 Answers1

1

You are probably missing the group by on the player column in the subquery. Here is an example that should work. You can adapt this to work in your Manager.

from django.db.models import Subquery, Sum, F, OuterRef

tw_subquery = Subquery(Score.objects.values(
    'player'  # Required to group the annotation by the player
).annotate(
    total_winnings = Sum(F("score") * F("chouette__stake")
).filter(
    player_id=OuterRef('id')
).values(
    'total_winnings'  # required to select only one column
))

users = User.objects.annotate(total_winnings=tw_subquery)

This kind of subquery is handled for you by the django-sql-utils package. This will allow you to do:

from sql_util import SubquerySum

tw_subquery = SubquerySum(F("scores__score") * F("scores__chouette__stake"))
users = User.objects.annotate(total_winnings=tw_subquery)

As a totally separate issue, it looks to me like in your build_chwinnings method that

total_winnings = self.filter(...)

is not right. This queryset should be on the Score model, but self is going to be a queryset on the User model if this is a Manager for the User model. But I'm not sure I'm seeing enough of the code to be certain of this.

Brad Martsberger
  • 1,747
  • 13
  • 7
  • Thanks; this works as well, and django-sql-utils looks very cool. One question for my benefit: is there a performance (or other) difference between running the Subquery on the `Score` model (grouping by `player` and then multiplying `score` and `chouette__stake`) vs running it on `User` (with `scores__score` and `scores__chouette__stake`)? It works both ways, and I got substantially identical query times . . . I gather your package is rewriting the example query using `SubquerySum` above to use the `Score` model? – sputman Apr 15 '21 at 23:14
  • Doing the subquery on the `Score` instead of the `User` is one fewer join. In almost all cases, you'll never notice a performance difference. But one fewer is one fewer and I don't think it should ever be worse. Database query optimizers are pretty good, and may even do the exact same work in both cases. But I'm certain the database will never do more work in the case with fewer joins. Yes, SubquerySum will try to use the least number of joins. – Brad Martsberger Apr 16 '21 at 00:04
  • Very slick--look forward to giving it a try; thanks again. – sputman Apr 16 '21 at 00:54