2

Say I have three models:

from django.db import models    

class X(models.Model):
    y = models.ForeignKey(Y, on_delete=models.PROTECT)
    z = models.ForeignKey(Z, on_delete=models.PROTECT)
    a = models.DecimalField(...)
    b = models.DecimalField(...)
    ...some fields...

class Y(models.Model):
   ...some fields...

class Z(models.Model):
   ...some fields...

I would like to use the ORM to do an aggregate query using the X model, but I don't want to group by X.id (I want to group by X.a) and I want the resulting objects to have the Y and Z related models pre-populated in them (to avoid repeated queries for the related data).

As far as I am aware, Django's aggregrate querying only allows you to group by the model's ID (kind of useless in my opinion), or to group by another field and then collapse the results into a single record/object.

There is no middle ground allowing you to group by a field other than the ID of the model and not collapse the results into a single record/object.

I realise this is most likely because this middle ground doesn't fit in nicely with the concept of models (because if you group multiple model records together, they really aren't model instances anymore), but it would be great to be able to utilise the power of the ORM and achieve this middle ground.

Currently I had to sidestep the ORM and roll my own custom model implementation and use raw queries, but it's a bit hacky and I was hoping with the release of Django 1.8 and it's new querying capabilities I might be able to move back to the ORM, but it seems like it's not the case.

The closest I can get is like so:

X.objects.all().values('a', 'y', 'z').annotate(temp=Sum('b'))

...which translates to this SQL:

SELECT app_x.a, app_x.y_id, app_x.z_id, SUM(app_x.b) AS temp
FROM app_x LEFT OUTER JOIN app_y ON (app_x.y_id = app_y.id)
GROUP BY app_x.a, app_x.y_id, app_x.z_id

The SQL I want from the ORM is something like this:

SELECT app_x.a, app_y.*, app_z.*, SUM(app_x.b) AS temp
FROM app_x LEFT OUTER JOIN app_y ON (app_x.y_id = app_y.id)
   LEFT OUTER JOIN app_z ON (app_x.z_id = app_z.id)
GROUP BY app_x.a, app_y.*, app_z.*

...so that I don't just get back related IDs, but full model instances which can be used right away and avoid future queries.

Does the ORM allow for this?

The only solution I can see is to manually define all fields from the related models in the .values() call, but that doesn't give back a queryset and it's a bit tedious to have to manually list all the necessary related fields (as opposed to just doing a .select_related() call).

Edit 1

This would be easy if the .values() call returned actual related model objects instead of just their ID values.

I.e. I think a query like this:

X.objects.all.values('y')

...should return a list of dictionaries that contain Y objects, not Y IDs.

A query like so:

X.objects.all.values('y_id')

...should return a list of dictionaries containing IDs, because it is what you ask for.

Does anyone agree with me?

pleasedesktop
  • 1,395
  • 3
  • 14
  • 25

1 Answers1

1

Does X.objects.annotate(temp=Sum('b')).select_related('y','z') work for you?

select_related docs

edit: Just seen that you ruled that out in your question. I've not tried it, but could values_list() maybe return model instances for foreign key fields? If not, and the reason you're using values() is a lot of unneeded field on model X, maybe look into using defer/only

skolsuper
  • 629
  • 1
  • 6
  • 21
  • Yes and no. Yes it does return related model objects, but no because it groups X by "id", which I want to avoid. Using .values() seems to be the only way to avoid grouping by the model's "id" field. – pleasedesktop May 08 '15 at 01:32
  • See my edit. Failing that, you could also fall back to filtering on the Y and Z models, e.g. `related_zs = Z.objects.filter(x_set__in=x_values)`. It's 2 extra queries but then you can group them however you need. – skolsuper May 08 '15 at 01:39
  • .values_list() just returns a list of lists, as opposed to a list of dicts. Your related solution definitely looks plausible, but I was wondering if there is a way to do it all in the one query. If not, then I will go with your suggestion. – pleasedesktop May 08 '15 at 02:50