8

I'm using Django 1.6 with PostgreSQL and have following model:

# models.py
class Game(AbstractContentModel, AbstractScoreModel):
    name = models.CharField(_("name"), max_length=100, blank=True)
    developer = models.CharField(_('Developer'), max_length=255)
    distributor = models.CharField(_('Distributor'), max_length=255, blank=True)
    # ...
    reviews = models.ManyToManyField(Review, related_name="games", blank=True, verbose_name=_("Reviews"))
    videos = models.ManyToManyField(Video, related_name="games", blank=True, verbose_name=_("Videos"))
    images = models.ManyToManyField(Image, related_name="games", blank=True, verbose_name=_("Gallery"))

I'm trying to get all games and to each of them add count of related videos, reviews and images as follows:

# views.py
qs = Game.objects.all()
qs = qs.annotate(video_count=models.Count('videos'))
qs = qs.annotate(review_count=models.Count('reviews'))
qs = qs.annotate(image_count=models.Count('images'))

Result query is:

SELECT 
"content_game"."id", 
"content_game"."name",
"content_game"."developer", 
"content_game"."distributor",
COUNT("content_game_videos"."video_id") AS "video_count",
COUNT("content_game_reviews"."review_id") AS "review_count", 
COUNT("content_game_images"."image_id") AS "image_count" 
FROM "content_game" 
LEFT OUTER JOIN "content_game_videos" ON ( "content_game"."id" = "content_game_videos"."game_id" )
LEFT OUTER JOIN "content_game_reviews" ON ( "content_game"."id" = "content_game_reviews"."game_id" ) 
LEFT OUTER JOIN "content_game_images" ON ( "content_game"."id" = "content_game_images"."game_id" ) 
GROUP BY 
"content_game"."id", 
"content_game"."name",
"content_game"."developer", 
"content_game"."distributor";

My question is - why there are all selected fields in GROUP BY clause? And most importantly, how can I get rid of them besides raw SQL?

I gathered that it would be good to use .values(), but I want every field in result query. If I just use GROUP BY "content_game"."id", the result is the same, but I don't know how to use it like this with Django ORM.

Flaiming
  • 413
  • 2
  • 14
  • 1
    `My question is - why there are all selected fields in GROUP BY clause? And most importantly, how can I get rid of them besides raw SQL?` Postgresql is not MySQL [more_info](http://stackoverflow.com/questions/33629168/group-by-clause-in-mysql-and-postgresql-why-the-error-in-postgresql) Just try to execute raw query with only "content_game"."id" in the `GROUP BY` – Lukasz Szozda Dec 15 '15 at 08:58
  • @lad2025 Thanks, the issue is really with MySQL vs PostgreSQL. But I don't want to use raw SQL if I don't really have to. I found out that this bug was fixed by https://code.djangoproject.com/ticket/19259 , nevertheless I don't see the solution to it. Btw Django version is 1.6, I will add it to my question. – Flaiming Dec 15 '15 at 09:51
  • At the end I solved it by NOT counting related items in query. I have created fields images_count and videos_count on Game model and I'm updating those fields on save and delete. At least this way it will be always optimal for viewing :) – Flaiming Dec 19 '15 at 18:59
  • This might be helpful: https://docs.djangoproject.com/en/1.11/topics/db/aggregation/#order-of-annotate-and-values-clauses – wm3ndez Jan 26 '18 at 14:01

1 Answers1

2

I'm too late to answer but I have kind of studied answer. First of all, the behavior what OP said, is correct. And it should be for all kind of relational database, but as @lad2025 said in the comment the only MySQL (MariaDB as well) was excusing to this behavior. So instead of getting rid of it we must learn it. As this is the only correct way to use it / understand it and get used to with it.

Currently OP has found solution that is workaround of keep updating the table fields with counts of rows somewhere and populate later instead of retrieving it data on the fly by query.

As whatever you add into GROUP BY must be included into SELECT and vice versa while using aggregative functions.

"content_game"."id", "content_game"."name", "content_game"."developer",  "content_game"."distributor"

Above fields must come into GROUP BY because if I try to print id or any field from the selection which row's field (single) value this should return? either last or first row from the group bunch which is incorrect (which MySQL does). So better keep all into GROUP BY.

Why? Because of ambiguity - group by produces a single record for each group of records. Group By columns contain the same value for all records in the group and any aggregative function produces the single value for a group, but "content_game"."developer" can be different for "content_game"."distributor" group "content_game"."id" will be different for all rows as it's primary key. Same for all fields.

Good explanation can be found here

And I would like people to share (edit my post) more statement in more nicer way. So other people can found this useful in the future.

Anup Yadav
  • 2,825
  • 3
  • 21
  • 30