0

In the book that I am, reading so as to count the number of tags in common between multiple posts the Count from django.db.models is used. For tags the taggit is used. But I am quite confused how this functionality is working

post_tags_ids = post.tags.values_list('id', flat=True)
similar_posts =Post.published.filter(tags__in=post_tags_ids).exclude(id=post.id)
similar_posts = similar_posts.annotate(same_tags=Count('tags')).order_by('-same_tags', '-publish')[:4]

In my case, I have several posts and each has two tags. Each post shares only one tag with other posts. I am getting the list of the ids of the tags that the post that I am interested in contains. Then according to that list, I am filtering out other posts. Then I am adding same_tags parameter for all the posts, counting the tags they have.

My confusion is here, as I said I have two tags in each post but magicaly here same_tag is counting one which is the number of tags that post shares with the post of my interest. How this is happening? Here is my test in the shell

>>> posts
[<Post: This is the other one >, <Post: Some content >, <Post: Other Post>,<Post: Temurs learning curv>]
>>> post = Post.objects.get(pk=1) 
>>> post_tags_ids = post.tags.values_list('id', flat=True)
>>> post_tags_ids
[2, 3]
similar_posts = Post.published.filter(tags__in=post_tags_ids).exclude(id=post.id)
similar_posts[0].tags.count()
2
>>> similar_posts = similar_posts.annotate(same_tags=Count('tags')).order_by('-same_tags','-publish')[:4]
>>> similar_posts[0].tags.all()
[<Tag: jazz>, <Tag: temur>]
>>> similar_posts[0].same_tags
1

How I am getting one for the Count() while I have two tags?

Cœur
  • 37,241
  • 25
  • 195
  • 267
R.Temur
  • 37
  • 6
  • In the book, this was explained as: "We use the Count aggregation function to generate a calculated field same_tags that contains the number of tags shared with all the tags queried.". But I didn`t get the point from it. – R.Temur Nov 21 '18 at 07:20
  • Do a `print(similar_posts.query)` in order to inspect the SQL being generated. I guess you don't have all the tags counted for the post, because you have filtering on tags `(tags__in=post_tags_ids)` before the Count. If you do the annotation first, and then filter, then you should get all the tags counted properly. – Todor Nov 21 '18 at 07:22

1 Answers1

0

as Todor pointed out, it's relatively straight forward to look into the query. we can see the code only counts the tag_id which is in the tags of the post is being shown.

SELECT "blog_post"."id", "blog_post"."title", "blog_post"."slug", "blog_post"."author_id", "blog_post"."body", "blog_post"."publish", "blog_post"."created", "blog_post"."updated", "blog_post"."status", COUNT("taggit_taggeditem"."tag_id") AS "same_tags" FROM "blog_post"

INNER JOIN "taggit_taggeditem" ON ("blog_post"."id" = "taggit_taggeditem"."object_id" AND ("taggit_taggeditem"."content_type_id" = 7))

WHERE ("blog_post"."status" = published

   AND "taggit_taggeditem"."tag_id" IN (SELECT DISTINCT U0."id" FROM "taggit_tag" 
                                                        U0 INNER JOIN "taggit_taggeditem" U1 ON (U0."id" = U1."tag_id") 
                                                        INNER JOIN "django_content_type" U2 ON (U1."content_type_id" = U2."id") 
                                                        WHERE (U2."app_label" = blog AND U2."model" = post AND U1."object_id" = 8)) 

   AND NOT ("blog_post"."id" = 8)) 

GROUP BY "blog_post"."id", "blog_post"."title", "blog_post"."slug", "blog_post"."author_id", "blog_post"."body", "blog_post"."publish", "blog_post"."created", "blog_post"."updated", "blog_post"."status"

ORDER BY "same_tags" DESC, "blog_post"."publish" DESC LIMIT 4