0

These are the models i have in my small project

class Tag(models.Model):
    name = models.CharField(max_length=200)

class Post(models.Model):
    content = models.CharField(max_length=200)
    tags = models.ManyToManyField(Tag)
    location = models.PointField(null=True, blank=True)

A list of tags are given to me (a maximum of 3). I need to find the number of times tag each of these tags appear on Post. lets say ["funny","geek","tech"]

For getting tag count i am doing

    for tag in tags:
        tag_data[tag] = Post.objects.filter(
            tags__name=tag,
            location__distance_lte=(
                pnt, D(km=distance))).count()

Then i fetch all post data using the following query.

    posts = Post.objects.filter(
             tags__name__in=tags,
             location__distance_lte=(pnt, D(km=distance)))

I am already looping on posts at the end of function.

for post in posts:
   #some logic

So i know i have the tags data there. No need to do seperate data queries like above to find count. But i figured out that if i want to find count of tags inside the for loop i need 1 more for loop inside that for loop.

for post in posts:
   #some logic
   for tag in post.tags:
      #some logic

So my question which one is more efficient in here. Fetching using .count() as i did or nested for loops.

qwertyui90
  • 191
  • 2
  • 14
  • What is `em_id` in `tag_data[em_id]`? – Ravi Kumar Apr 06 '17 at 08:32
  • That was a typo. Updated – qwertyui90 Apr 06 '17 at 08:39
  • As I understand it, you want to count the number of `Tag` that have been tagged in a specific `Post`. Correct? Or you want to count how many `Post` a `Tag` have been used in? What's the actual thing you want to count? – Marcus Lind Apr 06 '17 at 08:43
  • I wan to know in how many posts a tag has been appeared. – qwertyui90 Apr 06 '17 at 08:46
  • Don't mind the location field, i am restricting the count to a particular area – qwertyui90 Apr 06 '17 at 08:47
  • Your second approach would be very slow because each call to `tag.name` in the inner loop will fire a new query. For your first approach there will be one query for each tag. You should use one aggregation query(as suggested by @MarcusLind) and one for getting all posts. – Ravi Kumar Apr 06 '17 at 09:46

1 Answers1

2

In the Django ORM there's something called Aggregation that is using the .annotate() and .aggregate() methods to add the functionality you're looking for to the queries. (Docs on Aggregation).

There are plenty of examples of how you can do things such as getting a Count, Avg, Sum and more.This can also be done throughout Many-To-Many relationships just like in the case of yours.

There's even an example in the documentation that is very similar to your case.

For example, if you are retrieving a list of books, you may want to know how many authors contributed to each book. Each Book has a many-to-many relationship with the Author; we want to summarize this relationship for each book in the QuerySet.

To count the amount of Post a Tag has appeared in you would just do this:

# Add any arguments you want to the filter() to narrow down the Tag query set.
tags = Tags.objects.filter().annotate(Count('post_set'))
# You access it as post_set__count

EDIT:

Based on the comments that you supplied, you want to add conditions to the annotation and the query, to be able to only count posts that are within a radius. To do this you use what Django calls Conditional Expressions.

With the help of Conditional Expressions you can add conditions to your .annotate().

Example:

tags = Tags.objects.filter().annotate(Count(
        Case(When(post_set__location="1", then=1), output_field=IntegerField())
))

Please read the Documentation to find the proper way to use it in your case. You can see the Conditional Expressions with Aggregations.

Marcus Lind
  • 10,374
  • 7
  • 58
  • 112
  • I understood. But i am confused where to apply filter on Posts. As you see i need only the posts who satisfy the location condition(*location__distance_lte=(pnt, D(km=distance))*). – qwertyui90 Apr 06 '17 at 10:26
  • You previously wrote that we should ignore the location field... But if I understand you correctly, you want to count the number of `Post` that a `Tag` exists in, but only count `Post` that are inside a radius? So you don't want the total `Tag` count, you want to limit it by `Post` location? – Marcus Lind Apr 06 '17 at 10:34
  • Yes. And sorry for misdirecting. When i said Ignore i meant just consider it as filter field. I want Tag count and limit it by Post location – qwertyui90 Apr 06 '17 at 10:38