3

Django recommends not using null on CharField, however annotate includes empty strings in the count. Is there a way to avoid that without excluding rows with empty string from the query?

My question isn't simly how to achieve my query, but fundamentally, should Annotate/Aggregate count include empty fields or not. Django consider empty as a replacement for NULL for string based fields.

My model :

class Book(models.Model):
    name = models.CharField(...)

class Review(models.Model):
    book = models.ForeignKey()
    category = models.ForeignKey()
    review = models.CharField(max_length=200, default='', blank=True)

To count non-empty reviews & group by category, I use

Review.objects.values('category').annotate(count=Count('review'))

This doesn't work because annotate counts empty values also (if the entry was NULL, it wouldn't have done so). I could filter out empty strings before the annotate call but my Query is more complex and I need all empty & non-empty objects.

Is there a smarter way to use annotate and skip empty values from count or should I change the model from

review = models.CharField(max_length=200, default='', blank=True)

to

review = models.CharField(max_length=200, default=None, blank=True, null=True)
user
  • 17,781
  • 20
  • 98
  • 124
  • 4
    `Book.objects.filter(review != '').values('category').annotate(count=Count('review'))` – Sławek Kabik May 25 '14 at 08:06
  • From the question description : `I could filter out empty strings before the annotate call but my Query is more complex and I need all empty & non-empty objects` – user May 25 '14 at 08:08
  • I'm afraid that adding `null=True` won't help you. You should use raw sql query https://docs.djangoproject.com/en/dev/topics/db/sql/ – Sławek Kabik May 25 '14 at 08:21
  • can you use the order_by method, followed by distinct? However, check the django docs, they warn about such a case: https://docs.djangoproject.com/en/dev/ref/models/querysets/#distinct – vlad-ardelean May 25 '14 at 08:54

2 Answers2

1

I faced a very similar situation. I solved it using Conditional Expressions:

review_count = Case(
    When(review='', then=0),
    default=1,
    output_field=IntegerField(),
)
Review.objects.values('category').annotate(count=review_count)
Mauricio
  • 87
  • 2
  • 2
0

...and I need all empty & non-empty objects.

This doesn't make any sense when using values. Instead of actual objects, you'll get a list of dictionaries containing just the category and count keys. Apart from a different number in count, you'll see no difference between filtering out empty review values or not. On top of that, you filter for a single book (id=2) and somehow expect that there can be more than one review.

You need to seriously rethink what you are exactly trying to do, and how your model definition fits into that.

knbk
  • 52,111
  • 9
  • 124
  • 122
  • 2
    The query will have other calculations (average/max/min) on other fields of the model. Those fields are independent of `review`. I want to achieve that in a single query. I had written a simplified model to focus on what I was asking. Added more detail to modelto answer some of your questions. – user May 25 '14 at 08:15