2

Let's say I have following models:

class Invoice(models.Model):
    ...

class Note(models.Model):
    invoice = models.ForeignKey(Invoice, related_name='notes', on_delete=models.CASCADE)
    text = models.TextField()

and I want to select Invoices that have some notes. I would write it using annotate/Exists like this:

Invoice.objects.annotate(
    has_notes=Exists(Note.objects.filter(invoice_id=OuterRef('pk')))
).filter(has_notes=True)

This works well enough, filters only Invoices with notes. However, this method results in the field being present in the query result, which I don't need and means worse performance (SQL has to execute the subquery 2 times).

I realize I could write this using extra(where=) like this:

Invoice.objects.extra(where=['EXISTS(SELECT 1 FROM note WHERE invoice_id=invoice.id)'])

which would result in the ideal SQL, but in general it is discouraged to use extra / raw SQL. Is there a better way to do this?

Lord Elrond
  • 13,430
  • 7
  • 40
  • 80
radoh
  • 4,554
  • 5
  • 30
  • 45

5 Answers5

3

You can remove annotations from the SELECT clause using .values() query set method. The trouble with .values() is that you have to enumerate all names you want to keep instead of names you want to skip, and .values() returns dictionaries instead of model instances.

Django internaly keeps the track of removed annotations in QuerySet.query.annotation_select_mask. So you can use it to tell Django, which annotations to skip even wihout .values():

class YourQuerySet(QuerySet):
    def mask_annotations(self, *names):
        if self.query.annotation_select_mask is None:
            self.query.set_annotation_mask(set(self.query.annotations.keys()) - set(names))
        else:
            self.query.set_annotation_mask(self.query.annotation_select_mask - set(names))
        return self

Then you can write:

invoices = (Invoice.objects
  .annotate(has_notes=Exists(Note.objects.filter(invoice_id=OuterRef('pk'))))
  .filter(has_notes=True)
  .mask_annotations('has_notes')
)

to skip has_notes from the SELECT clause and still geting filtered invoice instances. The resulting SQL query will be something like:

SELECT invoice.id, invoice.foo FROM invoice
WHERE EXISTS(SELECT note.id, note.bar FROM notes WHERE note.invoice_id = invoice.id) = True

Just note that annotation_select_mask is internal Django API that can change in future versions without a warning.

martin.macko.47
  • 888
  • 5
  • 9
  • I knew about `values`, but I needed the queryset to return model instances. But the other trick with `annotation_select_mask` is pretty cool and works perfectly! Nice find! – radoh Feb 04 '20 at 08:25
2

Ok, I've just noticed in Django 3.0 docs, that they've updated how Exists works and can be used directly in filter:

Invoice.objects.filter(Exists(Note.objects.filter(invoice_id=OuterRef('pk'))))

This will ensure that the subquery will not be added to the SELECT columns, which may result in a better performance.

Changed in Django 3.0:

In previous versions of Django, it was necessary to first annotate and then filter against the annotation. This resulted in the annotated value always being present in the query result, and often resulted in a query that took more time to execute.

Still, if someone knows a better way for Django 1.11, I would appreciate it. We really need to upgrade :(

radoh
  • 4,554
  • 5
  • 30
  • 45
2

We can filter for Invoices that have, when we perform a LEFT OUTER JOIN, no NULL as Note, and make the query distinct (to avoid returning the same Invoice twice).

Invoice.objects.filter(notes__isnull=False).distinct()
radoh
  • 4,554
  • 5
  • 30
  • 45
Willem Van Onsem
  • 443,496
  • 30
  • 428
  • 555
  • ahh right, I actually use this pattern somewhere else, but I couldn't use in my real case - I oversimplified the example and in reality I use `select_for_update` and some additional filters in the exists subquery. However since I didn't mention this in my original question - I'll accept this as the answer. – radoh Jan 24 '20 at 11:44
0

This is best optimize code if you want to get data from another table which primary key reference stored in another table Invoice.objects.filter(note__invoice_id=OuterRef('pk'),)

0

We should be able to clear the annotated field using the below method.

Invoice.objects.annotate(
    has_notes=Exists(Note.objects.filter(invoice_id=OuterRef('pk')))
).filter(has_notes=True).query.annotations.clear()
rathourarv
  • 126
  • 1
  • 7