22

I have such a Book model:

class Book(models.Model):
    authors = models.ManyToManyField(Author, ...)
    ...

In short:

I'd like to retrieve the books whose authors are strictly equal to a given set of authors. I'm not sure if there is a single query that does it, but any suggestions will be helpful.

In long:

Here is what I tried, (that failed to run getting an AttributeError)

# A sample set of authors
target_authors = set((author_1, author_2))

# To reduce the search space, 
# first retrieve those books with just 2 authors.
candidate_books = Book.objects.annotate(c=Count('authors')).filter(c=len(target_authors))

final_books = QuerySet()
for author in target_authors:
    temp_books = candidate_books.filter(authors__in=[author])
    final_books = final_books and temp_books

... and here is what I got:

AttributeError: 'NoneType' object has no attribute '_meta'

In general, how should I query a model with the constraint that its ManyToMany field contains a set of given objects as in my case?

ps: I found some relevant SO questions but couldn't get a clear answer. Any good pointer will be helpful as well. Thanks.

iuysal
  • 655
  • 1
  • 5
  • 15

4 Answers4

18

Similar to @goliney's approach, I found a solution. However, I think the efficiency could be improved.

# A sample set of authors
target_authors = set((author_1, author_2))

# To reduce the search space, first retrieve those books with just 2 authors.
candidate_books = Book.objects.annotate(c=Count('authors')).filter(c=len(target_authors))

# In each iteration, we filter out those books which don't contain one of the 
# required authors - the instance on the iteration.
for author in target_authors:
    candidate_books = candidate_books.filter(authors=author)

final_books = candidate_books
iuysal
  • 655
  • 1
  • 5
  • 15
  • Your solutions do the same. Filter kwargs are "AND"ed – Serhii Holinei Nov 07 '12 at 19:01
  • You're right, they look the same. However, I think there is an execution difference. As far as I understood, in your approach, the `author` in the `authors` field which matches `author_1` is also expected to match with `author_2`. On the other hand, iterative filtering doesn't enforce such a constraint. If I'm wrong, please correct me. I'm here to learn. Thanks again! – iuysal Nov 07 '12 at 20:56
5

You can use complex lookups with Q objects

from django.db.models import Q
...
target_authors = set((author_1, author_2))
q = Q()
for author in target_authors:
    q &= Q(authors=author)
Books.objects.annotate(c=Count('authors')).filter(c=len(target_authors)).filter(q)
Serhii Holinei
  • 5,758
  • 2
  • 32
  • 46
  • 5
    Thanks @goliney, although the approach is neat and inspiring, I guess it's not doing what I'm looking for. If there is just one author, it works fine, but when there are multiple authors, the AND process is probably leading to an impossible constraint like (where a=x AND a=y). – iuysal Nov 07 '12 at 17:25
  • 4
    As it turned out, there is something unclear in Q() behaviour for me. According to [this relative question](http://stackoverflow.com/a/5542966/1065780) I find on SO, `Q() & Q()` is not equal to `.filter().filter()`. Thank you for your question – Serhii Holinei Nov 07 '12 at 21:24
  • 2
    Thanks for the link and sharing your time. – iuysal Nov 08 '12 at 10:37
2

Q() & Q() is not equal to .filter().filter(). Their raw SQLs are different where by using Q with &, its SQL just add a condition like WHERE "book"."author" = "author_1" and "book"."author" = "author_2". it should return empty result.

The only solution is just by chaining filter to form a SQL with inner join on same table: ... ON ("author"."id" = "author_book"."author_id") INNER JOIN "author_book" T4 ON ("author"."id" = T4."author_id") WHERE ("author_book"."author_id" = "author_1" AND T4."author_id" = "author_1")

Ed Chai
  • 21
  • 2
0

I came across the same problem and came to the same conclusion as iuysal, untill i had to do a medium sized search (with 1000 records with 150 filters my request would time out).

In my particular case the search would result in no records since the chance that a single record will align with ALL 150 filters is very rare, you can get around the performance issues by verifying that there are records in the QuerySet before applying more filters to save time.

# In each iteration, we filter out those books which don't contain one of the 
# required authors - the instance on the iteration.
for author in target_authors:
   if candidate_books.count() > 0:
      candidate_books = candidate_books.filter(authors=author)

For some reason Django applies filters to empty QuerySets. But if optimization is to be applied correctly however, using a prepared QuerySet and correctly applied indexes are necessary.

Eaton Emmerich
  • 452
  • 4
  • 17