The Models
Here is the basic models setup we have.
A List has many Items, and an Item can be in many Lists. For a given Item, if any of its Lists are good (i.e., list.bad == False
) then the Item is good. If an Item doesn't appear in any good Lists, then it is bad.
We have a custom QuerySet for Items, with a method for returning only good Items, and a method for returning only bad Items.
class Item(models.Model):
objects = ItemQuerySet.as_manager()
name = models.CharField(max_length=255, unique=True)
class List(models.Model):
name = models.CharField(max_length=255, unique=True)
bad = models.BooleanField(default=True)
items = models.ManyToManyField(Item, related_name='lists')
class ItemQuerySet(models.QuerySet):
def bad(self):
return self.exclude(lists__bad=False)
def good(self):
return self.filter(lists__bad=False)
The Scenario
Here's an example of a scenario we're having trouble with: one bad List, one good List, and two Items.
BadList: GoodList:
- Item1 - Item1
- Item2
Since Item1 appears in at least one good list, it should come up in Item.objects.good()
, and not in Item.objects.bad()
.
Since Item2 does not appear in any good list, it should come up in Item.objects.bad()
, and not in Item.objects.good()
.
We can set up the scenario like so:
# Create the two lists.
>>> goodlist = List.objects.create(name='goodlist', bad=False)
>>> badlist = List.objects.create(name='badlist', bad=True)
# Create the two items.
>>> item1 = Item.objects.create(name='item1')
>>> item2 = Item.objects.create(name='item2')
# Item1 goes in both lists
>>> goodlist.items.add(item1)
>>> badlist.items.add(item1)
# Item2 only in badlist
>>> badlist.items.add(item2)
And, indeed, Item.objects.good()
and Item.objects.bad()
work as we expect:
>>> Item.objects.bad() # This returns what we want! Good!
<QuerySet [<Item: item2>]>
>>> Item.objects.good() # This returns what we want! Good!
<QuerySet [<Item: item1>]>
The Problem
Thanks for bearing with me. Here's where our custom QuerySet goes wrong. If we access the good()
and bad()
custom QuerySet methods through a single List's Items, we get incorrect results.
>>> badlist.items.bad() # WRONG! We want to ONLY see item2 here!
<QuerySet [<Item: item1>, <Item: item2>]
>>> badlist.items.good() # WRONG! We want to see item1 here!
<QuerySet []>
It seems like, when we do badlist.items.bad()
, the query only considers badlist
in determining if the Items are bad, instead of considering all Lists that the Items are in. But I'm confused as to why that would be the case.
My thought is that, in the ItemQuerySet.bad
method, I want something like self.exclude(any__lists__bad=False)
instead of just self.exclude(lists__bad=False)
. But of course that any__
keyword doesn't actually exist, and I'm not sure how to express that logic correctly in a Django QuerySet. It seems that using Q
objects might be the way forward, but I'm still not quite sure how to express a query like this with Q
objects.
In our actual database, there are less than 100 Lists, but millions of Items. So, for performance reasons it's ideal to do this with one query, rather than a property or multiple queries.
Cheers!