So I have a booking system. Agents (the people and organisations submitting bookings) are only allowed to make booking in the categories we assign them. Many agents can assign to the same categories. It's a simple many-to-many. Here's an idea of what the models look like:
class Category(models.Model):
pass
class Agent(models.Model):
categories = models.ManyToManyField('Category')
class Booking(models.Model):
agent = models.ForeignKey('Agent')
category = models.ForeignKey('Category')
So when a booking comes in, we dynamically allocate the category based on which are available to the agent. The agent usually doesn't specify.
Can I select Bookings where Booking.category isn't in Booking.agent.categories?
We have just noticed that —by the grace of a silly admin mistake— some agents were allowed submit Bookings to any category. It has left us with thousands of bookings in the wrong place.
I can fix this but the I can only get it to work by nesting lookups:
for agent in Agent.objects.all():
for booking in Booking.objects.filter(agent=agent):
if booking.category not in agent.categories.all():
# go through the automated allocation logic again
This works but it's super-slow. It's a lot of data flying between database and Django. This isn't a one-off either. I want to periodically audit new bookings to make sure they are in the correct place. It doesn't seem impossible that another admin issue will occur so after checking the Agent database, I want to query for Bookings that aren't in their agent's categories.
Again, nested queries will work not but as our datasets grow into millions (and beyond) I'd like to do this more efficiently..
I feel like it should be possible to do this with a F()
lookup, something like this:
from django.db.models import F
bad = Booking.objects.exclude(category__in=F('agent__categories'))
But this doesn't work: TypeError: 'Col' object is not iterable
I've also tried .exclude(category=F('agent__categories'))
and while it's happier with the syntax there, it doesn't exclude the "correct" bookings.
What's the secret formula for doing this sort of F()
query on a M2M?
To help nail down exactly what I'm after I've set up a Github repo with these models (and some data). Please use them to write the query. The current sole answer hits and issue I was seeing on my "real" data too.
git clone https://github.com/oliwarner/djangorelquerytest.git
cd djangorelquerytest
python3 -m venv venv
. ./venv/bin/activate
pip install ipython Django==1.9a1
./manage.py migrate
./manage.py shell
And in the shell, fire in:
from django.db.models import F
from querytest.models import Category, Agent, Booking
Booking.objects.exclude(agent__categories=F('category'))
Is that a bug? Is there a proper way to achieve this?