3

Here is a simplified version of my models:

class Airport(models.Model):
    iata = models.CharField()
    name = models.CharField()
    latitude = models.FloatField()
    longitude = models.FloatField()

class Flight(models.Model):
    origin = models.ForeignKey('Airport', related_name='origins')
    destination = models.ForeignKey('Airport', related_name='destinations')
    owner = models.ForeignKey(User)

Given a User, I would like to create a list of all the Airport objects that appear in either the origin or destination fields of the Flight objects he owns, each annotated with the corresponding number of Flight objects.

For example, suppose a user has been on 3 flights: LAX-LHR, LHR-CDG, and CDG-JFK. Then I would like a query which returns the following object:

[LHR, id__count=2}, {CDG, id__count=2}, {LAX, id__count=1}, {JFK, id__count=1}]

In the above, the three letter codes stand for Airport objects or all their fields.

Generally, there may be thousands of Users and tens of thousands of Airports and Flights, so I am seeking something more efficient than the obvious solution with for loops and if statements, preferably in a single database query.

My current progress is this query:

Airport.objects.filter(
    Q(origins__owner=user) | Q(destinations__owner=user)
)
.distinct()
.annotate(
    id__count=Count('origins', distinct=True) + Count('destinations', distinct=True)
).order_by('-id__count')

This works perfectly with only one user, because the initial filter only keeps those airports which appear somewhere in his flights. But it clearly fails when their are multiple users, because the counts include every user's flights. I need some way to only Count those Flight objects which obey a certain property, namely owner=user where user is a certain User object.


Edit: after reading this page in the Djnago documentation, it seems that putting the filter first should make this work as needed. But it doesn't, at least when I use Q objects. I've found the following highly confusing result.

When I use this query, i.e. only looking at the origins, then it works, and the num_origins field counts only those flights belonging to the specified user:

Airport.objects.filter(origins__owner=user).annotate(num_origins=Count('origins'))

(This isn't exactly what I need because the counts only include flights whose origin is a certain Airport, but it does filter the Users correctly.)

But, when I do nothing but replace a single filter with two Q objects combined with or, i.e.

Airport.objects.filter(Q(origins__owner=user) | Q(destinations__owner=user)).annotate(num_origins=Count('origins'))

now it counts flights belonging to every user! It seems that the annotate "forgets" about the filter when it uses Q objects. What is going on here?

jc315
  • 107
  • 1
  • 9
  • Why you didn't add a related name for owner key on Flight object? Then you can get all of users' flights and apply distinct to them so that only distinct flight counts. – Cagatay Barin Dec 15 '17 at 07:27
  • @ÇağatayBarın I've added a related name for the owner field but I still can't see how to use it. Can you explain a little more? Thanks. – jc315 Dec 16 '17 at 22:43

2 Answers2

5

I think you can achieve this with conditional expressions:

from django.db.models import Case, When

Airport.objects.filter(
    Q(origins__owner=user) | Q(destinations__owner=user)
).annotate(
    num_origins=Count(
        Case(When(Q(origin__owner=user), then=1),output_field=CharField()),
    ),
    num_destinations=Count(
        Case(When(Q(destination__owner=user), then=1),output_field=CharField()),
    )
)

Note that the When clause is repeating the same filter that you do initially. It might actually be more efficient to do this instead (you probably need to inspect the resulting SQL query to find out):

Airport.objects.annotate(
    num_origins=Count(
        Case(When(Q(origin__owner=user), then=1), output_field=CharField()),
    ),
    num_destinations=Count(
        Case(When(Q(destination__owner=user), then=1),output_field=CharField()),
    )
).filter(Q(num_origins__gt=0) | Q(num_destinations__gt=0))

i.e., annotate all flights, and then filter out the ones where the count was 0.

You can then add up num_origins and num_destinations in Python.

If you are using Django 2, then it is simpler still because you can pass a filter argument to Count:

Airport.objects.annotate(
    num_origins=Count('origins', filter=Q(origin__owner=user), distinct=True),
    num_destinations=Count('destinations', filter=Q(destination__owner=user), disctinct=True)
).filter(Q(num_origins__gt=0) | Q(num_destinations__gt=0))
Mohammed Shareef C
  • 3,829
  • 25
  • 35
solarissmoke
  • 30,039
  • 14
  • 71
  • 73
  • Thanks for the answer, but I can't get the first option to work. I think you actually meant `default=0` instead of `else=0`, but even then, the counts are wrong and seem to ignore the user filter. Could this be a database issue? I'm currently using SQLite and I was intending to stay with it, but I could move to MySQL if it would help. In any case, next I will try upgrading to Django 2.0 and using your second option. – jc315 Dec 20 '17 at 01:14
  • I upgraded to Django 2.0 and your second option works! There are a couple of typos, though: the first arguments of `Count` should be `'origins'` and `'destinations'`, and we have to also include `distinct=True`. If you correct these mistakes I will accept your answer. – jc315 Dec 23 '17 at 16:28
  • I've edited based on your comments. Not too sure why the first approach didn't work. Note that SQLite is not recommended for use in production, even if your code works. Better to use Postgres or MySQL. – solarissmoke Dec 24 '17 at 04:07
0

Can you try like this? I didn't test it on shell, so I'm not sure about 'distinct_flights' list structure but you'll get the idea.

# This is all of the distinct flights of your users.
distinct_flights = Flight.objects.filter(owner__in=[user1.id, user2.id]).distinct().values_list('origin','destination')

# This is all of the airports included in the flights above. 
Airport.objects.filter(
    Q(origins__in=distinct_flights['origin'])||
    Q(destination__in=distinct_flights['destination'])
)

# The rest is annotation from those airports as you did before. You can annotate it on the above query again.
Cagatay Barin
  • 3,428
  • 2
  • 24
  • 43
  • Thanks, but this sort of structure is counting every user's flights. When I don't use Q objects in the filter, it works, and the annotation only counts the specified user's flights. But when I do use Q objects combined with or, the annotate counts every user's flights. See my edit. – jc315 Dec 16 '17 at 22:43