10

How can I apply annotations and filters from a custom manager queryset when filtering via a related field? Here's some code to demonstrate what I mean.

Manager and models

from django.db.models import Value, BooleanField

class OtherModelManager(Manager):
    def get_queryset(self):
        return super(OtherModelManager, self).get_queryset().annotate(
            some_flag=Value(True, output_field=BooleanField())
        ).filter(
            disabled=False
        )

class MyModel(Model):
    other_model = ForeignKey(OtherModel)

class OtherModel(Model):
    disabled = BooleanField()

    objects = OtherModelManager()

Attempting to filter the related field using the manager

# This should only give me MyModel objects with related 
# OtherModel objects that have the some_flag annotation 
# set to True and disabled=False
my_model = MyModel.objects.filter(some_flag=True)

If you try the above code you will get the following error:

TypeError: Related Field got invalid lookup: some_flag

To further clarify, essentially the same question was reported as a bug with no response on how to actually achieve this: https://code.djangoproject.com/ticket/26393.

I'm aware that this can be achieved by simply using the filter and annotation from the manager directly in the MyModel filter, however the point is to keep this DRY and ensure this behaviour is repeated everywhere this model is accessed (unless explicitly instructed not to).

Ben
  • 885
  • 1
  • 12
  • 25
  • What you mean by `=True` after an arbitrary name "some_flag"? Write an example how you can achieve the right result manually, because I get the error `AttributeError: 'bool' object has no attribute 'resolve_expression'` in the latest Django due to the simple value `True`. It is not clear what you want? (Btw, it seems in the link above that it is not related to you and that the original poster accepted that it is not a bug for bug report, but only misunderstanding.) – hynekcer Jun 22 '17 at 18:20
  • This is a contrived example to demonstrate the problem; regardless I have updated the annotation to now work correctly. The example you ask for is obvious, just move the code from the manager to the query `MyModel.objects.annotate(other_model__some_flag=Value(True, output_field=BooleanField())).filter(other_model__some_flag=True, other_model__disabled=False)`. The link is exactly the same problem, it was reported as a bug because the author believed a related Django feature would solve it, which it does not. No response was posted on how to solve the problem. – Ben Jun 22 '17 at 20:50
  • No, I want to see the code that you claim works correctly for you, but it is not DRY enough and I hope that I can do it dry. Your "obvious" example is surely incorrect because it is compiled internally to `SELECT app_mymodel.id,... , True AS "other_model__some_flag" FROM ...` The SQL can be verified easily by `any_queryset.query.get_compiler('default').as_sql()`. Maybe explain clearly e.g. by SQL what do you want to get. Yes, simplified examples are useful, but verified examples that it is without additional unrelated bugs by simplifying, e.g. the missing `return` after `def get_queryset`. – hynekcer Jun 22 '17 at 22:39
  • It is nice that you explained your effort, but you should also explain the problem by a terminology that you are sure you safely know in order to be possible to be possible to get a relevant answer. (It seems it is not Django ORM terminology for you yet.) 1) Annotation is usually a Count or Sum, Min, Max or average of some field (or expression by field) by some groups. There is nothing to do with a constant True. 2) It is usually better to filter before annotate than after it, if the order is otherwise unrelated. That can not be decided without more information. V – hynekcer Jun 24 '17 at 10:06
  • 3) A base manager should not be overwritten this way. Read why users are [discouraged from it](https://docs.djangoproject.com/en/1.11/topics/db/managers/#base-managers). --- Expected that Django ORM is powerful enough, but the right solution could be totally different, you could early reformulate the question, but now it could be to late. – hynekcer Jun 24 '17 at 10:49
  • 1
    @hynekcer Sorry to disappoint, but you're confusing annotation and aggregation. Annotation's common workload is to add computed field at the row level (such as `amount=F('price')+F('quantity')` (simplified). Aggregation as you say, works on a set of rows. –  Jun 24 '17 at 12:34
  • I do not appreciate your condescending responses @hynekcer and they do not add to this discussion. The example is clear and as already mentioned is contrived, you are worrying about the minutia of annotations which is essentially irrelevant to the question. – Ben Jun 26 '17 at 21:01
  • Excuse. I only did not understand the question, frustrating even that I understand half of django.db source code. I tried to reproduce the error message etc. and finally I regretted the lost time because I still don't understand and waited in vain for a context. – hynekcer Jun 27 '17 at 07:36
  • I'm sorry you don't understand the question @hynekcer, i've tried to explain it in several ways. I am simply trying to avoid copy-pasting filterings on related models that should *always* be applied, and similarly annotate fields that should always be available on said related models (unless explicitly instructed otherwise). Custom managers achieve this but are not used for related models when using `.filter()`. – Ben Jun 28 '17 at 01:47

3 Answers3

4

How about running nested queries (or two queries, in case your backend is MySQL; performance).

The first to fetch the pk of the related OtherModel objects.

The second to filter the Model objects on the fetched pks.

other_model_pks = OtherModel.objects.filter(some_flag=...).values_list('pk', flat=True)
my_model = MyModel.objects.filter(other_model__in=other_model_pks)
# use (...__in=list(other_model_pks)) for MySQL to avoid a nested query.
Moses Koledoye
  • 77,341
  • 8
  • 133
  • 139
2

I don't think what you want is possible.

1) I think you are miss-understanding what annotations do.

Generating aggregates for each item in a QuerySet

The second way to generate summary values is to generate an independent summary for each object in a QuerySet. For example, if you are retrieving a list of books, you may want to know how many authors contributed to each book. Each Book has a many-to-many relationship with the Author; we want to summarize this relationship for each book in the QuerySet.

Per-object summaries can be generated using the annotate() clause. When an annotate() clause is specified, each object in the QuerySet will be annotated with the specified values.

The syntax for these annotations is identical to that used for the aggregate() clause. Each argument to annotate() describes an aggregate that is to be calculated.

So when you say:

MyModel.objects.annotate(other_model__some_flag=Value(True, output_field=BooleanField()))

You are not annotation some_flag over other_model.
i.e. you won't have: mymodel.other_model.some_flag

You are annotating other_model__some_flag over mymodel.
i.e. you will have: mymodel.other_model__some_flag

2) I'm not sure how familiar SQL is for you, but in order to preserve MyModel.objects.filter(other_model__some_flag=True) possible, i.e. to keep the annotation when doing JOINS, the ORM would have to do a JOIN over subquery, something like:

INNER JOIN 
(
    SELECT other_model.id, /* more fields,*/ 1 as some_flag
    FROM other_model
) as sub on mymodel.other_model_id = sub.id

which would be super slow and I'm not surprised they are not doing it.

Possible solution

don't annotate your field, but add it as a regular field in your model.

Todor
  • 15,307
  • 5
  • 55
  • 62
  • Actually I think we would have `mymodel.some_flag` not `mymodel.other_model__some_flag` after the annotation, I made this mistake in the example. Also the SQL you postulate would depend on what is being annotated - for instance if it's a field reference via `F("some_field")` there is no need for a subquery; same for an aggregation. Joins and subqueries are not super slow either, this is just dogma; either way it's irrelevant to the Django developers - the ORM is here to do what I need it to do, including the ability to shoot myself in the foot if desired :) – Ben Jun 26 '17 at 20:48
  • Regardless, your solution would obviously work for removing any annotations, but this does not address the implicit filtering on the `disabled` field. I fear you may be right about the "it's not possible", at least without hacking the ORM or doing some horrible `.extra()`. – Ben Jun 26 '17 at 20:50
  • @Ben I did not address the implicit filtering, because it will be kept across relations, only the annotation won't. – Todor Jun 27 '17 at 04:45
  • I don't believe that is the case. Check the documentation on managers [here](https://docs.djangoproject.com/en/1.11/topics/db/managers/#base-managers): `Manager’s aren’t used when querying on related models. For example, if the Question model from the tutorial had a deleted field and a base manager that filters out instances with deleted=True, a queryset like Choice.objects.filter(question__name__startswith='What') would include choices related to deleted questions.` – Ben Jun 28 '17 at 01:43
  • Ah, you are correct, this seems to be a new behavior coming from [1.10](https://docs.djangoproject.com/en/1.11/releases/1.10/#manager-use-for-related-fields-and-inheritance-changes), I still use 1.8 so I've missed that one. I did some testing, seem like there is no way to restore the the old behavior. Well after all, [explicit is better than implicit](https://www.python.org/dev/peps/pep-0020/#id3) ;) – Todor Jun 28 '17 at 14:17
1

The simplified answer is that models are authoritative on the field collection and Managers are authoritative on collections of models. In your efforts to make it DRY you made it WET, cause you alter the field collection in your manager.

In order to fix it, you would have to teach the model about the lookup and need to do that using the Lookup API.

Now I'm assuming that you're not actually annotating with a fixed value, so if that annotation is in fact reducible to fields, then you may just get it done, because in the end it needs to be mapped to database representation.