7

I have a simple model with a generic foreign key:

class Generic(models.Model):
    content_type = models.ForeignKey(ContentType)
    object_id = models.PositiveIntegerField()
    content_object = GenericForeignKey('content_type', 'object_id')

I would like to filter all entries in this table that have non-null content_object's, i.e. filter out all instances of Generic whose content objects no longer exist:

Generic.objects.filter(~Q(content_object=None))

This doesn't work, giving the exception:

django.core.exceptions.FieldError: Field 'content_object' does not generate an automatic reverse relation and therefore cannot be used for reverse querying. If it is a GenericForeignKey, consider adding a GenericRelation.

Adding GenericRelation to the referenced content type models makes no difference.

Any help on how to achieve this would be appreciated, many thanks.

EDIT: I realise I could cascade the delete, however this is not an option in my situation (I wish to retain the data).

Alex Morozov
  • 5,823
  • 24
  • 28
Ben
  • 885
  • 1
  • 12
  • 25

1 Answers1

6

If you want to filter some records out, it's often better to use the exclude() method:

Generic.objects.exclude(object_id__isnull=True)

Note, though, that your model now doesn't allow empty content_object fields. To change this behaviour, use the null=True argument to both object_id and content_type fields.

Update

Okay, since the question has shifted from filtering out null records to determining broken RDBMS references without help of RDBMS itself, I'd suggest a (rather slow and memory hungry) workaround:

broken_items = []
for ct in ContentType.objects.all():        
    broken_items.extend(
        Generic.objects
        .filter(content_type=ct)
        .exclude(object_id__in=ct.model_class().objects.all())
        .values_list('pk', flat=True))

This would work as a one-time script, but not as a robust solution. If you absolutely want to retain the data, the only fast way I could think out is having a is_deleted boolean flag in your Generic model and setting it in a (post|pre)_delete signal.

medmunds
  • 5,950
  • 3
  • 28
  • 51
Alex Morozov
  • 5,823
  • 24
  • 28
  • Another guy commented this same solution then deleted it because it doesn't work, except yours is invalid as this will check if the `object_id` field is null, not the related object referenced by that id. Either way if you fix that it still gives the same exception as posted in the original question. Filter or exclude makes no difference here. FWIW checking `= None` is the same as `isnull=True`. – Ben Jan 26 '16 at 16:08
  • Ben, but what does the "non-null" `content_object` mean then? It's just a (thin enough) wrapper around two actual fields in your database. If you want to exclude the `Generics` aren't referencing any other model, the query I've given above is the one which works. Or please refine your definition of the "null object' in the answer. – Alex Morozov Jan 26 '16 at 16:17
  • Hi Alex, I've clarified the question. Non-null means the actual object that you could retrieve from the table defined by `content_type_id` using the id defined by `object_id`. For example if you had a generic relation to an object that has now been deleted. – Ben Jan 26 '16 at 16:20
  • Okay, now I see. Check out my updated thoughts on it. – Alex Morozov Jan 26 '16 at 17:18
  • Thanks Alex, that's not a bad workaround. I'm planning on having deleted flags in the future so that will probably be the final solution. Appreciate the answer! – Ben Jan 26 '16 at 17:26
  • 1
    In the second version, shouldn't the exclude be on `object_id` rather than `pk`: `.exclude(object_id__in=ct.model_class().objects.all())`? (Because `pk` would refer to `Generic`'s own id, not the id of the related model you want to check.) – medmunds Apr 17 '18 at 17:46
  • Correct @medmunds. To be honest with you though, I would suggest avoiding generic FKs in Django altogether as they are a headache to deal with in all kinds of ways; better to refactor your schema to not need them if you can. – Ben Apr 18 '18 at 02:05
  • @Ben agreed—there are a lot of [reasons to avoid generic foreign keys](https://lukeplant.me.uk/blog/posts/avoid-django-genericforeignkey/) in Django. (I inherited some code that uses them all over the place, and your question and Alex's answer are helping clean up the data during refactoring.) – medmunds Apr 18 '18 at 18:20