0

Simplified Setup

class System(models.Model):
    fields...
    vulnerabilities = models.ManyToManyField('Vulnerability', through='SystemVulnerability')

class Vulnerablity(models.Model):
    name = ...
    risk = CharField(choices=['H','M','L','I'])
    ...

class SystemVulnerability(models.Model):
    vulnerability = ForeignKey
    system = ForeignKey
    fields...

sv_list = [199026, 199036, 199046, 199048, ....]
other_filter = Q(... lots of stuff...) #not including this for brevity

System.objects.filter(other_filter & Q(systemvulnerability__in=sv_list)).query.__str__()
u'SELECT "system"."id", "system"."ip", "system"."org_id", "system"."dnsname", "system"."ipownercode", "system"."securityplan" 
    FROM "system" 
    INNER JOIN "systemvulnerability" ON ( "system"."id" = "systemvulnerability"."system_id" ) 
    WHERE "systemvulnerability"."id" IN (199026, 199036, 199046, 199048, ....)'

That looks good. What if we negate it?

System.objects.filter(other_filter & ~Q(systemvulnerability__in=sv_list)).query.__str__()
u'SELECT "system"."id", "system"."ip", "system"."org_id", "system"."dnsname", "system"."ipownercode", "system"."securityplan" 
    FROM "system" 
    WHERE NOT ("system"."id" IN (SELECT U1."system_id" FROM "systemvulnerability" U1 WHERE U1."id" IN (199026, 199036, 199046, 199048, ....)))'

What? I would expect the resulting query to be:

u'SELECT "system"."id", "system"."ip", "system"."org_id", "system"."dnsname", "system"."ipownercode", "system"."securityplan" 
    FROM "system" 
    INNER JOIN "systemvulnerability" ON ( "system"."id" = "systemvulnerability"."system_id" ) 
    WHERE "systemvulnerability"."id" NOT IN (199026, 199036, 199046, 199048, ....)'

How do I go about modifying my statement so the resulting query is more like I expect?

Example data

System:
s1
s2
s3

Vulnerability:
v1
v2
v3

SystemVulnerability:
sv1 = s1, v1
sv2 = s1, v2
sv3 = s2, v2
sv4 = s3, v2
sv5 = s3, v3

System.filter(Q(systemvulnerability__in=[sv1, sv4]))
Get back s1, s3: Good!

System.filter(~Q(systemvulnerability__in=[sv1, sv2]))
Get back: s2, s3: Good!

System.filter(~Q(systemvulnerability__in=[sv1, sv4, sv5]))
Get back s2: Bad!
But I wanted s1, s2

Obviously I could do the query based on SystemVulnerability but my filter is actually a lot more complicated than this so I'd like to keep it based on System.

I suppose a System.objects.filter(pk__in=SystemVulnerability.filter(~Q(systemvulnerability__in=sv_list)).values_list('system')) could work but the resulting IN clause in the SQL is rather large and slow, especially as my data grows.

EDIT: I went and tried to do it with the IN clause on the previous line. It works but ends up negating a lot of my other filters. I really just need the ~Q() to work properly for relationships!

Tomasz Jakub Rup
  • 10,502
  • 7
  • 48
  • 49

1 Answers1

0

It's not really an answer but after a few days I managed to think about how to rewrite this without the exclusion. The SQL dev in me is pained by what I did but it works far better in django than doing a .extra(where=[]) which is what I started doing. It would have made me rewrite too many queries that way.