I want to find objects with duplicate values for a specific field, but only if they are related to the same secondary object.
For example, with the following models:
class Blog(models.Model):
name = models.CharField(max_length=100)
...
class Entry(models.Model):
blog = models.ForeignKey(Blog, on_delete=models.CASCADE)
headline = models.CharField(max_length=255)
pub_date = models.DateField()
...
How do I select all Entry objects that have the same headline IF they are published in the same Blog. So headlines shared between blogs should not be included. (Bonus: An extra filter that extracts only duplicate headlines with the same publication date, still within the same blog).
The closest solution I have is based on this answer: https://stackoverflow.com/a/8989348/20488494
The first part, obtaining duplicate values, is successful:
dupe_entries_vals = (
Entry.objects
.values('blog', 'headline', 'pub_date')
.order_by()
.annotate(count=Count('headline'))
.filter(count__gt=1)
)
But then I can't use the second part because it loses related object filtering. The PKs are also lost due to taking values (necessarily, otherwise the duplicate filtering fails), so I can't use those either.
# No longer restricted to related blog
dupe_entries = Entry.objects.filter(headline__in=[e['headline'] for e in dupe_entries_vals])
# KeyError (id not in `dupe_entries_vals`)
dupe_entries = Entry.objects.filter(pk__in=[e['id'] for e in dupe_entries_vals])
So is there a nice solution for this? Either Django native or a direct SQL query that can be used for most DB backends (if this can't be constructed with Django, please explain)?