1

I want to select rows in Django which have multiple rows with duplicated values.

For example I have a model which looks like this

class User:
    first_name = models.CharField(...)
    last_name = models.CharField(...)
    ...

and a database looking like this

first_name | last_name | ...
---------- | --------- | ...
Paul       | Adams     | ...
John       | Smith     | ...
Tom        | Mueller   | ...
John       | Smith     | ...
John       | Adams     | ...
Paul       | Adams     | ...

and I want to get the first_name and last_name values which exist multiple times in the database. The other values must not be equal of the rows. In this example I want to get "John Smith" and "Paul Adams".

leona
  • 423
  • 3
  • 16
  • Is [this](https://stackoverflow.com/questions/8989221/django-select-only-rows-with-duplicate-field-values) an answer to your question? – Brambor Nov 26 '20 at 10:11
  • @Brambor No it isn't. I found this question already, but it only works if the value of one field is duplicated, not of multiple fields. – leona Nov 26 '20 at 10:24
  • Can't you just use the answer twice to filter by two columns? – Brambor Nov 26 '20 at 10:26
  • I tried it and it didn't work, maybe I did something wrong. I can try again later. – leona Nov 26 '20 at 10:28
  • try doing something like `.values("first_field", "second_field")` – Alvi15 Nov 26 '20 at 10:29

2 Answers2

2

You can do this in views.py or in your django shell using py manage.py shell

from django.db.models import Count
#import your User model 

users_output = User.objects.values('first_name','last_name').annotate(first_name_count=Count('first_name'),last_name_count = Count('last_name')).filter(first_name_count__gt=1,last_name_count__gt=1)

the above query will return a query set something like this

<QuerySet
 [{'first_name': 'John', 'last_name': 'Smith', 'first_name_count': 2, 'last_name_count': 2},
 {'first_name': 'Paul', 'last_name': 'Adams', 'first_name_count': 2, 'last_name_count': 2}]
>
2

First of all, you need to concat fields.

from django.db.models import Count, Value
from django.db.models.functions import Concat

queryset = User.objects.annotate(full_name=Concat('first_name', Value(' '), 'last_name', output_field=CharField())

then you need to group and calculate the count for each unique full_name

queryset = queryset.values('full_name').annotate(full_name_count=Count('full_name'))

and just filter those with count bigger than one.

queryset = queryset.filter(full_name_count__gt=1)
Davit Tovmasyan
  • 3,238
  • 2
  • 20
  • 36