2

I've this name field in my database and quite a few of the names are duplicates. I want to have them unique. I know I can set the unique = True but that would only help with future entries. I want to know all the current entries with duplicate names. Is there an easy way to print out all the names that are duplicate in the doctor model?

class Doctor(models.Model):
   name = models.CharField(max_length=1300)
James L.
  • 1,143
  • 22
  • 52
  • 1
    Possible duplicate of [Django select only rows with duplicate field values](http://stackoverflow.com/questions/8989221/django-select-only-rows-with-duplicate-field-values) – sobolevn Oct 11 '15 at 18:29

2 Answers2

3

To get rid of all duplicates from your database, you must ask yourself a question first - what to do with them? Remove? Merge somehow? Change name of each duplicate?

After answering that question, simply construct data migration (with RunPython migration) that will do desired operation on each duplicated entry.

To find all duplicates, you can do:

from django.db.models import Count
with_duplicates = Doctor.objects.annotate(count=Count('id')).order_by('id').distinct('name').filter(count__gt=1)

That query will fetch from database first (by id) record from duplicates group (for example if you have 3 doctors named "who", it will fetch first of them and it will fetch only doctors with duplicates).

Having that, for each doctor that have duplicates, you can get list of that duplicates:

with_duplicates = Doctor.objects.annotate(count=Count('id')).order_by('id').distinct('name').filter(count__gt=1)
for doctor in with_duplicates:
    duplicates = Doctor.objects.filter(name=doctor.name).exclude(id=doctor.id)

And do something with them.

alexislg
  • 1,018
  • 13
  • 20
GwynBleidD
  • 20,081
  • 5
  • 46
  • 77
-7

class Doctor(models.Model): name = models.CharField(max_length=1300, unique = True)

Ishwar
  • 1
  • 3