1

I want to set 'unique_together' on my DB (postgres). The problem is that I may already have duplicates on DB, so migration would probably not work. So as I see it - before the deployment I need to run some script to remove all duplications (leave only one of each). I prefer doing it with Django Custom Command.

The table 'mapping' looks something like - Id, user_id, user_role, project_id, user_type. I want to set 'unique_together' for all of them. And the script I use to retrieve duplicated rows is-

duplicates = (Mapping.objects.values('project_id', 'user_id', 'user_type', 'user_role').annotate(
        count=Count('id')).values('project_id', 'user_id', 'user_type', 'user_role').order_by().
           filter(count__gt=1))

It returns list of objects that contains the duplicated attributes. for example:

QuerySet [{'user_id': '2222', 'user_type': '1', 'user_role': '1', 'project_id': UUID('c02bda0e-5488-4519-8f34-96b7f3d36fd6')}, {'user_id': '44444', 'user_type': '1', 'user_role': '1', 'project_id': UUID('8c088f57-ad0c-411b-bc2f-398972872324')}]>

Is there a way to retrieve the Ids directly? Is there a better way?

Zoe
  • 27,060
  • 21
  • 118
  • 148
user2880391
  • 2,683
  • 7
  • 38
  • 77
  • Probably this will help to find and remove duplicate https://stackoverflow.com/questions/23908971/find-duplicate-entries-in-database-for-specific-fields – user8845619 Oct 27 '17 at 20:05

1 Answers1

2

You can try it:

Mapping.objects.values(
    'project_id', 'user_id', 'user_type', 'user_role'
).annotate(count=Count('id')
).annotate(max_id=Max('id')
).values('max_id').order_by().filter(count__gt=1)
Brown Bear
  • 19,655
  • 10
  • 58
  • 76
  • I'm getting the following error, since 'id' is a UUID - LINE 1: SELECT MAX("mapping"."id") AS "max_id" FROM ... ^ HINT: No function matches the given name and argument types. You might need to add explicit type casts. But as I understand, it suppose to return only on row, is that correct? – user2880391 Oct 28 '17 at 19:19
  • in your case the trouble is in the id type, you use uuid and database couldn't apply max function to this type of field, you can specify custom function for uuid as example you can look here: [PostgreSQL create UUID max aggregate function ](https://gist.github.com/devodo/8b39748d65e8185fbd89), sorry but now i couldn't give your more help – Brown Bear Oct 30 '17 at 07:52