I want to set 'unique_together' on my DB (postgres). The problem is that I may already have duplicates on DB, so migration would probably not work. So as I see it - before the deployment I need to run some script to remove all duplications (leave only one of each). I prefer doing it with Django Custom Command.
The table 'mapping' looks something like - Id, user_id, user_role, project_id, user_type. I want to set 'unique_together' for all of them. And the script I use to retrieve duplicated rows is-
duplicates = (Mapping.objects.values('project_id', 'user_id', 'user_type', 'user_role').annotate(
count=Count('id')).values('project_id', 'user_id', 'user_type', 'user_role').order_by().
filter(count__gt=1))
It returns list of objects that contains the duplicated attributes. for example:
QuerySet [{'user_id': '2222', 'user_type': '1', 'user_role': '1', 'project_id': UUID('c02bda0e-5488-4519-8f34-96b7f3d36fd6')}, {'user_id': '44444', 'user_type': '1', 'user_role': '1', 'project_id': UUID('8c088f57-ad0c-411b-bc2f-398972872324')}]>
Is there a way to retrieve the Ids directly? Is there a better way?