4

I want to get all duplicates based on a case insensitive field value.

Basically to rewrite this SQL query

SELECT count(*), lower(name)
FROM manufacturer
GROUP BY lower(name)
HAVING count(*) > 1;

with Django ORM. I was hoping something like this would do the trick

from django.db.models import Count
from django.db.models.functions import Lower

from myapp.models import Manufacturer


qs = Manufacturer.objects.annotate(
    name_lower=Lower('name'),
    cnt=Count('name_lower')
).filter('cnt__gt'=1)

but of course it didn't work.

Any idea how to do this?

Dušan Maďar
  • 9,269
  • 5
  • 49
  • 64

1 Answers1

9

you can try it:

qs = Manufacturer.objects.annotate(lname=Lower('name')
     ).values('lname').annotate(cnt=Count(Lower('name'))
     ).values('lname', 'cnt').filter(cnt__gt=1).order_by('lname', 'cnt')

why should add the order_by ordering-or-order-by:

the sql query looks like:

SELECT 
    LOWER("products_manufacturer"."name") AS "lname",
    COUNT(LOWER("products_manufacturer"."name")) AS "cnt"
FROM "products_manufacturer"
GROUP BY LOWER("products_manufacturer"."name")
HAVING COUNT(LOWER("products_manufacturer"."name")) > 1
ORDER BY "lname" ASC, "cnt" ASC
Brown Bear
  • 19,655
  • 10
  • 58
  • 76
  • This works, thank you! Though, it doesn't work if `order_by('lname', 'cnt')` isn't part of the query. Do you know why? Also, `annotate(cnt=Count(Lower('name'))` can be simplified to `annotate(cnt=Count('lname')`. – Dušan Maďar Oct 25 '17 at 14:13
  • Order in which `annotate()` and `values()` clauses are applied to a query is significant: https://docs.djangoproject.com/en/1.11/topics/db/aggregation/#order-of-annotate-and-values-clauses – Dušan Maďar Oct 25 '17 at 14:24