11

I am trying to clear out and reload a table in my django model, and

>>> models.PuzzleSum.objects.all().count()
2644
>>> models.PuzzleSum.objects.all().delete()
>>> models.PuzzleSum.objects.all().count()
2535

... wtf? Always the magic number 109. I know I could just go into the database and delete them by hand (or loop until they're all gone) but I'm curious.

(Django 1.3.1 on Mac OS X Lion btw)

AlanL
  • 626
  • 1
  • 6
  • 15
  • dunno, maybe PuzzleSum's base QuerySet got set to a custom Manager? e.g. https://docs.djangoproject.com/en/dev/topics/db/managers/#modifying-initial-manager-querysets – David Lam Jul 30 '12 at 19:14
  • Good idea, but no. It's a data load script that runs from a django shell, and the (attempted) delete is the first thing I do after importing the models. – AlanL Jul 30 '12 at 19:32
  • Could you add the code for PuzzleSum and any related models? – Chris Lawlor Jul 30 '12 at 20:11
  • how did your model get into models? models.PuzzleSum.objects.all()? Your model should be a 'subclass' of django.db.models.Model? Is it? – erikvw Jul 30 '12 at 20:54
  • Got it! Yes, my model is a subclass of models.Model, but I have overridden __hash__ and __eq__ because of a particular definition of "duplicate" that I want to use. And guess what: I have 109 distinct hash values. Django must be using a set of objects somewhere internally in its delete logic. – AlanL Jul 30 '12 at 21:12

2 Answers2

6

Yes, Django is storing all objects in a dict, and then deletes them one by one. That's the reason why only the unique items are deleted, as it iterates over them. This is from the Django Collector class, which collects the models for deletion:

self.data = SortedDict([(model, self.data[model])
                        for model in sorted_models])

and then:

# delete instances
for model, instances in self.data.iteritems():
    query = sql.DeleteQuery(model)
    pk_list = [obj.pk for obj in instances]
    query.delete_batch(pk_list, self.using)

As long as you've overridden the __hash__ of your models, when the models are stored in the self.data dict, only the unique ones are stored, and then deleted.

Tisho
  • 8,320
  • 6
  • 44
  • 52
2

Converting my comment above into an answer to the question:

I have overridden hash and eq in PuzzleSum because of a particular definition of "duplicate" that I want to use. And guess what: I have 109 distinct hash values. Django must be using a set of objects somewhere internally in its delete logic.

AlanL
  • 626
  • 1
  • 6
  • 15