3

tl;dr: Why does Django's uniqueness check on INSERT require a SELECT query, and can I permanently disable it?

I'm working to highly optimize a Django app that is writing to a PSQL database. I have a uuid column which is a primary_key as part of my model. It is the only unique field in the model.

id = models.UUIDField(
    primary_key = True,
    default = uuid.uuid4,
    editable = False,
    null = False,
    blank = False,
    help_text = 'The unique identifier of the Node.'
)

The issue I'm encountering is that when attempting to save a new item, Django automatically performs a uniqueness check query prior to the insert:

SELECT (1) AS "a" FROM "customer" WHERE "customer"."id" = \'1271a1c8-5f6d-4961-b2e9-5e93e450fd4e\'::uuid  LIMIT 1

This results in an extra round trip to the database. I know that the field must be unique, but of course Django has already configured this at the database level - so if it tries to insert a row with a non-unique field it will get an error.

I've implemented a workaround which suppresses the query by adding the following to my model:

def validate_unique(self, *args, **kwargs):

    # Make sure that we never validate if ID is unique. Duplicates OK.
    current_exclude = set(kwargs.get('exclude', []))
    current_exclude.add('id')
    kwargs['exclude'] = list(current_exclude)

    super().validate_unique(*args, **kwargs)

This will ensure that uniqueness on the id field is never checked.

This works, I don't get the extra query. I also verified that if I do try to re-insert a duplicate UUID, I indeed get an error with the database as the source.

My question is this: Why does Django do this? I'm tempted to prevent Django from checking uniqueness ever, unless the extra round trip to the DB accomplishes some valuable purpose.

Env:

django==2.2.12
psycopg2-binary==2.8.5
Caleb Mac
  • 83
  • 5
  • For reference, here's what Django docs say: https://docs.djangoproject.com/en/2.2/ref/models/instances/#how-django-knows-to-update-vs-insert However `select_on_save` hasn't been set to true. So it seems it's not following default behavior. – Caleb Mac Sep 04 '20 at 14:11
  • I added an answer, but it would help if you explained the context of your save call. Is it coming from a `ModelForm`? Are you using `force_insert`? – Kevin Christopher Henry Sep 04 '20 at 16:51

1 Answers1

1

In Django, model validation is a distinct step from model saving. It appears that whatever you're doing is triggering validation.

There are a number of good reasons for those to be separate steps. One is that you can express many more constraints in arbitrary Python code than you can with database constraints. Another is that it allows you to generate much more descriptive error messages than you would get by trying to parse non-standardized database errors. Another is that sometimes you simply want to know whether something's valid but don't want to actually save it.

By default, Django does not validate models before saving them. Some Django components, though, like the admin (more generally, ModelForms) do trigger validation.

So, you need to figure out why validation is being triggered in your case, and if that's not what you want, prevent it.

Kevin Christopher Henry
  • 46,175
  • 7
  • 116
  • 102
  • Thank you! This isn't related to a ModelForm, any ideas on how to find out why it is being validated? EDIT: a full text search for `clean` and `validate` shows that we have `full_clean()` in an override of `save` on a base model class... seems like a cuplrit – Caleb Mac Sep 15 '20 at 22:05
  • @CalebMac: That explains it. Note that Django does not particularly support doing `full_clean()` on `save()`, and many people (including me, and this [core contributor](https://code.djangoproject.com/ticket/29655#comment:3)) consider it an anti-pattern. That said, you can override `validate_unique()` if you want. Of course, that will prevent validation even when you want it. And if performance is your concern, note that doing `full_clean()` on `save()` will cause the validation to happen twice when using the admin or `ModelForms`, since they already call `full_clean()`. – Kevin Christopher Henry Sep 15 '20 at 23:58