Example code
Imagine you have the following model:
class DictionaryEntry(models.Model):
name = models.CharField(max_length=255, null=False, blank=False)
definition = models.TextField(null=True, blank=False)
and the following code:
obj, created = DictionaryEntry.objects.get_or_create(
name='apple', definition='some kind of fruit')
get_or_create
In case you have not seen the code for get_or_create
:
# simplified
def get_or_create(cls, **kwargs):
try:
instance, created = cls.get(**kwargs), False
except cls.DoesNotExist:
instance, created = cls.create(**kwargs), True
return instance, created
about webservers...
Now imagine that you have a webserver with 2
worker processes that both have their own concurrent access to the database.
# simplified
def get_or_create(cls, **kwargs):
try:
instance, created = cls.get(**kwargs), False # <===== nope not there...
except cls.DoesNotExist:
instance, created = cls.create(**kwargs), True
return instance, created
If the timing goes right (or wrong depending on how you want to phrase this), both processes can do the lookup and not find the item. They may both create the item. Everything is fine...
MultipleObjectsReturned: get() returned more than one KeyValue -- it returned 2!
Everything is fine... until you call get_or_create
a third time, "third time is a charm" they say.
# simplified
def get_or_create(cls, **kwargs):
try:
instance, created = cls.get(**kwargs), False # <==== kaboom, 2 objects.
except cls.DoesNotExist:
instance, created = cls.create(**kwargs), True
return instance, created
unique_together
How could you solve this? Maybe enforce a constraint at the database level:
class DictionaryEntry(models.Model):
name = models.CharField(max_length=255, null=False, blank=False)
definition = models.TextField(null=True, blank=False)
class Meta:
unique_together = (('name', 'definition'),)
back to the function:
# simplified
def get_or_create(cls, **kwargs):
try:
instance, created = cls.get(**kwargs), False
except cls.DoesNotExist:
instance, created = cls.create(**kwargs), True # <==== this handles IntegrityError
return instance, created
Say you have the same race as before, and they both did not find the item and proceed to the insert; doing so they will start a transaction and one of them is going to win the race while the other will see an IntegrityError
.
mysql ?
The example uses a TextField
, which for mysql
translates to a LONGTEXT
(in my case). Adding the unique_together
constraint fails the syncdb
.
django.db.utils.InternalError: (1170, u"BLOB/TEXT column 'definition' used in key specification without a key length")
So, no luck, you may have to deal with MultipleObjectsReturned
manually.
possible solutions
- It may be possible to replace the
TextField
with a CharField
.
- It may be possible to add a
CharField
which may be a strong hash of the TextField
, that you can compute in pre_save
and use in a unique_together
.