19

We run into a known issue in django:

IntegrityError during Many To Many add()

There is a race condition if several processes/requests try to add the same row to a ManyToManyRelation.

How to work around this?

Envionment:

  • Django 1.9
  • Linux Server
  • Postgres 9.3 (An update could be made, if necessary)

Details

How to reproduce it:

my_user.groups.add(foo_group)

Above fails if two requests try to execute this code at once. Here is the database table and the failing constraint:

myapp_egs_d=> \d auth_user_groups
  id       | integer | not null default ...
  user_id  | integer | not null
  group_id | integer | not null
Indexes:
           "auth_user_groups_pkey" PRIMARY KEY, btree (id)
fails ==>  "auth_user_groups_user_id_group_id_key" UNIQUE CONSTRAINT,
                                            btree (user_id, group_id)

Environment

Since this only happens on production machines, and all production machines in my context run postgres, a postgres only solution would work.

guettli
  • 25,042
  • 81
  • 346
  • 663
  • @e4c5 yes, I can confirm it. I added an example. – guettli Jun 21 '16 at 12:28
  • thanks for the update, posting your model might also help. Or are you using django.contrib.auth.models.User? – e4c5 Jun 21 '16 at 12:31
  • @e4c5 Does the model matter? AFAIK there is a race condition in add() for all ManyToManyFields. – guettli Jun 21 '16 at 13:07
  • If you're just looking for a workaround, would http://stackoverflow.com/questions/19686204/django-orm-and-locking-table suffice? I'm not saying this is a perfect answer, but it would help to understand what quality bar is good enough for the final solution... – Peter Brittain Jun 21 '16 at 13:21
  • @guettli The model does not matter but understanding the problem requires quite a bit of reading off-site: the bug report, but even the report is not crystal clear. I did not understand the problem until I read the code in Django. I initially thought the bug report was about the ids created *for the table that models the relation*. In other words, I thought `auth_user_groups_pkey` was violated. Your edit showed that was not the case but *for a moment* at least I thought like Aquiles did. It is not surprising that you are getting comments or answers that seem to miss the mark. – Louis Jun 21 '16 at 13:45
  • @PeterBrittain your linked StackO answer could help: Locking the table ... But I guess this would reduce the db speed. I would like to avoid locks. But thank you very much for pointing me into this direction. – guettli Jun 21 '16 at 13:51
  • @Louis yes, this is a special problem. I guess it is not related to django at all. How can django know if an other process is accessing the same db at the very same moment.... – guettli Jun 21 '16 at 13:51
  • @guettli Oh, it is a Django problem. It can be solved with solutions that are not Django-specific but the fact is that the specs that Django advertises for `.add()` are not being respected. It baffles me that the problem has not been fixed yet. – Louis Jun 21 '16 at 14:01
  • what model you are using, does matter a great deal. If the models were your own, there is a simple work around. – e4c5 Jun 22 '16 at 06:48
  • 1
    @e4c5 please explain the simple work around in an answer. I am very interested. – guettli Jun 22 '16 at 07:43

3 Answers3

16

Can the error be reproduced?

Yes, let us use the famed Publication and Article models from Django docs. Then, let's create a few threads.

import threading
import random

def populate():

    for i in range(100):
        Article.objects.create(headline = 'headline{0}'.format(i))
        Publication.objects.create(title = 'title{0}'.format(i))

    print 'created objects'


class MyThread(threading.Thread):

    def run(self):
        for q in range(1,100):
            for i in range(1,5):
                pub = Publication.objects.all()[random.randint(1,2)]
                for j in range(1,5):
                    article = Article.objects.all()[random.randint(1,15)]
                    pub.article_set.add(article)

            print self.name


Article.objects.all().delete()
Publication.objects.all().delete()
populate()
thrd1 = MyThread()
thrd2 = MyThread()
thrd3 = MyThread()

thrd1.start()
thrd2.start()
thrd3.start()

You are sure to see unique key constraint violations of the type reported in the bug report. If you don't see them, try increasing the number of threads or iterations.

Is there a work around?

Yes. Use through models and get_or_create. Here is the models.py adapted from the example in the django docs.

class Publication(models.Model):
    title = models.CharField(max_length=30)

    def __str__(self):              # __unicode__ on Python 2
        return self.title

    class Meta:
        ordering = ('title',)

class Article(models.Model):
    headline = models.CharField(max_length=100)
    publications = models.ManyToManyField(Publication, through='ArticlePublication')

    def __str__(self):              # __unicode__ on Python 2
        return self.headline

    class Meta:
        ordering = ('headline',)

class ArticlePublication(models.Model):
    article = models.ForeignKey('Article', on_delete=models.CASCADE)
    publication = models.ForeignKey('Publication', on_delete=models.CASCADE)
    class Meta:
        unique_together = ('article','publication')

Here is the new threading class which is a modification of the one above.

class MyThread2(threading.Thread):

    def run(self):
        for q in range(1,100):
            for i in range(1,5):
                pub = Publication.objects.all()[random.randint(1,2)]
                for j in range(1,5):
                    article = Article.objects.all()[random.randint(1,15)]
                    ap , c = ArticlePublication.objects.get_or_create(article=article, publication=pub)
            print 'Get  or create', self.name

You will find that the exception no longer shows up. Feel free to increase the number of iterations. I only went up to a 1000 with get_or_create it didn't throw the exception. However add() usually threw an exception with in 20 iterations.

Why does this work?

Because get_or_create is atomic.

This method is atomic assuming correct usage, correct database configuration, and correct behavior of the underlying database. However, if uniqueness is not enforced at the database level for the kwargs used in a get_or_create call (see unique or unique_together), this method is prone to a race-condition which can result in multiple rows with the same parameters being inserted simultaneously.

Update: Thanks @louis for pointing out that the through model can in fact be eliminated. Thuse the get_or_create in MyThread2 can be changed as.

ap , c = article.publications.through.objects.get_or_create(
            article=article, publication=pub)
e4c5
  • 52,766
  • 11
  • 101
  • 134
  • 3
    `get_or_create()` was fixed here: https://github.com/django/django/commit/1b331f6c1e ... would be nice to have this for `add()` of ManyToMany, too. Thank you very much for this answer. – guettli Jun 22 '16 at 13:11
  • glad you find it usefull. – e4c5 Jun 22 '16 at 13:12
  • Is there something special you do with `ArticlePublication`? If not, then I'm pretty sure you do not need it. If you do not specify `through` on an `ManyToManyField`, then Django creates one for you. I'm able to access `Article.publication.through.object.get_or_create` using the models shown in the Django documentation. – Louis Jun 22 '16 at 13:25
  • This is short, beautiful and working: `article.publications.through.objects.get_or_create( article=article, publication=pub)`. Thank you all for the great team work! You get the bounty :-) – guettli Jun 23 '16 at 10:53
  • just to add it looks like `bulk_create` now has an `ignore_conflicts` kwarg as of 2.2: https://github.com/django/django/commit/f1fbef6cd171ddfae41fcc901f1f60ccad039f51 – jmunsch Feb 20 '19 at 18:53
0

If you are ready to solve it in PostgreSQL you may do the following in psql:

-- Create a RULE and function to intercept all INSERT attempts to the table and perform a check whether row exists:

CREATE RULE auth_user_group_ins AS 
    ON INSERT TO auth_user_groups 
    WHERE (EXISTS (SELECT 1 
                   FROM auth_user_groups 
                   WHERE user_id=NEW.user_id AND group_id=NEW.group_id)) 
    DO INSTEAD NOTHING;

Then it will ignore duplicates only new inserts in table:

db=# TRUNCATE auth_user_groups;
TRUNCATE TABLE

db=# INSERT INTO auth_user_groups (user_id, group_id) VALUES (1,1);
INSERT 0 1   --  added

db=# INSERT INTO auth_user_groups (user_id, group_id) VALUES (1,1);
INSERT 0 0   -- no insert no error

db=# INSERT INTO auth_user_groups (user_id, group_id) VALUES (1,2);
INSERT 0 1   -- added

db=# SELECT * FROM auth_user_groups;  -- check
 id | user_id | group_id
----+---------+----------
 14 |       1 |        1
 16 |       1 |        2
(2 rows)

db=#
Eugene Lisitsky
  • 12,113
  • 5
  • 38
  • 59
  • Does this work if 100 processes try to insert values at once? – guettli Jun 22 '16 at 08:00
  • If you have a really lot of inserts than you possibly come to transaction clashing. – Eugene Lisitsky Jun 22 '16 at 10:33
  • Could you describe in details situation when one user is being added to the same group 100 times at a moment? May be we need some other tools to solve it? BTW, I've updated a solution using RULES to make it easier. – Eugene Lisitsky Jun 22 '16 at 10:41
  • @EugeneLisitsky The example with user and group is just an example. The problem recurs everywhere a `ManyToManyField` exists. So while the scenario of 100 processes trying to add user to the same group may seem bizarre that's not the only case. It could be 100 users saving in their favorites a post that has just appeared on a site. – Louis Jun 22 '16 at 10:59
  • It is not a problem when 100 users save a post each for himself. Because there's no constraint violation - each of them work with his own row in the database. Problem starts when 1 user saves 1 post for himself 100 times simultaneously. If you need to realize some type of counter (when different users tries to update the same row at a moment) it's possible to use some other technics for high-loaded counters. – Eugene Lisitsky Jun 22 '16 at 13:22
  • @EugeneLisitsky Yea that was a bad example – Louis Jun 22 '16 at 13:32
  • It's too hard to solve the task without detailes :( Could you describe in details? – Eugene Lisitsky Jun 22 '16 at 15:35
-3

From what I'm seeing in the code provided. I believe that you have a constraint for uniqueness in pairs (user_id, group_id) in groups. So that's why running 2 times the same query will fail as you are trying to add 2 rows with the same user_id and group_id, the first one to execute will pass, but the second will raise an exception.

Aquiles
  • 841
  • 7
  • 13
  • Yes, all you write is correct. Have you read the linked issue of django? – guettli Jun 21 '16 at 13:05
  • `my_user.groups.add(foo_group)` should be a no-op if `my_user.groups` already contains `foo_group`. So if two processes try to do it at the same time and `foo_group` was not initially in `my_user.groups`, whichever process is first should get to add it. And the second process should perform a no-op: no `insert` should happen and there should not be an exception. This does not happen because the code of `.add()` does a check-then-insert but there is no locking between the check and the insert. So two processes could finish the check thinking that they should insert. – Louis Jun 21 '16 at 13:34