15

Imagine I have a python dictionary where keys are existing user ids, and values are scores to be added to those users' existing scores.

For example: {1: 1580, 4: 540, 2: 678} (this could stretch to n k,v pairs)

I need to update the scores of all these user objects (updated_score = original_score + new_score). One way to do it is iteratively, like so:

from django.db.models import F
scores = {1: 1580, 4: 540, 2: 678}
for user_id,score_to_add in scores.iteritems():
    UserProfile.objects.filter(user_id=user_id).update(score=F('score')+score_to_add)

But that's multiple DB calls. Can I do it in a single call? An illustrative example would be great. As you would have guessed, this is for a Django project.

Hassan Baig
  • 15,055
  • 27
  • 102
  • 205
  • 1
    Never used it, but django-bulk-update might help? https://github.com/aykut/django-bulk-update – DA-- Mar 11 '17 at 08:25
  • I don't think this is feasible since you need a different value in each row. As @DA-- said, you should use this package or wrap the `for` loop in a `transaction.atomic()` context. – nik_m Mar 11 '17 at 08:27

2 Answers2

16

Something like that:

from django.db.models import F
from django.db import transaction

with transaction.atomic():
    scores = {1: 1580, 4: 540, 2: 678}
    for user_id,score_to_add in scores:
        UserProfile.objects.filter(user_id=user_id).update(score=F('score')+score_to_add)

More on this here

You can take a look at this answer too.

[UPDATE]:

TL;DR: It'll not make one db query but it will be faster cause each query lacks the database overhead.

As the docs and @ahmed in his answer say:

Django’s default behavior is to run in autocommit mode. Each query is immediately committed to the database, unless a transaction is active.

By using with transaction.atomic() all the inserts are grouped into a single transaction. The time needed to commit the transaction is amortized over all the enclosed insert statements and so the time per insert statement is greatly reduced.

MD. Khairul Basar
  • 4,976
  • 14
  • 41
  • 59
nik_m
  • 11,825
  • 4
  • 43
  • 57
  • So to be exact, does this imply a single DB operation? – Hassan Baig Mar 11 '17 at 18:22
  • Well "kind of" a single DB operation (updated my answer). Without the DB overhead each time. I don't believe what you ask is feasible with Django ORM, unless you use the `extra` to write custom SQL query. – nik_m Mar 11 '17 at 18:32
  • Yea I read the docs you linked to as well. I think this would primarily help me avoid race conditions. This could perhaps be called an optimization as well - but what happens when the dictionary is large? The updates being performed on the table would block all others. So technically, as `n` scales, the atomic nature of the transaction comes back to bite us. Don't you think? And one can avoid this scenario if it was a simple for loop. – Hassan Baig Mar 12 '17 at 03:50
  • Quoting from [here](https://makandracards.com/makandra/31937-differences-between-transactions-and-locking): *Nothing special happens if the "same" transaction [...] runs at the same time. They will simply run at the same time. There is no guarantee of the order*. So, in my belief, the size of `n` will not affect updates on the table. That would be another great SO question. – nik_m Mar 12 '17 at 07:09
0

transaction.atomic() proposed by @nik_m is good idea, but also you should get records from database in single request.

from django.db.models import F
from django.db import transaction

with transaction.atomic():
    scores = {1: 1580, 4: 540, 2: 678}
    users_to_update = UserProfile.objects.filter(
        user_id__in=scores.keys()
    ) 
    for user in users_to_update:
        user.update(score=F('score') + scores[user.user_id])
Ihor Pomaranskyy
  • 5,437
  • 34
  • 37