Efficiently updating records in MongoDB / pymongo (perhaps in-place?)

Question

I'm updating records in my collection like so:

for document in myDB.find():
    if compliesToSomeRule(document):
        myDB.update({'_id':document['_id']}, {'$set':{'something':'somevalue'}})

Is this the most efficient way to do it?

It just seems weird that I have to set the first parameter of update as the _id of the document since is that re-querying the index?

Is there a way to update "in place" so to speak?

score 1 · Answer 1 · answered Feb 26 '13 at 22:55

The first section of the update command you make is essentially what you want from the "compliesToSomeRule" method. For example, lets say that compliesToSomeRule is defined as follows:

def compliesToSomeResult(doc):
    if doc['a'] == 0:
        return True
    else:
        return False

Then you could skip this set and just do

myDB.update({ 'a' : 0 }, {'$set' : { 'x' : 'y' }})

This will then apply your update document (the second one) to all documents in your collection that have a field a equal to 0.

Another example: if you want to find all documents where 'a' > 0 you could do

myDB.update({ 'a' : { '$gt' : 0 }}, {'$set' : {'x' : 'y'}})

The first section of the update call, the "query" section, is how you specify the which documents receive the update. The second section defines what the update actually is.

Here is a document that you might find helpful while reviewing this material: http://docs.mongodb.org/manual/applications/update/

This is a great idea. But usually my compilesToSomeRule function is very complex and would probably be quite hard to implement as part of the finder argument :/ — LittleBobbyTables, Feb 26 '13 at 23:00
Ahh, that does make things harder, perhaps if you put an example function that describes what you want to do we might be able to come up with a good solution. That being said, I do often advise people to do complex operations as you describe on the client side, so perhaps your original solution (the one in your question) is the best way to handle this problem. — ACE, Feb 26 '13 at 23:19

Ian McMahon · Answer 2 · 2013-02-26T16:24:35.110

0

You are updating in place. The first argument to update is a find clause, and that's how it identifies which record to update.

edit

Does MongoDB handle caching? Yes. MongoDB keeps all of the most recently used data in RAM. If you have created indexes for your queries and your working data set fits in RAM, MongoDB serves all queries from memory.

MongoDB does not implement a query cache: MongoDB serves all queries directly from the indexes and/or data files.

Also, your username is awesome :)

edited Feb 26 '13 at 16:24

answered Feb 26 '13 at 16:17

Ian McMahon

1,660
11
13

But the find clause has to refind a record...that's already been found surely? Perhaps not. p.s. - the best XKCD eh? :D – LittleBobbyTables Feb 26 '13 at 16:21
I think what Ian is getting at is that the first part of the update call is doing the "compliesToSomeRule" section. In this case it is saying that document._id == expected_id, very basic / simple. But you could do: myDB.update( { 'a' : 0 }, { '$set' : ....}). In that case you are specifying that a == 0. Doing that update will modify any document where a == 0, it is the same as your "compliesToSomeRule" method. EDIT: I am going to write up a more complete answer... – ACE Feb 26 '13 at 22:37

score 0 · Answer 3 · answered Feb 26 '13 at 23:08

0

I think you want to look at the "$where" keyword. http://docs.mongodb.org/manual/reference/operator/where/ It lets you query based on a javascript function, so everything happens on the database. This might be slower though.

answered Feb 26 '13 at 23:08

Josh Buell

606
5
12

score -1 · Answer 4 · answered May 08 '13 at 20:28

if you need to find a specific document and then add something:

    flag=db.foo.find({"_id":333})
    try: 
        if flag[0]['_id']:
            db.foo.update({"_id":333},{"$set":{"name":"mongo"}})
    except:
        print "none"

but if you need to search all documents and add something to all:

flag=db.foo.find()
for row in flag:
    db.foo.update({"_id":flag['_id']},{"$set":{"name":"mongo"}})

Efficiently updating records in MongoDB / pymongo (perhaps in-place?)

4 Answers4