0

Say I have the following MongoDB collection (am using mongomock for this example so it's easy to reproduce):

import mongomock

collection = mongomock.MongoClient().db.collection

objects = [{'name': 'Alice', 'age': 21}, {'name': 'Bob', 'age': 20}]
collection.insert_many(objects)

I then would like to update my existing objects with the fields from some new objects:

new_objects = [{'name': 'Alice', 'height': 170}, {'name': 'Caroline', 'height': 160}]

The only way I could think of doing this is:

for record in new_objects:
    if collection.find_one({'name': record['name']}) is not None:
        collection.update_one({'name': record['name']}, {'$set': {'height': record['height']}})
    else:
        collection.insert_one(record)

However, if new_objects is very large, then this method becomes slow - is there a way to use update_many for this?

ignoring_gravity
  • 6,677
  • 4
  • 32
  • 65

1 Answers1

1

You can't use update_many(), because it requires a single filter which in your use case would not work as each filter is different.

A simpler construct uses upsert=True to avoid the insert/update logic, and also sets all the fields specified in the record which is less coding :

for record in objects + new_objects:
    collection.update_one({'name': record.get('name')}, {'$set': record}, upsert=True)

If it is slowing down with a larger number of updates, make sure you have an index on the name field using (in mongo shell):

db.collection.createIndex( { "name": 1 } )

You can squeeze a bit more performance out by using a bulk_write operation. Worked example:

from pymongo import MongoClient, UpdateOne

collection = MongoClient().db.collection

objects = [{'name': 'Alice', 'age': 21}, {'name': 'Bob', 'age': 20}]
new_objects = [{'name': 'Alice', 'height': 170}, {'name': 'Caroline', 'height': 160}]

updates = []

for record in objects + new_objects:
    updates.append(UpdateOne({'name': record.get('name')}, {'$set': record}, upsert=True))

collection.bulk_write(updates)

for record in collection.find({}, {'_id': 0}):
    print(record)

Gives:

{'name': 'Alice', 'age': 21, 'height': 170}
{'name': 'Bob', 'age': 20}
{'name': 'Caroline', 'height': 160}
Belly Buster
  • 8,224
  • 2
  • 7
  • 20