3

I am using pymongo "insert_one", I want to prevent insertion of two documents with the same "name" attribute.

  1. How do I generally prevent duplicates?
  2. How do I config it for a specific attribute like name?

Thanks!

My code:

client = MongoClient('mongodb://localhost:8888/db')
db = client[<db>]
heights=db.heights

post_id= heights.insert_one({"name":"Tom","height":2}).inserted_id


try:
    post_id2 = heights.insert_one({"name":"Tom","height":3}).inserted_id

except pymongo.errors.DuplicateKeyError, e:
    print e.error_document

print post_id
print post_id2

output:

56aa7ad84f9dcee972e15fb7

56aa7ad84f9dcee972e15fb8

Tom
  • 424
  • 2
  • 8
  • 19

4 Answers4

6

There is an answer for preventing addition of duplicate documents in mongoDB in general at How to stop insertion of Duplicate documents in a mongodb collection .

The idea is to use update with upsert=True instead of insert_one. So while inserting the code for pymongo would be

db[collection_name].update(document,document,upsert=True)
deerishi
  • 527
  • 7
  • 5
5

You need to create an index that ensures the name is unique in that collection

e.g.

db.heights.create_index([('name', pymongo.ASCENDING)], unique=True)

Please see the official docs for further details and clarifying examples

Pynchia
  • 10,996
  • 5
  • 34
  • 43
2

This is your document

doc = {"key": val}

Then, use $set with your document to update

update = {"$set": doc} # it is important to use $set in your update
db[collection_name].update(document, update, upsert=True)
heilala
  • 770
  • 8
  • 19
Ali Reza Ebadat
  • 948
  • 6
  • 5
0

I had a similar problem myself and couldn't find a solution that worked for me.

Eventually I came up with the idea of

  1. searching the collection for the exact data I was going to put into it
  2. count the number of results
  3. If the number of results is 1 the data already exists, if the number is 0 it doesn't.

This is working for what I was looking for, maybe iz helps others:

# dummy data we want to insert to MongoDB
json_data = '{"location": "Berlin","date": 2023,"temperature": 19, "clouds": true}'

# Connect to collection
db = client[event["database"]]
collection = db[event["collection"]]

# Check if the data already exists
cursor = collection.find(json_data)
# If no objects are returned store the data
if(len(list(cursor)) == 0): collection.insert_one(json_data)
Marvin
  • 1
  • 1