1

I'm getting the same document (with the same _id) returned for a mongodb query.

db.getCollection('my-collection').find({ "my_key" : "my_value" }, { _id : 1}).toArray()

returns:

[
    {
        "_id" : ObjectId("57d8ea76cee3c6d2299890f2")
    },
    {
        "_id" : ObjectId("57d7975b5981a0a5f27e260c")
    },
    {
        "_id" : ObjectId("57d8ea76cee3c6d2299890f2")
    }
]

I'm trying to figure out why, as it messes up some of our script results, but couldn't find the reason, or any mention of this online.

Our current guess is that it's related to having a sharded collection and shard migrations, but if that was the case I would have expected this to a momentary issue, and this query returns the same document for the whole day now.

Notes:

  • Behavior is the same with or without toArray()
  • Projection was added for clarity only, the same full document appears in the results when not using projection
  • We have 4 shards, each running 3 servers (primary, secondary, backup)
marmor
  • 27,641
  • 11
  • 107
  • 150

1 Answers1

1

_id is always unique, however it's unique to a shard. If you have a sharded collection, there may be a case in which multiple documents with the same _id appear across different shards. This can happen due to orphaned chunks that result from unfinished chunk migration operations between shards.

You can safely remove orphaned chunks here https://docs.mongodb.com/v3.0/reference/command/cleanupOrphaned/

When querying using the _id key, mongos knows which shard to query on, so in that case you should only see one result. But when querying using a different key, the query is run on all shards, so you could see the results that you see.

Meni
  • 478
  • 5
  • 14