4

I want to return all IDs from a MongoDB collection and I used the code below:

db.coll.find({}, { _id: 1})

But MongoDB scans the whole collection instead of reading the information from the default index { _id: 1 }.

From the log:

{ find: "collection", filter: {}, projection: { _id: 1 } } 
planSummary: COLLSCAN cursorid:30463374118 keysExamined:0
docsExamined:544783 numYields:4286 nreturned:544782 reslen:16777238
locks:{ Global: { acquireCount: { r: 8574 } }, Database: {
acquireCount: { r: 4287 } }, Collection: { acquireCount: { r: 4287 } }
} protocol:op_query 7024ms

Any idea how to fix this, please?

bejvisek
  • 61
  • 5
  • 1
    What specific version of MongoDB server are you using? Since you haven't specified any query criteria or sort order, the fastest plan for iteration is generally (but not always) a collection scan in natural order. See [SERVER-23406: index scan is slower than full collection scan in some scenarios](https://jira.mongodb.org/browse/SERVER-23406) and linked issues for background details. Adding a sort order or hint on `_id` should result in a covered index query with your projection. For comparison, can you try: `db.coll.find({},{_id: 1}).sort({'_id':1})`)? – Stennie Mar 28 '18 at 13:10
  • Stennie, thanks `db.coll.find({},{_id: 1}).sort({'_id':1})` really works! Also `.hint({_id: 1})` works as you suggested in other comment: https://docs.mongodb.com/manual/reference/method/cursor.hint/#cursor.hint – bejvisek Mar 30 '18 at 15:00

2 Answers2

3

You would need to add a filter that can use this index like so in order to change that query plan - not all operations remove the FETCH stage, e.g. $exists does not appear to work:

db.coll.find({ _id: { $ne: 0 }, { _id: 1 }) // works for ObjectIds

One part of the explanation for this is in the documentation:

A covered query is a query that can be satisfied entirely using an index and does not have to examine any documents. An index covers a query when both of the following apply:

  • all the fields in the query are part of an index,
  • and all the fields returned in the results are in the same index.

I seem to remember seeing a JIRA request to support that without the filter but cannot remember which ticket that was...

dnickless
  • 10,733
  • 1
  • 19
  • 34
  • Thank you, but unfortunately, `$exists: true` does not skip the FETCH phase, which reads all documents from the collection. – bejvisek Mar 28 '18 at 15:22
  • True. I was only looking at the log output to see if the index was used and not the full execution plan... – dnickless Mar 29 '18 at 07:48
2

I have found a solution, which works if one knows the type of the _id field. In case of string _id:

db.coll.find({ _id: { $regex: ".*" }, { _id: 1 })

In case of integer _id:

db.coll.find({ _id: { $gte: 0 }, { _id: 1 })
bejvisek
  • 61
  • 5