0

The MongoDB documentation says that MongoDB doesn't store documents in a collection in a particular order. So if you have this collection:

db.restaurants.insertMany( [
   { "_id" : 1, "name" : "Central Park Cafe", "borough" : "Manhattan"},
   { "_id" : 2, "name" : "Rock A Feller Bar and Grill", "borough" : "Queens"},
   { "_id" : 3, "name" : "Empire State Pub", "borough" : "Brooklyn"},
   { "_id" : 4, "name" : "Stan's Pizzaria", "borough" : "Manhattan"},
   { "_id" : 5, "name" : "Jane's Deli", "borough" : "Brooklyn"},
] );

and sorting like this:

db.restaurants.aggregate(
   [
     { $sort : { borough : 1 } }
   ]
)

Then the sort order can be inconsistent since:

the borough field contains duplicate values for both Manhattan and Brooklyn. Documents are returned in alphabetical order by borough, but the order of those documents with duplicate values for borough might not to be the same across multiple executions of the same sort.

To return a consistent result it's recommended to modify the query to:

db.restaurants.aggregate(
   [
     { $sort : { borough : 1, _id: 1 } }
   ]
)

My question relates to the efficiency of such a query. Let's say you have millions of documents, should you create a compound index, something like { borough: 1, _id: -1 }, to make it efficient? Or is it enough to index { borough: 1 } due to the, potentially, special nature of the _id field?

I'm using MongoDB 4.4.

Johan
  • 37,479
  • 32
  • 149
  • 237

1 Answers1

1

If you need stable sort, you will have to sort on both the fields and for performant query you will need to have a compound index on both the fields.

{ borough: 1, _id: -1 }
Holy_diver
  • 377
  • 1
  • 15
  • But what if I just query for "borough" and only using "{ borough: 1, _id: -1 }" for sort? Do I still need the compound index? – Johan Feb 18 '21 at 14:28
  • The _id index is a bit special, which creates automatically and is always unique. Normally, the _id index is an **ObjectId**, a **UUID** or maybe an **integer** or a **string** that contains some kind of hash. here you need reverse order then you need to have a compound index. – ROHIT KHURANA Feb 19 '21 at 01:51
  • 1
    If you create index on just borough. Mongo will store them sorted by borough not necessarily sorted by _id. Although _id field is index. But the indexes can not intersect to make a make query perform better. Here filtering stage will use indexes but sorting stage won't. – Holy_diver Feb 19 '21 at 11:21