0

I have two collections, as follows:

db.ships
document format: { mmsi: Long, ...some other fields }
indexes: { {mmsi: 1}, {unique: true} }

db.navUpdates
document format: { mmsi: Long, time: ISODate, ...some other fields }
indexes: { mmsi: 1 }, { time: 1 }

For each document within db.ships, I need to find the most recent document within db.navUpdates that matches by mmsi. I cannot use _id to find most recent as documents are not necessarily entered in chronological (as defined by timestamp time) order.

E.g.:

ship document:
{ mmsi: 12345 }

navUpdate documents:
{ mmsi: 12345, time: ISODate("2012-09-19T12:00:00.000Z") }
{ mmsi: 12345, time: ISODate("2012-09-18T12:00:00.000Z") }
{ mmsi: 54321, time: ISODate("2012-09-19T12:00:00.000Z") }

So for the ship with mmsi:12345, the most recent navUpdate is the first document in the list above, with time:ISODate("2012-09-19T12:00:00.000Z").

I tried the following mongo shell script, but it's incredibly slow (multiple seconds for just 10 queries), and messages appearing on the server indicate I'm missing an index.

db.ships.find().limit(10).forEach(function(ship) {
    var n = db.navUpdates.find({mmsi:ship.mmsi}).count();
    if (n==0) { return; }
    var t = db.navUpdates.find({mmsi:ship.mmsi}).sort({time:-1}).limit(1)[0].time;
    print(t);
});

Why is this query so slow? I tried adding a {time: -1} index to navUpdate, thinking perhaps the sort({time: -1}) might be the culprit, but still no improvement.

Also, can this query be optimized? I have the count() call in there because there are some ship documents whose mmsis are not found within navUpdates.

Community
  • 1
  • 1
ericsoco
  • 24,913
  • 29
  • 97
  • 127

1 Answers1

2

A single query can only use one index, so you should add a compound index of { mmsi: 1, time: -1 } to navUpdates that can be used for both your find and sort needs.

Then use .explain() to determine if your indexes are being used in your queries.

JohnnyHK
  • 305,182
  • 66
  • 621
  • 471
  • Ah, so each of the documents returned by `find({mmsi})` must be traversed to examine its `time` value, and then `sort()`ed. Makes sense. Will `sort({time:-1})` be able to take advantage of an index with `{time:1}`? (Does order matter for this `sort()`?) – ericsoco Sep 19 '12 at 21:17
  • 1
    Answer to my comment: order does not matter in this case. When I `sort({time:-1})`, `explain()` reveals the use of `"cursor" : "BtreeCursor mmsi_1_time_1 reverse"`. – ericsoco Sep 19 '12 at 21:39