I have a relatively straight forward query, yet I can't seem to create the proper index to make it the most efficient read that I can (or can't seem to instruct mongo to use my index). The query is:
const query = {
'location.geoJson': {
$geoWithin: {
$centerSphere: [
user.location.geoJson.coordinates,
defaultRadiusInMiles / earthRadiusInMiles,
],
},
},
_id: { $lt: lastId },
};
results = collections.myCollection.find(query).sort({ _id: -1 }).limit(limit);
The index I've created in attempt to make this query more efficient is
collections.myCollection.createIndex({ 'location.geoJson': '2dsphere', _id: -1 })
However, when I review the explainStats
, I see the following:
"winningPlan": {
"stage": "LIMIT",
...
"inputStage": {
"stage": "FETCH",
"filter": {
"location.geoJson": {
"$geoWithin": {
"$centerSphere": [
...
]
}
}
},
"inputStage": {
"stage": "IXSCAN",
"keyPattern": {
"_id": 1
},
"indexName": "_id_",
Which according to the documentation, indicates that mongo is first doing an index scan on _id
, THEN fetching based on location, and lastly limiting the results, which is not what I want.
So, is this because my compound index is incorrect and not supporting this query? Or how can I force mongo to use my compound index? Lastly, the queryPlanner
shows that the parsed query is:
"parsedQuery": {
"$and": [
{
"_id": {
"$lt": ...
}
},
{
"location.geoJson": {
"$geoWithin": {
"$centerSphere": [
...
]
}
}
}
]
},
Maybe because the first element in the $and
array is the _id
portion of the query, that's why it's executing that first? Any help whatsoever is appreciated.
EDIT: I should probably provide some context. I am attempting to implement newest to oldest pagination, that is why I am passing ObjectId
to begin with. When I obtain the first batch of results (that is, the limit
newest posts), I pass ObjectId('f'.repeat(24))
as lastId
. This is when I observe the performance issues.
I know for a fact I have 110 documents within that radius, and only when I obtain this first batch are all 110 documents scanned, even though limit
< 110. However, what is interesting is that when I obtain the next batch in the pagination (by passing the last ObjectId
from the first batch), it does not scan all 110 documents and is super efficient, in that case it only views limit
number of documents and keys. Could anyone help make sense why the first batch has performance issues but the second batch with a specific ObjectId
does not?
Perhaps I should implement newest to oldest pagination some other way, I'm open to suggestions on that as well.
Here are the executionStats
for when I obtain the first batch vs the second batch: