0

I have a collection of customers with their visited places, organised as follows:

{
  "customer_id": 151,
  "first_name": "Nakia",
  "last_name": "Boyle",
  "visited_places": [
    {
      "country": "Portugal",
      "cities": [
        "South Nicklausburgh",
        "East Graham"
      ]
    },
    {
      "country": "Rwanda",
      "cities": [
        "West Kristofer",
        "Effertzbury",
        "Stokeston",
        "South Darionfort",
        "Lewisport"
      ]
    }
  ]
}

I am trying to find all customers that have visited a specific city in a specific country. The query object is:

{
  "visited_places.country" : "Portugal", 
  "visited_places.cities" : "South Nicklausburgh" 
}

What would the ideal index for this query be? I tried to create a compound index like so:

collection.createIndex({
  'visited_places.cities': 1,
  'visited_places.country': 1
}

This index is indeed used, but only for finding the cities, as the execution plan explains in the IXSCAN stage:

"indexBounds": {
  "visited_places.cities": [
   "[\"South Nicklausburgh\", \"South Nicklausburgh\"]"
  ],
  "visited_places.country": [
   "[MinKey, MaxKey]"
  ]

The country is filtered out in the subsequent FETCH stage:

"filter": {
  "visited_places.country": {
   "$eq": "Portugal"
  }
 }

Why can't the query be completed only from the compound index, and what would the ideal index be for this schema and query?

3bst0r
  • 35
  • 3
  • Could very well be that filtering on just city is more efficient than searching on country and then city, as most cities will be unique to a country. So it would make more sense to search just the city binary tree (or whatever Mongo uses to implement indexes) than doing both. Once it has already done the reads, checking the in-memory data to check the country is very cheap. – gjvdkamp Nov 03 '21 at 15:03

1 Answers1

1

use $elemMatch like this

db.collection.find({
  "visited_places": {
    "$elemMatch": {
      "country": "Portugal",
      "cities": {
        "$elemMatch": {
          "$eq": "South Nicklausburgh"
        }
      }
    }
  }
},
)

https://mongoplayground.net/p/CKnz8VCT5rX

mohammad Naimi
  • 2,259
  • 1
  • 5
  • 15
  • Perfect, thanks! Would you mind explaining why this query can be answered only from the index, and my original one can't? Is it a missing optimization or is the query actually different in that it would return a different result for a certain data configuration? – 3bst0r Nov 10 '21 at 06:45
  • Also I noticed the second nested $elementMatch is not needed: https://mongoplayground.net/p/JmmkAghuoJu will give the same result and execution plan. – 3bst0r Nov 10 '21 at 07:01
  • I was able to answer my own question from the first comment: it is indeed a different query, yours only matches the customers that have visited a specific city in a specific country, mine would match all customers that visited the country and the city, with no constraint that the city has to be within the specified country. – 3bst0r Nov 10 '21 at 08:35