0

In the example below, if the collection inventory has an index on the sku field, will it be used in this $lookup operation?

db.orders.insertMany( [
   { "_id" : 1, "item" : "almonds", "price" : 12, "quantity" : 2 },
   { "_id" : 2, "item" : "pecans", "price" : 20, "quantity" : 1 },
   { "_id" : 3  }
] )

db.inventory.insertMany( [
   { "_id" : 1, "sku" : "almonds", "description": "product 1", "instock" : 120 },
   { "_id" : 2, "sku" : "bread", "description": "product 2", "instock" : 80 },
   { "_id" : 3, "sku" : "cashews", "description": "product 3", "instock" : 60 },
   { "_id" : 4, "sku" : "pecans", "description": "product 4", "instock" : 70 },
   { "_id" : 5, "sku": null, "description": "Incomplete" },
   { "_id" : 6 }
] )

db.orders.aggregate( [
   {
     $lookup:
       {
         from: "inventory",
         localField: "item",
         foreignField: "sku",
         as: "inventory_docs"
       }
  }
] )

EDITED:

It does not. Why not?

{
    "explainVersion" : "1",
    "stages" : [
        {
            "$cursor" : {
                "queryPlanner" : {
                    "namespace" : "6303c64faf8ef53d8ba2062f_y22_test2.orders",
                    "indexFilterSet" : false,
                    "parsedQuery" : {

                    },
                    "queryHash" : "8B3D4AB8",
                    "planCacheKey" : "D542626C",
                    "maxIndexedOrSolutionsReached" : false,
                    "maxIndexedAndSolutionsReached" : false,
                    "maxScansToExplodeReached" : false,
                    "winningPlan" : {
                        "stage" : "COLLSCAN",
                        "direction" : "forward"
                    },
                    "rejectedPlans" : [

                    ]
                }
            }
        },
        {
            "$lookup" : {
                "from" : "inventory",
                "as" : "inventory_docs",
                "localField" : "item",
                "foreignField" : "sku"
            }
        }
    ],
  • 3
    The explain (thanks for taking a look at it and for sharing it) is showing that there is a collection scan on the _source_ (`orders`) collection. It actually isn't saying anything about how the `$lookup` is operating. Can you gather the `"executionStats"` verbosity? The database should use an index, but that would help confirm – user20042973 Jan 08 '23 at 00:22

1 Answers1

1

In the case of simple lookups (e.g., when specifying a localField + foreignField), an index will be used

Things are sadly more complicated when using a $lookup + pipeline, the following limitations apply:

  1. Multikey indexes are not used.

  2. Indexes are not used for comparisons where the operand is an array or the operand type is undefined.

  3. Indexes are not used for comparisons with more than one field path operand.

https://www.mongodb.com/docs/manual/reference/operator/aggregation/lookup/

It is really annoying that the explain() call doesn't provide any information on index usage of lookup stages. The best way I've found to determine whether an index was used was to (separately) use the $indexStats aggregation on the collection being looked up, in the above case:

db.inventory.aggregate([{$indexStats: {}}])

Then find the index you think is being used and watch the accesses.ops field.

Zack Newsham
  • 2,810
  • 1
  • 23
  • 43