-3

The base collection has 100,000 documents. All of which will be run through a $lookup stage. The collection that will be looked up in is empty.

This query takes 30 seconds.

When I remove the $lookup stage, this query takes 3 seconds.

What explains why it is so slow, despite the foreign collection be empty, having no documents?

Even if it has documents, there is an index for the pipeline.

EDITED:

The $lookup is dynamic, which means that it needs to match on a different value for each of the 100,000 documents.

bits of the explain():

      "$lookup": {
         from: ....
      },
      ....           
      "totalDocsExamined": 0,
      "totalKeysExamined": 0,
      "collectionScans": 0,
      "indexesUsed": [
        "A_1_B_1_C_1"
      ],
      "nReturned": 100000,
      "executionTimeMillisEstimate": 35602
    },

1 Answers1

2

The explain plan clearly states that the time taken is around 35 seconds and the documents returned are 100000.

One reason behind this is that $lookup is evaluated for each and every document of the base collection. Even though the joined collection is empty, the lookup will still happen for each document, and there aren't any internal query optimizations possible in this case. The only thing you can do is filter some documents using $match before $lookup. Joining one complete collection with another using $lookup, will not scale as the collections grow in size.

Charchit Kapoor
  • 8,934
  • 2
  • 8
  • 24