MongoDB Change Streams: does config FullDocument = UpdateLookup have performance implications on the source DB?

Question

I am trying to deep dive into Mongo Change Streams implementation to understand whether configuring full document update lookup will impact DB performance in a production environment.

I assume the full document lookup is just a simple query by ID. Therefore, my manly concern is how it will impact the source DB when there are a lot of writes happening to it. Maybe there is a configuration to query by batch of IDs, that might help.

If the change stream cursor will query my collection each time it sees an update in the oplog, it simply means that each applicative write to the collection is effectively a write and a read. I don't want to impact an operational DB performance and thus impact the application performance.

From my understanding, change streams reading from the oplog will not impact the DB so much, but if my assumption above is right, this main advantage is gone.

Setup background:

I intend to use this leveraging the Kafka-Connect Native Mongo source connector, which implements change streams under-the-hoods (the java-sync driver).
My source collection is in MongoDB 4.4.
The source collection does have a lot of writing to it.

Appreciate any piece of information on this matter.

Thank you all.

score 0 · Answer 1 · answered Feb 16 '22 at 14:24

If you use a pipeline with a $match and $project, you can minimize the data returned when a changestream watch fires. It certainly minimizes the data load in your application because you can project just the fields you're interested in as opposed to the whole document. I do not know whether this truly saves bandwidth in the database, because I don't fully understand the inner workings, but it could. Here's how I do it:

const pipeline01 = [
    { $match: { 'updateDescription.updatedFields.fieldIamInterestedIn': { $ne: undefined } } },
    { $project: { 'fullDocument._id': 1, 'fullDocument.anotherFieldIamInterestedIn': 1 } },
];
collectionIamWatching.watch(pipeline01, { fullDocument: 'updateLookup' }).on('change', async (data) => {
    // then do what you want with data.fullDocument - it will only contain the fields you've named in the $project step
});

MongoDB Change Streams: does config FullDocument = UpdateLookup have performance implications on the source DB?

1 Answers1