1

I am trying to deep dive into Mongo Change Streams implementation to understand whether configuring full document update lookup will impact DB performance in a production environment.

I assume the full document lookup is just a simple query by ID. Therefore, my manly concern is how it will impact the source DB when there are a lot of writes happening to it. Maybe there is a configuration to query by batch of IDs, that might help.

If the change stream cursor will query my collection each time it sees an update in the oplog, it simply means that each applicative write to the collection is effectively a write and a read. I don't want to impact an operational DB performance and thus impact the application performance.

From my understanding, change streams reading from the oplog will not impact the DB so much, but if my assumption above is right, this main advantage is gone.

Setup background:

Appreciate any piece of information on this matter.

Thank you all.

YFl
  • 845
  • 7
  • 22

1 Answers1

0

If you use a pipeline with a $match and $project, you can minimize the data returned when a changestream watch fires. It certainly minimizes the data load in your application because you can project just the fields you're interested in as opposed to the whole document. I do not know whether this truly saves bandwidth in the database, because I don't fully understand the inner workings, but it could. Here's how I do it:

const pipeline01 = [
    { $match: { 'updateDescription.updatedFields.fieldIamInterestedIn': { $ne: undefined } } },
    { $project: { 'fullDocument._id': 1, 'fullDocument.anotherFieldIamInterestedIn': 1 } },
];
collectionIamWatching.watch(pipeline01, { fullDocument: 'updateLookup' }).on('change', async (data) => {
    // then do what you want with data.fullDocument - it will only contain the fields you've named in the $project step
});
Desmond Mullen
  • 159
  • 1
  • 8