2

This is MongoDB's api:

db.foo.watch([{$match: {"bar.baz": "qux" }}])

Let's say that collection foo contains millions of documents. The arguments passed into watch indicate that for every single document that gets updated the system will filter the ones that $match the query (but it will be triggered behind the scenes with any document change).

The problem is that as my application scales, my listeners will also scale and my intuition is that I will end up having n^2 complexity with this approach.

I think that as I add more listeners, database performance will deteriorate due to changes to documents that are not part of the $match query. There are other ways to deal with this, (web sockets & rooms) but before prematurely optimizing the system, I would like to know if my intuition is correct.

Actual Question: Can I attach a listener to a single document, such that watch's performance isn't affected by sibling documents? When I do collection.watch([$matchQuery]), does the MongoDB driver listen to all documents and then filters out the relevant ones? (this is what I am trying to avoid)

1 Answers1

1

The code collection.watch([$matchQuery]) actually means watch the change stream for that collection rather than the collection directly.

As far as I know, there is no way to add a listener to a single document. Since I do not know of any way, I will give you a couple tips on how to avoid scalability problems with the approach you have chosen. Your code appears to be using change streams. It should not cause problems unless you open too many change streams.

There are two ways to accomplish this task by watching the entire collection with a process outside of that won't lead to deterioration of the database performance.

If you use change streams, you can open only a single change stream with logic that checks for all the conditions you need to filter for over time. The mistake is that people often open many change streams for single document filtering tasks, and that is when people have problems.

The simpler way, since you mentioned Atlas, is to use Triggers. You can use something called a match expression in your Triggers configuration to prevent any operations on your collection unless the match expression evaluates to true. As noted in the documentation, the trigger function will not execute unless a field status in this case is updated to "blocked", but many match expressions are available:

{
  "updateDescription.updatedFields": {
    "status": "blocked"
    }
  }

I hope this helps. If not, I can keep digging. I think with change streams or Triggers, you are ok if you want to write a bit of code. :)

Nice-Guy
  • 1,457
  • 11
  • 20
  • 1
    I like this approach. Having a single watch service allows me to scale more, yet I still need some kind of pub/sub so that the listeners can communicate with the main watch service. I will accept the current answer in a couple of days, if there is no other response – AlTunegenie May 09 '22 at 20:35