Kafka connect with mongo oplog

Question

I want to tail mongo oplog and stream it through Kafka.But there are many databases and collections, and I just want to get the update data for one of them. If you want to filter out the desired operation records from all the operation records in oplog, this can affect performance. So I would like to ask for a better solution. Please give me some suggestions.

score 0 · Accepted Answer · answered Oct 18 '18 at 20:01

0

It's not clear what tool you are using, but Debezium supports these for applying filterings

database.whitelist
collection.whitelist

Also not clear what will "affect performance" since you are already reading the full oplog, but performing a filter (meaning dropping all records that don't match a condition) should not have significant impact as boolean/regex checks usually finish very quickly.

answered Oct 18 '18 at 20:01

OneCricketeer

179,855
19
132
245

The reason of worrying about performance is mongo stores lots of data,other databases will have many operations to update data.If you use code to get operation records in oplog,every operation record needs to be judged by code.Too many useless judgments can affect performance.What do you think? – DotWait Oct 19 '18 at 07:54
Performance of Mongo? No, because the oplog isn't filtered otherwise, and reading it in full cannot be avoided, AFAIK. And you're not performing actual database lookups or writes against Mongo, so that also wouldn't have impact... Basically, if using Kafka causes performance issues, then so would just having a replicated or sharded Mongo instance – OneCricketeer Oct 19 '18 at 13:53

Kafka connect with mongo oplog

1 Answers1