The goal: Real time CDC from Oracle and PostgreSQL to Kinesis on a single thread/process without much time lag and no record drop.
The system: We have a system where we are doing a real time CDC from Oracle and PostgreSQL to Kinesis using AWS DMS. The problem with doing a real time CDC with only one thread is that it takes many hours to replicate the changes to Kinesis when the data grows big(MBs).
Alternate approach: The approach we took was to pull the real time changes from Oracle and PostgreSQL using multiple threads and push to Kinesis while still using DMS.
The challenge: We noticed that while pulling data in real time using multiple threads, there is a drop in some records from Oracle and PostgreSQL. This happens in like 1 in 3 million records.
Tried different solutions on the Oracle and PostgreSQL side, talked to AWS and nothing works.
Notes: We are using Logminner or Binary leader on Oracle and PostgreSQL side.
Is there a solution to this or has anybody tried to build this kind of system? Please let me know.