0

recently I started working with CDC on MS SQL Server. I have a scenario.

  1. Enabled CDC on a SQL Server
  2. Enalbed CDC on a certain table
  3. Data ingested using debezium connector to kafka
  4. Data has been cleared by cdc cleanup job

Is it possible to run cdc capturing changes once again from beginning ? Like restarting whole CDC process to initial point ?

2 Answers2

0

Kind of, but it you may or may not like it.

It seems like you're looking for "can I get the missing change history back?". The answer is "not really". But you have options.

If you have historic backups of the database that has CDC on it, you could restore those somewhere and grab the CDC data from those. Depending on the size of your database, the configured retention on the CDC data, and the rate of change (i.e. how much change data has been captured), this is probably not a great option. That is, let's say that you have a backup from a month ago and your configured retention is two days. Once you restore the database, it will have the two days of change data from a month ago. You could continue to restore successively newer backups to get to current but that seems like a lot to me.

If you're using CDC to keep a target in sync with the source, you could either restore a backup of the source db somewhere or use a database snapshot of it to grab the initial state of the data and then consume CDC data from that point forward (based on the LSN of the source).

Ben Thul
  • 31,080
  • 4
  • 45
  • 68
0

Sounds like you are asking for snapshots, not just CDC... Debezium maintains a history topic, and Kafka Source connectors store offsets as well.

These topics could be modified such that you reset both. For example, here's a blog explaining how it's done with the FileStream source connector.

Otherwise, re-posting a new Debezium connector with a different name should achieve the same effect

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245