We have a requirement to set up Kafka MicrosoftSqlServerSource connect. This is to capture all the transactions(insert/update) performing in one of the sales table in Azure SQL database.
In order to bring the support for the above source connect, we have initially enabled CDC at both database and table level. We also created a view out of the source table which will be the input for the source connect( TableType = VIEW in connector configuration). Once we complete the set up at both connector as well as database level, we could see messages flowing to the respective topic created automatically along with the connector as when a new updations/insertions happened at the table level.
One strange behavior we observed while testing is that when we stopped the testing, the last message received in the topic starts getting duplicated until a new message arrived.
Could you please help us to understand whether this is a system behavior? Or Did we miss any configuration that has resulted in these duplicate entries. Please guide us on how we can tackle the above duplicate issue.
Attaching the snapshot
Connector Summary
Connector Class = MicrosoftSqlServerSource
Max Tasks = 1
kafka.auth.mode = SERVICE_ACCOUNT
kafka.service.account.id = **********
topic.prefix = ***********
connection.host = **************8
connection.port = 1433
connection.user = ***************
db.name = **************88
table.whitelist = item_status_view
timestamp.column.name = ProcessedDateTime
incrementing.column.name = SalesandRefundItemStatusID
table.types = VIEW
schema.pattern = dbo
db.timezone = Europe/London
mode = timestamp+incrementing
timestamp.initial = -1
poll.interval.ms = 10000
batch.max.rows = 1000
timestamp.delay.interval.ms = 30000
output.data.format = JSON