I'm currently looking into Kafka Connect to stream some of our databases to a data lake. To test out Kafka Connect I've setup a database with one of our project databases in. So far so good.
Next step I configured Kafka Connect with mode following properties:
{
"connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector",
"timestamp.column.name": "updated_at,created_at",
"incrementing.column.name": "id",
"dialect.name": "SqlServerDatabaseDialect",
"validate.non.null": "false",
"tasks.max": "1",
"mode": "timestamp+incrementing",
"topic.prefix": "mssql-jdbc-",
"poll.interval.ms": "10000",
}
While this works for the majority of my tables where I got an ID and a created_at / updated_at field, it won't work for my tables where I solved my many-to-many relationships with a table in between and a composite key. Note that I'm using the generic JDBC configuration with a JDBC driver from Microsoft.
Is there a way to configure Kafka Connect for these special cases?