I have used Approach 1 as mentioned in the above post.
I have created the Cosmos DB source connector to fetch User data from the cosmos DB and publish it to the Kafka topic "new_user_registered".
Cosmos DB source connector configuration:
{
"name": "cosmosdb-source-connector",
"config": {
"connector.class": "com.azure.cosmos.kafka.connect.source.CosmosDBSourceConnector",
"tasks.max": "1",
"key.converter": "org.apache.kafka.connect.storage.StringConverter",
"key.converter.schemas.enable": "false",
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"value.converter.schemas.enable": "false",
"connect.cosmos.task.poll.interval": "1000",
"connect.cosmos.connection.endpoint": "https://cosmos-instance.documents.azure.com:443/",
"connect.cosmos.master.key": "C2y6yDjf5/R+ob0N8A7Cgv30VRDJIWEHLM+4QDU5DE2nQ9nDuVTqobD4b8mGGyPMbIZnqyMsEcaGQy67XIw/Jw==",
"connect.cosmos.databasename": "UserDb",
"connect.cosmos.containers.topicmap": "new_user_registered#User",
"connect.cosmos.offset.useLatest": true,
"topic.creation.enable": "false",
"topic.creation.default.replication.factor": 1,
"topic.creation.default.partitions": 1,
"output.data.format": "JSON",
"transforms": "replacefield",
"transforms.replacefield.type": "org.apache.kafka.connect.transforms.ReplaceField$Value",
"transforms.replacefield.exclude": "id,_rid,_self,_etag,_attachements,_ts",
"transforms.replacefield.include": "Login,Email,Password,Name"
}
}
Then created an Azure SQL sink connector which fetches data from the Kafka topic "new_user_registered".
Azure SQL sink connector configuration:
{
"name": "sqlserver-sink-azure-connector",
"config": {
"name": "sqlserver-sink-azure-connector",
"connector.class": "io.confluent.connect.azuresqldw.AzureSqlDwSinkConnector",
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"value.converter.schemas.enable": "false",
"key.converter": "org.apache.kafka.connect.storage.StringConverter",
"key.converter.schemas.enable": "false",
"transforms": "RenameFields",
"transforms.RenameFields.type": "org.apache.kafka.connect.transforms.ReplaceField$Value",
"transforms.RenameFields.renames": "Email:vchEmail,Login:vchLogin,Name:vchName,Password:vchPassword",
"topics": "NEW_USER_REGISTERED_AVRO",
"azure.sql.dw.url": "jdbc:sqlserver://192.168.154.131:1433;",
"azure.sql.dw.user": "sa",
"azure.sql.dw.password": "password123",
"azure.sql.dw.database.name": "DatabaseName",
"table.name.format": User"
"insert.mode": "insert",
"auto.create": "true",
"auto.evolve": "true",
"tasks.max": "1",
"confluent.topic.bootstrap.servers": "broker:29092"
}
}
But the sink connector throws the exception "No fields found using key and value schemas for table: User"
For this, I found the solution in the following post:
Kafka Connect JDBC sink connector not working
Solution 1: We need to send schema and payload in messages. (this is not suitable for us)
Solution 2: Use the Confluent Avro serializer.
To go with Avro we found a video (https://www.youtube.com/watch?v=b-3qN_tlYR4&t=1300s) provided by confluent where we can use KSQL and streams to convert JSON to Avro and then fetch data using a connector from the newly created topic by the stream.
But I was thinking should I use a sink connector with KSQL/Streams lengthy stuff for my case where I just want to Sync Users between two Services without any transformation and schema.
Can anybody suggest, should I go with traditional Consumer or Kafka connect?
Thanks,
Saurabh