What is the simplest way to sync a Kafka KTable to a SQL database?

Question

I have used KSQL to create a stream and an aggregated table off that stream.

  {  
   "ksql":"DROP Stream IF EXISTS StreamLegacyNames; DROP Stream IF EXISTS StreamLegacy; CREATE Stream  StreamLegacy (payload  STRUCT<AgeYr  varchar>)WITH (KAFKA_TOPIC='eip-legacy-13',VALUE_FORMAT='JSON' );  CREATE Stream  StreamLegacyNames As Select payload->AgeYr Age from StreamLegacy; Create Table DimAge As SELECT Age FROM StreamLegacyNames Group By Age;",
   "streamsProperties":{  
      "ksql.streams.auto.offset.reset":"earliest"
   }
}

What is the easiest way to export this code to a sql table? We are using the jdbc connector for topic but I am unclear if that would work for an aggregated KSQL table (In this example DIMAGE).

Even if I set the topic to DIMAGE and the following in the jdbc connect configuration file.

value.converter.schemas.enable=false

The full configuration file is

connector.class=io.confluent.connect.jdbc.JdbcSinkConnector
connection.password=PASSWORD
auto.evolve=true
topics=DIMAGE
tasks.max=1
connection.user=USER
value.converter.schemas.enable=false
auto.create=true
connection.url=jdbc:sqlserver://SERVER

I receive the following error in the connector.

Caused by: org.apache.kafka.connect.errors.DataException: JsonConverter with schemas.enable requires "schema" and "payload" fields and may not contain additional fields. If you are trying to deserialize plain JSON data, set schemas.enable=false in your converter configuration.

A KSQL query through postman shows the format of the KTABLE as

{"row":{"columns":["83"]},"errorMessage":null,"finalMessage":null}
{"row":{"columns":["74"]},"errorMessage":null,"finalMessage":null}
{"row":{"columns":["36"]},"errorMessage":null,"finalMessage":null}

score 2 · Answer 1 · answered Jan 04 '19 at 08:01

2

When you CREATE STREAM foo AS SELECT ("CSAS") in KSQL, you are creating a new Kafka topic and populating it continually with the results of the SELECT statement.

So you have just a Kafka topic, in your case called STREAMLEGACYNAMES (KSQL usually forces objects to upper case). You can use the JDBC Sink connector to stream this topic to a target RDBMS, including MS SQL.

answered Jan 04 '19 at 08:01

Robin Moffatt

30,382
3
65
92

Hi, thanks. What I am trying to accomplish is to send the aggregated KTABLE DimAge to sql using the JDBC sink connector. Even if i set value.converter.schemas.enable=false, I get an error message that says Caused by: org.apache.kafka.connect.errors.DataException: JsonConverter with schemas.enable requires "schema" and "payload" fields and may not contain additional fields. If you are trying to deserialize plain JSON data, set schemas.enable=false in your converter configuration. – John Bowyer Jan 04 '19 at 19:41
The JDBC Sink requires a schema. If you're using KSQL then the only option (and a good option) is to use Avro. If you're using JSON then the JSON that KSQL generates does not include the schema itself. See this article for more details https://www.confluent.io/blog/kafka-connect-deep-dive-converters-serialization-explained – Robin Moffatt Jan 06 '19 at 22:34

score 0 · Answer 2 · answered Jan 03 '19 at 20:30

0

A KTable is just another topic at the end of the day. You can use KSQL PRINT or kafka-console-consumer to see what data the JDBC Sink connector will be getting.

If you are assuming that the KSQL table will exactly match the SQL Server table, then it will not. In the SQL Server table, you will have each and every "event row" that has occured onto the KTable, including null values because deletes are not yet supported by the JDBC sink.

Not sure what data you are expecting, but what you could do, is perform a windowed output of the events that you are trying to capture, then you effectively are having a micro-batch insert into your downsteam database.

answered Jan 03 '19 at 20:30

OneCricketeer

179,855
19
132
245

Hi, thanks I edited the original question to show more details about what I am trying to do and the current error message. – John Bowyer Jan 04 '19 at 19:47
I will assume your key converter is also set to JSON, and so you need to set `schemas.enable` for it as well – OneCricketeer Jan 05 '19 at 19:17

What is the simplest way to sync a Kafka KTable to a SQL database?

2 Answers2