2

I have used KSQL to create a stream and an aggregated table off that stream.

  {  
   "ksql":"DROP Stream IF EXISTS StreamLegacyNames; DROP Stream IF EXISTS StreamLegacy; CREATE Stream  StreamLegacy (payload  STRUCT<AgeYr  varchar>)WITH (KAFKA_TOPIC='eip-legacy-13',VALUE_FORMAT='JSON' );  CREATE Stream  StreamLegacyNames As Select payload->AgeYr Age from StreamLegacy; Create Table DimAge As SELECT Age FROM StreamLegacyNames Group By Age;",
   "streamsProperties":{  
      "ksql.streams.auto.offset.reset":"earliest"
   }
}

What is the easiest way to export this code to a sql table? We are using the jdbc connector for topic but I am unclear if that would work for an aggregated KSQL table (In this example DIMAGE).

Even if I set the topic to DIMAGE and the following in the jdbc connect configuration file.

value.converter.schemas.enable=false

The full configuration file is

connector.class=io.confluent.connect.jdbc.JdbcSinkConnector
connection.password=PASSWORD
auto.evolve=true
topics=DIMAGE
tasks.max=1
connection.user=USER
value.converter.schemas.enable=false
auto.create=true
connection.url=jdbc:sqlserver://SERVER

I receive the following error in the connector.

Caused by: org.apache.kafka.connect.errors.DataException: JsonConverter with schemas.enable requires "schema" and "payload" fields and may not contain additional fields. If you are trying to deserialize plain JSON data, set schemas.enable=false in your converter configuration.

A KSQL query through postman shows the format of the KTABLE as

{"row":{"columns":["83"]},"errorMessage":null,"finalMessage":null}
{"row":{"columns":["74"]},"errorMessage":null,"finalMessage":null}
{"row":{"columns":["36"]},"errorMessage":null,"finalMessage":null}
John Bowyer
  • 1,213
  • 1
  • 15
  • 26

2 Answers2

2

When you CREATE STREAM foo AS SELECT ("CSAS") in KSQL, you are creating a new Kafka topic and populating it continually with the results of the SELECT statement.

So you have just a Kafka topic, in your case called STREAMLEGACYNAMES (KSQL usually forces objects to upper case). You can use the JDBC Sink connector to stream this topic to a target RDBMS, including MS SQL.

Robin Moffatt
  • 30,382
  • 3
  • 65
  • 92
  • Hi, thanks. What I am trying to accomplish is to send the aggregated KTABLE DimAge to sql using the JDBC sink connector. Even if i set value.converter.schemas.enable=false, I get an error message that says Caused by: org.apache.kafka.connect.errors.DataException: JsonConverter with schemas.enable requires "schema" and "payload" fields and may not contain additional fields. If you are trying to deserialize plain JSON data, set schemas.enable=false in your converter configuration. – John Bowyer Jan 04 '19 at 19:41
  • The JDBC Sink requires a schema. If you're using KSQL then the only option (and a good option) is to use Avro. If you're using JSON then the JSON that KSQL generates does not include the schema itself. See this article for more details https://www.confluent.io/blog/kafka-connect-deep-dive-converters-serialization-explained – Robin Moffatt Jan 06 '19 at 22:34
0

A KTable is just another topic at the end of the day. You can use KSQL PRINT or kafka-console-consumer to see what data the JDBC Sink connector will be getting.

If you are assuming that the KSQL table will exactly match the SQL Server table, then it will not. In the SQL Server table, you will have each and every "event row" that has occured onto the KTable, including null values because deletes are not yet supported by the JDBC sink.


Not sure what data you are expecting, but what you could do, is perform a windowed output of the events that you are trying to capture, then you effectively are having a micro-batch insert into your downsteam database.

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245