1

My config file of sink contains the following configurations -

...
"connector.class": "io.confluent.connect.jdbc.JdbcSinkConnector",
"tasks.max": "1",
"topics": "DELETESRC",
"insert.mode": "upsert",
"batch.size": "50000",
"table.name.format": "DELETESRC",
"pk.mode": "record_key",
"pk.fields": "ID,C_NO",
"delete.enabled": "true",
"auto.create": "true",
"auto.evolve": "true",
"max.retries": "10",
"retry.backoff.ms": "3000",
"mode": "bulk",
"key.converter": "org.apache.kafka.connect.storage.StringConverter",
"value.converter": "io.confluent.connect.avro.AvroConverter",
"value.converter.schemas.enable": "true",
"value.converter.schema.registry.url": "http://localhost:8081",
"transforms": "ValueToKey",
"transforms.ValueToKey.type":"org.apache.kafka.connect.transforms.ValueToKey",
"transforms.ValueToKey.fields": "ID,C_NO"
...

I am able to use upsert using the keys but not able to use the delete mode in the JDBC sink. I configured the topic DELETESRC as cleanup.policy=compact, delete.retention.ms=0. I created a KSQL stream as with 4 columns (ID,CMP,SEG,C_NO) and using the insert into statements in KSQL to push the data.

INSERT INTO DELETESRC VALUES ('null','11','D','1','3')
INSERT INTO DELETESRC VALUES ('null','11','C','1','4')
INSERT INTO DELETESRC VALUES ('null','12','F','1','3')

But when I am doing INSERT INTO DELETESRC VALUES ('null','11','null','null','3'), the sink is updating the table as 11,null,null,3. I have looked into other answers in stack-overflow but those solutions did not work.

Am I doing any mistake in creating a tombstone record?

I tried other ways in insert statement in KSQL but the delete operation is not occurring.

Sanjay Nayak
  • 79
  • 2
  • 12

1 Answers1

1

In order to generate a proper tombstone message you need to provide a keyed message with null value. In the example you don't show null value.


In order to not lose events due to high consumer lag, I guess you need to increase the delete.retention.ms:

The amount of time to retain delete tombstone markers for log compacted topics. This setting also gives a bound on the time in which a consumer must complete a read if they begin from offset 0 to ensure that they get a valid snapshot of the final stage (otherwise delete tombstones may be collected before they complete their scan).

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
Giorgos Myrianthous
  • 36,235
  • 20
  • 134
  • 156
  • Can you please give me an example regarding the tombstone record? How can I insert that into stream so that it can be ingested into the topic? – Sanjay Nayak Apr 15 '20 at 12:06
  • @SanjayNayak Do tou have a source connector that replicates data from MySQl to Kafka and then a sink connector that reads from kafka and replicates back to MySQL? – Giorgos Myrianthous Apr 15 '20 at 12:12
  • No source connector is a CSV file that reads data and loads it into MySQL. My source is a CSV file. For the above one I was checking how delete.enabled works by creating a stream. – Sanjay Nayak Apr 15 '20 at 12:14
  • 1
    @SanjayNayak OK. So you need to write a simple producer and generate tombstone messages. Please refer to this answer: https://stackoverflow.com/questions/61195652/how-to-produce-a-tombstone-avro-record-in-kafka-using-python/61195822#61195822 – Giorgos Myrianthous Apr 15 '20 at 12:21
  • @SanjayNayak Any luck? – Giorgos Myrianthous Apr 16 '20 at 13:41
  • Sorry for that. I have not yet tried. There was some other requirement. So couldn't see into it. I will surely let you know – Sanjay Nayak Apr 16 '20 at 14:19
  • It worked. The tombstones are getting generated but in the key I am not able to see the key in record format. The key is coming as String format. The values are in AVRO. I checked the schema registry. I have the schema for the key. – Sanjay Nayak May 11 '20 at 16:23