1

I am working with Kafka Connect (using the Confluent implementation) and am seeing a strange behavior. I configure a source connection to pull data from a DB table, and populate a topic. This works. But, if I delete the topic, remove the Source config, and then reset the config (perhaps adding another column to the query) the topic does not get populated. If I change the topic name to something I haven't used before, it works. I am using Postman to set the configuration, though I don't believe that matters here.

My Connect config:

{
    "name": "my-jdbc-connector",
    "config": {
        "connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector",
        "connection.url": "jdbc:db2://db2server.mycompany.com:4461/myDB",
        "connection.user: "dbUser",
        "connection.password": "dbPass",
        "dialect.name": "Db2DatabaseDialect",
        "mode": "timestamp",
        "query": "select fname, lname, custId, custRegion, lastUpdate from CustomerMaster",
        "timestamp.column.name": "lastUpdate",
        "table.types": "TABLE",
        "topic.prefix": "master.customer"
    }
}
OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
M. Ferris
  • 341
  • 3
  • 13

1 Answers1

1

KAFKA JDBC connector uses HighWatermark on the timestamp column i.e. last update in your case. It doesn't depend on the topic or even you can delete the JDBC connector and recreate it with the same name it still will be using the same HighWatermark because HighWatermark depends on the connector name. So even you recreate the topic it will not load data again. So there is a way to reprocess the whole data again you can follow any of the ways:

  1. Drop topic and delete JDBC Connector, recreate topic, and create JDBC Connector with a different name. or

  2. Delete JDBC connector and recreate again with the same name with mode "mode": "bulk" . It will dump all DB tables again in the topic. once it loads you can again update mode to timestamp. Please refer JDBC connector configuration details

https://docs.confluent.io/current/connect/kafka-connect-jdbc/source-connector/source_config_options.html

  1. update lastUpdate for all records to the current timestamp.
Nitin
  • 3,533
  • 2
  • 26
  • 36