I am using a custom query in JDBC kafka source connector can any one told me what is the mode at the time of using custom query in JDBC kafka source connector if i am using bulk mode then it will reinsert all data in kafka topic. note:-i didn't have any primary key or timestamp column in my table.
2 Answers
You can use either incrementing or timestamp
incrementing
- use a strictly incrementing column on each table to detect only new rows. Note that this will not detect modifications or deletions of existing rows.
timestamp
- use a timestamp (or timestamp-like) column to detect new and modified rows. This assumes the column is updated with each write, and that values are monotonically incrementing, but not necessarily unique.
timestamp+incrementing
- use two columns, a timestamp column that detects new and modified rows and a strictly incrementing column which provides a globally unique ID for updates so each row can be assigned a unique stream offset.
Example for timestamp
:
name=mysql-source-test
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
tasks.max=10
connection.url=jdbc:mysql://mysql.example.com:3306/my_database?user=myuser&password=mypass
table.whitelist=users,products
mode=timestamp
timestamp.column.name=last_modified
topic.prefix=mysql-test-
Example for incrementing
:
name=mysql-source-test
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
tasks.max=10
connection.url=jdbc:mysql://mysql.example.com:3306/my_database?user=myuser&password=mypass
table.whitelist=users,products
mode=incrementing
incrementing.column.name=id
topic.prefix=mysql-test-
Example for timestamp+incrementing
:
name=mysql-source-test
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
tasks.max=10
connection.url=jdbc:mysql://mysql.example.com:3306/my_database?user=myuser&password=mypass
table.whitelist=users,products
mode=timestamp+incrementing
incrementing.column.name=id
timestamp.column.name=last_modified
topic.prefix=mysql-test-

- 36,235
- 20
- 134
- 156
-
:-thanks for reply but i didn't have any incremental column or timestamp column so how can give incremental or timestamp column ? – santoXme May 30 '19 at 05:55
-
@santoXme What columns do you have on the table side? – Giorgos Myrianthous May 30 '19 at 10:21
-
let suppose my table only have name and salary column and i want to migrate it to kafka topic so for this i use a custom query for my required data at that time what is the mode i have to specify . – santoXme May 30 '19 at 10:41
-
@santoXme You can include one more column like `id` for `incrementing` mode, or `rowversion` for `timestamp` mode. – Giorgos Myrianthous May 30 '19 at 10:42
-
This table is crated by a client and i cant do change in any client tables to that the main issue i need to migrate those table as they are in database – santoXme May 30 '19 at 10:47
-
@santoXme Then I'm afraid you can only use `bulk` mode. – Giorgos Myrianthous May 30 '19 at 10:51
-
did u mean to say that the bulk mode ? – santoXme May 30 '19 at 10:52
-
@santoXme Yes, I meant ‘bulk’ mode. – Giorgos Myrianthous May 30 '19 at 10:53
-
in bulk mode if i run my connector once it will migrate that data to kafka topic but it continuously run and migrate that same data to topic .Is there is any way we can discontinue the connector once it migrate that data – santoXme May 30 '19 at 10:57
-
@santoXme You can just stop the connector. – Giorgos Myrianthous May 30 '19 at 11:12
If you don't have a timestamp or incrementing ID column, then you cannot do query-based CDC, you can only do a bulk load.
Your alternative is to use log-based CDC with a tool such as Debezium.
This talk goes into the details of each option and tools available: http://rmoff.dev/ksny19-no-more-silos

- 30,382
- 3
- 65
- 92