0

How to load oracle table data into kafka topic? i did some research and got to know,i should use CDC tool,but all CDC tools are paid version ,can anyone suggest me how to achieve this ?

MT0
  • 143,790
  • 11
  • 59
  • 117

7 Answers7

1

You'll find this article useful: No More Silos: How to Integrate your Databases with Apache Kafka and CDC

It details all of your options and currently-available tools. In short, you can do bulk (or query-based CDC) with the Kafka Connect JDBC Connector, or you can use a log-based CDC approach with one of several CDC tools that support Oracle as a source, including Attunity, GoldenGate, SQ Data, and IBM's IIDR.

You'll generally find that if you've paid for your database (e.g. Oracle, DB2, etc) you're going to have to pay for a log-based CDC tool. Open source CDC tools are available for open source databases. For example, Debezium is open source and works great with MongoDB, MySQL, and PostgreSQL.

Robin Moffatt
  • 30,382
  • 3
  • 65
  • 92
  • 1
    Thank you @Robin for the reply.Without Confluent Platform ,Is it possible to get cdc data from oracle table through jdbc source connector to Kafka connect?. – Priyanka Marihal Apr 05 '18 at 05:08
  • Kafka Connect is part of Apache Kafka. The JDBC connector is open source, and available [standalone](https://github.com/confluentinc/kafka-connect-jdbc/), or as part of [Confluent Platform](https://www.confluent.io/download/). – Robin Moffatt Apr 05 '18 at 06:07
1

You might be interested in the Debezium project, which provides open-source CDC connectors for a variety of databases. Amongst others, we provide one for Oracle DB. Note that this connector currently is based on the XStream API of Oracle, which itself requires a separate license, but we hope to add a fully free alternative soon.

Disclaimer: I'm the lead of Debezium

Gunnar
  • 18,095
  • 1
  • 53
  • 73
  • Hi Gunnar, any updates regarding the free alternative for the Oracle CDC connector? Thanks for developing debezium. – Gustavo Jan 14 '20 at 21:19
  • Hi Gunnar, I read internals of debezium, but it also can send duplicate messages to kafka, if there is some failure at connector just before updating offset in bin log. In that case, consumer needs to be idempotent. Is there any way we can avoid duplicate message in kafka – Rajat Goyal Feb 25 '20 at 15:10
0

Please refer to kafka jdbc source connector . Below is link https://docs.confluent.io/current/connect/connect-jdbc/docs/index.html

Liju John
  • 1,749
  • 16
  • 19
0

You don't need a Change Data Capture (CDC) tool in order to load data from Oracle Table into a Kafka topic.

You can use Kafka Confluent's JDBC Source Connector in order to load the data.

However, if you need to capture deletes and updates you must use a CDC tool for which you need to pay a licence. Confluent has certified the following CDC tools (Source connectors):

  1. Attunity
  2. Dbvisit
  3. Striim
  4. Oracle GoldenGate
Giorgos Myrianthous
  • 36,235
  • 20
  • 134
  • 156
0

As others have mentioned, CDC requires paid products. If you'd just like to try something out, Striim is available for free for the first 30 days.

https://www.striim.com/instant-download/

The 'free' options which include JDBC..but you would be introducing a significant load on your database if you actually want to use triggers to capture changes.

disclaimer: i work at striim

capkutay
  • 183
  • 2
  • 11
0

There's a custom Kafka source connector for Oracle database which is based on logminer here:

https://github.com/erdemcer/kafka-connect-oracle

This project is in development.

ecer
  • 75
  • 1
  • 1
  • 6
0

You might be interested in OpenLogReplicator. It is an open source GPL-licensed tool written completely in C++. It reads binary format of Oracle Redo logs and sends them to Kafka.

It is very fast - you can achieve low latency without much effort, since it operates fully in memory. It supports all Oracle database versions since 11.2.0.1 and requires no additional licensing.

It can work on the database host, but you can also configure it to read the redo logs using sshfs from another host - with minimal load of the database.

disclaimer: I am the author of this solution

Adam Leszczyński
  • 1,079
  • 7
  • 13