1

I have a need to capture any changed / deleted / added data from a set of related Oracle 11g tables (parent and multiple children) and capture this set of table data in a file. Once the file is created, I can reset the capture window and later (batch mode) repeat the process. These files can be provided to interested parties. Rather than create something homegrown I'd like to utilize what's in Oracle such as CDC or an equivalent to perform this task. However, I'm not sure Oracle CDC or an equivalent feature can capture a related set of tables (i.e. a change happens to a parent and all children data also gets captured, or a change happens to a child and the parent and all other children data is captured), or which of the many features in 11g / 12c (soon upgrading) best satisfy the requirement.

In the future we'd like to publish this information to one or more subscribers, some real-time and some batch, so that I can ensure that all subscribers have received the full set of changed data before it's automatically removed.

Any advise on the above two scenarios would be appreciated. I haven't been able to keep up with all the Oracle advancements as of late. As a note, we also have SSIS and Informatica which may be able to aid in the solution.

Aacini
  • 65,180
  • 12
  • 72
  • 108
kimo
  • 11
  • 3
  • Striim will allow you to do change data capture from oracle and stream changes to your big data infrastructure – capkutay Nov 17 '16 at 11:52

1 Answers1

0

I am not sure if you have got any solution yet.

Recently i evaluated the Oracle golden gate and some custom solution provider (ATTUnity, Striim) that uses oracle log miner to capture the change data at source.

Oracle golden gate allows you to specify entire schema or specific tables to capture the change. In your case, you can specify parent-child or all related tables and it starts capturing the change.

On publishing side, they have some adapters like recently they got big data adapter including Kafka adapters. You can also write custom adapter for any specific need.

The way golden gate works, it captures the change from DB and write into its own file call trail file. The process that performs this operation is called extract. They also have another process replicat/pump that reads the trail files and apply it to either target DB or publish to messaging systems or big data.

I have developed the realtime data streaming application for my client using oracle golden gate, kafka and spark streaming with oracle at source and target.

Only issue with oracle golden gate is that it is expensive.

Do let me know if it is helpful.

Thanks