0

I am evaluating Talend and I have little experience using the tool. My question is regarding audit. Does Talend offer a way to run with an audit option that extracts the source data, transforms it and then compares the transformed data to the existing target data? I would like a report that tells me which records would have triggered an update if the audit option had not prevented it.

I know that ETL's are triggered on change but I would want to run this audit without regard to changed timestamps. That is, a query would specify what source records to audit.

I have done a search of the talend documentation but nothing jumped out at me.

Thank you.

EDIT

Looking outside of Talend, I found where Tim Mitchell wrote:

While I have not yet found an auditing framework with enough customization to suit me, I have learned that it is possible to audit ETL processes in a customized way without reinventing the wheel every time. I keep on hand a common (but never set in stone) set of table definitions and supporting logic to shorten the path to the following auditing objectives:

  • Simple aggregate auditing of row counts, financial data, and other key metrics
  • Documentation templates for source-to-target mappings and transformation logic
  • Lightweight auditing of batch-level changes
  • Full auditing of every change made in the transformation process

So I guess that auditing is on me.

MikeJRamsey56
  • 2,779
  • 1
  • 20
  • 34
  • Like in any other ETL tool, this is something you need to build along with your ETL pipelines. Also, why this question has `Oracle` and `Redshift` tags, the question has nothing to do with any database – demircioglu Apr 18 '19 at 22:38
  • Because I am ETL'ing from Oracle to Redshift but OK, I will remove the tags. Build along with the ETL pipeline; OK, I will look in that direction. – MikeJRamsey56 Apr 18 '19 at 22:42
  • 1
    You are right saying "it's on me". Because every ETL process/DW implementation has it's own set of use cases. I have implemented 3-4 different variations of these in the past varying from just having audit tables and audit columns in fact/dim tables to running user customizable queries on the data daily and sending out reports/alerts when anomalies detected. So in short based on your use cases and your needs you need to build it. – demircioglu Apr 18 '19 at 23:48

0 Answers0