2

I have an existing HANA warehouse which was built without create/update timestamps. I need to generate a number of nightly batch delta files to send to another platform. My problem is how to detect which records are new or changed so that I can capture those records within the replication process.

Is there a way to use HANA's built-in features to detect new/changed records?

Suncatcher
  • 10,355
  • 10
  • 52
  • 90
dbjstein
  • 31
  • 1
  • 2
  • Can't you use Insert Date or Modified Date fields on original tables? – Eralper Jan 11 '18 at 07:50
  • Hi Eralper, the source tables are mostly SAP tables and many or most of them don't have modified dates or timestamps. Most don't even have the equivalent of a Create Date, though some do (like Orders KNVK).If you mean the HANA tables replicated from the SAP tables, it would have been smart to add an InsertTS and a ModifiedTS to every one of those during the extract process (we use SLT from SAP to HANA.) Since we didn't add those fields at the time, I'm trying to save time and money by using some alternate means to determine: "When was this record on this HANA table created or last modified?" – dbjstein Jan 11 '18 at 18:46

2 Answers2

5

SAP HANA does not provide a general change data capture interface for tables (up to current version HANA 2 SPS 02).
That means, to detect "changed records since a given point in time" some other approach has to be taken.

Depending on the information in the tables different options can be used:

  • if a table explicitly contains a reference to the last change time, this can be used
  • if a table has guaranteed update characteristics (e.g. no in-place update and monotone ID values), this could be used. E.g. read all records where ID is larger than the last processed ID
  • if the table does not provide intrinsic information about change time then one could maintain a copy of the table that contains only the records processed so far. This copy can then be used to compare the current table and compute the difference. SAP HANA's Smart Data Integration (SDI) flowgraphs support this approach.

In my experience, efforts to try "save time and money" on this seemingly simple problem of a delta load usually turn out to be more complex, time-consuming and expensive than using the corresponding features of ETL tools.

Lars Br.
  • 9,949
  • 2
  • 15
  • 29
2

It is possible to create a Log table and organize columns according to your needs so that by creating a trigger on your database tables you can create a log record with timestamp values. Then you can query your log table to determine which records are inserted, updated or deleted from your source tables.

For example, following is from one of my test trigger codes

CREATE TRIGGER "A00077387"."SALARY_A_UPD" AFTER UPDATE ON "A00077387"."SALARY" REFERENCING OLD ROW MYOLDROW,
     NEW ROW MYNEWROW FOR EACH ROW 
begin INSERT 
INTO SalaryLog ( Employee,
     Salary,
     Operation,
     DateTime ) VALUES ( :mynewrow.Employee,
     :mynewrow.Salary,
     'U',
     CURRENT_DATE )
;

end
;

You can create AFTER INSERT and AFTER DELETE triggers as well similar to AFTER UPDATE

You can organize your Log table so that so can track more than one table if you wish just by keeping table name, PK fields and values, operation type, timestamp values, etc. But it is better and easier to use seperate Log tables for each table.

Eralper
  • 6,461
  • 2
  • 21
  • 27
  • Adding a log to a table certainly is an option, but it's important to understand the consequences. By creating triggers, "invisible" dependencies are created, i.e. as long as the DBA/Developer does not look out for them, the won't be aware that these exist. The solution requires primary keys in the source tables and a management of the logging tables - both for data volumes and structural changes. Basically, this approach is what SAP SLT is doing, which should be a testament to how the little details make this approach not **that** easy to build out. – Lars Br. Jan 12 '18 at 06:34
  • Actually what is used for these kind of requirements are the CDC Change Data Capture features http://www.kodyaz.com/articles/change-data-capture.aspx provided by different database vendors. SQL Server provided this tool with SQL Server 2008. I did not experience it on HANA. But https://apps.support.sap.com/sap/support/knowledge/preview/en/2505510 I though it is already included for SDI Smart Data Integration – Eralper Jan 12 '18 at 08:48
  • Lars, Eralper, thanks for your feedback. I think you've answered my core question, which is whether there is anything built-in, such as a system attribute on every HANA record, that could be used as a kind of "backdoor" timestamp. – dbjstein Jan 12 '18 at 19:04
  • 1
    Seems pretty clear that such a backdoor doesn't exist - or is extremely well hidden if you guys don't know about it! I think I will have to bite the bullet and go through the work of enhancing the SLT jobs to add the created/modified timestamps to each raw table, where the timestamps represent the time the HANA record was created/updated (not the actual source record date/time, which in many cases we can't easily know.) That will provide the necessary basis for doing the delta extracts. – dbjstein Jan 12 '18 at 19:05