1

My goal is to map entries from a large table to a smaller one, respecting a query based on GROUP BY statement. I want to sync them through Kiba, in a incremental way, i.e., without rewriting unchanged entries.

Is Kiba able to identify and run the minimum amount of INSERTS, UPDATES and DELETES in order to sync the two tables?

Cheers!

1 Answers1

1

Kiba author here! Today Kiba itself does not provide built-in mechanisms for a generic version of this, because in real life there are many different ways to achieve this depending on both your needs and your actual setup (is everything local, or is a part of the processing remote, how much data has to be handled, what is the stack etc).

That said this type of scenario is very, very commonly implemented using Kiba in production today: it's a common need, and people use their existing knowledge and specific datastore capabilities to implement the best way, relying on Kiba.

A few points though for today that can help:

  • Subscribe to my blog and I'll make sure to share an example of simple "sync" between two stores in the future, including code etc.
  • An upcoming "Kiba Pro" offering will cover very specific implementations of this (so not necessarily for everyone).
  • The best keyword to Google to find good patterns about this is "Change Data Capture" (make sure to first read the wikipedia page)
  • Ralph Kimball's book "The Data Warehouse ETL Toolkit", albeit old, contains a lot of interesting insights on related topics.
  • Most databases (even lately, in PostgreSQL 9.5 UPSERT) provides useful commands to merge and only update what changed etc (MERGE etc)

Hope this helps!

Thibaut Barrère
  • 8,845
  • 2
  • 22
  • 27