We have a data warehouse on MySQL with dimensions, fact tables and some aggregate tables. I want to implement the same structure using Apache Kafka where Source would be another MySQL server. I want to address the below requirements
- Add foreign keys to fact tables
- Add aggregate tables
- Handle schema changes
I read somewhere that dimension-fact table architecture cannot be created well using Kafka because star schema(dimensions and fact tables) should be updated in order(first load dimensions then fact tables). If a fact goes into fact table before its dimension in dim table then FK would be NULL.
How we can handle these problems using Kafka Streams(using Python)?
Is it possible to do this using only Source & Sink connector?