I Have a requirement where I need to ingest continuous/steam data(Json format) from eventHub to Azure data lake. I want to follow the layered approach(raw, clean, prepared) to finally store data into delta table. My doubt is around the raw layer. out of below two approach which one do you recommend is best.
- Event hub -> RawLayer(Raw Json Format) -> cleanLayer (delta table) -> preparedLayer(delta table)
- Event hub -> RawLayer(delta table) -> cleanLayer (delta table) -> preparedLayer(delta table)
so shall I store the raw Json format in raw layer or its suggested to create delta table in Raw layer is well.
Regards,