I am using a very simple architecture to copy data from an external source into Azure Data Lake Storage gen 2 and serve it to PowerBI via a Serverless pool (where I perform some aggregations).
For the initial load, I used CopyData activity (Synapse Pipeline) and I store the data in parquet files.
Since parquet / ADLS2 does not support UPDATE operations on files, I am looking for best practices to create the incremental load (watermarking process) without using an additional database from where I can query the control/watermark table and run the stored procedure to update the last run date.
Has anyone bumped into this before? Thanks!
PS: I first checked here the best practice: https://learn.microsoft.com/en-us/azure/data-factory/tutorial-incremental-copy-overview