I read a while about how to upload my S3 data to Redshift, COPY command, Glue, etc. My pipeline is almost entirely in NIFI something like: extract_data->insert to S3->excecute Lamda process to transform the data or enrich it using Athena, in 2 or 3 stages, into another S3 bucket (lets call it processed bucket).
Now I want to continue this pipeline, loading the data from the processed bucket and inserting it to redshift, I have an empty table created for this.
The idea is to add incrementally in some tables and in others to delete all the data loaded that day and reload it.
Can anyone give me a hint of where to start? Thank you!