We have a use-case to build data pipeline solution in which we need following things:
- Ability to have multiple steps (outputs from one step should feed as input to next)
- Ability to have multiple algorithms (SQL Query or probably invoke REST endpoint) in each step.
Input to first step can be anything. We have DW tables, but we can pre-process and keep the relevant information in AWS S3 or other data store.
Is there an existing solution that already provides functionalities similar to this or can be modified to support this?
Having something in AWS would be easier to integrate.