I was wondering if there are any AWS Services or projects which allow us to configure a data pipeline using AWS Lambdas in code. I am looking for something like below. Assume there is a library called pipeline
from pipeline import connect, s3, lambda, deploy
p = connect(s3('input-bucket/prefix'),
lambda(myPythonFunc, dependencies=[list_of_dependencies])
s3('output-bucket/prefix'))
deploy(p)
There can be many variations of this idea of course. This use case assumes only one s3 bucket for e.g. There could be a list of input s3 buckets.
Can this be done by AWS Data Pipeline? The documentation I have(quickly) read says that Lambda is used to trigger a pipeline.