Every day we get one incremental file, and we have multiple sources from which we gets incremental files. And both will place these files in two different s3 prefixes. But they come in different time. We want to process both the files in one go and generate a report out of that. For this I will be using AWS Lambda and Data Pipeline. We will trigger AWS Data pipe line through Lambda. And Lambda will be triggered whenever a new file arrived.
We are able to the same when we have single source, so we created a s3 trigger ever for lambda and when ever the file comes, it is getting triggered and starting pipe line and emr activity is getting and at the end the report is getting generated.
Now we have the second source as well, and now we want to start the activity whenever both the files are arrived/uploaded.
Not sure if we can trigger aws lambda with more than one dependency. I know this can be done through Step Functions, i might go to that route if we dont have support for triggering lambda with multiple dependencies.
Trigger AWS Lambda function whenever new files arrived on two different s3 prefixes. Dont trigger lambda function if a file arrived on only s3 location but not on other location.