1

I have a SWF workflow that has two activities.

The first activity (A1) simply calls a service API that in turn launches an application that will (eventually) upload a file in a specific S3 bucket.

The second activity (A2) downloads this file and evaluates the data it contains.

My problem is that I have A2 repeatedly failing and retrying because the S3 file is not there until the file is uploaded by the application.

A1 simply launches an external application and completes immediately after getting a "Application Successfully Launched" response, so having A2 wait on a Promise returned by A1 doesn't make A2 wait until the file is in S3.

My initial solution is to catch the exception that's caused by the file not being there yet and retry within the activity, but that's a bad alternative since the activity will keep running and prevent other workflows running on the same machine from doing useful work.

The ideal solution I think would be to "hibernate" the activity and "wake up" every X minutes to see if the file is there or not in a way that doesn't potentially starve other workflows.

Is this possible?

Pomacanthidae
  • 207
  • 1
  • 3
  • 8

2 Answers2

0

An alternative would be to separate your two steps:

  1. Call the API service. No need to even use SWF to do this.

  2. Create an Amazon S3 Event that triggers an AWS Lambda function when a file is created in the bucket. The Lambda function can then process the file.

So, instead of continually checking for the existence of the file, use the Event to trigger a Lambda function when it does appear.

Of course, there might be added complexity if you are expecting lots of files, so the Lambda function will need to know what to do with the particular file that appears

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
  • This is how I would do it if I were creating this whole system from scratch, but alas I do not have the time and resources to do it like that. One hurdle this approach might have is the case where multiple workflows are running at the same time, meaning the possibility of other unrelated uploads to the target bucket triggering the Lambda function. Keeping everything in SWF means that each workflow knows exactly what file it's looking for in the bucket. – Pomacanthidae Nov 21 '19 at 20:05
  • Okay. Another alternative is to switch from SWF to AWS Step Functions. It is a meta-layer above AWS Lambda. It includes things like retries and delays, which sound ideal for your situation. If you're already using Lambda, it might be a relatively easy switch. – John Rotenstein Nov 21 '19 at 20:13
0

You can use manual activity completion to release the activity worker thread and implement retries and heartbeating asynchronously.

Another option is to have a separate task list and worker just for that activity with much higher limit of open activities. This way it is not going to take capacity from the worker that executes other activity types.

Maxim Fateev
  • 6,458
  • 3
  • 20
  • 35