1

Situation:

I want to use AWS SWF to coordinate long running manual activities. When activity is being scheduled in AWS I transfer it to DB to show on UI what tasks are pending. Those tasks can take weeks to complete, thus they have huge timeouts in SWF.

Problem:

In case my application fails to populate DB (hangs or dies without reporting any error), then task is not seen by a person and retry can only happen after weeks, when activity times out (which is obviously unacceptable).

Question:

So I would want to have an ability to "start" the task (say having timeout of 30 seconds) and when application is sure that activity is now started set timeout to weeks. Is it really possible to do it elegantly using SWF?

(I've read through doc and several examples and still don't understand what is the envisioned way of running manual tasks)

Monty Joe
  • 33
  • 4

1 Answers1

3

Unfortunately the SWF service doesn't support "start activity task" API call. The workaround I used was to use an activity with a short timeout to insert the record into a DB. Then upon the manual task completion signal workflow about it. A separate timer was needed to deal with the manual task timeout. All this logic can be encapsulated in a separate class for reuse.

Added benefit of using signal is that manual tasks usually have more than one state. For example workflow can be signaled when task is claimed and later released back. Each state can have a different timeout.

[Edit: Added strawman ManualActivityClient example]

public class ManualActivityClient {

    private final Map<String, Settable<Void>> outstandingManualActivities = new HashMap<>();

    private StartManualActivityClient startActivityClient;
    private WorkflowClock clock;

    public Promise<Void> invoke(String id, String activityArgs, long timeout) {
        Promise<Void> started = startActivityClient.start(id, activityArgs);
        Settable<Void> completionPromise = new Settable<>();
        outstandingManualActivities.put(id, completionPromise);
        // TryFinally is used to define cancellation scope for automatic timer cancellation.
        new TryFinally() {
            @Override
            protected void doTry() throws Throwable {
                // Wrap timer invocation in Task(true) to give it daemon flag. Daemon tasks are automatically
                // cancelled when all other tasks in the same scope (defined by doTry) are done.
                new Task(true) {
                    @Override
                    protected void doExecute() throws Throwable {
                        Promise<Void> manualActivityTimeout = clock.createTimer(timeout);
                        new Task(manualActivityTimeout) {
                            @Override
                            protected void doExecute() throws Throwable     {
                                throw new TimeoutException("Manual activity " + id + " timed out");
                            }
                        };
                    }
                };
                // This task is used to "wait" for manual task completion. Without it the timer would be
                // immediately cancelled.
                new Task(completionPromise) {
                    @Override
                    protected void doExecute() throws Throwable {
                        // Intentionally empty
                    }
                };
            }

            @Override
            protected void doFinally() throws Throwable {

            }
        };
        return completionPromise;
    }

    public void signalManualActivityCompletion(String id) {
        // Set completionPromise to ready state
        outstandingManualActivities.get(id).set(null);
    }
}

And this class can be used as:

@Workflow(...)
public class ManualActivityWorkflow {

    private ManualActivityClient manualActivityClient;

    @Execute(...)
    public void execute() {
        // ...
        Promise<Void> activity1 = manualActivityClient.invoke("activity1", "someArgs1", 300);
        Promise<Void> activity2 = manualActivityClient.invoke("activity2", "someArgs2", 300);

        // ...
    }

    @Signal(...)
    public void signalManualActivityCompletion(String id) {
        manualActivityClient.signalManualActivityCompletion(id);
    }

}

Maxim Fateev
  • 6,458
  • 3
  • 20
  • 35
  • Thanks for the great answer! Unfortunately I find this workaround inconvenient for my case as it's going to be hard to coordinate multiple parallel manual processes using timer+signal. – Monty Joe Sep 23 '16 at 15:45
  • It is not that complicated and can be encapsulated in a well defined class. Ping me directly if you need help in implementing it. – Maxim Fateev Sep 23 '16 at 15:53
  • I realized that I need to send timer id with a manual task and coordinate timer cancellation this way, so it is possible to coordinate multiple manual activities. – Monty Joe Sep 24 '16 at 14:19
  • There is no need to send it out. Just keep Map of outstanding tasks as a field of the workflow object. – Maxim Fateev Sep 26 '16 at 19:00