1

Basically what the title says: I want different workflows to be able to wait on the completion of shared tasks. For example, workflow 1 needs tasks A, B, and C completed, while workflow 2 needs tasks C, D, and E completed, before they move on to do other things. I know activities have unique id's, so if workflow 2 tried to start "C" while 1 had already started "C", it will return an ACTIVITY_ID_ALREADY_IN_USE error and will know now to start a duplicate copy of the activity worker. The problem is, how do I notify both workflows once C is complete?

Thanks

  • I don't really understand the question or the assumptions behind it. You can start C twice in parallel in two different workflow executions. This makes sense for specific activities like "compress this file" or "calculate `n!`". You may also want to make sure C only executes once, which makes sense for general activities like "generate daily reports". In this case, however, you wouldn't include C in two workflows. Can you please explain your scenario better? What is the relation between the workflows? – Kobi Aug 02 '15 at 05:45
  • 1
    Let's say we have a bunch of different files, like A, B, C, D etc. Now each workflow represents a group of these files, and the workflow needs these files, let's say decompressed, before it can do it's computation over the group. So workflow 1 needs to do a reduction over A, B, and C once they're decompressed, and workflow 2 needs to do a different reduction over C, D, and E once they're decompressed. Both workflows need C decompressed before they can proceed to their reduction stage. Hence your statement "you wouldn't include C in two workflows" is false. – user3784712 Aug 03 '15 at 07:16
  • Good point, thanks! So - what is C in your case? If C is "decompress file" - you only want to schedule it once (because you only want to decompress the file once). If C is "check if file is decompressed and decompressed it if needed", you can schedule it twice, but you need to synchronize the implementation somehow (e.g. database, or file system lock). You ask "how do I notify both workflows once C is complete?" - Who owns C? Who requested it? What triggers the notification? – Kobi Aug 03 '15 at 08:06
  • Also, I'm not sure SWF is suited for a general Publish/Subscribe pattern. I think it usually relies on workflows knowing each other by execution ID. – Kobi Aug 03 '15 at 08:10
  • I think what I'll try to do is have the workflows try to start the activities, and then on ACTIVITY_ID_ALREADY_IN_USE I'll manually poll to see once the decompressed file exists in the server/cache, or to see if the activity_id is no longer in use and can be restarted due to a failure or whatever – user3784712 Aug 03 '15 at 18:08
  • I was able to schedule two activities of the same type with the same activityId. That's not supposed to be possible, right? – user3784712 Aug 28 '15 at 23:45

1 Answers1

0

What sdk are you using for implementing the workflow? If you use the Java sdk flow framework then the framework abstracts all this logic of creating activity id from you. Both workflow 1 and workflow 2 will schedule the activity c and whenever the respective tasks are completed by activity c worker then the corresponding worker is informed. I think the flow framework is available for ruby as well.

Rohit
  • 842
  • 6
  • 8