0

I have a list of items. The length of this list is unknown. I would like to use a loop to perform some fairly heavy processing on each item. I am wondering if it is possible to do this using a state machine. More specifically, can I have one step of the machine perform the loop and start an instance of the next step per item? If not what would you suggest? The goal is to keep this as serverless as possible.

P.S. I know that I can iterate with step functions, but the state machine will time out after a few minutes, and I am afraid that wouldn't be enough time to process all the items, if the processing is done serially.

na1368
  • 127
  • 9

2 Answers2

1

By "serverless", i hope u meant not starting & terminating instances manually everyday (or periodically).

Option 1: Put the items in AWS SQS, launch an instance, which will process the item one by one and terminate after all items are done. Here adding the items to SQS and launching the instance can be automated, depending on the trigger point of how/where the item list is coming from.

Option 2: AWS Batch, which can process all items in parallel if that works for your use-case.

Additional components would be needed depending on the trigger point e.g. if you want to fetch the item list from a URL every 6 hours (like a feed), then add a cloudwatch scheduled event, which triggers a lambda function, which downloads the items and uses option 1 or 2.

dy10
  • 117
  • 1
  • 8
  • Thanks for your help dy10 :) In option 1, Do you mean an EC2 instance by "instance" or a Lambda instance as stated in the question? I am trying to avoid dedicated servers, however, if I can programatically create an EC2 instance and terminate it when I am done, I guess it should be fine. – na1368 May 23 '17 at 17:14
  • I meant EC2 instance, because lambda is capped at 5 minutes. If your individual items can be processed in parallel and each of them would take less than 5 mins individually then you can use lambda too. But do note that u can automate ec2 instance launch and shutdown, so practically it shouldn't matter how your code is running. – dy10 May 24 '17 at 04:58
-2

If you can do parallel processing , then only Lambda will help. Also, AWS Lambda will only execute till 5 minutes. So , keep that in mind !

For more visit here .

Let me know, if you other queries.

tom
  • 3,720
  • 5
  • 26
  • 48