I am using AWS batch
for executing jobs, I am calculating the initial memory to use by content size. About 90% of times its successful but 10% times it fails with OutOfMemory error
.
So for next attempt for this failed jobs, I would like to increase the memory and submit the job again. I can not use AWS batch Job Attempts
for this, I will need a different FailOver Strategy.
One way I can use is to have a lambda to check the job status every 1 hr and if its failed submit the job again with additional memory.
Are there any other better ways to have FailOver strategy for AWS Batch jobs?