I have a state-machine consisting of a first pre-process task that generates an array as output, which is used by a subsequent map state to loop over. The output array of the first task has gotten too big and the state-machine throws the error States.DataLimitExceeded
: The state/task 'arn:aws:lambda:XYZ' returned a result with a size exceeding the maximum number of characters service limit.
Here is an example of the state-machine yaml:
stateMachines:
myStateMachine:
name: "myStateMachine"
definition:
StartAt: preProcess
States:
preProcess:
Type: Task
Resource:
Fn::GetAtt: [preProcessLambda, Arn]
Next: mapState
ResultPath: "$.preProcessOutput"
mapState:
Type: Map
ItemsPath: "$.preProcessOutput.data"
MaxConcurrency: 100
Iterator:
StartAt: doMap
States:
doMap:
Type: Task
Resource:
Fn::GetAtt: [doMapLambda, Arn]
End: true
Next: ### next steps, not relevant
A possible solution I came up with would be that state preProcess
saves its output in an S3-bucket and state mapState
reads directly from it. Is this possible? At the moment the output of preProcess
is
ResultPath: "$.preProcessOutput"
and mapState
takes the array
ItemsPath: "$.preProcessOutput.data"
as input.
How would I need to adapt the yaml that the map state reads directly from S3?