I am having nested json as source in gzip format. In Synapse pipeline I am using the dataflow activity where I have mentioned the compression type as gzip in the source dataset. The pipeline was executing fine for small size files under 10MB. When I tried to execute pipeline for a large gzip file about 89MB.
The dataflow activity failed with below error:
Error1 {"message":"Job failed due to reason: Cluster ran into out of memory issue during execution,
please retry using an integration runtime with bigger core count and/or memory optimized compute type.
Details:null","failureType":"UserError","target":"df_flatten_inf_provider_references_gz","errorCode":"DF-Executor-OutOfMemoryError"}
Requesting for your help and guidance.
To resolve Error1, I tried Azure integration runtime with bigger core count (128+16 cores) and memory optimized compute type but still the same error. I thought it could be too intensive to read json directly from gzip so I tried a basic copy data activity to decompress the gzip file first but still its failing with the same error.