I have a s3 bucket with a lot of small files, over 100K that add up to about 700GB. When loading the objects from a data bag and then persist the client always runs out of memory, consuming gigs very quickly.
Limiting the scope to a few hundred objects will allow the job to run, but a lot of memory is being used by the client.
Shouldn't only futures be tracked by the client? How much memory do they take?