I have created a lambda that iterates over all the files in a given S3 bucket and deletes the files in S3 bucket. The S3 bucket has around 100K files and I am selecting and deleting the around 60K files. I have set the timeout for lambda to max (15 minutes) timeout value. The lambda is consistently returning "network error" after few minutes though it seems to run in the background for sometime even after the error is returned. How can get around this?
3 Answers
S3 has Rate Limiting, which restricts the number of reads and writes you can do per second.
Amazon S3 automatically scales to high request rates. For example, your application can achieve at least 3,500 PUT/POST/DELETE and 5,500 GET requests per second per prefix in a bucket. There are no limits to the number of prefixes in a bucket. It is simple to increase your read or write performance exponentially. For example, if you create 10 prefixes in an Amazon S3 bucket to parallelize reads, you could scale your read performance to 55,000 read requests per second.
If all of those objects have the same 8 characters in their key (file path) then they are on the same prefix, and limited to 3,500 DELETEs and 5,500 GETs. If this is the case and you need to do this regularly, think about changing the prefix so the first 8 characters of the key are different, forcing the objects to be spread across more nodes. One of my previous answers goes in to more detail on that.
Alternatively, you can use the bulk delete operation delete up to 1000 objects per operation.
The delay you see is probably due to eventual consistency when S3 is synchronising across the AZs in the region.
Amazon S3 offers eventual consistency for overwrite PUTS and DELETES in all regions.

- 3,289
- 1
- 15
- 29
-
Thanks @Matt. This contains about 100 prefixes and each prefix has around 600 - 1500 files. I'll try the bulk delete option. – Punter Vicky Nov 06 '18 at 22:21
I was testing another function and this error came up as a result. Reading a little in the documentation I found that I activated the throttle option, that it reduces the amount of rate for your function.
The solution is to create another function and see if the throttle is making that error.

- 21
- 5
Make sure that you have the right Lambda VPC, subnet and SG configured on the function itself.