0

We are finding that using AWS file storage (EFS or EBS using GP2 or GP3) from an EC2 instance is very slow when doing simultaneous reads. Here's an example:

I'm reading 30 binary files into memory, totaling 46 MB.

Doing this once takes about 16 ms. However, if I spawn 8 parallel processes on the same EC2 instance, each reading different sets of 30 binary files, each one takes an average of 105 ms (556% slower than a single process). It's almost like the 8 reads are happening serially instead of in parallel (though not quite). Note: There is no writing happening to these files at the time.

If I repeat the same test on my laptop, using local file storage, the same 8 simultaneous reads of the same files are each only about 70% slower than a single read.

Why is the performance hit of simultaneous reads of the same file so much greater using AWS storage?

Is there anything I can configure about the volume that would reduce that performance penalty?

Update: This does not seem to be dependent on reading the same files. I get the same performance whether each process is reading the same 30 files or 30 different files. Title and details updated to account for this.

JoeMjr2
  • 101
  • 3
  • Interesting question. Not sure the answer, but I wonder if you could look into disk caching, do the first read, the subsequent reads should come from the RAM cache and be near instant. I wonder if it's due to disk being across the network. 105ms still seems fairly quick, is it being so slow causing a problem? – Tim Jan 20 '23 at 18:46
  • @Tim This is not the actual use case. I just simplified it to demonstrate the issue. The actual use case is more involved, and getting the actual data needed out of the 8 files takes about 360ms one at at time, and an average of 2.5 seconds each when 8 are done at once. This is indeed a problem at scale. The issue with caching is that (in this example) the file set totals 46 MB, and there may be many such sets of files needed at a time, which would be a lot to cache in memory, so keeping them only on disk is ideal. – JoeMjr2 Jan 20 '23 at 19:20
  • Maybe you could work around it somehow - one thread starts, downloads the files, then makes them available locally. Hopefully someone can help answer your question. – Tim Jan 20 '23 at 20:26
  • 1
    Have you tested with a larger EC2 instance type, striping the data across multiple EBS volumes or a larger EBS volume? EBS performance is a function of network capacity and I'm guessing EFS as well. If the files are large enough you might be hitting the limit of the EC2 instance or a single EBS volume. As an example, we were able to increase DB performance by creating a RAID 0 array across two EBS volumes coupled with the larger instance we used for the DB. For smaller instances we did not see the same gains. – Tim P Jan 20 '23 at 20:32

1 Answers1

0

It turns out that this performance hit was due to a CPU bottleneck on the client. I was trying to read the file with 8 simultaneous processes, but the Docker container I was running it in was limited to only 2 cores. When I upped this to at least 8 cores, the performance went up considerably.

JoeMjr2
  • 101
  • 3