AWS EBS volume seems bottlenecked

Question

I have a postgres data warehouse type machine that does 24/7 continuous work. It goes through periods of very problematic performance which I've had the hardest time figuring out the cause of. Here are some bullet points:

t3.2xlarge
gp3, 1.6TB, 8000IOPS + 200mbps (though I've tried various levels of IOPS/mbps and also io1)
ubuntu 18.04

Without going into specific database stuff which could be a whole other conversation, assume that everything else seems to be as reasonably well configured as one could expect.

The main issues are:

The "Average Queue Length" is always in the 3-4 range. My impression is that this is high, though I've had a hard time finding a concrete answer for that - what I've seen is mostly relative advice (lower is better - obviously).
Though not always, many times and for extended periods the total IOPS throughput is flatlined at exactly 4000 - despite my provisioning always being well above that. This is very specific and suspicious. Meanwhile, my combined throughput is << my provisioned throughput.

Combining 1 & 2 together, it is very clear that something is constraining the disk. I am unsure where to start (over again) to clearly answer that, despite trying many things already. Could it be a ubuntu setting? Could it be a network card thing? How do I know if AWS isn't just cheating me?

score 0 · Answer 1 · answered May 12 '23 at 21:24

Whelp... I have at least an answer to one of the suspicious parts of this. The 4000 IOPS limit turned out to be a hard limit on the instance with only 30 minutes of ~2x burst time per 24h. Somehow I've not seen this disclosed anywhere previously, but finally found it.

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-optimized.html#ebs-optimization-performance

This seems so relevant and yet so hidden away. There is absolutely NO point in upgrading your volumes (or using gp2 for that matter) if your instance is capped at a small fraction of their capability... TIL.

AWS EBS volume seems bottlenecked

1 Answers1