Capacity planning on AWS

Question

I need some understanding on how to do capacity planning for AWS and what kind of infrastructure components to use. I am taking the below example.

I need to setup a nodejs based server which uses kafka, redis, mongodb. There will be 250 devices connecting to the server and sending in data every 10 seconds. Size of each data packet will be approximately 10kb. I will be using the 64bit ubuntu image

What I need to estimate,

MongoDB requires atleast 3 servers for redundancy. How do I estimate the size of the VM and EBS volume required e.g. should be m4.large, m4.xlarge or something else? Default EBS volume size is 30GB.
What should be the size of the VM for running the other application components which include 3-4 processes of nodejs, kafka and redis? e.g. should be m4.large, m4.xlarge or something else?
Can I keep just one application server in an autoscaling group and increase as them as the load increases or should i go with minimum 2

I want to generally understand that given the number of devices, data packet size and data frequency, how do we go about estimating which VM to consider and how much storage to consider and perhaps any other considerations too

score 1 · Answer 1 · answered Sep 04 '18 at 07:52

Nobody can answer this question for you. It all depends on your application and usage patterns.

The only way to correctly answer this question is to deploy some infrastructure and simulate standard usage while measuring the performance of the systems (throughput, latency, disk access, memory, CPU load, etc).

Then, modify the infrastructure (add/remove instances, change instance types, etc) and measure again.

You should certainly run a minimal deployment per your requirements (eg instances in separate Availability Zones for High Availability) and you can use Auto Scaling to add extra capacity when required, but simulated testing would also be required to determine the right triggers points where more capacity should be added. For example, the best indicator might be memory, or CPU, or latency. It all depends on the application and how it behaves under load.

Yes, I understand that but there must be some way of estimating this even at a theoretical level. If a customer asks to give a budget estimate for this kind of requirement, it is not feasible to actually make the application, setup the servers, do a load test and then provide the estimate. — Avi, Sep 04 '18 at 08:55

Capacity planning on AWS

1 Answers1