How does horizontal scaling decrease load of a system?

Question

In theory, say I have a 3 node LXD cluster running under a load balancer. Traffic begins ramping up and I need to scale horizontally by adding another node into the cluster. At this point, I have a 4-node cluster under the LB.

In what ways in this approach beneficial? I'm trying to understand if it reduces the load of the overall baremetal that contains the nodes or just allows for more requests to process. In the case that it's more requests that are allowed to process through, would I want to me measuring the load average of each individual container?

If you ran a supermarket and the lines were always full, would you employ new staff and increase the amount of available lines or would you attempt to train the staff to work faster? — Marty, Jul 04 '16 at 00:00
So more requests will process, cool. What does this do to the overall CPU of the system housing the nodes? It's still getting hit the same amount of times, so does that value stay the same? — Ryan Shocker, Jul 04 '16 at 00:05
I've never been given the opportunity to deal with scaling up across machines, so I don't know the specifics, but I was under the impression that each "node" was its own machine with its own resources. — Marty, Jul 04 '16 at 00:08

Software Engineer · Answer 1 · 2016-07-04T00:29:07.247

As long as there are spare CPU cycles and sufficient network bandwidth, adding a new node will almost always allow more requests to be handled simultaneously. In some cases though this will reduce the responsiveness of each request.

If CPU load is high then adding another node (on the same box) will reduce throughput (extend response time -- because you're asking the cpu to do more work that it can do simultaneously). If instead the load is high due blocking IO then adding another node on the box should not significantly affect the processing time for each request.

In the blocking IO case you can add new nodes until CPU is at its threshold level -- you never want to max out CPU, you want to have the thresholds at, maybe, 75%, to allow for variations in load.

However, you shouldn't be running code with blocking io these days -- you should be writing everything in Node.js or Go ;)

Just to make this clear. Blasting HTTP requests to a machine with 3 nodes and a high CPU usage and then adding a 4th node to compensate for this would reduce CPU usage? This idea still doesn't make sense to me. — Ryan Shocker, Jul 04 '16 at 00:27
I don't think that's what I said. I think I said that if you're at high cpu and you add more work then everything will get slower. — Software Engineer, Jul 04 '16 at 00:30

score 1 · Answer 2 · answered Jul 04 '16 at 00:31

1

You would really need to get more specific about your set-up, which this may not be the best forum. If your app is multithreaded and you are using a single machine, then, in theory, you are not gaining anything by adding more nodes other than the ability to isolate a particularly "greedy" client if you are using resource limits on the nodes. Even if it is single-threaded (e.g., Node.js) and you use clusters or child processes, you can still max out the CPU.

One advantage of sparse nodes with single processes is you can tailor the resource limits appropriate to each application. At a certain point, you are going to need the ability to scale to multiple hosts, particularly if you need high-availability.

answered Jul 04 '16 at 00:31

ldg

9,112
2
29
44

Thanks for the reply, which forum is best? So basically the correct solution is scaling up another baremetal and reconfiguring the loadbalancer to distribute requests to that machine. I see. Thanks – Ryan Shocker Jul 04 '16 at 00:38
I just mean you will get better results here asking more specific questions like even specifying which app/s you are balancing. ServerFault or SuperUser would be the start of other options but see [here](http://meta.stackoverflow.com/questions/276579/should-docker-questions-go-on-stackoverflow-or-serverfault-or-superuser) as it can be a gray area. – ldg Jul 04 '16 at 00:54
But at a high-level, I would say if you are looking for best scalability and availability, balancing between hosts would be prudent, and as mentioned, optimize resource limits per service. You may also want to check out the latest version of [swarm](https://docs.docker.com/engine/swarm/) – ldg Jul 04 '16 at 01:00

How does horizontal scaling decrease load of a system?

2 Answers2