1

We are using vSphere technologies for virtualization and I am tasked with developing a scalable application infrastructure. I'm familiar more with AWS.

Doing some research on scaling I'm not sure I understand the point in scaling containers like Docker. Say I have a single VM with 32GB of RAM that's running our docker containers. There's all of a sudden a lot of traffic and work load so I want to load balance and scale out. If I start up more identical containers on this server I'm not really allocating new resources for the application to be load balanced.

So how can I justify using a container management application to scale out docker containers on the fly when it's just not helping the issue and why is this offered as a solution to load balancing?

EDIT 1: From the answers... 2 virtual servers where my containers have room to grow and shrink. The 2 servers are taking up 128MB RAM. That's 128MB RAM that other systems on the same server rack could be using if they needed it but I am instead taking up the 128MB for my two virtual servers. That's what I don't want, I want my allocated resources to grow and shrink. So If we were doing vm's instead of containers I'd spin up another VM with 64MB RAM and/or stop VM's to free up RAM on the fly. I'm still not understanding the benefit of scaling containers compared to scaling vm's. It doesn't give me more resources to run my application and it doesn't shrink my resources so other systems can have my resources.

GolangFunk
  • 11
  • 2
  • "It doesn't give me more resources to run my application". Then you're doing it wrong. Containers should have allocated resources. Otherwise, auto-scaling is the least of your problems. – SYN Nov 20 '19 at 14:55
  • So you're saying containers compete for resources on the server. Maybe there's a container with 10mb allocated on the server but then there's a peak and I need to autoscale to create another container with 10mb allocated to load balance that part of the application. But the server machine only has 30MB ram so creating new containers doesn't give me more ram. I need to create another server. But i'm not going to give my server 60MB ram initially because I don't need that until peak time. I would usually start with 30mb then at peak auto scale and deploy another server vm at 30mb ram. – GolangFunk Nov 20 '19 at 15:43

2 Answers2

2

Autoscaling is designed for multi machine environment. Whether it's bare metal or VM's you can utilize others' machines resources to keep up with more load coming to your containers.

Just have a look at "Autoscaling in Kubernetes" (which is one of the most popular container orchestrating tools out there).

Also - if you move your applications to the cloud autoscaling becomes very easy and it's there where you will see it's full potential in action.

But you can for example set a cluster of machines on premises that will autoscale your applications up to the point the servers will be fully utilized. You can for example set up Kubernetes cluster on premises (non cloud).

Usefull links:

MetalLoadBalancer: Kubernetes On-Prem /BareMetal LoadBalancing

Kubernetes TCP load balancer service on premise (non-cloud)

Wojtek_B
  • 1,013
  • 4
  • 14
  • So are you saying it's common practice to allocate the max resources you'd need either on one machine or multiple machines so that your containers can scale within them. This still isn't freeing up resources on the on-prem server rack if I scale down my containers because the machines are still up and running within the resources that were allocated for them. – GolangFunk Nov 20 '19 at 14:41
  • 1
    Containers are sort of like lightweight VMs. A modest compute infrastructure can be carved up into hundreds of containers. They don't solve capacity planning problems any more then VMs do. You still need to purchase or rent capacity for peak utilization. – John Mahowald Nov 20 '19 at 21:22
2

Kubernetes' origins as a container operation system were with Google inventing a cluster to schedule jobs on. (VMs were just too heavyweight for them.) They already were launching millions of containers over they-won't-tell-you-how-many machines. Multiple node scaling was a given since the beginning.

Elements of this design can be scaled down.

Say a container instance is 1 GB RAM, enough to do something interesting like a web server. Also assume you have 128 GB RAM in a physical server. But one is none, so for high availability you have two physical servers. Whether or not you divide the compute into VMs, there is about 200 container capacity in total.

Being lightweight, each application can have many containers, enabling rolling upgrades and scale out. Perhaps an API is run out of 4 containers, a web site powered by it with 2, and some related batch jobs with another 2. There is capacity for a couple dozen such micro service based apps on our imaginary cluster, which is quite dense packing.

The web site service could be scaled to more containers. If more compute is needed in aggregate, more nodes could be added to the cluster. Kubernetes is capable of auto scaling in both these dimensions.


As an analogy, consider the physical shipping container. A ship unloading isn't particularly concerned about ground transport. Always more trains hauling the standard-sized containers away. Bigger nodes = longer trains. Scaling out = more trains. There are still specialty heavy loads though, those are databases and the like that are not easy to put in containers.

John Mahowald
  • 32,050
  • 2
  • 19
  • 34