From what I understand, using multiple replicas as well as auto-scaling is supposed to help in the case that lots of people visit your website and make calls to services provided by your Kubernetes cluster.
How do the replicas help with scaling?
Aren't these extra pods all just running on the same computer with constant resources?
That would mean that they're all limited by a constant amount of CPU and memory.