I have a Deployment with x amount of gameservers (pods) running. I'm using Agones to make sure gameservers with players connected to them won't get stopped by downscaling. In addition, I use a Service ("connected" to all of the gameserves) which acts as a LoadBalancer for the pods and as I understand it, it will randomly choose a gameserver when a player connects to the service. This all works great when upscaling, but not so much when downscaling. Since Agones prevents gameservers with players on them from scaling down, the amount of pods will essentially never decrease because the service doesn't consider the amount of desired replicas (the actual amount is higher because gameservers with players on them won't be downscaled).
Is there a way to prevent the LoadBalancer service from picking a gameserver (replica) that's no longer desired? For example: current network load only requires 3 replicas, but currently there's 5 because there's 5 servers with players on them preventing them from shutting down. I would like to only spread new load accross the 3 desired replicas (gameservers) to give the other 2 the chance to reach 0 players so it's eventually able to shut itself down.