I'm doing some experiments with my application. My application is in a Docker container and it is programmed to send 4 request every second to a web-server. When I host 30 containers on a single server everything is smooth and working properly. However, when I scale it to 50 containers I can see some performance degradation (number of requests sent decreased to 3 to 2). I check the CPU/Memory utilization and it's quite stable and below 50%. Also the load average for my server is around 4. My guess is that it could be due to excessive context switching but I don't know where to look at to either confirm or deny this. My question is how to detect a software contention on a server? My other question is how to find bottlenecks in general?
PS. I'm using a Linux machine with 4 VCPU cores and 8 GB of RAM.