We are running 3 replicas of a java/spring boot microservice in a docker/k8s environment. After days of running without any issues one of the 3 becomes very slow due to extreme cpu usage and very high gc pauses. One specific odd metrics that caught my eye was that the jvm (for whatever reason) decided to shrink the eden space and enlarge the old gen. (by a factor of 6 approximately)
Even after restarting the pod/container the odd eden/old gen ratio and extreme cpu usage occurs. The three replicas serve the same requests and only one of three shows this behavior.
What could cause such behavior? (jvm args are -Xmx5900m -Xms5g)
(in this screenshot between 14:15 and 14:20 the issue occurs)