Our spring boot application verion 2.5.10 has just a single api which reads from redis 6 times and does some calculations and there are no other IO operations such as database operations and is deployed in kubernetes with helm. Actual api execution time is under 20 milliseconds which I calculated using Instant.now()
. The current K6 load testing output p99 value is under 3 seconds when load testing from m4.4xlarge instance at 1000 VUs.
I have already tried below tomcat configurations using application properties file.
server.tomcat.threads.min-spare=100
server.tomcat.threads.max=400
server.tomcat.accept-count=200
Initially deployment was with just 0.5 cpu in kubernetes. Once the requirements were shared we increased cpu to 2000m and memory to 1Gb and increased replicaset to 3.
While load testing I noticed the runnable thread counts is low. Can I make any changes to increase the runnable count or is it okay?
jvm_threads_states_threads{state="runnable",} 5.0
jvm_threads_states_threads{state="blocked",} 0.0
jvm_threads_states_threads{state="waiting",} 202.0
jvm_threads_states_threads{state="timed-waiting",} 3.0
jvm_threads_states_threads{state="new",} 0.0
jvm_threads_states_threads{state="terminated",} 0.0
What possible optimizations or configurations can be made to further improve the p99 latency to under 500ms?