I see higher throughput and long average response delay (waiting for worker in range 20-50 seconds), see outputs from grafana:
I know, that part of optimization can be:
- use more workers (for each pod/replica)
- increase sources for each pod/replica
- use more pods/replicas in k8s
I tuned performance based on increase sources and pods/replicas see:
# increase of sources (for faster execution)
fn.with_requests(mem="500Mi", cpu=0.5) # default sources
fn.with_limits(mem="2Gi", cpu=1) # maximal sources
# increase parallel execution based on increase of pods/replicas
fn.spec.replicas = 2 # default replicas
fn.spec.min_replicas = 2 # min replicas
fn.spec.max_replicas = 5 # max replicas
Do you know, how can I increase amount of workers and expected impacts to CPU/Memory?