Connection bias in Cloud Run using http/2, GCLB

Question

GCLB , Serverless NEG, CloudRun(http/2) is used to configure the backend. This is a chart that records access counts per container by our own log-based metrics.

Request Count

Clearly, the containers appear to be grouped. And there appears to be a large bias in the number of requests between one group and another. (Not enough requests are being allocated to the container groups that have just been launched.) This was not seen when http1 was used in the connection between GCLB and CloudRun.

During this time, application logs are normal(few 503), so there is no application layer limiting requests. (The metrics are based on the standard request log, so the only metric is whether or not the request was made.)

How should this be remedied?

CloudRun Setting is as below

    metadata:
      name: app
      annotations:
        run.googleapis.com/client-name: gcloud
        run.googleapis.com/client-version: 411.0.0
        autoscaling.knative.dev/minScale: '4'
        run.googleapis.com/execution-environment: gen2
        autoscaling.knative.dev/maxScale: '200'
        run.googleapis.com/cpu-throttling: 'false'
        run.googleapis.com/startup-cpu-boost: 'true'
    spec:
      containerConcurrency: 100
      timeoutSeconds: 300
      containers:
        ports:
        - name: h2c
          containerPort: 8080

I have tried changing the number of containerConcurrency, but this does not improve things much. There is still an imbalance, as it appears that requests are allocated to containers that have just been launched or not allocated at all.

Connection bias in Cloud Run using http/2, GCLB

0 Answers0