1

I moved from gvisor-containerd-shim (Shim V1) to containerd-shim-runsc-v1 (Shim V2). The metrics server and the Horizontal Pod Autoscaler used to work just fine in the case of gvisor-containerd-shim.

But now, with containerd-shim-runsc-v1, I keep getting CPU and memory metrics for nodes and runc pods, but I only get memory metrics for runsc (gvisor) pods.

For example, I deployed a PHP server in a gvisor pod with containerd-shim-runsc-v1. I get the following metrics:

kubectl get hpa
NAME         REFERENCE               TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache   0%/50%    1         10        1          68s


kubectl top nodes
NAME         CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
snf-877559   549m         13%    2327Mi          39%


kubectl top pods
NAME                                 CPU(cores)   MEMORY(bytes)
php-apache-gvisor-6f7bb6cf84-28qdk   0m           52Mi

After sending some load to the php-apache-gvisor pod, I can see CPU and memory usage increment for the node and for the runc pod (load-generator). I can also see that php-apache-gvisor's memory is increased from 52 to 72 Mi but its CPU usage remains at 0%. Why does the cpu usage remain at 0%?

I also tried with different container images, but I keep getting same results.

With load I get the following metrics:

kubectl get hpa
NAME         REFERENCE               TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache   0%/50%    1         10        1          68s


kubectl top nodes
NAME         CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
snf-877559   946m         23%    2413Mi          41%


kubectl top pods
NAME                             CPU(cores)   MEMORY(bytes)
load-generator-7d549cd44-xmbqw   3m           1Mi
php-apache-gvisor-6f7bb6cf84-28qdk      0m           72Mi

Further infos:

kubeadm, kubernetes 1.15.3, containerd 1.3.3, runsc nightly/2019-09-18, flannel

kubectl logs metrics-server-74657b4dc4-8nlzn -n kube-system
I0728 09:33:42.449921       1 serving.go:312] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)
I0728 09:33:44.153682       1 secure_serving.go:116] Serving securely on [::]:4443
E0728 09:35:24.579804       1 reststorage.go:160] unable to fetch pod metrics for pod default/php-apache-gvisor-6f7bb6cf84-28qdk: no metrics known for pod
E0728 09:35:39.940417       1 reststorage.go:160] unable to fetch pod metrics for pod default/php-apache-gvisor-6f7bb6cf84-28qdk: no metrics known for pod

/etc/containerd/config.toml (containerd-shim-runsc-v1)

subreaper = true
oom_score = -999
disabled_plugins = ["restart"]


[debug]
    level = "debug"

[metrics]
    address = "127.0.0.1:1338"

[plugins.linux]
    runtime = "runc"
    shim_debug = true


[plugins.cri.containerd.runtimes.runsc]
  runtime_type = "io.containerd.runsc.v1"

/etc/containerd/config.toml (gvisor-containerd-shim)

subreaper = true
oom_score = -999
disabled_plugins = ["restart"]


[debug]
    level = "debug"

[metrics]
    address = "127.0.0.1:1338"

[plugins.linux]
    runtime = "runc"
    shim_debug = true
    shim = "/usr/local/bin/gvisor-containerd-shim"


[plugins.cri.containerd.runtimes.runsc]
  runtime_type = "io.containerd.runtime.v1.linux"
  runtime_engine = "/usr/local/bin/runsc"
  runtime_root = "/run/containerd/runsc"

The metrics server yaml is based on https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml with the following args

....
      containers:
      - name: metrics-server
        image: k8s.gcr.io/metrics-server-amd64:v0.3.6
        imagePullPolicy: IfNotPresent
        args:
          - --kubelet-preferred-address-types=InternalIP
          - --kubelet-insecure-tls
          - --cert-dir=/tmp
          - --secure-port=4443
....

The current deployment has the below resources section

  resources:
    limits:
      cpu: 500m
    requests:
      cpu: 200m
virt
  • 11
  • 2
  • What was your containerd config for your gVisor only config? What did you add to configure runc? – Rico Jul 28 '20 at 13:09
  • Thank you for your comment. For containerd configuration, I used the /etc/containerd/config.toml file (you can see it in my question). I also used RuntimeClass for gvisor. For the installation I relied on this guide -> https://github.com/google/gvisor-containerd-shim/blob/master/docs/runtime-handler-shim-v2-quickstart.md . Please let me know if I misunderstood your question. – virt Jul 28 '20 at 16:18
  • yes, but you said initially you had gVisor. Then you add runc and gVisor together. What did you change in the configs? – Rico Jul 28 '20 at 16:30
  • Thank you. In both cases I ran both runc and runsc containers. In the first case I was using gvisor-containerd-shim and in the second case I was using containerd-shim-runsc-v1. To move from gvisor-containerd-shim to containerd-shim-runsc-v1, I deleted the deployments, changed the /etc/containerd/config.toml file, restarted containerd and kubelet and then I deployed again the metrics server and the example deployment. Moreover, I deployed containerd-shim-runsc-v1 to a brand new cluster, and this time I got the same results. I updated my answer so you can see the initial config file. – virt Jul 28 '20 at 17:16

1 Answers1

1

gVisor currently only reports memory and Pids on a per Pod basis. See: https://github.com/google/gvisor/blob/add40fd/runsc/boot/events.go#L62-L68

We are planning to export more stats and the issue for tracking that work is here: https://gvisor.dev/issue/172

Ian Lewis
  • 917
  • 1
  • 7
  • 17