1

I have a Kubernetes onebox deployment with the following (containerized) components, all running as --net=host, with kubelet running as a privileged Docker container with the kubernetes flag --allow-privileged set to true.

gcr.io/google_containers/hyperkube-amd64:v1.7.9   "/bin/bash -c './hype"     kubelet
gcr.io/google_containers/hyperkube-amd64:v1.7.9   "/bin/bash -c './hype"     kube-proxy
gcr.io/google_containers/hyperkube-amd64:v1.7.9   "/bin/bash -c './hype"     kube-scheduler
gcr.io/google_containers/hyperkube-amd64:v1.7.9   "/bin/bash -c './hype"     kube-controller-manager
gcr.io/google_containers/hyperkube-amd64:v1.7.9   "/bin/bash -c './hype"     kube-apiserver
quay.io/coreos/etcd:v3.1.0                        "/usr/local/bin/etcd "     etcd

On top of this, I enabled the addon manager with kubectl create -f https://github.com/kubernetes/kubernetes/blob/master/test/kubemark/resources/manifests/kube-addon-manager.yaml, with the default yaml manifests for calico 2.6.1 and kube-dns 1.14.5 mounted to /etc/kubernetes/addons/. The calico pod comes up with two nodes (install-cni and calico-node) as expected.

However, kube-dns gets stuck in ContainerCreating or ContainerCannotRun, with the following error while trying to start the Kubernetes pause container:

{"log":"I1111 00:35:19.549318       1 manager.go:913] Added container: \"/kubepods/burstable/pod3173eef3-c678-11e7-ac4b-e41d2d59689e/1dd57d6f6c996d7abe061f6236fc8a0150cf6f95d16d5c3c462c9ed7158d3c54\" (aliases: [k8s_POD_kube-dns-v20-141138543-pmdww_kube-system_3173eef3-c678-11e7-ac4b-e41d2d59689e_0 1dd57d6f6c996d7abe061f6236fc8a0150cf6f95d16d5c3c462c9ed7158d3c54], namespace: \"docker\")\n","stream":"stderr","time":"2017-11-11T00:35:19.5526284Z"}
{"log":"I1111 00:35:19.549433       1 cni.go:291] About to add CNI network cni-loopback (type=loopback)\n","stream":"stderr","time":"2017-11-11T00:35:19.5526748Z"}
{"log":"I1111 00:35:19.549504       1 handler.go:325] Added event \u0026{/kubepods/burstable/pod3173eef3-c678-11e7-ac4b-e41d2d59689e/1dd57d6f6c996d7abe061f6236fc8a0150cf6f95d16d5c3c462c9ed7158d3c54 2017-11-11 00:35:19.3931718 +0000 UTC containerCreation {\u003cnil\u003e}}\n","stream":"stderr","time":"2017-11-11T00:35:19.5527217Z"}
{"log":"I1111 00:35:19.551134       1 container.go:407] Start housekeeping for container \"/kubepods/burstable/pod3173eef3-c678-11e7-ac4b-e41d2d59689e/1dd57d6f6c996d7abe061f6236fc8a0150cf6f95d16d5c3c462c9ed7158d3c54\"\n","stream":"stderr","time":"2017-11-11T00:35:19.5527441Z"}
{"log":"E1111 00:35:19.555099       1 cni.go:294] Error adding network: failed to Statfs \"/proc/54226/ns/net\": no such file or directory\n","stream":"stderr","time":"2017-11-11T00:35:19.5553606Z"}
{"log":"E1111 00:35:19.555122       1 cni.go:237] Error while adding to cni lo network: failed to Statfs \"/proc/54226/ns/net\": no such file or directory\n","stream":"stderr","time":"2017-11-11T00:35:19.5553887Z"}
{"log":"I1111 00:35:19.600281       1 manager.go:970] Destroyed container: \"/kubepods/burstable/pod3173eef3-c678-11e7-ac4b-e41d2d59689e/1dd57d6f6c996d7abe061f6236fc8a0150cf6f95d16d5c3c462c9ed7158d3c54\" (aliases: [k8s_POD_kube-dns-v20-141138543-pmdww_kube-system_3173eef3-c678-11e7-ac4b-e41d2d59689e_0 1dd57d6f6c996d7abe061f6236fc8a0150cf6f95d16d5c3c462c9ed7158d3c54], namespace: \"docker\")\n","stream":"stderr","time":"2017-11-11T00:35:19.6005722Z"}

I see \pause containers keep coming up just to exit a second later, with an innocuous error message (this one is old, I stopped the cluster so it wouldn't keep spawning more containers):

ubuntu@r172-16-6-39:~$ docker ps -a | grep 216e39defa36
216e39defa36        gcr.io/google_containers/pause-amd64:3.0          "/pause"                 About an hour ago   Exited (0) About an hour ago                         k8s_POD_kube-dns-v20-141138543-xvdmv_kube-system_0594732f-c688-11e7-9da5-e41d2d59689e_17
ubuntu@r172-16-6-39:~$ docker logs 216e39defa36
shutting down, got signal: Terminated

The dir /proc/54226 doesn't exist on my host, which I assume is why CNI is complaining. But the pause containers for Calico are fine, running the same image, so something must be either failing to write only in the case of kube-dns, or not trying to write in the case of Calico. I found some references to a similar SELinux-related error on Openshift, but I'm running a bare Ubuntu 14.04 VM without SELinux even installed.

ubuntu@r172-16-6-39:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 14.04.4 LTS
Release:        14.04
Codename:       trusty
ubuntu@r172-16-6-39:~$ setenforce
The program 'setenforce' is currently not installed. You can install it by typing:
sudo apt-get install selinux-utils

My CNI conf is also pretty simple, generated by the install-cni calico container:

ubuntu@r172-16-6-39:~$ cat /etc/cni/net.d/10-calico.conf
{
    "name": "k8s-pod-network",
    "cniVersion": "0.1.0",
    "type": "calico",
    "log_level": "debug",
    "datastore_type": "kubernetes",
    "nodename": "172.16.6.39",
    "mtu": 1500,
    "ipam": {
        "type": "host-local",
        "subnet": "usePodCidr"
    },
    "policy": {
        "type": "k8s",
        "k8s_auth_token": "****"
    },
    "kubernetes": {
        "k8s_api_root": "https://168.16.0.1:443",
        "kubeconfig": "/etc/kubernetes/kubeconfig"
    }
}

Has anyone hit something similar?

efkh
  • 11
  • 2

0 Answers0