2

I have installed an 64bit ubuntu on my raspberry pi 4 and it seems to me that each pod restarts frequenty:

microk8s.kubectl describe pod redis-c49fd5d65-g8ghn
Name:         redis-c49fd5d65-g8ghn
Namespace:    default
Priority:     0
Node:         raspberrypi4-docker1/192.168.0.45
Start Time:   Thu, 10 Sep 2020 08:11:38 +0000
Labels:       app=redis
              pod-template-hash=c49fd5d65
Annotations:  <none>
Status:       Running
IP:           10.1.42.201
IPs:
  IP:           10.1.42.201
Controlled By:  ReplicaSet/redis-c49fd5d65
Containers:
  redis:
    Container ID:   containerd://9b8300e456691025ccbfbee588a52069a1fa25ffa6f0c1b5f5f652227a1172f3
    Image:          hypriot/rpi-redis:latest
    Image ID:       sha256:2e0128f189c5b19a15001e48fac1d0326326cebb4195abf6a56519e374636f1f
    Port:           6379/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Sun, 07 Mar 2021 10:15:57 +0000
    Last State:     Terminated
      Reason:       Unknown
      Exit Code:    255
      Started:      Sun, 07 Mar 2021 09:24:16 +0000
      Finished:     Sun, 07 Mar 2021 10:14:43 +0000
    Ready:          True
    Restart Count:  4579
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-dn4bk (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  default-token-dn4bk:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-dn4bk
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason          Age                    From     Message
  ----     ------          ----                   ----     -------
  Normal   SandboxChanged  8d                     kubelet  Pod sandbox changed, it will be killed and re-created.
  Normal   SandboxChanged  8d                     kubelet  Pod sandbox changed, it will be killed and re-created.
  Normal   Pulled          8d                     kubelet  Container image "hypriot/rpi-redis:latest" already present on machine
  Normal   Created         8d                     kubelet  Created container redis
  Normal   Started         8d                     kubelet  Started container redis
  Normal   SandboxChanged  8d                     kubelet  Pod sandbox changed, it will be killed and re-created.
  Normal   Pulled          8d                     kubelet  Container image "hypriot/rpi-redis:latest" already present on machine
  Normal   Created         8d                     kubelet  Created container redis
  Normal   Started         8d                     kubelet  Started container redis
  Normal   SandboxChanged  8d                     kubelet  Pod sandbox changed, it will be killed and re-created.
  Normal   Pulled          8d                     kubelet  Container image "hypriot/rpi-redis:latest" already present on machine
  Normal   Created         8d                     kubelet  Created container redis
  Normal   Started         8d                     kubelet  Started container redis

...
  Normal   SandboxChanged  108m                   kubelet  Pod sandbox changed, it will be killed and re-created.
  Normal   Pulled          107m                   kubelet  Container image "hypriot/rpi-redis:latest" already present on machine
  Normal   Created         107m                   kubelet  Created container redis
  Normal   Started         107m                   kubelet  Started container redis
  Normal   SandboxChanged  101m                   kubelet  Pod sandbox changed, it will be killed and re-created.
  Normal   Pulled          101m                   kubelet  Container image "hypriot/rpi-redis:latest" already present on machine
  Normal   Created         101m                   kubelet  Created container redis
  Normal   Started         101m                   kubelet  Started container redis
  Normal   SandboxChanged  49m                    kubelet  Pod sandbox changed, it will be killed and re-created.
  Normal   Pulled          49m                    kubelet  Container image "hypriot/rpi-redis:latest" already present on machine
  Normal   Started         49m                    kubelet  Started container redis
  Normal   Created         49m                    kubelet  Created container redis

I have read that this error can be a result of networking failure, what I can found is a DNS error messages in my journalctl logs:

Mar 07 11:24:52 raspberrypi4-docker1 systemd-resolved[1760]: Server returned error NXDOMAIN, mitigating potential DNS violation DVE-2018-0001, retrying transaction with reduced feature level UDP.
Mar 07 11:24:52 raspberrypi4-docker1 systemd-resolved[1760]: Server returned error NXDOMAIN, mitigating potential DNS violation DVE-2018-0001, retrying transaction with reduced feature level UDP.
Mar 07 11:24:52 raspberrypi4-docker1 systemd-resolved[1760]: Server returned error NXDOMAIN, mitigating potential DNS violation DVE-2018-0001, retrying transaction with reduced feature level UDP.
Mar 07 11:24:52 raspberrypi4-docker1 systemd-resolved[1760]: Server returned error NXDOMAIN, mitigating potential DNS violation DVE-2018-0001, retrying transaction with reduced feature level UDP.
Mar 07 11:24:55 raspberrypi4-docker1 microk8s.daemon-kubelet[4953]: E0307 11:24:55.190320    4953 summary_sys_containers.go:47] Failed to get system container stats for "/systemd/system.slice": failed to get cgroup stats for "/systemd/system.slice": failed to get container info for "/systemd/system.slice": unknown container "/systemd/system.slice"

Microk8s inspect output:

Inspecting Certificates
Inspecting services
  Service snap.microk8s.daemon-cluster-agent is running
  Service snap.microk8s.daemon-flanneld is running
  Service snap.microk8s.daemon-containerd is running
  Service snap.microk8s.daemon-apiserver is running
  Service snap.microk8s.daemon-apiserver-kicker is running
  Service snap.microk8s.daemon-proxy is running
  Service snap.microk8s.daemon-kubelet is running
  Service snap.microk8s.daemon-scheduler is running
  Service snap.microk8s.daemon-controller-manager is running
  Service snap.microk8s.daemon-etcd is running
  Copy service arguments to the final report tarball
Inspecting AppArmor configuration
Gathering system information
  Copy processes list to the final report tarball
  Copy snap list to the final report tarball
  Copy VM name (or none) to the final report tarball
  Copy disk usage information to the final report tarball
  Copy memory usage information to the final report tarball
  Copy server uptime to the final report tarball
  Copy current linux distribution to the final report tarball
  Copy openSSL information to the final report tarball
  Copy network configuration to the final report tarball
Inspecting kubernetes cluster
  Inspect kubernetes cluster

Building the report tarball
  Report tarball is at /var/snap/microk8s/2038/inspection-report-20210307_113359.tar.gz

How can I prevent restarting of containers?

Dániel Kis
  • 2,341
  • 5
  • 28
  • 51
  • 1
    Hi, I am faced with the same issues, though my pods are restarted only twice a day. I saw a few errors in different logs but until not nothing that was the root cause. If I find a solution I will come back to you – Josef Biehler Mar 23 '21 at 22:30
  • Hi, I updated my microk8s instances (I have two bare metal server at home) both to MicroK8s 1.20 and since 16 hours there was no restart anymore. Not sure if this was the final solution but normally I was faced with at least 1-2 restarts in this time period – Josef Biehler Mar 25 '21 at 11:02
  • Unfortunatelly after 22hours I saw the first restart. Had you take a look into all the log files generated by "microk8s inspect"? In my case, for example the kubelet log is full of errors all the time. I think there is no time period without errors. for example " Failed to create existing container:" und much more – Josef Biehler Mar 26 '21 at 07:14
  • 1
    A completely fresh installation did not solve the problem. I opened an issue because I really have no idea anymore what is wrong: https://github.com/ubuntu/microk8s/issues/2132 – Josef Biehler Mar 26 '21 at 18:42
  • The fresh installation of microk8s made it even worse as the pods were not able to connect to the internet. FInally I decided to give K3s a try – Josef Biehler Mar 26 '21 at 22:31

1 Answers1

0

I switched back from K3S to MicroK8S and got the restarts on a fresh installed MicroK8S cluster with only one node. After inspecting the logs I found a strange log in one of the MicroK8s services:

root@biehler2:/git/webnut-ups-k8s/k8s# snap logs "microk8s.daemon-apiserver-kicker "
2022-11-11T18:49:14Z microk8s.daemon-apiserver-kicker[335004]: CSR change detected. Reconfiguring the kube-apiserver

I asked google and found this: https://github.com/canonical/microk8s/issues/1710#issuecomment-721043408

When the kicker detects some change in the network it will reconfigure and restart the apiserver. This is to help those who are moving from one network to another.

I think the api-server restart may have forced the pod restarts (no idea how K8S works at this place). At least the timestampts of the restarts and the timestamps of the service log have matched.

I also found a possible solution:

I think yes you can turn it off systemctl stop snap.microk8s-daemon-apiservice-kicker
You can give that a try though.

A few comments below I found this:

do you by any chance have ipv6 enabled on the machine? For example when you do hostname -I does it show ips which are not ipv4? This can be the reason why the kicker is constantly restarting the apiservice.

If you can turn off ipv6 and then re-enable the kicker it shouldn't restart the apiservice.

And this:

if you want to turn disable the kicker from detecting network changes you can follow the instructions below. See if this helps.

The author of the comment referenced this link: https://github.com/canonical/microk8s/issues/1822#issuecomment-745335208

MIcroK8s has a service that periodically check for changes on your network, reconfigures and restarts the API server if needed. Looking at this services logs (journalctl -n 1000 -u snap.microk8s.daemon-apiserver-kicker) I see a few restarts. To stop the API server restarts even if the network changes you can configure it to use a specific interface. To do so, edit /var/snap/microk8s/currect/args/kube-apiserver and add one of the arguments --advertise-address, --bind-address as described in [1]; then do a microk8s.stop; microk8s.start.

[1] https://kubernetes.io/docs/reference/command-line-tools-reference/kube-apiserver/

So I see three options:

  • disable IPv6
  • disable the kicker service
  • reconfigure the API server

I recently disabled the kicker service. Until now I saw no restarts anymore and no impact on my cluster. No idea what the kicker service does. Maybe this can help other people, too.

Josef Biehler
  • 952
  • 4
  • 16
  • Disabling the kicker service was not successful. But maybe I made a mistake and the shutdown of the service was only temporary. I disabled IPv6 and since 26h I saw no restart. Looks good so far – Josef Biehler Nov 14 '22 at 11:35