All pod staying in pending mode with Events None in a K3S ARM64 + AMD64 cluster

Question

I found many issues about this on StackOverflow; most are non-responded and over-complicated. I shrink my issue to a simple "Hello world" test in a brand-new empty cluster.

I have a K3s cluster, the master is an online bare-metal AMD64 server, and the nodes are local PI400 ARM64 Debian hosts.

I'm trying to deploy

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: hello-world
  labels:
    app: hello-world
spec:
  selector:
    matchLabels:
      app: hello-world
  template:
    metadata:
      labels:
        app: hello-world
    spec:
      containers:
        - name: hello-world
          image: nginxdemos/hello

Then all my pod stays in Pending states:

kubectl get pods

NAME                READY   STATUS    RESTARTS   AGE
hello-world-qsv2d   0/1     Pending   0          7m53s
hello-world-6rn5d   0/1     Pending   0          7m53s

a description of one of my nodes gave me:

kubectl describe pod hello-world-6rn5d

Name:           hello-world-6rn5d
Namespace:      default
Priority:       0
Node:           <none>
Labels:         app=hello-world
                controller-revision-hash=649569d94c
                pod-template-generation=1
Annotations:    <none>
Status:         Pending
IP:
IPs:            <none>
Controlled By:  DaemonSet/hello-world
Containers:
  hello-world:
    Image:        hello-world
    Port:         <none>
    Host Port:    <none>
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-8bh8p (ro)
Volumes:
  kube-api-access-8bh8p:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                             node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists
                             node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                             node.kubernetes.io/unreachable:NoExecute op=Exists
                             node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:                      <none>

I already use this same node in a local ARM64 cluster, and they are working fine.

kubectl get nodes

NAME    STATUS   ROLES    AGE   VERSION
pi417   Ready    <none>   11h   v1.24.4+k3s1
pi400   Ready    <none>   11h   v1.24.4+k3s1

kubectl version --output=yaml

clientVersion:
  buildDate: "2022-06-15T14:22:29Z"
  compiler: gc
  gitCommit: f66044f4361b9f1f96f0053dd46cb7dce5e990a8
  gitTreeState: clean
  gitVersion: v1.24.2
  goVersion: go1.18.3
  major: "1"
  minor: "24"
  platform: windows/amd64
kustomizeVersion: v4.5.4
serverVersion:
  buildDate: "2022-08-25T03:45:26Z"
  compiler: gc
  gitCommit: c3f830e9b9ed8a4d9d0e2aa663b4591b923a296e
  gitTreeState: clean
  gitVersion: v1.24.4+k3s1
  goVersion: go1.18.1
  major: "1"
  minor: "24"
  platform: linux/amd64

a node description:

kubectl.exe describe node pi400

Name:               pi400
Roles:              <none>
Labels:             adb=true
                    beta.kubernetes.io/arch=arm64
                    beta.kubernetes.io/instance-type=k3s
                    beta.kubernetes.io/os=linux
                    egress.k3s.io/cluster=true
                    kubernetes.io/arch=arm64
                    kubernetes.io/hostname=pi400
                    kubernetes.io/os=linux
                    node.kubernetes.io/instance-type=k3s
Annotations:        flannel.alpha.coreos.com/backend-data: {"VNI":1,"VtepMAC":"26:a8:bd:f3:1d:fd"}
                    flannel.alpha.coreos.com/backend-type: vxlan
                    flannel.alpha.coreos.com/kube-subnet-manager: true
                    flannel.alpha.coreos.com/public-ip: 192.168.3.25
                    k3s.io/hostname: pi400
                    k3s.io/internal-ip: 192.168.3.25
                    k3s.io/node-args: ["agent"]
                    k3s.io/node-config-hash: CBEQF3QV5PMMQWO2GECMRPJVEIFSCEFARQFZKX4RNV4K5FPB7FGQ====
                    k3s.io/node-env:
                      {"K3S_DATA_DIR":"/var/lib/rancher/k3s/data/8...2a","K3S_NODE_NAME":"pi400" ...}
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Mon, 12 Sep 2022 20:44:50 +0300
Taints:             <none>
Unschedulable:      false
Lease:
  HolderIdentity:  pi400
  AcquireTime:     <unset>
  RenewTime:       Tue, 13 Sep 2022 08:53:29 +0300
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Tue, 13 Sep 2022 08:51:08 +0300   Mon, 12 Sep 2022 21:33:41 +0300   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Tue, 13 Sep 2022 08:51:08 +0300   Mon, 12 Sep 2022 21:33:41 +0300   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Tue, 13 Sep 2022 08:51:08 +0300   Mon, 12 Sep 2022 21:33:41 +0300   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            True    Tue, 13 Sep 2022 08:51:08 +0300   Mon, 12 Sep 2022 21:33:41 +0300   KubeletReady                 kubelet is posting ready status
Addresses:
  InternalIP:  192.168.3.25
  Hostname:    pi400
Capacity:
  cpu:                4
  ephemeral-storage:  30473608Ki
  memory:             3885428Ki
  pods:               110
Allocatable:
  cpu:                4
  ephemeral-storage:  29644725840
  memory:             3885428Ki
  pods:               110
System Info:
  Machine ID:                 d2eb1415b12e45ebac766cc20ce58012
  System UUID:                d2eb1415b12e45ebac766cc20ce58012
  Boot ID:                    c2531ffa-96b0-4463-9f51-08e0dce6d5c3
  Kernel Version:             5.15.61-v8+
  OS Image:                   Debian GNU/Linux 11 (bullseye)
  Operating System:           linux
  Architecture:               arm64
  Container Runtime Version:  containerd://1.6.6-k3s1
  Kubelet Version:            v1.24.4+k3s1
  Kube-Proxy Version:         v1.24.4+k3s1
PodCIDR:                      10.42.1.0/24
PodCIDRs:                     10.42.1.0/24
ProviderID:                   k3s://pi400
Non-terminated Pods:          (0 in total)
  Namespace                   Name    CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                   ----    ------------  ----------  ---------------  -------------  ---
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests  Limits
  --------           --------  ------
  cpu                0 (0%)    0 (0%)
  memory             0 (0%)    0 (0%)
  ephemeral-storage  0 (0%)    0 (0%)
Events:              <none>

the issue may be due to:

the mixing of AMD64 / ARM64
a connection issue between nodes
K3s

I do not know how to get more info about the situation, I do not have any ADM64 node for now for more tests.

`kubectl describe pod hello-world-6rn5d` output is already present, let me describe a `node` — Uriel, Sep 13 '22 at 05:53

score 0 · Answer 1 · answered Sep 15 '22 at 09:50

After more Test: I had the same issue when all nodes used the same arch.

The issue was due to an incompatibility with the distro on my master. I could have spotted the issue just after the master k3s setup.

Once the setup is done on the server, take a kubectl get nodes on a K3S setup the master must be visible as a node. if it's not the case, do not try to add any more nodes.

So that my k3s setup:

STEP 1 preconfigure your k3s

mkdir -p /etc/rancher/k3s/
nano /etc/rancher/k3s/config.yaml

add all options that are not available via env variable in the config.yaml before stating the setup script. the content may looks like:

write-kubeconfig-mode: "0644"
tls-san:
  - "1.2.3.4"

STEP 2 the start the master node setup

curl -sfL https://get.k3s.io | sh -

STEP 3 check that the master node is live

kubectl get nodes if you do not see the master node, start investigating. (kernel option, and more)

STEP 4 get your credencial

get your credencial to remote access your k3s with the file /etc/rancher/k3s/k3s.yaml, and change clusters.cluster.server IP from 127.0.0.1 to a valid remote IP. past the new config file in ~/.kube/config.

STEP 5 try to connect with a `kubectl`

kubectl get node

STEP 6 add nodes

Start adding your nodes using your token from /var/lib/rancher/k3s/server/node-token

# FOR SLAVE customise hostname
# export K3S_NODE_NAME=pi417
export K3S_TOKEN=<token from /var/lib/rancher/k3s/server/node-token>
export K3S_URL=https://<remote-ip>:6443
curl -sfL https://get.k3s.io | sh -

STEP 7 if the setup get stuck

CTRL+C
sudo systemctl restart kubepods.slice kubepods-besteffort.slice
or reboot

STEP 8 start a hello world on all node to check if everythink is ok

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: hello-world
  labels:
    app: hello-world
spec:
  selector:
    matchLabels:
      app: hello-world
  template:
    metadata:
      labels:
        app: hello-world
    spec:
      containers:
        - name: hello-world
          image: nginxdemos/hello