k8s scheduler wants to schedule pod on non existing node

Question

I am running k8s:

clientVersion:
  buildDate: "2023-08-13T08:25:01Z"
  compiler: gc
  gitCommit: 25b4e43193bcda6c7328a6d147b1fb73a33f1598
  gitTreeState: dirty
  gitVersion: v1.27.3-dirty
  goVersion: go1.20.5
  major: "1"
  minor: 27+
  platform: linux/amd64
kustomizeVersion: v5.0.1
serverVersion:
  buildDate: "2023-06-14T09:47:40Z"
  compiler: gc
  gitCommit: 25b4e43193bcda6c7328a6d147b1fb73a33f1598
  gitTreeState: clean
  gitVersion: v1.27.3
  goVersion: go1.20.5
  major: "1"
  minor: "27"
  platform: linux/arm64

On Talos and have run into an issue where my scheduler is trying to schedule a pod on a node that no longer exits(known as node X and talos-j5b-9sl). This seems to happen after I did maintenance on my single node leader and did an etcd restore containing node X. After the restore, I deleted node X.

I have a feeling I might have deleted node X too soon, and now there is some inconsistency somewhere where the system thinks that node X exists and doesn't at the same time.

When I execute the k get nodes, I get the following:

NAME                  STATUS   ROLES           AGE    VERSION
talos-00b-3lq         Ready    worker          9d     v1.27.3
talos-9lo-ono         Ready    control-plane   9d     v1.27.3
talos-tob-7ed         Ready    worker          150d   v1.27.3
talos-wyr-reh         Ready    worker          126d   v1.27.3
talos-y15-g0m         Ready    worker          92d    v1.27.3

You can see that It is not in the list of known nodes, however, I have 1 statefullset pod failing to schedule with the following error:

Events:
  Type     Reason            Age                  From               Message
  ----     ------            ----                 ----               -------
  Warning  FailedScheduling  4m8s (x18 over 69m)  default-scheduler  nodeinfo not found for node name "talos-j5b-9sl"

And when checking the scheduler's logs:

E0813 08:32:11.154650       1 schedule_one.go:883] "Error scheduling pod; retrying" err="nodeinfo not found for node name \"talos-j5b-9sl\"" pod="psql-operator/main-0"

I've spent some time reading the k8s code base, and came to the conclusion that this has something do with caching, but I could not understand the needed details to come to a solution.

Source: https://github.com/kubernetes/kubernetes/blob/v1.27.3/pkg/scheduler/internal/cache/snapshot.go#L193 https://github.com/kubernetes/kubernetes/blob/v1.27.3/pkg/scheduler/internal/cache/snapshot.go#L27-L28

I tried to look into etcd manually, but that is all binary data, the idea I had was to find the inconsistency in there, fix it, create a snapshot from my locally modified etcd copy, and restore that on the control-plane, but I did not succeed.

etcdctl get "" --prefix --keys-only | grep talos-j5b-9sl

Yields no results.

I tried deleting and re-creating the stateful set, but after re-creating it, it still fails to schedule, as it still wants to schedule it on node X.

The stateful set does not have a NodeSelector set:

❯ kgp -o yaml -n psql-operator main-0| grep -i selector
      - labelSelector:

Nor is the nodeName field set:

 k explain pod.spec.nodeName
KIND:       Pod
VERSION:    v1

FIELD: nodeName <string>

DESCRIPTION:
    NodeName is a request to schedule this pod onto a specific node. If it is
    non-empty, the scheduler simply schedules this pod onto that node, assuming
    that it fits resource requirements.

kgp -n psql-operator main-0 -o json | jq '.spec.nodeName'

Yields no results.

I ran out of ideas and would love some tips on how to solve this. My last resort is to start from a fresh install, but that would be painful.

Can you explain what actually happened? Which node you have deleted, worker or master? Does `talos` is a daemonset? Have you check `RS` or `events` in your namespace? — dahiya_boy, Aug 13 '23 at 17:13
@dahiya_boy Talos is an OS: [https://talos.dev](https://talos.dev) . The removed node is a worker node. Events indicate the same error: `nodeinfo not found for node name \"talos-j5b-9sl\"`. There is no replica set in play here, as I'm working with a Statefulset. And the auto-generated node names start with `talos-` — Kevin, Aug 14 '23 at 07:07

k8s scheduler wants to schedule pod on non existing node

0 Answers0