I am running k8s:
clientVersion:
buildDate: "2023-08-13T08:25:01Z"
compiler: gc
gitCommit: 25b4e43193bcda6c7328a6d147b1fb73a33f1598
gitTreeState: dirty
gitVersion: v1.27.3-dirty
goVersion: go1.20.5
major: "1"
minor: 27+
platform: linux/amd64
kustomizeVersion: v5.0.1
serverVersion:
buildDate: "2023-06-14T09:47:40Z"
compiler: gc
gitCommit: 25b4e43193bcda6c7328a6d147b1fb73a33f1598
gitTreeState: clean
gitVersion: v1.27.3
goVersion: go1.20.5
major: "1"
minor: "27"
platform: linux/arm64
On Talos and have run into an issue where my scheduler is trying to schedule a pod on a node that no longer exits(known as node X
and talos-j5b-9sl
).
This seems to happen after I did maintenance on my single node leader and did an etcd restore containing node X. After the restore, I deleted node X.
I have a feeling I might have deleted node X too soon, and now there is some inconsistency somewhere where the system thinks that node X exists and doesn't at the same time.
When I execute the k get nodes
, I get the following:
NAME STATUS ROLES AGE VERSION
talos-00b-3lq Ready worker 9d v1.27.3
talos-9lo-ono Ready control-plane 9d v1.27.3
talos-tob-7ed Ready worker 150d v1.27.3
talos-wyr-reh Ready worker 126d v1.27.3
talos-y15-g0m Ready worker 92d v1.27.3
You can see that It is not in the list of known nodes, however, I have 1 statefullset pod failing to schedule with the following error:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 4m8s (x18 over 69m) default-scheduler nodeinfo not found for node name "talos-j5b-9sl"
And when checking the scheduler's logs:
E0813 08:32:11.154650 1 schedule_one.go:883] "Error scheduling pod; retrying" err="nodeinfo not found for node name \"talos-j5b-9sl\"" pod="psql-operator/main-0"
I've spent some time reading the k8s code base, and came to the conclusion that this has something do with caching, but I could not understand the needed details to come to a solution.
Source: https://github.com/kubernetes/kubernetes/blob/v1.27.3/pkg/scheduler/internal/cache/snapshot.go#L193 https://github.com/kubernetes/kubernetes/blob/v1.27.3/pkg/scheduler/internal/cache/snapshot.go#L27-L28
I tried to look into etcd manually, but that is all binary data, the idea I had was to find the inconsistency in there, fix it, create a snapshot from my locally modified etcd copy, and restore that on the control-plane, but I did not succeed.
etcdctl get "" --prefix --keys-only | grep talos-j5b-9sl
Yields no results.
I tried deleting and re-creating the stateful set, but after re-creating it, it still fails to schedule, as it still wants to schedule it on node X.
The stateful set does not have a NodeSelector set:
❯ kgp -o yaml -n psql-operator main-0| grep -i selector
- labelSelector:
Nor is the nodeName field set:
k explain pod.spec.nodeName
KIND: Pod
VERSION: v1
FIELD: nodeName <string>
DESCRIPTION:
NodeName is a request to schedule this pod onto a specific node. If it is
non-empty, the scheduler simply schedules this pod onto that node, assuming
that it fits resource requirements.
kgp -n psql-operator main-0 -o json | jq '.spec.nodeName'
Yields no results.
I ran out of ideas and would love some tips on how to solve this. My last resort is to start from a fresh install, but that would be painful.