So I'm not sure if I'm using this feature incorrectly or it's a limitation of "anti-affinity" but I'm at a loss. I have a batch of jobs that I want to run with each one having their own node. Should be relatively simple, add the anti-affinity to only run the pod where the hostname + label does not exist. Despite this I still have multiple pods on the same node.
My best guess right now is that, because I create all the jobs at once with a kubectl apply -f ./folder
command, the scheduler doesn't count pods on a node in the "container creating" state as a trigger for the anti-affinity rule and schedules another onto the node.
Each job needs a slightly different command line so I can't use just one job file with the parallel spec until 1.22 comes out with job indexing.
Below is the job yaml in case there's something I'm missing.
apiVersion: batch/v1
kind: Job
metadata:
name: testjob-$SHARD
spec:
backoffLimit: 1
template:
metadata:
labels:
run: testjob
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: run
operator: In
values:
- testjob
topologyKey: "kubernetes.io/hostname"
containers:
- name: testjob
imagePullPolicy: Always
image: image
resources:
requests:
memory: "3072Mi"
limits:
memory: "4Gi"
command: ["./foo"]
securityContext:
privileged: true
restartPolicy: OnFailure