0

I have an IndexedJob. For certain runs Kubernetes (1.24.10 on AKS) is creating jobs for certain indices multiple times before marking the job to be complete. I am at a loss to figure out what might be causing this behaviour

apiVersion: batch/v1
kind: Job
metadata:
  name: lombard-ttc-lgd-20230228
spec:
  completions: 6
  parallelism: 6
  completionMode: Indexed
  backoffLimit: 0
  template:
    spec:
      restartPolicy: Never

This describe job output shows multiple job creations for the same index, all the previous executions have been successful. Note the zero failed Pod statuses. Eventually the job is marked as complete. Not all indices are retried the same amount of time.

$kubectl describe job lombard-pit-pd-20230228
Name:           lombard-pit-pd-20230228
Namespace:      default
Selector:       controller-uid=111df9b6-d1ad-44af-9e9f-a3aa548bc6f4
Labels:         controller-uid=111df9b6-d1ad-44af-9e9f-a3aa548bc6f4
                job-name=lombard-pit-pd-20230228
Annotations:    Parallelism:  6
Completions:    6
Start Time:     Fri, 28 Apr 2023 16:06:05 +0100
Pods Statuses:  4 Running / 2 Succeeded / 0 Failed
Pod Template:
  Labels:  controller-uid=111df9b6-d1ad-44af-9e9f-a3aa548bc6f4
           job-name=lombard-pit-pd-20230228
  Containers:
   ubuntu:
Events:
  Type    Reason            Age   From            Message
  ----    ------            ----  ----            -------
  Normal  SuccessfulCreate  54m   job-controller  Created pod: lombard-pit-pd-20230228-0-5tk9f
  Normal  SuccessfulCreate  54m   job-controller  Created pod: lombard-pit-pd-20230228-2-xlxxb
  Normal  SuccessfulCreate  54m   job-controller  Created pod: lombard-pit-pd-20230228-1-8mlzj
  Normal  SuccessfulCreate  54m   job-controller  Created pod: lombard-pit-pd-20230228-5-l8wxx
  Normal  SuccessfulCreate  54m   job-controller  Created pod: lombard-pit-pd-20230228-3-fcnd6
  Normal  SuccessfulCreate  54m   job-controller  Created pod: lombard-pit-pd-20230228-4-9t5vt
  Normal  SuccessfulCreate  10m   job-controller  Created pod: lombard-pit-pd-20230228-2-rn2cn
  Normal  SuccessfulCreate  2m8s  job-controller  Created pod: lombard-pit-pd-20230228-5-dctlq
  Normal  SuccessfulCreate  42s   job-controller  Created pod: lombard-pit-pd-20230228-1-clgcs
  Normal  SuccessfulCreate  42s   job-controller  Created pod: lombard-pit-pd-20230228-3-x4hpg

kubectl events, pod logs show no error messages

0 Answers0