0

I am trying to install prometheus helm chart with default configuration in values.yaml

helm install prometheus prometheus-community/kube-prometheus-stack

I face error "Error: INSTALLATION FAILED: failed pre-install: timed out waiting for the condition". After that, I add a flag "--timeout" to increase time out

helm install prometheus prometheus-community/kube-prometheus-stack --timeout 30m

Then the error is "Error: INSTALLATION FAILED: failed pre-install: job failed: BackoffLimitExceeded" with message "Job has reached the specified backoff limit". I log pod "prometheus-admission-create" during installation:

W0404 01:47:16.302029       1 client_config.go:615] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
{"err":"Get \"https://10.96.0.1:443/api/v1/namespaces/default/secrets/prometheus-kube-prometheus-admission\": dial tcp 10.96.0.1:443: i/o timeout","level":"fatal","msg":"error getting secret","source":"k8s/k8s.go:232","time":"2023-04-04T01:47:46Z"}
 

This is my events:

        LAST SEEN   TYPE      REASON                 OBJECT                                                    MESSAGE
    15m         Normal    Scheduled              pod/prometheus-kube-prometheus-admission-create-pxhml     Successfully assigned default/prometheus-kube-prometheus-admission-create-pxhml to dangln
    12m         Normal    Pulled                 pod/prometheus-kube-prometheus-admission-create-pxhml     Container image "registry.k8s.io/ingress-nginx/kube-webhook-certgen:v20221220-controller-v1.5.1-58-g787ea74b6" already present on machine
    12m         Normal    Created                pod/prometheus-kube-prometheus-admission-create-pxhml     Created container create
    12m         Normal    Started                pod/prometheus-kube-prometheus-admission-create-pxhml     Started container create
    10m         Warning   BackOff                pod/prometheus-kube-prometheus-admission-create-pxhml     Back-off restarting failed container
    9m57s       Normal    Scheduled              pod/prometheus-kube-prometheus-admission-create-wljj7     Successfully assigned default/prometheus-kube-prometheus-admission-create-wljj7 to dangln
    6m29s       Normal    Pulled                 pod/prometheus-kube-prometheus-admission-create-wljj7     Container image "registry.k8s.io/ingress-nginx/kube-webhook-certgen:v20221220-controller-v1.5.1-58-g787ea74b6" already present on machine
    6m29s       Normal    Created                pod/prometheus-kube-prometheus-admission-create-wljj7     Created container create
    6m29s       Normal    Started                pod/prometheus-kube-prometheus-admission-create-wljj7     Started container create
    4m53s       Warning   BackOff                pod/prometheus-kube-prometheus-admission-create-wljj7     Back-off restarting failed container
    15m         Normal    SuccessfulCreate       job/prometheus-kube-prometheus-admission-create           Created pod: prometheus-kube-prometheus-admission-create-pxhml
    9m57s       Normal    SuccessfulCreate       job/prometheus-kube-prometheus-admission-create           Created pod: prometheus-kube-prometheus-admission-create-wljj7
    75s         Normal    SuccessfulDelete       job/prometheus-kube-prometheus-admission-create           Deleted pod: prometheus-kube-prometheus-admission-create-wljj7
    75s         Warning   BackoffLimitExceeded   job/prometheus-kube-prometheus-admission-create           Job has reached the specified backoff limit
    2m49s       Normal    FailedBinding          persistentvolumeclaim/storage-prometheus-alertmanager-0   no persistent volumes available for this claim and no storage class is set

This is my job desc:

Name:             prometheus-kube-prometheus-admission-create
Namespace:        default
Selector:         controller-uid=894df657-3d3d-4c09-9f5b-e59cc475bb28
Labels:           app=kube-prometheus-stack-admission-create
                  app.kubernetes.io/instance=prometheus
                  app.kubernetes.io/managed-by=Helm
                  app.kubernetes.io/part-of=kube-prometheus-stack
                  app.kubernetes.io/version=45.8.1
                  chart=kube-prometheus-stack-45.8.1
                  heritage=Helm
                  release=prometheus
Annotations:      helm.sh/hook: pre-install,pre-upgrade
                  helm.sh/hook-delete-policy: before-hook-creation,hook-succeeded
Parallelism:      1
Completions:      1
Completion Mode:  NonIndexed
Start Time:       Tue, 04 Apr 2023 08:30:28 +0700
Pods Statuses:    0 Running / 0 Succeeded / 0 Failed
Pod Template:
  Labels:           app=kube-prometheus-stack-admission-create
                    app.kubernetes.io/instance=prometheus
                    app.kubernetes.io/managed-by=Helm
                    app.kubernetes.io/part-of=kube-prometheus-stack
                    app.kubernetes.io/version=45.8.1
                    chart=kube-prometheus-stack-45.8.1
                    controller-uid=894df657-3d3d-4c09-9f5b-e59cc475bb28
                    heritage=Helm
                    job-name=prometheus-kube-prometheus-admission-create
                    release=prometheus
  Service Account:  prometheus-kube-prometheus-admission
  Containers:
   create:
    Image:      registry.k8s.io/ingress-nginx/kube-webhook-certgen:v20221220-controller-v1.5.1-58-g787ea74b6
    Port:       <none>
    Host Port:  <none>
    Args:
      create
      --host=prometheus-kube-prometheus-operator,prometheus-kube-prometheus-operator.default.svc
      --namespace=default
      --secret-name=prometheus-kube-prometheus-admission
    Environment:  <none>
    Mounts:       <none>
  Volumes:        <none>
Events:
  Type     Reason                Age                    From            Message
  ----     ------                ----                   ----            -------
  Normal   SuccessfulCreate      12m                    job-controller  Created pod: prometheus-kube-prometheus-admission-create-wljj7
  Normal   SuccessfulDelete      3m28s                  job-controller  Deleted pod: prometheus-kube-prometheus-admission-create-wljj7
  Warning  BackoffLimitExceeded  3m28s (x2 over 3m28s)  job-controller  Job has reached the specified backoff limit

I consider changing backofflimit value in values.yaml file but I don't find the backofflimit value in that file. How to fix this error??

  • According to log, the container `create` has a problem. Check the log during installation. – Andromeda Mar 29 '23 at 09:27
  • [Please do not upload images of code/data/errors.](//meta.stackoverflow.com/q/285551) Without being able to see the text of the Job YAML and without knowing what program is running in the Pod it's hard to tell what might be going on. – David Maze Mar 29 '23 at 11:25
  • Thanh you so much, I will edit image by code format later – Đăng Lương Mar 29 '23 at 14:52

0 Answers0