2

I am new to k8s and trying to setup prometheus monitoring for k8s. I used "helm install" to setup prometheus. Now:

  1. two pods are still in pending state:
    • prometheus-server
    • prometheus-alertmanager
  2. I manually created persistent volume for both Can anyone help me with how to map these PV with PVC created by helm chart?
[centos@k8smaster1 ~]$ kubectl get pod -n monitoring
NAME                                             READY   STATUS    RESTARTS   AGE
prometheus-alertmanager-7757d759b8-x6bd7         0/2     Pending   0          44m
prometheus-kube-state-metrics-7f85b5d86c-cq9kr   1/1     Running   0          44m
prometheus-node-exporter-5rz2k                   1/1     Running   0          44m
prometheus-pushgateway-5b8465d455-672d2          1/1     Running   0          44m
prometheus-server-7f8b5fc64b-w626v               0/2     Pending   0          44m
[centos@k8smaster1 ~]$ kubectl get pv
prometheus-alertmanager   3Gi        RWX            Retain           Available                                                                       22m
prometheus-server         12Gi       RWX            Retain           Available                                                                       30m
[centos@k8smaster1 ~]$ kubectl get pvc -n monitoring
NAME                      STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
prometheus-alertmanager   Pending                                                     20m
prometheus-server         Pending                                                     20m
[centos@k8smaster1 ~]$ kubectl describe pvc prometheus-alertmanager -n monitoring
Name:          prometheus-alertmanager
Namespace:     monitoring
StorageClass:
Status:        Pending
Volume:
Labels:        app=prometheus
               chart=prometheus-8.15.0
               component=alertmanager
               heritage=Tiller
               release=prometheus
Annotations:   <none>
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode:    Filesystem
Events:
  Type       Reason         Age                  From                         Message
  ----       ------         ----                 ----                         -------
  Normal     FailedBinding  116s (x83 over 22m)  persistentvolume-controller  no persistent volumes available for this claim and no storage class is set
Mounted By:  prometheus-alertmanager-7757d759b8-x6bd7

I am expecting the pods to get into running state

!!!UPDATE!!!

NAME                      STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS    AGE
prometheus-alertmanager   Pending                                      local-storage   4m29s
prometheus-server         Pending                                      local-storage   4m29s
[centos@k8smaster1 prometheus_pv_storage]$ kubectl describe pvc prometheus-server -n monitoring
Name:          prometheus-server
Namespace:     monitoring
StorageClass:  local-storage
Status:        Pending
Volume:
Labels:        app=prometheus
               chart=prometheus-8.15.0
               component=server
               heritage=Tiller
               release=prometheus
Annotations:   <none>
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode:    Filesystem
Events:
  Type       Reason                Age                   From                         Message
  ----       ------                ----                  ----                         -------
  Normal     WaitForFirstConsumer  11s (x22 over 4m59s)  persistentvolume-controller  waiting for first consumer to be created before binding
Mounted By:  prometheus-server-7f8b5fc64b-bqf42

!!UPDATE-2!!

[centos@k8smaster1 ~]$ kubectl get pods prometheus-server-7f8b5fc64b-bqf42 -n monitoring  -o yaml
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: "2019-08-18T16:10:54Z"
  generateName: prometheus-server-7f8b5fc64b-
  labels:
    app: prometheus
    chart: prometheus-8.15.0
    component: server
    heritage: Tiller
    pod-template-hash: 7f8b5fc64b
    release: prometheus
  name: prometheus-server-7f8b5fc64b-bqf42
  namespace: monitoring
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: ReplicaSet
    name: prometheus-server-7f8b5fc64b
    uid: c1979bcb-c1d2-11e9-819d-fa163ebb8452
  resourceVersion: "2461054"
  selfLink: /api/v1/namespaces/monitoring/pods/prometheus-server-7f8b5fc64b-bqf42
  uid: c19890d1-c1d2-11e9-819d-fa163ebb8452
spec:
  containers:
  - args:
    - --volume-dir=/etc/config
    - --webhook-url=http://127.0.0.1:9090/-/reload
    image: jimmidyson/configmap-reload:v0.2.2
    imagePullPolicy: IfNotPresent
    name: prometheus-server-configmap-reload
    resources: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /etc/config
      name: config-volume
      readOnly: true
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: prometheus-server-token-7h2df
      readOnly: true
  - args:
    - --storage.tsdb.retention.time=15d
    - --config.file=/etc/config/prometheus.yml
    - --storage.tsdb.path=/data
    - --web.console.libraries=/etc/prometheus/console_libraries
    - --web.console.templates=/etc/prometheus/consoles
    - --web.enable-lifecycle
    image: prom/prometheus:v2.11.1
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 3
      httpGet:
        path: /-/healthy
        port: 9090
        scheme: HTTP
      initialDelaySeconds: 30
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 30
    name: prometheus-server
    ports:
    - containerPort: 9090
      protocol: TCP
    readinessProbe:
      failureThreshold: 3
      httpGet:
        path: /-/ready
        port: 9090
        scheme: HTTP
      initialDelaySeconds: 30
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 30
    resources: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /etc/config
      name: config-volume
    - mountPath: /data
      name: storage-volume
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: prometheus-server-token-7h2df
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext:
    fsGroup: 65534
    runAsGroup: 65534
    runAsNonRoot: true
    runAsUser: 65534
  serviceAccount: prometheus-server
  serviceAccountName: prometheus-server
  terminationGracePeriodSeconds: 300
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - configMap:
      defaultMode: 420
      name: prometheus-server
    name: config-volume
  - name: storage-volume
    persistentVolumeClaim:
      claimName: prometheus-server
  - name: prometheus-server-token-7h2df
    secret:
      defaultMode: 420
      secretName: prometheus-server-token-7h2df
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2019-08-18T16:10:54Z"
    message: '0/2 nodes are available: 1 node(s) didn''t find available persistent
      volumes to bind, 1 node(s) had taints that the pod didn''t tolerate.'
    reason: Unschedulable
    status: "False"
    type: PodScheduled
  phase: Pending
  qosClass: BestEffort

Also I have the volumes created and assigned to local storage

[centos@k8smaster1 prometheus_pv]$ kubectl get pv -n monitoring

prometheus-alertmanager   3Gi        RWX            Retain           Available                                               local-storage            2d19h
prometheus-server         12Gi       RWX            Retain           Available                                               local-storage            2d19h

Jsingh
  • 45
  • 2
  • 9
  • Also in case you may feel its a duplicate, Already referred other answers, none of them mapped to helm install version of prometheus. – Jsingh Aug 16 '19 at 11:08
  • Hi check storageclass in your cluster `kubectl get sc` – Suresh Vishnoi Aug 16 '19 at 11:14
  • hi @SureshVishnoi, i checked it says No Resource Found. I thought storage class mention is not mandatory. – Jsingh Aug 16 '19 at 11:18
  • when you are creating a pv dynamically, then your pvc needs to reference it – Suresh Vishnoi Aug 16 '19 at 11:43
  • Checkout this page, https://kubernetes.io/docs/tasks/configure-pod-container/configure-persistent-volume-storage/ – Suresh Vishnoi Aug 16 '19 at 11:44
  • @SureshVishnoi: I tried creating a local storage class and details are in update, still same issue, reinstalled the prometheus instance.In local provisioner I don't ahve immediate yet, I tries with wait for firstconsumer and it was no success. Mine is a baremetal setup not cloud. Any ideas? – Jsingh Aug 18 '19 at 16:14
  • Hi, Its different issue now, can you run `kubectl get pod prometheus-server-7f8b5fc64b-w626v -o yaml` ? and howmany nodes are there ? – Suresh Vishnoi Aug 18 '19 at 20:18
  • @SureshVishnoi: I have one worker node here but now the pvc has a different message if you see..because the storageClass is using providioner as local, i had to mention **volumeBindingMode**: **WaitForFirstConsumer** which is why now if we see the pvc status above it says **waiting for first consumer to be created before binding** – Jsingh Aug 19 '19 at 06:17
  • Here is the issue `1 node(s) had taint`s that the pod didn''t tolerate.`, you need to put toleration on the pod,so check the taint on the node first `kubectl describe nodes your-node-name` – Suresh Vishnoi Aug 19 '19 at 07:36
  • `message: '0/2 nodes are available: 1 node(s) didn''t find available persistent volumes to bind, 1 node(s) had taints that the pod didn''t tolerate.'` – Suresh Vishnoi Aug 19 '19 at 07:36
  • 1
    I guess worker node does not have pv and master node has taint – Suresh Vishnoi Aug 19 '19 at 07:37

3 Answers3

2

If you are in EKS, your node need to have the next permission

arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy

and the Amazon EBS CSI Driver Add-on

1

Regarding your 1st issue related to the two pods still in pending state you can follow this procedure:

  1. Clean up your helm deployment by using

helm uninstall ...

  1. Remove the helm repository by using

helm repo remove ...

Now run the following commands and the problem should be solved:

  1. Get Repository Info

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts

  1. Fetch the latest version of your helm repo

helm repo update

  1. And Install chart ("prometheus" is the helm name chart)

helm install monitoring prometheus-community/kube-prometheus-stack

And let us know the result.

Alexandre Juma
  • 3,128
  • 1
  • 20
  • 46
Tchatua
  • 11
  • 2
-1

Prometheus will try to create PersiatentVolumeClaims with accessModes as ReadWriteOnce, PVC will get matched to PersistentVolume only if accessmodes are same. Change your accessmode of PV to ReadWriteOnce, it should work.