1

I have an EKS cluster that is Fargate only. I really don't want to have to manage instances myself. I'd like to deploy Prometheus to it - which requires a persistent volume. As of two months ago this should be possible with EFS (managed NFS share) I feel that I am almost there but I cannot figure out what the current issue is

What I have done:

  • Set up an EKS fargate cluster and a suitable fargate profile
  • Set up an EFS with an appropriate security group
  • Installed the CSI driver and validated the EFS as per AWS walkthough

All good so far

I set up the persistent volume claims (which I understand must be done statically) with:

kubectl apply -f pvc/

where

tree pvc/
pvc/
├── two_pvc.yml
└── ten_pvc.yml

and

cat pvc/*

apiVersion: v1
kind: PersistentVolume
metadata:
  name: efs-pv-two
spec:
  capacity:
    storage: 2Gi
  volumeMode: Filesystem
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: efs-sc
  csi:
    driver: efs.csi.aws.com
    volumeHandle: fs-ec0e1234
apiVersion: v1
kind: PersistentVolume
metadata:
  name: efs-pv-ten
spec:
  capacity:
    storage: 8Gi
  volumeMode: Filesystem
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: efs-sc
  csi:
    driver: efs.csi.aws.com
    volumeHandle: fs-ec0e1234

then

helm upgrade --install myrelease-helm-02 prometheus-community/prometheus \
    --namespace prometheus \
    --set alertmanager.persistentVolume.storageClass="efs-sc",server.persistentVolume.storageClass="efs-sc"

What happens?

The prometheus alertmanager comes up fine with its pvc. So do the other pods for this deployment, but the prometheus server goes to crashloopbackoff with

invalid capacity 0 on filesystem

Diagnostics

kubectl get pv -A
NAME                          CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS     CLAIM                                               STORAGECLASS   REASON   AGE
efs-pv-ten                    8Gi        RWO            Retain           Bound      prometheus/myrelease-helm-02-prometheus-server         efs-sc                  11m
efs-pv-two                    2Gi        RWO            Retain           Bound      prometheus/myrelease-helm-02-prometheus-alertmanager   efs-sc                  11m

and

kubectl get pvc -A
NAMESPACE    NAME                                     STATUS   VOLUME       CAPACITY   ACCESS MODES   STORAGECLASS   AGE
prometheus   myrelease-helm-02-prometheus-alertmanager   Bound    efs-pv-two   2Gi        RWO            efs-sc         12m
prometheus   myrelease-helm-02-prometheus-server         Bound    efs-pv-ten   8Gi        RWO            efs-sc         12m

describe pod just shows 'error'

lastly, this (from a colleague):

level=info ts=2020-10-09T15:17:08.898Z caller=main.go:346 msg="Starting Prometheus" version="(version=2.21.0, branch=HEAD, revision=e83ef207b6c2398919b69cd87d2693cfc2fb4127)"
level=info ts=2020-10-09T15:17:08.898Z caller=main.go:347 build_context="(go=go1.15.2, user=root@a4d9bea8479e, date=20200911-11:35:02)"
level=info ts=2020-10-09T15:17:08.898Z caller=main.go:348 host_details="(Linux 4.14.193-149.317.amzn2.x86_64 #1 SMP Thu Sep 3 19:04:44 UTC 2020 x86_64 myrelease-helm-02-prometheus-server-85765f9895-vxrkn (none))"
level=info ts=2020-10-09T15:17:08.898Z caller=main.go:349 fd_limits="(soft=1024, hard=4096)"
level=info ts=2020-10-09T15:17:08.898Z caller=main.go:350 vm_limits="(soft=unlimited, hard=unlimited)"
level=error ts=2020-10-09T15:17:08.901Z caller=query_logger.go:87 component=activeQueryTracker msg="Error opening query log file" file=/data/queries.active err="open /data/queries.active: permission denied"
panic: Unable to create mmap-ed active query log
goroutine 1 [running]:
github.com/prometheus/prometheus/promql.NewActiveQueryTracker(0x7fffeb6e85ee, 0x5, 0x14, 0x30ca080, 0xc000d43620, 0x30ca080)
    /app/promql/query_logger.go:117 +0x4cf
main.main()
    /app/cmd/prometheus/main.go:377 +0x510c

Beyond the appearance of a permissions issue I am baffled - I know that the storage 'works' and is accessible - the other pod in the deployment seems happy with it - but not this one.

jmkite
  • 285
  • 1
  • 4
  • 13
  • check `kubectl get events` - you may strike it lucky and get some usable debugging info out of that. Also - have you verified that alertmanager is actually *writing* to the pvc? It could be that both pvcs are broken but only prometheus is reporting it as alertmanager might not have tried to write to the pvc yet. Also, are you aware of the massive latency EFS comes with? It might not be ideal for prometheus. – mcfinnigan Oct 09 '20 at 21:24
  • `kubectl get events -A NAMESPACE LAST SEEN TYPE REASON OBJECT MESSAGE prometheus 118s Warning BackOff pod/myrelease-helm-02-prometheus-server-85765f9895-bftwd Back-off restarting failed container` – jmkite Oct 09 '20 at 22:18

1 Answers1

9

Working now - and writing up here for the common good. Thanks to /u/EmiiKhaos on reddit for the suggestions where to look

Problem:

EFS shares are root:root only and prometheus forbids running pods as root.

Solution:

  • create an EFS access point for each pod requiring a persistent volume to permit access for a specified user.
  • Specify these access points for the persistent volumes
  • apply a suitable security context to run the pods as the matching user

Method:

Create 2x EFS access points something like:

{
    "Name": "prometheuserver",
    "AccessPointId": "fsap-<hex01>",
    "FileSystemId": "fs-ec0e1234",
    "PosixUser": {
        "Uid": 500,
        "Gid": 500,
        "SecondaryGids": [
            2000
        ]
    },
    "RootDirectory": {
        "Path": "/prometheuserver",
        "CreationInfo": {
            "OwnerUid": 500,
            "OwnerGid": 500,
            "Permissions": "0755"
        }
    }
},
{
    "Name": "prometheusalertmanager",
    "AccessPointId": "fsap-<hex02>",
    "FileSystemId": "fs-ec0e1234",
    "PosixUser": {
        "Uid": 501,
        "Gid": 501,
        "SecondaryGids": [
            2000
        ]
    },
    "RootDirectory": {
        "Path": "/prometheusalertmanager",
        "CreationInfo": {
            "OwnerUid": 501,
            "OwnerGid": 501,
            "Permissions": "0755"
        }
    }
}

Update my persistent volumes:

kubectl apply -f pvc/

to something like:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: prometheusalertmanager
spec:
  capacity:
    storage: 2Gi
  volumeMode: Filesystem
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: efs-sc
  csi:
    driver: efs.csi.aws.com
    volumeHandle: fs-ec0e1234::fsap-<hex02>
---    
apiVersion: v1
kind: PersistentVolume
metadata:
  name: prometheusserver
spec:
  capacity:
    storage: 8Gi
  volumeMode: Filesystem
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: efs-sc
  csi:
    driver: efs.csi.aws.com
    volumeHandle: fs-ec0e1234::fsap-<hex01>

Re-install prometheus as before:

helm upgrade --install myrelease-helm-02 prometheus-community/prometheus \
    --namespace prometheus \
    --set alertmanager.persistentVolume.storageClass="efs-sc",server.persistentVolume.storageClass="efs-sc"

Take an educated guess from

kubectl describe pod myrelease-helm-02-prometheus-server -n prometheus

and

kubectl describe pod myrelease-helm-02-prometheus-alert-manager -n prometheus

as to which container needed to be specified when setting the security context. Then apply security context to run pods with appropriate uid:gid, e.g. with

kubectl apply -f setpermissions/

where

cat setpermissions/*

gives

apiVersion: v1
kind: Pod
metadata:
  name: myrelease-helm-02-prometheus-alertmanager
spec:
  securityContext:
    runAsUser: 501
    runAsGroup: 501
    fsGroup: 501
  volumes:
    - name: prometheusalertmanager
  containers:
    - name: prometheusalertmanager
      image: jimmidyson/configmap-reload:v0.4.0
      securityContext:
        runAsUser: 501
        allowPrivilegeEscalation: false        
apiVersion: v1
kind: Pod
metadata:
  name: myrelease-helm-02-prometheus-server
spec:
  securityContext:
    runAsUser: 500
    runAsGroup: 500
    fsGroup: 500
  volumes:
    - name: prometheusserver
  containers:
    - name: prometheusserver
      image: jimmidyson/configmap-reload:v0.4.0
      securityContext:
        runAsUser: 500
        allowPrivilegeEscalation: false
jmkite
  • 285
  • 1
  • 4
  • 13
  • Hello jmkite thank you for all the details you shared, these are really great. Would you mind sharing where to define the json method "Create 2x EFS access points something like"? I mean pls share the steps to create the access points? I am not getting gist of it – Nitin G Jul 19 '21 at 11:24
  • You can create them in the web GUI, with terraform, CLI, whatever. This JSON isn't creating the access points, it's the output of the CLI describing them – jmkite Jul 20 '21 at 16:59
  • Hello jmkite ... we have found that prometheus doesnt support AWS efs or rather any NFS ... are you not facing any issues with data corruption and all? – Nitin G Aug 15 '21 at 07:27
  • I'm sorry but this post is about a single issue, as per title. Whilst you are correct that EFS/NFS is unsupported, I'm afraid I'm not up for a general discussion regarding Prometheus here – jmkite Aug 17 '21 at 07:42
  • @jmkite When I run last step to update permissions (kubectl apply -f setpermissions/), it fails with missing "Missing volume-dir". Any idea what is volume-dir to be passed? – Ankush T Sep 07 '21 at 11:07
  • It's presented above – jmkite Sep 08 '21 at 15:29