So finally https://serverfault.com/questions/976764/kubernetes-run-aws-s3-sync-rsync-against-persistent-volume-on-demand pointed me in the right direction.
This is an extract of the deployment.yaml
descriptor which works as expected:
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: {{K8S_NAMESPACE}}
name: {{K8S_DEPLOYMENT_NAME}}
spec:
selector:
matchLabels:
name: {{K8S_DEPLOYMENT_NAME}}
strategy:
type: Recreate
template:
metadata:
labels:
name: {{K8S_DEPLOYMENT_NAME}}
version: v1
spec:
containers:
- name: {{AWSCLI_NAME}}
image: {{IMAGE_AWSCLI}}
env:
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: {{SECRET_NAME}}
key: accesskey
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: {{SECRET_NAME}}
key: secretkey
command: [ "/bin/bash",
"-c",
"aws --endpoint-url {{ENDPOINT_URL}} s3 sync s3://{{BUCKET}} /data; while true; do aws --endpoint-url {{ENDPOINT_URL}} s3 sync /data s3://{{BUCKET}}; sleep 60; done" ]
volumeMounts:
- name: pushgw-data
mountPath: /data
- name: {{PUSHGATEWAY_NAME}}
image: {{IMAGE_PUSHGATEWAY}}
command: [ '/bin/sh', '-c' ]
args: [ 'sleep 10; /bin/pushgateway --persistence.file=/data/metric.store' ]
ports:
- containerPort: 9091
volumeMounts:
- name: pushgw-data
mountPath: /data
volumes:
- name: pushgw-data
emptyDir: {}
- name: config-volume
configMap:
name: {{K8S_DEPLOYMENT_NAME}}
imagePullSecrets:
- name: harbor-bot
restartPolicy: Always
Note the override of entrypoint for the docker image of the pushgateway. In my case I have put 10 seconds delay to start, you might need to tune the delay to suits your needs. This delay is needed because the pushgateway container will boot faster than the sidecar (also due to the network exchange with s3, I suppose).
If the pushgateway starts when not metric store file is already present, it won't be used/considered. But it gets worse, when you first send data to the pushgateway, it will override the file. At that point, the "sync" from the sidecar container will also override the original "copy", so please pay attention and be sure you have a backup of the metrics file before experimenting with this delay value.