I'm currently in the process of trying to deploy a mainnet archive node with an erigon docker image to a GKE cluster (thorax/erigon
). I have successfully been able to deploy a Geth node with a similar configuration as below, but when trying to use the same methodology for erigon I have not been successful.
Below is my YAML deployment file:
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: erigon-mainnet
namespace: erigon-mainnet
spec:
selector:
matchLabels:
app: erigon-mainnet
replicas: 2
serviceName: erigon-mainnet
updateStrategy:
type: RollingUpdate
template:
metadata:
labels:
app: erigon-mainnet
spec:
terminationGracePeriodSeconds: 300
containers:
- name: erigon
image: docker.io/thorax/erigon
ports:
- containerPort: 8545
- containerPort: 8546
- { containerPort: 30303, protocol: TCP }
- { containerPort: 30303, protocol: UDP }
args:
[
"--datadir=/mainnet",
"--chain=mainnet",
"--http",
"--http.addr=0.0.0.0",
"--http.api=eth,net,web3",
"--http.vhosts=*",
" --http.corsdomain=*",
"--ws",
"--ws.addr=0.0.0.0",
"--ws.api=eth,net,web3",
"--ws.origins=*",
]
resources:
requests:
memory: 2G
cpu: 1000m
limits:
memory: 16G
cpu: 8000m
livenessProbe:
initialDelaySeconds: 10
timeoutSeconds: 10
httpGet:
path: /
port: 8545
readinessProbe:
httpGet:
path: /
port: 8545
volumeMounts:
- name: mainnet
mountPath: /mainnet
nodeSelector:
chain: mainnet
volumeClaimTemplates:
- metadata:
name: "mainnet"
spec:
storageClassName: premium-rwo
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 4Ti
---
apiVersion: v1
kind: Service
metadata:
name: erigon-mainnet
namespace: erigon-mainnet
spec:
ports:
- protocol: TCP
targetPort: 8545
port: 8545
name: http
- protocol: TCP
targetPort: 8546
port: 8546
name: websoket
clusterIP: None
selector:
app: erigon-mainnet
The result from kubectl describe pod
yields:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 53s default-scheduler Successfully assigned erigon-mainnet/erigon-mainnet-0 to gke-node-cluster-polygon-a017195b-fwhs
Normal Pulled 49s kubelet Successfully pulled image "docker.io/thorax/erigon" in 430.462783ms
Normal Pulled 48s kubelet Successfully pulled image "docker.io/thorax/erigon" in 399.71813ms
Normal Pulling 30s (x3 over 50s) kubelet Pulling image "docker.io/thorax/erigon"
Normal Created 29s (x3 over 49s) kubelet Created container erigon-mainnet
Warning Failed 29s (x3 over 49s) kubelet Error: failed to create containerd task: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: exec: "--datadir=/mainnet": stat --datadir=/mainnet: no such file or directory: unknown
Normal Pulled 29s kubelet Successfully pulled image "docker.io/thorax/erigon" in 417.260296ms
Warning BackOff 10s (x8 over 48s) kubelet Back-off restarting failed container
So, my assumption here is that I am probably mounting the SSD to the wrong directory. I have tried leaving the --datadir
flag blank and mounting it to the default datadir
erigon directory, but I still run into crash loops. With my Geth node, I mounted to /chaindata
exactly the same logic as above and the node ran fine. If anyone knows what the problem here could be, any help is appreciated. I am fairly new to GKE, and erigon so it might be a simple resolution I'm overlooking.