We want to deploy a k8s cluster which will run ~100 IO-heavy pods at the same time. They should all be able to access the same volume.
What we tried so far:
- CephFS
- was very complicated to set up. Hard to troubleshoot. In the end, it crashed a lot and the cause was not entirely clear.
- Helm NFS Server Provisioner
- runs pretty well, but when IO peaks a single replica is not enough. We could not get multiple replicas to work at all.
- MinIO
- is a great tool to create storage buckets in k8s. But our operations require fs mounting. That is theoretically possible with s3fs, but since we run ~100 pods, we would need to run 100 s3fs sidecars additionally. Thats seems like a bad idea.
There has to be some way to get 2TB of data mounted in a GKE cluster with relatively high availability?
Firestorage seems to work, but it's a magnitude more expensive than other solutions, and with a lot of IO operations it quickly becomes infeasible.
I contemplated creating this question on server fault, but the k8s community is a lot smaller than SO's.