4

What is the preferred Kubernetes storageClass for a PersistentVolume used by a Postgresql database? Which factors should go into consideration choosing the storageClass when I have the choice between S3 (Minio), NFS and HostPath?

mxcd
  • 1,954
  • 2
  • 25
  • 38

2 Answers2

4

When you choose a storage option for Postgresql in Kubernetes, you should take into account the following:

  1. NFS / Minio is not the preferred storage for databases, if your application is latency-sensitive. A common use case is a download folder or a logging/backup folder.
    But it gives you flexibility to design a k8s cluster and ability to easily move to cloud-based solution in future (AWS EFS or S3 for example).

  2. HostPath is a better option for databases. But

Kubernetes supports hostPath for development and testing on a single-node cluster. A hostPath PersistentVolume uses a file or directory on the Node to emulate network-attached storage.

In a production cluster, you would not use hostPath. Instead a cluster administrator would provision a network resource like a Google Compute Engine persistent disk, an NFS share, or an Amazon Elastic Block Store volume. Cluster administrators can also use StorageClasses to set up dynamic provisioning.

  1. As you mentioned, there is quite a good option for non-cloud k8s clusters Longhorn

Longhorn is a lightweight, reliable, and powerful distributed block storage system for Kubernetes.
Longhorn implements distributed block storage using containers and microservices. Longhorn creates a dedicated storage controller for each block device volume and synchronously replicates the volume across multiple replicas stored on multiple nodes. The storage controller and replicas are themselves orchestrated using Kubernetes.

  1. Also, check this Bitnami PostgreSQL Helm chart

It offers a PostgreSQL Helm chart that comes pre-configured for security, scalability and data replication. It's a great combination: all the open source goodness of PostgreSQL (foreign keys, joins, views, triggers, stored procedures…) together with the consistency, portability and self-healing features of Kubernetes.

mozello
  • 1,083
  • 3
  • 8
0

You should take care of getting dynamic block storage.

Host path is kind of what you want, but it's not dynamic, meaning it can't move around nodes. So if your node goes down, you have a problem.

If it's managed by a cloud vendor, there should be a premade storage class that covers this, i.e. azure disk.

NFS and S3 don't make sense for database data. You are not dealing with files/objects in that sense.

The Fool
  • 16,715
  • 5
  • 52
  • 86
  • 1
    Outside of an IAAS: Ceph. GlusterFS is no better than NFS. I would even argue it's worse. Technology of the past. Similar limitations than NFS. On top of which: fuse/userland client mounting devices, poor perfs in general, especially with small-IO (everything database, git, ...). While gluster-block is a sad attempt at block devices: storing some sort of image file on top of a glusterfs share. – SYN Mar 13 '22 at 20:20
  • @SYN, good point I took that part about glusterfs out of my answer. – The Fool Mar 13 '22 at 20:24
  • @SYN Did I get your comment right, that for a self-hosted environment with just a single storage server, you'd go with NFS for simplicity reasons? I am also looking at Rancher Longhorn which is dead-simple to set up – mxcd Mar 14 '22 at 13:36
  • @mxcd, the comment was about my suggestion to use glusterfs, in my original answer. This suggestion has been removed as its just as bad if not worse than nfs as ber syns comment. That doenst mean nfs is a good option though. Its still a bad idea. – The Fool Mar 14 '22 at 13:38