0

I am building a platform on top of Kubernetes that, among other requirements, should:

  • Be OS agnostic. Any Linux with a sane kernel and cgroup mounts.
  • Offer persistent storage by leveraging cluster node disk(s).
  • Offer ReadWriteMany volumes or a way to implement shared storage.
  • PODs shouldn't be bound to a specific node (like for local persistent volumes)
  • Volumes are automatically reattached when PODs are migrated (e.g. due to a node drain or node lost condition)
  • Offers data replication at storage level
  • Not assume a dedicated raw block device available for each node.

I'm addressing the 1st point by using static binaries for k8s components and container engine. Coupled with minimal host tooling that's also static binaries.

I'm still looking for a solution for persistent storage.

What I evaluated/used so far:

So the question is what other option do I have for Kubernetes persistent storage while using the cluster node disks.

Kiran Mova
  • 113
  • 6
Laurentiu Soica
  • 149
  • 2
  • 15
  • Are you looking to run it on-prem or in cloud (CGP/AWS)? – Nick Nov 29 '19 at 13:36
  • On-prem. No dedicated storage system assumed. Just local disks. – Laurentiu Soica Nov 29 '19 at 13:41
  • have you been checking glusterfs? official documentation on topc says that WriteMany is supported by gcePersistentDisk (which is not a case at all here), glusterfs, nfs . I'm checking if glusterfs suits your requirements – Nick Nov 29 '19 at 13:54
  • Checked glusterfs on k8s. It requires O/S specific tooling and a dedicated raw block device attached on each node. https://github.com/gluster/gluster-kubernetes/blob/master/docs/setup-guide.md#infrastructure-requirements – Laurentiu Soica Nov 29 '19 at 14:32
  • "dedicated raw block device attached on each node", i was under impression that you can have sda for OS and sdb for glusterfs on the same raid. "O/S specific tooling" they say that it's possible accessing it via NFS v3 https://docs.gluster.org/en/latest/Administrator%20Guide/Setting%20Up%20Clients/ . Additionally, I made a typo in my previous comment and mentioned gcePersistentDisk instead of CephFS – Nick Nov 29 '19 at 14:50
  • I'm not in control of the underlying infrastructure so I cannot assume I'll have a dedicated raw block device available (question updated). That's why rook with ceph was my first option. By "O/S specific tooling" I wanted to say distribution dependent setup; sorry for the confusion. – Laurentiu Soica Nov 29 '19 at 15:15
  • @LaurentiuSoica Is using a K8s DaemonSet to install the dependencies an option, for instance in EKS (amazon linux) which doesn't come with iscsi installed, the following can be used to setup iscsi. (https://github.com/openebs/charts/blob/master/docs/openebs-amazonlinux-setup.yaml) – Kiran Mova Dec 06 '19 at 02:50
  • Sounds good. Thanks! – Laurentiu Soica Dec 07 '19 at 06:09

4 Answers4

2

The below options can be considered

  1. kubernetes version 1.14.0 on wards supports local persistent volumes. You can make use of local pv's using node labels. You might have to run stateful work loads in HA ( master-slave ) mode so the data would be available in case of node failures

  2. You can install nfs server on one of the cluster node and use it as storage for your work loads. nfs storage supports ReadWriteMany. This might work well if you setup the cluster on baremetal

  3. Rook is also one of the good option which you have already tried but it is not production ready though.

Among the three, first option suits your requirement. Would like to hear any other options from the community.

P Ekambaram
  • 15,499
  • 7
  • 34
  • 59
  • Thanks for your answer. So local pvs could be an option. There's another (nice to have) requirement that I forgot to mention. I would prefer to have data replication at storage level instead of application level. I'll update the question. But yes, if there's no other better option with replication included, local pvs could be a solution. – Laurentiu Soica Nov 29 '19 at 12:40
  • glusterfs supports replication at storage level. But again it is external to k8s cluster – P Ekambaram Nov 29 '19 at 12:46
1

Two and a half years have passed, but it may be of help for those who wind up being here through a Google search.
There's a solution provided by OpenEBS to leverage the node disks to create PersistentVolumes named rawfile-localpv; Install it in your cluster, create a StorageClass like this, and then provision your PersistentVolumeClaims using this StorageClass:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: my-sc
provisioner: rawfile.csi.openebs.io
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true

Keep in mind that using this solution, your pods are still bound to a specific node (where the pv resides), and you should do all the migration process by yourself when needed. But it provides a neat and easy solution to use high-performance storage inside a Kubernetes cluster.

Link to project on Github: https://github.com/openebs/rawfile-localpv

abexamir
  • 48
  • 3
0

According to official documentation as of now (v1.16) K8S supports WriteMany on a few different types of volumes.

Namely these are: cephfs, glusterfs and nfs

In general, with all of these the content of a volume is preserved and the volume is merely unmounted when a Pod is removed. This means that a volume can be pre-populated with data, and that data can be “handed off” between Pods. These FS can be mounted by multiple writers simultaneously.

Among these FS the glusterfs can be deployed on each kubernetes cluster Node (at least 3 required). Data can be accessed in different ways one of which is NFS.

A persistentVolumeClaim volume is used to mount a PersistentVolume into a Pod. PersistentVolumes are a way for users to “claim” durable storage (such as a GCE PersistentDisk or an iSCSI volume) without knowing the details of the particular cloud environment ReadWriteMany is supported with following types of volumes: - AzureFile - CephFS - Glusterfs - Quobyte - NFS - PortworxVolume

but that's not an option with no control of the underlying infrastructure.

The local volume option represents a mounted local storage device such as a disk, partition or directory. Local volumes can only be used as a statically created PersistentVolume. The drawback is that if a node becomes unhealthy, then the local volume will also become inaccessible, and a Pod using it will not be able to run.

So at the moment there is no solution that suits all the requirements out of the box.

Nick
  • 1,882
  • 11
  • 16
  • The old saying that storage on kubernetes is hard still holds. As long as you're not using a managed solution from a public cloud vendor. – Laurentiu Soica Nov 29 '19 at 19:03
  • Did you took into account OpenEBS with open-iscsi? (they say that Open-iSCSI Initiator requires a host running the Linux operating system with kernel version 2.6.16. that matches my requirements, just that I don't know if it's something suitable for a production deployment) – Laurentiu Soica Nov 29 '19 at 19:17
  • I haven't been considering OpenEbs, because it wasn't mentioned in official Kubernetes Documentation and as far as I know OpenEBS (I might be wrong here) doesn't tick the "No dedicated storage system assumed. Just local disks." box. However, if you are going to build k8s cluster from scratch, glusterfs looks interesting. – Nick Dec 02 '19 at 08:29
  • My understanding is that OpenEBS configured with Jiva does support local disks (through the container image disks) but I have zero experience with it. – Laurentiu Soica Dec 02 '19 at 15:39
  • Nick, OpenEBS doesn't need an intree custom provisioner. The Kubernetes documentation only mentions about storages that were written using in-tree. You should check out the ones supported via the CSI driver. https://kubernetes-csi.github.io/docs/drivers.html – Kiran Mova Dec 06 '19 at 02:37
0

You can use OpenEBS Local PV which can consume entire disk for an application using default storage class openebs-device and you can consume the mounted disk for sharing multiple application using default storage class openebs-hostpath. More information is provided in OpenEBS documentation under User Guide section. This does not require open-iscsi. If you are using a direct device, then using OpenEBS Node Disk Manager, disk will be automatically detected and consumed. For meeting RWM use case, you can consume this provisioned volume using Local PV as underneath volume for multiple application using NFS provisioner. The implementation of the same is mentioned under OpenEBS documentation under Stateful Application section.