0

Assume a basic replica 3 arbiter 1 configuration, glusterfs-server 4.0.2 and glusterfs-client 4.0.2. glusterfs-client is installed on Ubuntu 18.04.

In order to verify that write / read ops are permitted when one storage node is down as stated in the docs, an unexpected result occurs. After killing gluster processes on one of the non-arbiter nodes (using pkill ^gluster*) the client mount point fails with 'Client quorum is not met.' (see glusterfs-client log file).

Gluster volume info:

Volume Name: brick01
Type: Replicate
Volume ID: 2310c6f4-f83d-4691-97a7-cbebc01b3cf7
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: proxmoxVE-1:/mnt/gluster/bricks/brick01
Brick2: proxmoxVE-2:/mnt/gluster/bricks/brick01
Brick3: arbiter01:/mnt/gluster/bricks/brick01 (arbiter)

The volume is created by the following command

gluster volume create brick01 replica 3 arbiter 1 
proxmoxVE-1:/mnt/gluster/bricks/brick01 
proxmoxVE-2:/mnt/gluster/bricks/brick01 
arbiter01:/mnt/gluster/bricks/brick01

As stated per the docs, file ops should be allowed in cases when one brick is down (if arbiter is ok with that), then why do I get 'Client quorum is not met' on the client side? After a significant time spent reading the official docs about glusterfs, I couldn't find explanation as to why this is happening, and also filed a bug report on Red Hat Bugzilla.

Any help on the topic will be very much appreciated!

e2l3n
  • 101
  • 2

1 Answers1

0

gluster volume heal brick01 enablе resolved the issue.

This eventually added the re-configured option cluster.self-heal-daemon: enable to the volume. It seems that by default arbiter brick fails to heal (sync) simultaneously when any file operation occurs and blames the other brick that is still up.

e2l3n
  • 101
  • 2