I am having almost identical issue, while so can provide more details on how the set up is:
2x server replica 2 gluster volume from two bricks. Brick IMG-01:/images/storage/brick1 49152 0 Y
3497 Brick IMG-02:/images/storage/brick1 49152 0
Y 3512 NFS Server on localhost N/A
N/A N N/A Self-heal Daemon on localhost
N/A N/A Y 3490 NFS Server on IMG-02
N/A N/A N N/A Self-heal Daemon on IMG-02
N/A N/A Y 3505 Task Status of Volume gv1 ------------------------------------------------------------------------------ There are no active volume tasks
To allow the HA I did this from the Gluster-clients side:
IMG-01:/gv1 /mnt/glustervol1 glusterfs _netdev,backupvolfile-server=IMG-02,direct-io-mode=disable,log-level=WARNING,log-file=/var/log/gluster.log 0 0
Glusterfs-server version is 3.7 on Ubuntu 16.04 and clients are glusterfs 3.8 on ubuntu 14.0.4 Gluster servers are communicating through infiniband direct connection and /30 subnet; while the clients are connecting through 1G Ethernet interface.
Now times that one of the servers are out for any reason say a reboot or service unavailability the clients maintain connections but fail to read or write and eventually the clients freez as well. If the servers are replica of each other and if th