0

I have a set of 2 gluster peers and a gluster volume replicated on them.

Everything works fine with native glusterfs client side like reading,writing, automatic replication and automatic fail over when both peers are up during the initial mount.

But if the one of the peer is down while mounting the from client side it take a lot of time to mount ( Specifically it takes 2 minutes when the gluster domain is pointing the up peer and 4 minutes when the gluster domain is pointing the down peer )

I know while mounting using glusterfs native client it tries to connect to all bricks nodes specified in volfile.

Is there any way to get the timeout while mounting or can we make client to use any one of the up node instead of trying to connect to all brick nodes. .

This is the log for both scenario.

[2016-02-11 07:07:12.559354] I [rpc-clnt.c:1847:rpc_clnt_reconfig] 0-test1-client-1: changing port to 49160 (from 0)

[2016-02-11 07:07:12.560528] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 0-test1-client-0: Using Program GlusterFS 3.3, Num (1298437), Version (330)

[2016-02-11 07:07:12.561447] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-test1-client-0: Connected to test1-client-0, attached to remote volume '/nfs_magnolia/test1'.

[2016-02-11 07:07:12.561461] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 0-test1-client-0: Server and Client lk-version numbers are not same, reopening the fds

[2016-02-11 07:07:12.561508] I [MSGID: 108005] [afr-common.c:3841:afr_notify] 0-test1-replicate-0: Subvolume 'test1-client-0' came back up; going online.

[2016-02-11 07:07:12.562281] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-test1-client-0: Server lk version = 1

[2016-02-11 07:07:12.563523] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 0-test1-client-1: Using Program GlusterFS 3.3, Num (1298437), Version (330)

[2016-02-11 07:07:12.564454] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-test1-client-1: Connected to test1-client-1, attached to remote volume '/nfs_magnolia/test1'.

[2016-02-11 07:07:12.564466] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 0-test1-client-1: Server and Client lk-version numbers are not same, reopening the fds

Logs when one peer is down.

[2016-02-11 07:11:25.173459] I [MSGID: 100030] [glusterfsd.c:2318:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.7.6 (args: /usr/sbin/glusterfs --volfile-max-fetch-attempts=2 --volfile-server=nfs_cluster_storage.preprod.ngp.tesco.com --volfile-server=nfs_cluster_storage.preprod.ngp.tesco.com --volfile-id=/test1 /root/test1_mount)

[2016-02-11 07:11:25.187780] I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1

[2016-02-11 07:13:32.455019] E [socket.c:2278:socket_connect_finish] 0-glusterfs: connection to 10.1.8.191:24007 failed (Connection timed out)

[2016-02-11 07:13:32.455089] E [glusterfsd-mgmt.c:1818:mgmt_rpc_notify] 0-glusterfsd-mgmt: failed to connect with remote-host: nfs_cluster_storage.preprod.ngp.tesco.com (Transport endpoint is not connected)

[2016-02-11 07:13:32.455103] I [glusterfsd-mgmt.c:1845:mgmt_rpc_notify] 0-glusterfsd-mgmt: connecting to next volfile server nfs_cluster_storage.preprod.ngp.tesco.com

[2016-02-11 07:13:35.209045] I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with index 2

[2016-02-11 07:13:35.209634] I [MSGID: 114020] [client.c:2118:notify] 0-test1-client-0: parent translators are ready, attempting connect on transport

[2016-02-11 07:13:35.213970] I [MSGID: 114020] [client.c:2118:notify] 0-test1-client-1: parent translators are ready, attempting connect on transport

[2016-02-11 07:13:35.216469] I [rpc-clnt.c:1847:rpc_clnt_reconfig] 0-test1-client-0: changing port to 49160 (from 0)

Community
  • 1
  • 1
Abhijit
  • 631
  • 7
  • 13

1 Answers1

0

This is a known-issue; I'm working on getting the fix http://review.gluster.org/#/c/11113/ in.

itisravi
  • 3,406
  • 3
  • 23
  • 30