1

I am unable to start a 3 node universe with yb-master. I am following the docs here: https://docs.yugabyte.com/latest/deploy/manual-deployment/start-masters/#verify-health

I created 3 master.conf files for 3 separate ips.

For 10.0.0.185:

--master_addresses=10.0.0.185:7100,10.0.0.141:7100,10.0.0.119:7100
--rpc_bind_addresses=10.0.0.185:7100
--fs_data_dirs=/home/mark/yuga/y1

For 10.0.0.141:

--master_addresses=10.0.0.141:7100,10.0.0.185:7100,10.0.0.119:7100
--rpc_bind_addresses=10.0.0.141:7100
--fs_data_dirs=/home/mark/yuga/y1

For 10.0.0.119:

--master_addresses=10.0.0.119:7100,10.0.0.141:7100,10.0.0.185:7100
--rpc_bind_addresses=10.0.0.119:7100
--fs_data_dirs=/home/mark/yuga/y1

I started each node up with the command ./bin/yb-master --flagfile master.conf >& ./y1/yb-master.out &

What seems to happen is that the first 2 nodes start up fine but as soon as I try to spin up the third node, the first node crashes and I end up with the error: enter image description here

At first I thought may be this has to do with the servers I've got so I changed up the order I spin up the yb-masters, but it's always the first one I spin up first that dies.

Looking at the yb-master.INFO for each ip from yb1/yb-data/master/logs/yb-master.INFO with the command cat y1/yb-data/master/logs/yb-master.INFO | grep master I see:

The one that crashes:

This master's current role is: FOLLOWER

And the other two show:

I0110 00:02:56.565732  3292 client-internal.cc:2384] New master addresses: [10.0.0.141:7100,10.0.0.185:7100,10.0.0.119:7100, 10.0.0.141:7100, 10.0.0.185:7100, 10.0.0.119:7100]
E0110 00:02:58.069311  3162 async_initializer.cc:99] Failed to initialize client: Timed out (yb/rpc/rpc.cc:224): Could not locate the leader master: GetLeaderMasterRpc(addrs: [10.0.0.141:7100, 10.0.0.185:7100, 10.0.0.119:7100, 10.0.0.141:7100, 10.0.0.185:7100, 10.0.0.119:7100], num_attempts: 46) passed its deadline 1101.945s (passed: 1.504s): Network error (yb/util/net/socket.cc:551): recvmsg error: Connection refused (system error 111)
I0110 00:02:59.071501  3293 client-internal.cc:2355] Reinitialize master addresses from file: master.conf
I0110 00:02:59.071782  3293 client-internal.cc:2384] New master addresses: [10.0.0.141:7100,10.0.0.185:7100,10.0.0.119:7100, 10.0.0.141:7100, 10.0.0.185:7100, 10.0.0.119:7100]

and

I0110 00:02:57.610631  2128 master_service.cc:531] Patching role from leader to follower because of: Leader not ready to serve requests (yb/master/scoped_leader_shared_lock.cc:123): Leader not yet ready to serve requests: leader_ready_term_ = -1; cstate.current_term = 1 [suppressed 77 similar messages]
I0110 00:02:58.072002  2144 client-internal.cc:2355] Reinitialize master addresses from file: master.conf
I0110 00:02:58.072276  2144 client-internal.cc:2384] New master addresses: [10.0.0.119:7100,10.0.0.141:7100,10.0.0.185:7100, 10.0.0.119:7100, 10.0.0.141:7100, 10.0.0.185:7100]

I'm not sure why I'm seeing those errors, am I missing something while attempting to start up the 3 yb-masters?

I should also mention that I've ensured all 3 nodes have the correct system configurations, as mentioned here: https://docs.yugabyte.com/latest/deploy/manual-deployment/system-config/#setting-system-wide-ulimits

Mark
  • 113
  • 7
  • Are there any logs on the master that crashes ? Can you try setting up a cluster on a single machine with yugabyted cli? Also look at .ERROR,.WARNING logs. – dh YB Jan 12 '22 at 09:04

0 Answers0