1

I have three machines M1, M2 & M3. I deployed mesos-master, zookeeper and marathon on M1 And mesos-slave on M2 & M3. However, on Mesos Gui, there are ZERO slaves being shown. Then I ran the command mesos-resolve cat /etc/mesos/zk to check if slave is discovering the correct master. But no, it is incorrectly discovering 127.0.0.1:5050 as the master. Below are the logs for the above command :

2015-07-31 15:38:02,522:17271(0x7f538b7cf700):ZOO_INFO@zookeeper_init@786: Initiating client connection, host=M1_IP:2181 sessionTimeout=10000 watcher=0x7f5392b130b0 sessionId=0 sessionPasswd=<null> context=0x7f5378003960 flags=0
2015-07-31 15:38:02,525:17271(0x7f5386dba700):ZOO_INFO@check_events@1703: initiated connection to server [M1_IP:2181]
2015-07-31 15:38:02,541:17271(0x7f5386dba700):ZOO_INFO@check_events@1750: session establishment complete on server [M1_IP:2181], sessionId=0x14ee590e0ec0008, negotiated timeout=10000
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0731 15:38:02.541931 17273 group.cpp:313] Group process (group(1)@127.0.0.1:53978) connected to ZooKeeper
I0731 15:38:02.542022 17273 group.cpp:787] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0)
I0731 15:38:02.542045 17273 group.cpp:385] Trying to create path '/mesos' in ZooKeeper
I0731 15:38:02.545756 17273 detector.cpp:138] Detected a new leader: (id='1')
I0731 15:38:02.545891 17273 group.cpp:656] Trying to get '/mesos/info_0000000001' in ZooKeeper
W0731 15:38:02.547034 17273 detector.cpp:444] Leading master master@127.0.0.1:5050 is using a Protobuf binary format when registering with ZooKeeper (info): this will be deprecated as of Mesos 0.24 (see MESOS-2340)
I0731 15:38:02.547114 17273 detector.cpp:481] A new leading master (UPID=master@127.0.0.1:5050) is detected

As the log indicates, I looked up the node value of /mesos/info_0000000001 in M1/zookeeper. It turned out to be something like this :

!20150801-152910-16777343-5050-765???'"master@127.0.0.1:5050*
marathon-120.23.0

Mesos master setting : cat /etc/mesos/zk

zk://M1_IP:2181/mesos

So as it looks like, mesos master at M1 some how not storing its absolute ip in zookeeper node. Can any one explain the strange behaviour.

Jacek Laskowski
  • 72,696
  • 27
  • 242
  • 420
Nitin
  • 165
  • 1
  • 9

4 Answers4

3

You may want explicitly tell the Master what IP to bind to, see --ip flag.

rukletsov
  • 1,041
  • 5
  • 7
  • You can also check your `/etc/hosts/` file, and see what `hostname` returns and what you get when you try to `ping ` – Adam Aug 04 '15 at 03:23
1

In

/etc/mess/zk file, please mention your machine IP address.

Ex:

zk://192.168.0.1:2181/mesos

Please reflect the same changes in mesos slave.

Rajiv Reddy
  • 153
  • 1
  • 9
0

It's a good practice to add the external interface IP to /etc/mesos-master/ip. Which will then get published correctly to zookeeper as opposed to the localhost ip. You should do the same for slaves as well.

Eren Güven
  • 2,314
  • 19
  • 27
0

In my case, the problem was solved by replacing the loopback address (127.0.1.1) in /etc/hosts with the correct IP for eth0 (so that hostname -i returns the correct IP address). Then I restarted all the services and everything started working. Of course this will break if the IP address changes.

I didn't see anything about this in the Mesos install instructions (maybe I overlooked it) but I've had to do this same thing for Hadoop installs to work correctly.

Clark Updike
  • 160
  • 1
  • 9