0

I setup a Mesos cluster with vm machines, machine A (master + agent) and machine B(agent only), and I also run Marathon and Mesos-dns on machine A, both agents support docker.

I startup a web application with docker via Marathon, the docker container run with bridge network mode.

When I run one instance, the container startup normally and Mesos-dns resolves it correct with the docker service internal IP (example, 172.17.0.2), but because I only run one instance and there are two agents in mesos cluster, so only one agent gets the container, the other one there is nothing, if client accesses the mesos agent which there nothing running, there is error appeared.

That means, the container is running on machine B, does not run on machine A, my docker application named test, and listen with port 5000, once I run "curl http://test.marathon.mesos:5000/" on machine B, I get the correct response, but when I run same command on machine A, there is an error appeared "curl: (7) Failed to connect to test.marathon.mesos port 5000: No route to host", the mesos-dns resolve domain to docker internal ip 172.17.0.2, but this ip is not appeared on machine A, because there is not any container running on machine A.

I also can run many instances on same Agent node without any problems, but as I know, mesos cluster and marathon are running application on agent node randomize, so all agent nodes behind load balancer could be accessed, if client access to agent node without container via load balancer, that's a problem for client.

my mesos-dns config file like below:

{
   "zk":"zk://10.11.54.103:2181,10.11.54.103:2182,10.11.54.103:2183/mesos",
  "masters": ["10.11.54.103:5050"],
  "refreshSeconds": 60,
  "ttl": 60,
  "domain": "mesos",
  "port": 53,
  "resolvers": ["10.11.255.1","10.11.255.2","4.2.2.2"],
  "timeout": 5,
  "httpon": true,
  "dnson": true,
  "httpport": 8123,
  "externalon": true,
  "listener": "10.11.54.103",
  "SOAMname": "ns1.mesos",
  "SOARname": "root.ns1.mesos",
  "SOARefresh": 60,
  "SOARetry":   600,
  "SOAExpire":  86400,
  "SOAMinttl": 60,
  "IPSources": ["netinfo", "mesos", "host"]
}

I wish the Mesos-dns can resole domain cross the whole mesos cluster, is there any idea?

Sam Ho
  • 1,320
  • 2
  • 12
  • 12

1 Answers1

0

From the documentation:

Mesos-DNS with Docker

If you choose to use Mesos-DNS with Docker, with a version of Mesos after 0.25, be aware that there are some caveats. By default the Docker executor publishes the IP of the Docker container into the NetworkInfo field. Unfortunately, unless you're running some kind of SDN solution, bridged, or host networking with Docker, this can prove to make the containers unreachable.

The default configuration that Mesos-DNS ships with in config.json.sample omits netinfo from the sources. The default options if you omit this field from the configuration includes netinfo. If you have trouble with Docker, ensure you check the IPSources field to omit netinfo.

IPSources defines a fallback list of IP sources for task records, sorted by priority. If you use Docker, and enable the netinfo IPSource, it may cause tasks to become unreachable, because after Mesos 0.25, the Docker executor publishes the container's internal IP in NetworkInfo.

janisz
  • 6,292
  • 4
  • 37
  • 70