2

I'm facing some trouble trying to run mesos-dns dockerized on a mesos cluster.

I've setup 2 virtual machines with ubuntu trusty on a windows 8.1 host. My VMs are called docker-vm and docker-sl-vm; where the first one runs mesos-master and the 2nd one runs mesos-slave.

The VMs have 2 network cards; one running NAT for accesing internet through the host and the other one is a Host-only adapter for internal communication.

The IPs for the VMs are:

  • 192.168.56.101 for docker-vm
  • 192.168.56.102 for docker-sl-vm

The MESOS cluster is running Okay.

I am trying to follow this tutorial. So, I am running mesos-dns with the following marathon description:

{
    "args": [
        "/mesos-dns",
        "-config=/config.json"
    ],
    "container": {
        "docker": {
            "image": "mesosphere/mesos-dns",
            "network": "HOST"
        },
        "type": "DOCKER",
        "volumes": [
            {
                "containerPath": "/config.json",
                "hostPath": "/usr/local/mesos-dns/config.json",
                "mode": "RO"
            }
        ]
    },
    "cpus": 0.5,
    "mem": 256,
    "id": "mesos-dns",
    "instances": 1,
    "constraints": [["hostname", "CLUSTER", "docker-sl-vm"]]
}

and this config.json:

{
    "zk": "zk://192.168.56.101:2181/mesos",
    "refreshSeconds": 60,
    "ttl": 60,
    "domain": "mesos",
    "port": 53,
    "resolvers": ["8.8.8.8"],
    "timeout": 5,
    "email": "root.mesos-dns.mesos"
}

I am also running a test proposal application called peek with the following description:

{
  "id": "peek",
  "cmd": "env >env.txt && python3 -m http.server 8080",
  "cpus": 0.5,
  "mem": 32.0,
  "container": {
    "type": "DOCKER",
    "docker": {
      "image": "python:3",
      "network": "BRIDGE",
      "portMappings": [
        { "containerPort": 8080, "hostPort": 0 }
      ]
    }
  }
}

PROBLEM

Into the tutorial, a dig command such as dig _peek._tcp.marathon.mesos SRV got the following answer:

; <<>> DiG 9.9.5-3ubuntu0.5-Ubuntu <<>> _peek._tcp.marathon.mesos SRV
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 57329
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; QUESTION SECTION:
;_peek._tcp.marathon.mesos. IN  SRV

;; ANSWER SECTION:
_peek._tcp.marathon.mesos. 60   IN  SRV 0 0 31000 peek-27346-s0.marathon.mesos.

;; ADDITIONAL SECTION:
peek-27346-s0.marathon.mesos. 60 IN A   10.141.141.10

;; Query time: 4 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Sat Oct 24 23:21:15 UTC 2015
;; MSG SIZE  rcvd: 160

Where we can clearly see the port and IP bound to _peek._tcp.marathon.mesos SRV, BUT when I run this on my slave machine - which is running this container - I get this result:

docker@docker-sl-vm:~$ dig _peek._tcp.marathon.mesos SRV

; <<>> DiG 9.9.5-3ubuntu0.5-Ubuntu <<>> _peek._tcp.marathon.mesos SRV
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 33415
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1280
;; QUESTION SECTION:
;_peek._tcp.marathon.mesos. IN  SRV

;; AUTHORITY SECTION:
.           10791   IN  SOA a.root-servers.net. nstld.verisign-grs.com. 2015102801 1800 900 604800 241

;; Query time: 1 msec
;; SERVER: 10.10.11.1#53(10.10.11.1)
;; WHEN: Wed Oct 28 17:06:30 BRT 2015
;; MSG SIZE  rcvd: 129

It looks like mesos-dns can't resolve _peek._tcp.marathon.mesos SRV.

Does anyone know why and how to fix it?

Thank you in advance...

UPDATE

Result of command /etc/resolv.conf :

nameserver 10.10.11.1
nameserver 10.10.10.7
RafaelTSCS
  • 1,234
  • 2
  • 15
  • 36
  • I'm the author of said tutorial. Sorry to see you having issues. What does `cat /etc/resolv.conf` give you? – Michael Hausenblas Oct 29 '15 at 05:38
  • Hey Michael! Thank you for your tutorial and for your answer! I just updated the question with your request. – RafaelTSCS Oct 29 '15 at 15:25
  • I've read about some people having trouble witn ubuntu, docker and dns together. Porblemas related to dnsmasq using the 53 port. So, I turned it off, but still got no results. – RafaelTSCS Oct 29 '15 at 18:55
  • can you try setting first entry to `127.0.0.1`, see also http://stackoverflow.com/questions/30524236/new-to-mesos-marathon-how-to-deploy-a-new-self-defined-docker/30575078#30575078 and let me know if that changes anything? – Michael Hausenblas Oct 30 '15 at 08:40

1 Answers1

1

Have a look at the Mesos DNS docs regarding Slave Setup:

To allow Mesos tasks to use Mesos-DNS as the primary DNS server, you must edit the file /etc/resolv.conf in every slave and add a new nameserver. For instance, if mesos-dns runs on the server with IP address 10.181.64.13, you should add the line nameserver 10.181.64.13 at the beginning of /etc/resolv.conf on every slave node.

I think the local IP (192.168.56.102) address is missing in your /etc/resolv.conf.

Otherwise, you can also try my minimal Mesos DNS image, but you'd still have to edit the above file.

Tobi
  • 31,405
  • 8
  • 58
  • 90