Spark on DCOS: Make dispatcher reachable from outside the cluster

Question

We have installed spark service from the marathon catalogue on a DCOS cluster.

The json configuration of the service goes like this:

{
  "service": {
    "name": "spark",
    "cpus": 1,
    "mem": 1024,
    "role": "*",
    "service_account": "",
    "service_account_secret": "",
    "user": "root",
    "docker-image": "mesosphere/spark:2.3.1-2.2.1-2-hadoop-2.6",
    "log-level": "INFO",
    "spark-history-server-url": "http://internal-tfef5a-int-master-elb-1145533197.eu-east-1.elb.amazonaws.com/service/spark-history",
    "UCR_containerizer": false,
    "use_bootstrap_for_IP_detect": false
  },
  "hdfs": {
    "config-url": "http://api.hdfs.marathon.l4lb.thisdcos.directory/v1/endpoints"
  }
}

Given that the marathon ui does not directly allow editing of the service port, what is the proper way to expose the dispatcher so one can perform spark-submit from his/her own workstation?

score 0 · Answer 1 · answered Nov 15 '18 at 13:40

0

You should run your dispatcher behind edge proxy. See here how to do it.

Another way is to run it on a public agent. It could be done by setting acceptedResourceRoles":["slave_public"] see here

answered Nov 15 '18 at 13:40

janisz

6,292
4
37
70

if I edit my json service configuration (for the `spark` service I installed from the catalogue) and add `acceptedResourceRoles":["slave_public"]`, will this make it automatically exposed / routable from my workstation? – pkaramol Nov 15 '18 at 13:45
Nope. It will be only available with IP and port obtained manually from marathon. To get routable endpoint on one port you need to set up a proxy. – janisz Nov 15 '18 at 13:48

Spark on DCOS: Make dispatcher reachable from outside the cluster

1 Answers1