1

I have spark application and running it with spark-submit like:

/opt/spark/bin/spark-submit \
--master mesos://dispatcher_ip:7077 \
--driver-memory xG \
--executor-memory xG \
--total-executor-cores x \
--deploy-mode cluster \
-- and jar files.

When I run this spark application from remote-ip or in Mesos slaves directly, it works as expected, i.e. I can see framework/driver running on Mesos master and logs also shows Tasks running. I want to run the same using Marathon, but when I run with Marathon, the application starts on Mesos-slaves, task state is "FINISHED" but spark-app dies quickly saying "Executor asked to shutdown". I am not understanding why spark app is not running. Can someone help me in understanding why marathon is not able to launch spark-app on Mesos?

Marathon config for application:

{
  "id": "/zzzzzzzzz333",
  "cmd": "sh path_to/spark_app.sh",
  "cpus": 2,
  "mem": 2048,
  "disk": 0,
  "instances": 1,
  "constraints": [
    [
      "hostname",
      "CLUSTER",
      "mesos_slave_ip"
    ]
  ],
  "portDefinitions": [
    {
      "port": 10000,
      "protocol": "tcp",
      "labels": {}
    }
  ]
}

When deploying application from Marathon and checking tasks status on Mesos: Task Status is Finished and output is:

{
  "action" : "CreateSubmissionResponse",
  "serverSparkVersion" : "1.6.1",
  "submissionId" : "driver-20160917213046-0142",
  "success" : true
}

Output on Mesos framework for the driver application:

I0917 22:20:10.152683 13370 exec.cpp:143] Version: 0.28.2
I0917 22:20:10.162206 13378 exec.cpp:390] Executor asked to shutdown
EnableTest
  • 11
  • 3

1 Answers1

1

I could be mistaken, but my understanding is that the spark submit task here only informs spark to execute the job, the output I see here from spark more or less says "spark accepted the job" and your command then exits as expected. Marathon will keep executing this continuously as it is intended to keep long running applications alive. You may want to investigate Metronome/DCOS Jobs or Chronos for something like this (or just submit spark jobs directly to the cluster as this isn't quite what marathon is intended to do).