1

I've created a one mesos master and three mesos slave environment. Now, marathon is running as a framework for mesos jobs. I am trying to deploy a simple job:

{
    "id": "basic-0", 
    "cmd": "while [ true ] ; do echo 'Hello Marathon' ; sleep 5 ; done",
    "cpus": 0.1,
    "mem": 10.0,
    "instances": 1
}

But this is hanging into the marathon web ui for a long time. I have tried manually creating a marathon job, but that one also keeps itself in deployment state forever. I am clueless why its not running, any idea?

Marcelo Camargo
  • 2,240
  • 2
  • 22
  • 51
Pankaj Saha
  • 21
  • 1
  • 5
  • Is your marathon connected to the cluster? Mind sharing Marathon logs for triaging? – rukletsov Aug 13 '15 at 20:35
  • 1
    This could also could happen if there are connectivity issues between the slave and the master. Could you please verify that you can actually execute tasks on slave? You can test that by doing `GLOG_v=1 mesos-execute --command="/bin/ls/" --name=testTask --master=MASTER_IP`. – hartem Aug 13 '15 at 23:28
  • yes marathon is connected with mesos cluster. i have installed the marathon on mesos-master node and could see the marathon portal at 8080 port. sharing logs is a problem because my dept will not allow me to share network information. i have tried pinging mesos slaves from my mesos masters and they are ssh-able from master node. – Pankaj Saha Aug 14 '15 at 17:45
  • 1
    In the logs for the mesos-master, do you see entries like `Sending # offers to framework UUID (marathon)...`? If yes then there is something about the offers that is insufficient for marathon to deploy the job. If no then marathon is just waiting around for a resource. If no then are you using Mesos roles by any chance? I'm having problems there. Also if no then marathon might be being starved by the Mesos resource sharing algorithm though that doesn't seem like the case. Also if no, have you tried restarting mesos-master and marathon? I've seen what I think are similar intermittent problems. – mab Aug 17 '15 at 16:01
  • I am also getting same problem.Even after deploying the project ,the mesos framework does not showing the project as activated . – Chintamani Jan 09 '16 at 12:00
  • this could be related to resources not available. Check your mesos cluster if it's healthy and slaves well connected. Then check sandbox logs for any clue – Salimane Adjao Moustapha Mar 24 '16 at 22:23
  • @hartem did you mean `--command="/bin/ls"`? That worked for me. Thanks – Peter Becich Nov 16 '16 at 20:55

1 Answers1

0

Please check your marathon logs and it says reason why its still in waiting state. It can be due to multiple issues like slave deregistration frequently, insufficient resources, not meeting constraint requirement

sathee005
  • 67
  • 3