DC/OS with chronos on localhost vagrant very unreliable

Question

I have a local deployment with DC/OS where I also installed chronos. My setup is one master, one agent and the boot image: m1, a1, boot.

The problem is that the jobs I send to chronos either don't get into queue or seem to not execute or...they get executed really late even tough I specified that I want them running right away. I always resort to restarting chronos so I can have 10 minutes of a responsive stack.

I tried with multiple masters and multiple agents as well with the same results. I also tried raising the RAM and CPUs on both the master and agent with no luck. There seems to be a time window after which the stack lags out badly.

My second issue after some testing. I tried adding jobs to chronos that would keep the agent's cpu capped at 100% for a while to see how it performs under load and, after 2 mins chronos crashed and my jobs all failed at once. Is this also something I could expect in production?

I'm asking this in hopes that it's only a matter with the test local deployment under vagrant before I go on with my project and enter production spending quite a few bucks.

I kind of think it's a little difficult to judge from a local Vagrant-based installation how it would behave in PROD on dedicated hardware/VMs... Apart from that, it'd be great if you'd add some more details, like Chronos version, service logs etc. — Tobi, Dec 19 '16 at 10:45
It's unlikely that dcos-vagrant is the root cause here. It's more likely to be a Chronos or DC/OS issue. That said, it's impossible to know without actually having Chronos or DC/OS component logs to debug. — KarlKFI, Dec 19 '16 at 22:16

DC/OS with chronos on localhost vagrant very unreliable

0 Answers0