Marathon has 1st-class support for performing rolling (zero-downtime) upgrades on your applications. What if you need to upgrade or reconfigure Mesos itself though?
More specifically I'd like to know if it's possible to upgrade/reconfigure Mesos Master and Slave instances without causing any downtime?
Reconfiguring slaves in a rolling fashion should be trivial, since you can run redundant slave instances.
Would it be safe to upgrade a slave to a later version that the master? In other words is the master kept forwards compatible with respect to the slaves?
According to the operational guide it looks like it would be possible to take down a master node at a time in High Availability mode: http://mesos.apache.org/documentation/latest/operational-guide/
I wonder if the differing versions of master would be compatible however?
I suppose you could spin up a new Mesos cluster and migrate your existing workload across, but this seems like a pain.