5

We are looking into using Docker plus either Mesos/Marathon or Kubernetes for hosting a cluster. However, the one issue that we haven't really seen any answers for is how to allow clustered services to connect to each other correctly. All of the ones that I have seen need to know about at least one other node before they can join the cluster. Some need to know about every node. However, in Kubernetes and Mesos, there's no way to know what those IP addresses are ahead of time.

So, are there any best practices for this? If it helps, some technologies we're looking into deploying as containers are ElasticSearch, ActiveMQ, and MongoDB. There may be others.

blockcipher
  • 2,144
  • 4
  • 22
  • 35

4 Answers4

2

However, the one issue that we haven't really seen any answers for is how to allow clustered services to connect to each other correctly.

I think you're talking about HA/replicated/sharded apps here.

At the moment, in kubernetes, you can accomplish this by making an api call listing all the "endpoints" of the service; that will tell you where your peers are running.

We'd eventually like to support the use case you describe in a more first-class manner.

I filed https://github.com/GoogleCloudPlatform/kubernetes/issues/3419 to maybe get something more standardized started here.

lavalamp
  • 399
  • 1
  • 5
  • Exactly. I just used "clustered" since they run as a cluster of machines. We actually did something similar, but it seems kind-of hackish. Also, there's the concern about leadership, what if nodes get added/removed, etc. Hopefully your idea helps. – blockcipher Jan 13 '15 at 13:14
1

I also wanted to setup an ElasticSearch cluster using Mesos/Marathon. As the existing "solutions" either were merely undocumented, or not working/outdated, I set up my own container.

If you like, have a look at https://github.com/tobilg/docker-elasticsearch-marathon

If you have a running Marathon installation (I use v0.8.1), then setting up an ElasticSearch cluster should be a matter of a few minutes.

UPDATE:

The container now uses Elasticsearch v1.5.2 and is able to run on the latest Marathon v0.8.2.

Tobi
  • 31,405
  • 8
  • 58
  • 90
0

As for Kubernetes, it currently does require kube-controllers-manager to start with --machines argument given a list of minion IPs or hostnames.

errordeveloper
  • 6,716
  • 6
  • 41
  • 54
  • It probably wouldn't be too difficult to replace this with either etcd or mDNS backed discovery mechanism and surely maintainers would welcome such enhancement. – errordeveloper Jan 14 '15 at 13:27
  • Something like that already exists for Kubernetes, however that's after the container starts. Even then, the services themselves would have to support those particular forms of discovery, which may or may not happen depending on the project. – blockcipher Jan 15 '15 at 14:13
  • I don't think we are talking about the same thing... Your question was about forming the Kubernetes cluster itself, which currently require you to provide the IPs/hostnames of all participating machines. I turns out you are asking about apps that run on Kubernetes, right? – errordeveloper Jan 15 '15 at 14:51
  • Actually, that was never my question. My question was about creating a service cluster, such as an ElasticSearch cluster, within a Kubernetes or Mesos environment. The issue is that for such a cluster to work, the nodes need to find each other, which becomes very difficult if you don't know their IP addresses until after they are started. – blockcipher Jan 15 '15 at 19:30
  • To that question ElasticSearch would use it's multicast-based [Zen discovery](http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-discovery-zen.html). If you use Kubernetes with Weave, like I shown in [my blog post](http://weaveblog.com/2014/11/11/weave-for-kubernetes/), you will get multicast working anywhere. – errordeveloper Jan 16 '15 at 11:35
0

I don't see any easy way how to handle this correctly in Kubernetes now. Yes, you could make a call to the API that returns list of endpoints but you must watch for changes and take an action when endpoints change...

I would prefer to use Mesos/Marathon that is well prepared for this scenario. You should implement custom Framework for Mesos. There is already Framework for ElasticSearch prepared: http://mesos.apache.org/documentation/latest/mesos-frameworks/

Augi
  • 350
  • 1
  • 2
  • 10