1

I am trying to connect an external app to Cassandra which is running dockerized on a mesos cluster.

These are the the apps I have running on mesos:

CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                                                        NAMES
137760ce852a        cassandra:latest    "/docker-entrypoint.s"   15 minutes ago      Up 15 minutes       7000-7001/tcp, 7199/tcp, 9160/tcp, 0.0.0.0:31634->9042/tcp   mesos-1b65f33a-3d36-4bf4-8a77-32077d8d234a-S1.0db174cc-2e0c-4790-9cd7-1f142d08c6e2
fec5fc93ccfd        cassandra:latest    "/docker-entrypoint.s"   22 minutes ago      Up 22 minutes       7000-7001/tcp, 7199/tcp, 9160/tcp, 0.0.0.0:31551->9042/tcp   mesos-1b65f33a-3d36-4bf4-8a77-32077d8d234a-S1.0022a3d2-d695-43c4-b22f-f5274cbd03ce
ca729ee628bb        tobilg/mesos-dns    "./bootstrap.sh"         About an hour ago   Up About an hour                                                                 mesos-1b65f33a-3d36-4bf4-8a77-32077d8d234a-S1.12593777-2295-42fa-a56d-1d3cc9fc70ff
3921002a8a5b        python:3            "/bin/sh -c 'env >env"   About an hour ago   Up About an hour    0.0.0.0:31295->8080/tcp                                      mesos-1b65f33a-3d36-4bf4-8a77-32077d8d234a-S1.b101ab59-2538-416f-80cf-29215794bd37

the app called peek is just being used for testing proposals. I can access it at the URL: http://192.168.56.101:10001 with no problems.

The 2 cassandra instances are a seed and another one for scaling up; forming a cluster.

The json description for deployment of the cassandra applications on marathon are as following:

/cassandra-seed

{
    "id": "cassandra-seed",
    "constraints": [["hostname", "CLUSTER", "docker-sl-vm"]],
    "container": {
        "type": "DOCKER",
        "docker": {
            "image": "cassandra:latest",
            "network": "BRIDGE",
            "portMappings": [ {"containerPort": 9042,"hostPort": 0,"servicePort": 0,"protocol": "tcp"} ]
        }
    },
    "cpus": 0.5,
    "mem": 512.0,
    "instances": 1,
    "backoffSeconds": 1,
    "backoffFactor": 1.15,
    "maxLaunchDelaySeconds": 3600
}

/cassandra

{
    "id": "cassandra",
    "constraints": [["hostname", "CLUSTER", "docker-sl-vm"]],
    "container": {
        "type": "DOCKER",
        "docker": {
            "image": "cassandra:latest",
            "network": "BRIDGE",
            "portMappings": [ {"containerPort": 9042,"hostPort": 0,"servicePort": 0,"protocol": "tcp"} ]
        }
    },
    "env": {
            "CASSANDRA_SEED_COUNT": "1",
        "CASSANDRA_SEEDS": "cassandra-seed.marathon.mesos"
    },
    "cpus": 0.5,
    "mem": 512.0,
    "instances": 1,
    "backoffSeconds": 1,
    "backoffFactor": 1.15,
    "maxLaunchDelaySeconds": 3600
}

haproxy configuration is as following:

global
  daemon
  log 127.0.0.1 local0
  log 127.0.0.1 local1 notice
  maxconn 4096
  tune.ssl.default-dh-param 2048

defaults
  log               global
  retries           3
  maxconn           2000
  timeout connect   5s
  timeout client    50s
  timeout server    50s

listen stats
  bind 127.0.0.1:9090
  balance
  mode http
  stats enable
  stats auth admin:admin

frontend marathon_http_in
  bind *:80
  mode http

frontend marathon_http_appid_in
  bind *:81
  mode http

frontend marathon_https_in
  bind *:443 ssl crt /etc/ssl/xip.io/xip.io.pem
  mode http

frontend cassandra_10003
  bind *:10003
  mode tcp
  use_backend cassandra_10003

frontend cassandra-seed_10002
  bind *:10002
  mode tcp
  use_backend cassandra-seed_10002

frontend dns_10000
  bind *:10000
  mode tcp
  use_backend dns_10000

frontend peek_10001
  bind *:10001
  mode tcp
  use_backend peek_10001

backend cassandra_10003
  balance roundrobin
  mode tcp
  server docker-sl-vm_31634 192.168.56.102:31634

backend cassandra-seed_10002
  balance roundrobin
  mode tcp
  server docker-sl-vm_31551 192.168.56.102:31551

backend dns_10000
  balance roundrobin
  mode tcp
  server docker-sl-vm_31314 192.168.56.102:31314

backend peek_10001
  balance roundrobin
  mode tcp
  server docker-sl-vm_31295 192.168.56.102:31295

The application I am trying to connect to Cassandra is a Play application. I am setting it like this:

akka.persistence {
  journal.plugin = "cassandra-journal"
  snapshot-store.plugin = "cassandra-snapshot-store"
}

cassandra-journal.contact-points = ["192.168.56.101:10003"]
cassandra-snapshot-store.contact-points = ["192.168.56.101:10003"]

The app starts up OK, but when I try to access it, I get the following error:

! @6o380dcg9 - Internal server error, for (GET) [/issues/list] ->

play.api.Application$$anon$1: Execution exception[[TimeoutException: deadline passed]]
        at play.api.Application$class.handleError(Application.scala:296) ~[play_2.11-2.3.10.jar:2.3.10]
        at play.api.DefaultApplication.handleError(Application.scala:402) [play_2.11-2.3.10.jar:2.3.10]
        at play.core.server.netty.PlayDefaultUpstreamHandler$$anonfun$14$$anonfun$apply$1.applyOrElse(PlayDefaultUpstreamHandler.scala:205) [play_2.11-2.3.10.jar:2.3.10]
        at play.core.server.netty.PlayDefaultUpstreamHandler$$anonfun$14$$anonfun$apply$1.applyOrElse(PlayDefaultUpstreamHandler.scala:202) [play_2.11-2.3.10.jar:2.3.10]
        at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36) [scala-library-2.11.7.jar:na]
Caused by: java.util.concurrent.TimeoutException: deadline passed
        at akka.actor.dsl.Inbox$InboxActor$$anonfun$receive$1.applyOrElse(Inbox.scala:117) ~[akka-actor_2.11-2.4.0.jar:na]
        at scala.PartialFunction$AndThen.applyOrElse(PartialFunction.scala:189) ~[scala-library-2.11.7.jar:na]
        at akka.actor.Actor$class.aroundReceive(Actor.scala:480) ~[akka-actor_2.11-2.4.0.jar:na]
        at akka.actor.dsl.Inbox$InboxActor.aroundReceive(Inbox.scala:62) ~[akka-actor_2.11-2.4.0.jar:na]
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:525) ~[akka-actor_2.11-2.4.0.jar:na]
[error] c.d.d.c.Session - Error creating pool to /172.17.0.2:9042
com.datastax.driver.core.TransportException: [/172.17.0.2:9042] Cannot connect
        at com.datastax.driver.core.Connection.<init>(Connection.java:109) ~[cassandra-driver-core-2.1.5.jar:na]
        at com.datastax.driver.core.PooledConnection.<init>(PooledConnection.java:32) ~[cassandra-driver-core-2.1.5.jar:na]
        at com.datastax.driver.core.Connection$Factory.open(Connection.java:586) ~[cassandra-driver-core-2.1.5.jar:na]
        at com.datastax.driver.core.SingleConnectionPool.<init>(SingleConnectionPool.java:76) ~[cassandra-driver-core-2.1.5.jar:na]
        at com.datastax.driver.core.HostConnectionPool.newInstance(HostConnectionPool.java:35) ~[cassandra-driver-core-2.1.5.jar:na]
Caused by: org.jboss.netty.channel.ConnectTimeoutException: connection timed out: /172.17.0.2:9042
        at org.jboss.netty.channel.socket.nio.NioClientBoss.processConnectTimeout(NioClientBoss.java:139) ~[netty-3.9.9.Final.jar:na]
        at org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:83) ~[netty-3.9.9.Final.jar:na]
        at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337) ~[netty-3.9.9.Final.jar:na]
        at org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42) ~[netty-3.9.9.Final.jar:na]
        at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) ~[netty-3.9.9.Final.jar:na]
[error] c.d.d.c.Session - Error creating pool to /172.17.0.2:9042

Does anyone know how to fix this? What am I doing wrong?

Thank you in advance...

UPDATE =============================

Interesting thing is that the keyspaces for my application were created (akka, akka_snapshots):

cqlsh> describe keyspaces;
akka_snapshot  system_auth  system  system_distributed  system_traces  akka

UPDATE 2 =============================

I have just noticed that I can't even connect the app directly to the running cassandra (without going through the haproxy). So, I've changed the portMapping to:

"portMappings": [ {"containerPort": 9042,"hostPort": 0,"servicePort": 9042,"protocol": "tcp"} ]

and it worked. HOWEVER, it only allow me to startup one machine, because of the servicePort declaration.

The problem is right into the port mapping. Any clue?

RafaelTSCS
  • 1,234
  • 2
  • 15
  • 36
  • What happens if you remove the `hostPort` and `servicePort` definitions from the Marathon app JSON? – Tobi Nov 06 '15 at 09:49
  • Hey Tobi! thanks for answering. I did what you asked, but nothing changed. – RafaelTSCS Nov 06 '15 at 11:43
  • Regarding your update 2), what happens if you use `"portMappings": [ {"containerPort": 9042,"protocol": "tcp"} ]` as I suggested? – Tobi Nov 09 '15 at 08:16
  • Nothing happens. It remains the same. – RafaelTSCS Nov 09 '15 at 16:18
  • That's not very specific :-/ I bet it's not exactly the same, because now the port 9042 shouldn't be already blocked. Please post some error messages, or it will be impossible to help. – Tobi Nov 09 '15 at 17:17
  • I mean. Both instances goes up. But I can't connect to any of them. It seems the service ports are unknown between host and container. I get no error messages regardless the stacktrace mentioned above. – RafaelTSCS Nov 09 '15 at 17:36
  • @RafaelTSCS: Hi, I see (in a comment below) that you have managed to make the nodes communicate between them. I am trying to make a similar deployment, however, the nodes cannot gossip. While in bridge mode, the Cassandra service starts in the docker container IP and not the host IP. However, dig shows me that the IP for Cassandra-seed.marathon.mesos is the host IP. Thus, I cannot provide a valid IP for seed in the new nodes. I would be grateful if you could post the final json that you used for deploying the seed and non-seed nodes? – Manolis Jan 10 '17 at 16:43
  • Hey @Manolis. I am sorry. I'd like to help you, but I don't have it anymore. It happened that we are not using it yet, as long as the system is not in production yet. I think my problem was something about the port and the physical machine. – RafaelTSCS Jan 11 '17 at 12:40
  • No problem @RafaelTSCS! Thank you! – Manolis Jan 11 '17 at 13:00

1 Answers1

2

I understand you're using haproxy for the service discovery of the Cassandra cluster. If so, it won't be successful if you don't have a mechanism that updates the configuration once the tasks from Marathon are changed (scaling etc.).

The problem why your Cassandra node can't talk to each other is presumably that the /cassandra app has no reference to /cassandra-seed .

According to the Cassandra Docker image docs you should be able to configure the CASSANDRA_SEEDS env parameter dynmically.

To be able to use the service name cassandra-seed.marathon.mesos if would be necessary to resolve it to an IP address first IMHO:

"CASSANDRA_SEEDS": "$(host cassandra-seed.marathon.mesos | awk '/has address/ { print $4 }')"

would theoretically work (e.g. if your app has just one instance).

As you seem to use Mesos DNS, there can be a problem because currently (v0.4.0) only internal IP addresses are advertised (see Issue). You might have to fall back to a "real" Mesos DNS client which can resolve SRV records to correctly map those to Mesos Slave IP adresses and ports.

Or, you can parse the dig results yourself and use this as an input for the CASSANDRA_SEEDS env parameter:

dig _cassandra-seed._tcp.marathon.mesos SRV

see Mesos DNS docs.

mesosdns-cli can handle this, but requires a Node.js runtime in the Docker container where it should be used. You'd therefore have to create your own derivate of the cassandra Docker image.

Tobi
  • 31,405
  • 8
  • 58
  • 90
  • You've got a point. But how it comes that I can't even connect to cassandra using the slave's IP and service port? e.g. 192.168.56.102:10003. I can only do this if I bind the servicePort to 9042 and connect the app to 192.168.56.102:9042. – RafaelTSCS Nov 11 '15 at 15:12
  • Well, I don't know, but I also think that this is maybe a side issue. First, you have to make sure you can connect your Cassandra instances – Tobi Nov 11 '15 at 15:18
  • If I run "nodetool status" inside any of the 2 containers, it gives me 2 nodes. So, it means they can talk. But the external app can't. – RafaelTSCS Nov 11 '15 at 15:20
  • But if you connect to `192.168.56.102:31634` directly, it works...? – Tobi Nov 13 '15 at 07:48
  • Well. Sorry for the delay. I was sent to another issue. Answering to your question. It doesn't work on 192.168.56.102:31634. It only works if I define service port as 9042 on marathon. – RafaelTSCS Dec 02 '15 at 13:17