When I publish a service with a VIP, the advertised address does not route properly to the advertised port. For example, for a MariaDB Galera 3-node cluster service with a VIP specified as:
"labels": {
"VIP_0": "/mariadb-galera:3306"
}
On the configuration tab of the service page (and according to the docs), the load balanced address is:
mariadb-galera.marathon.l4lb.thisdcos.directory:3306
I can ping the DNS name just fine, but...
When I try to connect a front-end service (Drupal7, wordpress) to consume this load balanced address:port combination, there will be numerous connection failures and timeouts. It isn't that it never works but that it works quite sporadically, if at all. Drupal7 dies almost immediately and starts kicking up Bad Gateway errors.
What I have found through experimentation is that if I specify a hostPort for the service in question, the load balanced address will work as long as I use the hostPort
value, and not the advertised load balanced service port as above. In this specific case I specified a hostPort
of 3310.
"network":"USER",
"portMappings": [
{
"containerPort": 3306,
"hostPort": 3310,
"servicePort": 10000,
"name": "mariadb-galera",
"labels": {
"VIP_0": "/mariadb-galera:3306"
}
}
Then if I use the load balanced address (mariadb-galera.marathon.l4lb.thisdcos.directory
) with the host port value (3310) in my Drupal7 settings.php, the front end connects and works fine.
I've noticed similar behaviour with custom applications connecting to mongodb backends also in a DC/OS environment... it seems the load balanced address/port combination specified never works reliably... but if you substitute the hostPort value, it does.
The docs clearly state that:
address and port is load balanced as a pair rather than individually.
(from https://docs.mesosphere.com/1.9/networking/dns-overview/)
Yet I am unable to effectively connect when I specify the VIP designated port. Yet IT DOES WORK when I use the hostPort (and will not work at all unless I designate a specific hostPort in the service definition json). Wether or not this approach is actually load balanced remains a question to me based on the wording in the documentation.
I must be doing something wrong, but I am at a loss... any help is appreciated.
My cluster nodes are VMWare virtual machines.