I deployed a MongoDB ReplicaSet as StatefulSet in Kubernetes. I'm running a Bare Metal K8S Cluster and therefore I'm using MetalLB to expose Services of type LoadBalancer. In case of my MongoDB-RS Setup exposed Services look like this:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
mongo-0 LoadBalancer 10.43.199.127 172.16.24.151 27017:31118/TCP 55m
mongo-1 LoadBalancer 10.43.180.131 172.16.24.152 27017:31809/TCP 55m
mongo-2 LoadBalancer 10.43.156.124 172.16.24.153 27017:30312/TCP 55m
This works as expected but the Problem comes when connecting to the RS from external client:
➜ ~ mongo "mongodb://172.16.24.151:27017,172.16.24.152:27017,172.16.24.153:27017/?replicaSet=rs0"
MongoDB shell version v4.0.10
connecting to: mongodb://172.16.24.151:27017,172.16.24.152:27017,172.16.24.153:27017/?gssapiServiceName=mongodb&replicaSet=rs0
2019-07-05T10:47:27.058+0200 I NETWORK [js] Starting new replica set monitor for rs0/172.16.24.151:27017,172.16.24.152:27017,172.16.24.153:27017
2019-07-05T10:47:27.106+0200 I NETWORK [js] Successfully connected to 172.16.24.153:27017 (1 connections now open to 172.16.24.153:27017 with a 5 second timeout)
2019-07-05T10:47:27.106+0200 I NETWORK [ReplicaSetMonitor-TaskExecutor] Successfully connected to 172.16.24.151:27017 (1 connections now open to 172.16.24.151:27017 with a 5 second timeout)
2019-07-05T10:47:27.136+0200 I NETWORK [ReplicaSetMonitor-TaskExecutor] changing hosts to rs0/10.42.2.155:27017,10.42.3.147:27017,10.42.4.108:27017 from rs0/172.16.24.151:27017,172.16.24.152:27017,172.16.24.153:27017
2019-07-05T10:47:52.654+0200 W NETWORK [js] Unable to reach primary for set rs0
2019-07-05T10:47:52.654+0200 I NETWORK [js] Cannot reach any nodes for set rs0. Please check network connectivity and the status of the set. This has happened for 1 checks in a row.
2019-07-05T10:47:52.654+0200 E QUERY [js] Error: connect failed to replica set rs0/172.16.24.151:27017,172.16.24.152:27017,172.16.24.153:27017 :
connect@src/mongo/shell/mongo.js:344:17
At some point it says "changing hosts to rs0/10.42.2.155:27017,10.42.3.147:27017,10.42.4.108:27017". Since those IPs are Cluster-internal the connection will then fail after this point.
Any suggestions what I could do?