0

I am trying to deploy mongodb with kubernetes (gke more precisely). This database is used by a microservice which only needs to read in the database, so I thought of deploying multiple pods with a mongodb docker in each of them, so that the work is shared between them. To do that I created a mongodb image in which I uploaded my mongodb that I previously used with a single docker.

(Here is its Dockerfile, a single deployment of this image works in k8s so I guess that this may not be linked to the issue)

FROM mongo:latest
EXPOSE 27017
COPY /mdb/ /data/db

As the number of requests to the db varies during the day, I want to use gke horizontal autoscaling for those "mongodb pods". Autoscaling works as new pods are created when the cpu utilization goes over the target I fixed in my horinzontal pod autoscaler, but these new pods are not used by the service I created for the deployment file used to deploy those pods, and that's my issue.

Something strange to me is that the new pods local IP addresses appear in my service endpoints, and when I delete the initial pod, which is the only one working at that moment, then the other pods created by the autoscaler get activated and so I finally get a performance improvement. However, this is obviously not a solution to me, and moreover other pods created after I deleted the initial one don't get used either.

Here are the yamls files for my mongodb deployment and service :

apiVersion: apps/v1
kind: Deployment
metadata:
  name: mongodb-deployment
  labels:
    app: mongodb
spec:
  replicas: 1
  selector:
    matchLabels:
      app: mongodb
  template:
    metadata:
      labels:
        app: mongodb
    spec:
      containers:
      - name: mongodb
        image: "$my_mongodb_image_in_which_I_have_my_db"
        ports:
        - containerPort: 27017
        resources:
          requests:
            memory: "1800Mi"
            cpu: "3000m"
apiVersion: v1
kind: Service
metadata:
  name: mongodb-service
spec:
  type: LoadBalancer
  loadBalancerIP: $IP_reserved_for_this_service
  selector:
    app: mongodb
  ports:
    - protocol: TCP
      port: 80
      targetPort: 27017

And I am accessing those mongodb through pymongo, in programs that run in another pod in the same gke cluster:

def get_db(database: str):
    client = MongoClient(host="$IP_reserved_for_this_service",
                         port=80,
                         username="...",
                         password="...",
                         authSource="admin")
    return client.get_database(database)

This way of using and autoscaling mongodb might be weird and quite impractical but it's only a first model for me and I would like to make it work (or understand why it can't work).

Here are screenshots showing the behaviour of those pods:

state 1: only the initial pod is working

...but all ips appear in service endpoints

state 2: initial pod deleted, the other work now (except the new one created by the autoscaler after the deletion)

...and the endpoints are updated in the service (the update is in the "+ 1 more ...", I checked in the google console)

I feel that the problem might come either from the configuration of my mongodb-service or from the way k8s or gke deals with mongodb images (anyway since I'm new to k8s I might be completely wrong on that too).

Any help or comment will be appreciated, and if you need more information let me know.

JujuPat
  • 36
  • 4

1 Answers1

0

There sticky connection in Kubernetes, a common and well-known feature of Kubernetes. Kubernetes doesn't balance packages, it's balancing connections. Once your app established a connection to a service, all requests to this service will through this connection. Kubernetes doesn't guarantee that the next one won't go via another connection. Once of the option to solve - headless service https://kubernetes.io/docs/concepts/services-networking/service/#headless-services Or service mesh, but it's too much for your case ))

pluk
  • 99
  • 4
  • Thanks for your answer. As I mentioned in my question, I want to access this service to connect to one of the MongoDB that I have deployed, i.e. to one of the pods deployed with my deployment file. However it seems that a headless service doesn't have either an external IP or a cluster IP. I tried to get the DNS server address with `nslookup mongodb-service` but it returns the error `server can't find mongodb-service: NXDOMAIN`, for which I haven't found an explanation or a solution in my case. Do you have an idea of where that comes from? – JujuPat Jul 19 '22 at 09:34
  • I found a way to get server address by running busybox `kubectl run temporary --image=radial/busyboxplus:curl -i --tty` as mentioned here [Setup a Headless Service in Kubernetes](https://blog.knoldus.com/what-is-headless-service-setup-a-service-in-kubernetes/) , however using this address in MongoClient the same way I used the service IP before (either external or cluster IP, both worked since my python program runs in the same cluster) doesn't work, and I get a `500 Internal Server Error` when I send a request to my microservice. I probably don't get well enough what this server address is. – JujuPat Jul 19 '22 at 11:24
  • As I understand you need to communicate with mongo inside your cluster, why do you need external IP? – pluk Jul 20 '22 at 08:12
  • Yes that's what I want to do. But since I want to have multiples mongodb, by having multiple pods, and I want to have a unique access to them (I don't mind to which mongodb my request is going since they are all the same), I am using a service for my deployment. But with a headless service I don't find any IP (external or internal) to send a request to this service. – JujuPat Jul 20 '22 at 08:34