kubernetes - container busy with request, then request should route to another container

Question

I am very new to k8s and docker. But I have task on k8s. Now I stuck with a use case. That is:

If a container is busy with requests. Then incoming request should redirect to another container.

deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: twopoddeploy
  namespace: twopodns
spec:
  selector:
    matchLabels:
      app: twopod
  replicas: 1
  template:
    metadata:
      labels:
        app: twopod
    spec:
      containers:
      - name: secondcontainer
        image: "docker.io/tamilpugal/angmanualbuild:latest"
        env:
        - name: "PORT"
          value: "24244"
      - name: firstcontainer
        image: "docker.io/tamilpugal/angmanualbuild:latest"
        env:
        - name: "PORT"
          value: "24243"

service.yaml

apiVersion: v1
kind: Service
metadata:
  name: twopodservice
spec:
  type: NodePort
  selector:
    app: twopod
  ports:
  - nodePort: 31024
    protocol: TCP
    port: 82
    targetPort: 24243

From deployment.yaml, I created a pod with two containers with same image. Because, firstcontainer is not reachable/ busy, then secondcontainer should handles the incoming requests. This our idea and use case. So (only for checking our use case) I delete firstcontainer using docker container rm -f id_of_firstcontainer. Now site is not reachable until, docker recreates the firstcontainer. But I need k8s should redirects the requests to secondcontainer instead of waiting for firstcontainer.

Then, I googled about the solution, I found Ingress and Liveness - Readiness. But Ingress does route the request based on the path instead of container status. Liveness - Readiness also recreates the container. SO also have some question and some are using ngix server. But no luck. That why I create a new question.

So my question is how to configure the two containers to reduce the downtime?

what is keyword for get the solution from the google to try myself?

Thanks,

Pugal.

Have you explored load-balancing (https://kubernetes.io/docs/concepts/services-networking/) in k8s? This should give you clue on how to handle the requests in such scenarios — ashu, Dec 12 '20 at 18:02

score 1 · Answer 1 · answered Dec 12 '20 at 19:32

A service can load-balance between multiple pods. You should delete the second copy of the container inside the deployment spec, but also change it to have replicas: 2. Now your deployment will launch two identical pods, but the service will match both of them, and requests will go to both.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: onepoddeploy
  namespace: twopodns
spec:
  selector: { ... }
  replicas: 2 # not 1
  template:
    metadata: { ... }
    spec:
      containers:
      - name: firstcontainer
        image: "docker.io/tamilpugal/angmanualbuild:latest"
        env:
        - name: "PORT"
          value: "24243"
      # no secondcontainer

This means if two pods aren't enough to handle your load, you can kubectl scale deployment -n twopodns onepoddeploy --replicas=3 to increase the replica count. If you can tell from CPU utilization or another metric when you're getting to "not enough", you can configure the horizontal pod autoscaler to adjust the replica count for you.

(You will usually want only one container per pod. There's no way to share traffic between containers in the way you're suggesting here, and it's helpful to be able to independently scale components. This is doubly true if you're looking at a stateful component like a database and a stateless component like an HTTP service. Typical uses for multiple containers are things like log forwarders and network proxies, that are somewhat secondary to the main operation of the pod, and can scale or be terminated along with the primary container.)

There are two important caveats to running multiple pod replicas behind a service. The service load balancer isn't especially clever, so if one of your replicas winds up working on intensive jobs and the other is more or less idle, they'll still each get about half the requests. Also, if your pods are configure for HTTP health checks (recommended), if a pod is backed up to the point where it can't handle requests, it will also not be able to answer its health checks, and Kubernetes will kill it off.

You can help Kubernetes here by trying hard to answer all HTTP requests "promptly" (aiming for under 1000 ms always is probably a good target). This can mean returning a "not ready yet" response to a request that triggers a large amount of computation. This can also mean rearranging your main request handler so that an HTTP request thread isn't tied up waiting for some task to complete.

We know about `replica` field. Let we try in this case. – Pugal Dec 14 '20 at 06:06 — Pugal, Dec 14 '20 at 06:06

kubernetes - container busy with request, then request should route to another container

1 Answers1