Horizontal Pod Autoscaling (HPA) with an initContainer that requires a Job

Question

I have a specific scenario where I'd like to have a deployment controlled by horizontal pod autoscaling. To handle database migrations in pods when pushing a new deployment, I followed this excellent tutorial by Andrew Lock here.

In short, you must define an initContainer that waits for a Kubernetes Job to complete a process (like running db migrations) before the new pods can run.

This works well, however, I'm not sure how to handle HPA after the initial deployment because if the system detects the need to add another Pod in my node, the initContainer defined in my deployment requires a Job to be deployed and run, but since Jobs are one-off processes, the pod can not initialize and run properly (a ttlSecondsAfterFinished attribute removes the Job anyways).

How can I define an initContainer to run when I deploy my app so I can push my database migrations in a Job, but also allow HPA to control dynamically adding a Pod without needing an initContainer?

Here's what my deployment looks like:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: graphql-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      app: graphql-pod
  template:
    metadata:
      labels:
        app: graphql-pod
    spec:
      initContainers:
        - name: wait-for-graphql-migration-job
          image: groundnuty/k8s-wait-for:v1.4 # This is an image that waits for a process to complete
          args:
            - job
            - graphql-migration-job # this job is defined next
      containers:
        - name: graphql-container
          image: image(graphql):tag(graphql)

The following Job is also deployed

apiVersion: batch/v1
kind: Job
metadata:
  name: graphql-migration-job
spec:
  ttlSecondsAfterFinished: 30
  template:
    spec:
      containers:
      - name: graphql-migration-container
        image: image(graphql):tag(graphql)
        command: ["npm", "run", "migrate:reset"]
      restartPolicy: Never

So basically what happens is:

I deploy these two resources to my node
Job is initialized
initContainer on Pod waits for Job to complete using an image called groundnuty/k8s-wait-for:v1.4
Job completes
initContainer completes
Pod initializes
(after 30 TTL seconds) Job is removed from node

(LOTS OF TRAFFIC)

HPA realizes a need for another pod
initContainer for NEW pod is started, but cant run because Job doesn't exist
...crashLoopBackOff

Would love any insight on the proper way to handle this scenario!

Is it required to run the job for every new pod or is this a one thing only? — papanito, Aug 09 '21 at 05:16
@papanito good question, it is only important to run the job at initial deployment. Other than that, no, the Job simply runs my DB migrations script which doesnt need to run when Horizontally autoscaling. The useful feature with the initContainer and `wait-for` is that it doesn't run any of the new pods until the job is complete, then scales up the new pods and removes the old ones so there's no down time — Jordan Lewallen, Aug 09 '21 at 05:28

score 4 · Accepted Answer · answered Aug 11 '21 at 08:28

There is, unfortunately, no simple Kubernetes feature to resolve your issue.

I recommend extending your deployment tooling/scripts to separate the migration job and your deployment. During the deploy process, you first execute the migration job and then deploy your deployment. Without the job attached, the HPA can nicely scale your pods.

There is a multitude of ways to achieve this:

Have a bash, etc. script first to execute the job, wait and then update your deployment
Leverage more complex deployment tooling like Helm, which allows you to add a 'pre-install hook' to your job to execute them when you deploy your application

Hi Lukas - thanks for the realistic response, as I figured this might be the scenario. I'll keep this open for a little bit longer just in case there are other options but a pre-install hook seems like the best compromise. — Jordan Lewallen, Aug 11 '21 at 17:38

Horizontal Pod Autoscaling (HPA) with an initContainer that requires a Job

1 Answers1