I have a MySql statefulset with just one replica. So, there will always be just one container running MySQL. The first time this container runs, it should create the schema of the db and insert all data (possibly via a python script). Storing everything on persistent storage, it should not lose any data when eventually restarting, so that the "database inser script" should not re-run. Also, every now and then it should run another script, which will search for new records to insert and add them. I thought about having a configmap for the schema generation, but I think this would not be the right approach since it would recreate the schema each time, right? So, what is the best way to to this?
1 Answers
I recommend that you use initContainers to do the housekeeping tasks - such as data restore, downloading initial content etc.
InitContainers are specialized containers that run before app containers in a Pod. Init containers can contain utilities or setup scripts not present in an app image.
You can specify init containers in the Pod specification alongside the containers array (which describes app containers). A Pod can have multiple containers running apps within it, but it can also have one or more init containers, which are run before the app containers are started.
Init containers are exactly like regular containers, except:
Init containers always run to completion. Each init container must complete successfully before the next one starts. If a Pod's init container fails, the kubelet repeatedly restarts that init container until it succeeds. However, if the Pod has a restartPolicy of Never, and an init container fails during startup of that Pod, Kubernetes treats the overall Pod as failed.
To specify an init container for a Pod, add the initContainers field into the Pod specification, as an array of container items (similar to the app containers field and its contents).
Refer to the use of initContainers in your context here: https://gist.github.com/hossainemruz/7926eb2660cc8a1bb214019b623e72ea
Below, the init.sql contains your insert statements
initContainers: # this init container download init.sql file using "curl -o <downloaded file name with path> <download url>" command.
- name: init-script-downloader
image: appropriate/curl
args:
- "-o"
- "/tmp/data/init.sql" # we are saving downloaded file as init.sql in /tmp/data directory
- "https://raw.githubusercontent.com/kubedb/mysql-init-scripts/master/init.sql" # download url
volumeMounts:
- name: init-script # mount the volume where downloaded file will be saved
mountPath: /tmp/data
As for "Also, every now and then it should run another script, which will search for new records to insert and add them"
I recommend using cronjobs.

- 3,507
- 3
- 18
- 24
-
1As an alternative to using init-containers, some helm charts (like bitnami/mysql) allow you to provide your own sql (ddl) scripts to initialize the db during installation. Or if your app has its own helm chart, one other possibility is to use "helm hook"s in your own chart to perform similar tasks. Not really sure which one is the best option, but wanted to list these alternatives.. – murtiko Jan 26 '22 at 17:07
-
Thank you very much for your answer! I have understood what you say in general but I have a few questions. First of all, when the sql is downloaded, when is it actually executed? Also, although I am not planning to scale this in any way, I am trying to create a statefulset, so that the pod basically remains the same even if something happens and it crashes. So, I think it would not have to re-download the file each time, since it should have the mysql information stored on the persistent volume, right? Again, thank you verry much, and pardon me if I said something silly, I am very new to k8 – Jan 26 '22 at 19:15
-
In a statefulset, your application state is maintained on the persistent volume. Every pod gets its on separate volume and has its own state. initContainers will be executed before the actual containers are created. So, what you will end up doing is to mount a volume to the pod and put your init.sql there and use this path in your initContainer to execute inside the pod. Alternatively, you may download the init.sql from an http/https end point if you don't want to mount a volume. If the init container has been run once, you may create a flag file telling that initialization is done already – Rakesh Gupta Jan 26 '22 at 20:55