I've got very unbalanced (exponential) memory requirements for different Katib trials. When running smaller trials it is perfectly fine to run 16 in parallel on my 4 node cluster - but when the larger ones run they use up a lot of memory and I get OOMKilled from Kubernetes.
Ideally I would like to control the amount of parallelization based on the hyperparameters chosen but this doesn't seem to be possible in Katib.
Is there another way of preventing those trial pods to be scheduled in parallel and somehow keep them in "pending" until the resources are free again? maybe on the Kubernetes level?