I'd like to use spare CPU capacity in our kubernetes cluster for low-priority jobs -- specifically ML training using Tensorflow in this case -- without depriving higher-priority services on our cluster from CPU when they suddenly spike, akin to how one would with OS process priority. Currently we configure our autoscaler to add more nodes if CPU usage exceeds 60%, meaning as much as 40% of our CPU is unused at all times.
Questions: (1) Is this possible with K8s? After some experimentation it seems that Pod priority is not exactly the same, as my lower priority deployment does not instantly yield back CPU to my higher priority deployment. (2) If not possible, is there another generally-used strategy to utilize intentionally-overprovisioned CPU capacity, but yield immediately to higher priority services?