Will enabling Ondemand GOvernor on HPC cluster help save power ? Are sleep states (C-states) enabled in HPC platforms ? If not, what is the reason behind this ?
1 Answers
It's a crapshoot. There's a certain amount of spin-up/spin-down latency associated with changing the CPU frequency, and this can negatively affect certain workloads (typically ones that are CPU-hungry but still memory-bound, resulting in uneven but high-throughput CPU access patterns). Modern CPUs seem to perform much better with this than even just two generations ago, so I imagine it's something we'll be seeing a lot more of in the future.
Many organizations are fully powering off unused nodes and turning them back on dynamically when more nodes are needed, so that may explain a lot of why it's not as widely-used as it could be.
What it comes down to is that you need to split off a couple of frequency-scaled nodes into a separate queue, benchmark their execution times and power usage versus the baseline nodes, and see if it provides a cost savings to you.

- 4,395
- 18
- 18