0

Recently inherited a neglected cluster: I'm trying to do some sanity checks on it. Running a benchmark on node X and then running 'top' shows high cpu usage from mpi processes (as expected), but on node Y top shows 0% usage.

Is this normal? Is there another utility I can use that can monitor system resources correctly on a node?

Dan Anderson
  • 113
  • 5

1 Answers1

0

It's not normal. The 'cluster' I inherited is actually a bunch of boxes connected through an Infiniband switch without any load-sharing, i.e. not a cluster at all.

A useful utility for monitoring cluster load is ganglia. Config took a little finnagling, but it works great if you're not already using other cluster management stuff like Conga.

Dan Anderson
  • 113
  • 5