2

I've got an Intel linux cluster with IPMI interfaces on the nodes. Lately, the IPMI interfaces have been acting flaky. For example, I can no longer use IPMI commands to get the nodes to PXE boot, and rebooting them via IPMI works sometimes but not always.

I recently discovered that you can test the chassis and BMC with ipmitool, and this was the output:

# ipmitool chassis selftest
Self Test Results    : device error
                       [FRU Internal Use Area corrupted]

# ipmitool bmc selftest
Selftest: device corrupted
Internal Use Area corrupted

It looks like something has gone wrong. Is there any way to restore the IPMI interfaces to their original state? (Note that I don't know what the specific IPMI hardware is here, or how to query them to find out).

Lorin Hochstein
  • 5,028
  • 15
  • 56
  • 72

3 Answers3

1

Reflashing the firmware/BIOS might help... or hard power cycling the machine - unplugging it completely.

What type of machines are they?

James
  • 7,643
  • 2
  • 24
  • 33
1

I would first try to verify this on other machines of the same type - it's quite possible that all of your systems of the same type have some sort of ipmi oddity that ipmitool is failing to deal with correctly.

Phil Hollenback
  • 14,947
  • 4
  • 35
  • 52
  • Out of ten nodes, it works on two of them, fails on eight. It used to work, but started to fail a few months ago, and I don't htink there was a software change. – Lorin Hochstein Jan 05 '10 at 14:28
0

Running ipmitool reset warm or ipmitool reset cold can sometimes help.

mgorven
  • 30,615
  • 7
  • 79
  • 122
  • Interestingly, many implementations log to RAM, and they log every access. If the RAM gets full, things may react odd (at least). Resetting those beasts usually clears the RAM. – U. Windl Apr 19 '21 at 09:38