6

I'm purchasing a Dell R610 with redundant power supplies. What is the best way to be alerted in the event that one of the power supplies fails? I'll be running Windows Server 2008 R2. Since this machine will be at a colocation facility, I won't hear the alarm.

4 Answers4

9

Dell OpenManage generates events in the event log when it detects issues with a PSU. You can use any piece of software for notification that is capable of detecting specific events in the event log. You can also configure alert actions to run a program of your choice, which I suppose could be an emailer, etc.

I believe notifications are built in to the IT Assistant componenet of Dell OpenManage, but it's a multicomponent suite and I'm not sure if it's in the baseline piece, as we use Microsoft's System Center Operations Manager for notifications, which is obviously overkill for a single server in a colo facility. IT Assistant would need to be run on a separate system IIRC. Depending on the colo facility, they may have an IT Assistant set up that you can hook into to receive alerts.

phoebus
  • 8,380
  • 1
  • 31
  • 30
7

ipmitool can probe the power supplies. I do this mainly on linux machines, but ipmitool exists for windows too.

# ipmitool sdr type "Power Supply"
Status           | 64h | ok  | 10.1 | Presence detected
Status           | 65h | ok  | 10.2 | Presence detected
PS Redundancy    | 74h | ok  |  7.1 | Fully Redundant

Just write a script to parse the output and have that output sent to your central monitoring host (or have it email you.)

toppledwagon
  • 4,245
  • 25
  • 15
0

If you're looking for a monitoring solution that's independent of the R610 itself, you could use a product such as the ITWatchDogs' WeatherGoose-II paired with a couple of CT-30-60-120 current transformers, one transformer for each of the two AC power connections on the back of the server. Then, if either power supply failed, its current draw would drop to zero (or close to it) and you could set the WeatherGoose-II to send an e-mail or SNMP trap when that occurs.

However, you would need to get an electrician to open up the power cord and separate out the current-carrying "hot" wire from the trio inside the power cord so that it can go through the center of the transformer by itself; if you just clamp the CT around the entire power cord, the opposing currents in the hot and neutral wires will cancel out each others' magnetic field and the CT will always "see" zero current draw.

Another possibility would be the RCU-H (which is made by another branch of IT Watchdogs' parent company, but should still be available through them), which is basically a "smart" rack-mount power strip that can individually monitor and control each outlet. Simply plug both power cords from the Dell server into two sockets on the RCU-H, and it too can monitor the current draw of each one and alert you if either power supply suddenly stops drawing current.

splattne
  • 28,508
  • 20
  • 98
  • 148
0

You can take a look at Nagios.

If you only want to monitor the power supply, then it would be overkill.

However, setting it up will allow you to monitor any alerts generated by OpenManage such as raid failure, memory issues, chassis fans, etc as there is a Nagios plugin that queries OpenManage.

On top of that, you can monitor for excessive RAM usage, hard disks filling up, CPU %, etc.

You can also monitor the services you are providing on that box such as HTTP, SMTP, FTP etc.

The setup is best served by using a separate box and if it is offsite of the equipment you are monitoring so that you can get the alerts during an outage. It doesn't require a lot of power and can easily be an older box sitting in your office or home.

You can setup alerts to email, cell phones, firefox plugins, etc. You can configure escalation groups so that it sends to email first, if it isn't addressed then text guy #1, still not addressed text guy #2, etc.

Basically the stuff you can do with Nagios is pretty deep and is a great tool for any sysadmin.

ManiacZX
  • 1,656
  • 13
  • 16