* * * * *
The monitoring of uninterruptable power supplies
I've been dealing with UPS (Uninterruptable Power Supply) problems [1] for a
week and a half now, and it's finally calmed down a bit. Bunny's UPS has been
replaced, and I'm waiting for Smirk to order battery replacements for my UPS
so in the mean time, I'm using a spare UPS from The Company.
Bunny suspects the power situation here at Chez Boca is due to some overgrown
trees interfering with the power lines, causing momentary fluctuations in the
power and basically playing hell with not only the UPSes but the DVRs as
well. This past Wednesday was particuarly bad—the UPS would take a hit and
drop power to my computers, and by the time I got up and running, I would
take another hit (three times, all within half an hour). It got so bad I
ended up climbing around underneath the desks rerunning power cables with the
hope of keeping my computers powered for more than ten minutes.
It wasn't helping matters that I was fighting my syslogd replacement [2]
during each reboot (but that's another post [3]).
So Smirk dropped off a replacement UPS, and had I just used the thing,
yesterday might have been better. But nooooooooooooooooo! I want to monitor
the device (because, hey, I can), but since it's not an APC [4], I can't use
apcupsd [5] to monitor it (Bunny's new UPS is an APC, and the one I have with
the dead battery is an APC). In searching for some software to monitor the
Cyber Power 1000AVR LCD [6] UPS, I came across NUT (Network UPS Tools) [7],
which supports a whole host of UPSes [8], and it looks like it can support
monitoring multiple UPSes on a single computer (functionality that apcupsd
lacks).
It's nice, but it does have its quirks (and caused me to have nuclear
meltdowns yesterday). I did question the need for five configuration files
and its own user accounting system, but upon reflection, the user acccounting
system is probably warranted (maybe), given that you can remotely command the
UPSes to shutdown. And the configurations files aren't that complex; I just
found them annoying. I also found the one process per UPS, plus two processes
for monitoring, a bit excessive, but the authors of the program were
following the Unix philosophy of small tools collectively working together.
Okay, I can deal.
The one quirk that drove me towards nuclear meltdown was the inability of the
USB (Universal Serial Bus) “driver” (the program that actually queries the
UPS over the USB bus) to work properly when a particular directive was
present in the configuration file and running in “explore” mode (used to
query the UPS for all its information). So I have the following in the UPS
configuration file:
> [apc1000]
> driver = usbhid-ups
> port = auto
> desc = "APC Back UPS XS 1000"
> vendorid = 051D
>
I try to run usbhid-ups in explore mode, and it fails. Comment out the
vendorid, but add it to the commnd line, and it works. But without the
vendorid, the usbhid-ups program wouldn't function normally (it's the
interface between the monitoring processes and the UPS).
It's bad enough that you can only use the explore mode when the rest of the
UPS monitoring software isn't running, but this? It took me about three hours
to figure out what was (or wasn't) going on.
You can obviously generate kilowatt usage, yet I can't query for it over USB?
Not even as a vendor extention? You suck!] [9]
Then there was the patch I made to keep NUT from logging every second to
syslogd (I changed one line from “if result > 0 return else log error” to “if
result >= 0 return else log error” since 0 isn't an error code), then I found
this bug report [10] on the mailing list archive, and yes, that bug was
affecting me as well; after I applied the patch, I was able to get more
informtion from the Cyber Power UPS (and it didn't affect the monitoring of
the APC).
And their logging program, upslog, doesn't log to syslogd. It's not even an
option. I could however, have it output to stdout and pipe that into logger,
but that's an additional four processes (two per UPS) just to log some stats
into syslogd. Fortunately, the protocol used to communicate with the UPS
monitoring software is well documented and easy to implement, so it was an
easy thing to write a script (Lua, of course) to query the information I
wanted to log to syslogd and run that every five minutes via cron.
Now, the information you get is impressive. apcupsd gives out rather terse
information like (from Bunny's system, which is still running apcupsd):
> APC : 001,038,0997
> DATE : Sat Apr 17 22:23:25 EDT 2010
> HOSTNAME : bunny-desktop
> VERSION : 3.14.6 (16 May 2009) debian
> UPSNAME : apc-xs900
> CABLE : USB Cable
> MODEL : Back-UPS XS 900
> UPSMODE : Stand Alone
> STARTTIME: Thu Apr 08 23:20:10 EDT 2010
> STATUS : ONLINE
> LINEV : 118.0 Volts
> LOADPCT : 16.0 Percent Load Capacity
> BCHARGE : 084.0 Percent
> TIMELEFT : 48.4 Minutes
> MBATTCHG : 5 Percent
> MINTIMEL : 3 Minutes
> MAXTIME : 0 Seconds
> SENSE : Low
> LOTRANS : 078.0 Volts
> HITRANS : 142.0 Volts
> ALARMDEL : Always
> BATTV : 25.9 Volts
> LASTXFER : Unacceptable line voltage changes
> NUMXFERS : 6
> XONBATT : Fri Apr 16 00:40:37 EDT 2010
> TONBATT : 0 seconds
> CUMONBATT: 11 seconds
> XOFFBATT : Fri Apr 16 00:40:39 EDT 2010
> SELFTEST : NO
> STATFLAG : 0x07000008 Status Flag
> MANDATE : 2007-07-03
> SERIALNO : JB0727006727
> BATTDATE : 2143-00-36
> NOMINV : 120 Volts
> NOMBATTV : 24.0 Volts
> NOMPOWER : 540 Watts
> FIRMWARE : 830.E6 .D USB FW:E6
> APCMODEL : Back-UPS XS 900
> END APC : Sat Apr 17 22:24:00 EDT 2010
>
NUT will give back:
> battery.charge: 42
> battery.charge.low: 10
> battery.charge.warning: 50
> battery.date: 2001/09/25
> battery.mfr.date: 2003/02/18
> battery.runtime: 3330
> battery.runtime.low: 120
> battery.type: PbAc
> battery.voltage: 24.8
> battery.voltage.nominal: 24.0
> device.mfr: American Power Conversion
> device.model: Back-UPS RS 1000
> device.serial: JB0307050741
> device.type: ups
> driver.name: usbhid-ups
> driver.parameter.pollfreq: 30
> driver.parameter.pollinterval: 2
> driver.parameter.port: auto
> driver.parameter.vendorid: 051D
> driver.version: 2.4.3
> driver.version.data: APC HID 0.95
> driver.version.internal: 0.34
> input.sensitivity: high
> input.transfer.high: 138
> input.transfer.low: 97
> input.transfer.reason: input voltage out of range
> input.voltage: 121.0
> input.voltage.nominal: 120
> ups.beeper.status: disabled
> ups.delay.shutdown: 20
> ups.firmware: 7.g3 .D
> ups.firmware.aux: g3
> ups.load: 2
> ups.mfr: American Power Conversion
> ups.mfr.date: 2003/02/18
> ups.model: Back-UPS RS 1000
> ups.productid: 0002
> ups.serial: JB0307050741
> ups.status: OL CHRG
> ups.test.result: No test initiated
> ups.timer.reboot: 0
> ups.timer.shutdown: -1
> ups.vendorid: 051d
>
Same information, but better variable names, plus you can query for any
number of variables. Not all UPSes support all variables, though (and there
are plenty more variables that my UPSes don't support, like temperature). You
can also send commands to the UPS (for instance, I was able to shut off the
beeper on the failing APC) using this software.
So yes, it's nice, but its quirky nature was something I wasn't expecting
after a week of electric musical chairs.
[1]
gopher://gopher.conman.org/1Phlog:2010/04/07
[2]
gopher://gopher.conman.org/0Phlog:2010/02/09.1
[3]
gopher://gopher.conman.org/0Phlog:2010/04/18.1
[4]
http://www.apc.com/
[5]
http://sourceforge.net/projects/apcupsd/
[6]
http://www.cyberpowersystems.com/products/ups-systems/browse-by-category/intelligent-lcd-ups/CP1000AVRLCD.html?selectedTabId=overview&imageI=#tab-box
[7]
http://www.networkupstools.org/
[8]
http://www.networkupstools.org/compat/stable.html
[9]
gopher://gopher.conman.org/IPhlog:2010/04/17/ups.jpg
[10]
http://lists.alioth.debian.org/pipermail/nut-upsdev/2010-March/004673.html
Email author at
[email protected]