Tuesday 25 September 2012

Power management and NUT #1: an introduction

in this series of articles, I will be talking in depth about power management through the NUT project, its packaging on Debian, how to use it in general and how I see it being part of the GreenIT thing.

So, let's start with an introduction:

NUT is a Free Software (GPL v2+ and 3 to be precise), originally created for power protection using UPS, from home to data-centers:

To shortly describe the main features, I would say that NUT:

NUT used to stand for Network UPS Tools. That is, a software for talking to your UPS and shutting down your systems when needed. This definition is a bit limited nowadays, since NUT supports 4 types of power device:

  • UPS, obviously, since the origin.
  • PDU (power distribution units), for 4 years. These are somehow big intelligent power switches, that you find in datacenters. These allow to switch on and off specific outlets and / or to measure power consumptions. The latter is more interesting for Green IT and PUE calculation. But this is the topic of another article ;-)
  • SCD (solar controller device), for more than 2 years. NUT only supports 1 SCD (IVT SCD series), but support for another can be very easily added.
  • PSU (power supply unit), for more than a year. This support is limited to server PSU, that are IPMI compatible.
  • meters and gensets are also new device types that are considered. Meters would provide more measurement capabilities, still mainly for PUE calculation, while gensets.

Considering this, is the name Network UPS Tools still suitable? Not really! But the acronym NUT is well known! So, for the time being, I just stick using it, and focus on other more important things, until a better opportunity (ideas and comments are welcome!).

That said, what can you do exactly with NUT? Currently, you can:

  • monitor and manage UPS(s) that protect(s) your system(s), with no redundancy limitation,
  • manage your PDU, to power on / off your systems (not servers directly!), and measure power consumptions,
  • monitor and manage your servers power supplies, power these on / off, and measure power consumption.
  • monitor all these links of the powerchain, that feed your servers.
  • discover all USB, SNMP and IPMI supported devices, locally or on the network.

All the above is available in a standardized way:

  • the manufacturer name will always be in the variable called device.mfr. The first outlet will always be outlet.1, whatever the device is (UPS or PDU here),
  • Tons of command line tools, libraries / language bindings and software are available to help in NUT integration! You can even make your own NUT client implementation very easily and quickly (experiences reports are ~ 2 hours).

Well, this is already a long and dense post, so I will stop there for today. In the next post, we will have a deeper dive into using NUT, for various use cases: submit yours if you can ;-)

Tuesday 10 July 2012

Definitive solution to IPMI over LAN with Dell iDrac Express

I have this bunch of Dell R610, with iDrac6 Express management cards. I used these, among other things, for developing IPMI support in NUT and working on Infrastructure & Cloud power management. But that's the topic of another post (still, if you're interested in, check this and that).

The thing is that this "IPMI" monitoring development has been limited to local support (Ie, power supplies can't be monitored remotely by the nut-ipmipsu driver), due to an issue : any attempt to enable IPMI access over the network was miserably failing!

Well, these attempts were limited to a couple of 15 minutes runs, without plain motivation, almost a year ago. The various firmwares were up to date (iDrac 1.70, ...) , everything was running and configured fine, locally. But still... no IPMI available through the network!

Looking on the Net, I've learned that many Dell customers with iDrac Express cards, were having the same issue. Dell support seems to have replaced tons of motherboards! There, I switched to other things, and time has passed....

A good year later (last week), I decided that it was time to get back on this. And I've found the solution there

Incredible: this was due to a 'bug' in the Broadcom NetXtreme II LoM (LAN on Motherboad) firmware! I've not had time to dig this issue in depth, but here is a base explanation, for what it's worth: Some LoM initial self tests are failing. Thus, the LoM are not switched to the managed mode, and can't actually be available for BMC management (thus no IPMI over the network). In my case, the tests were wrongly failing at 'A07', a test which tries to establish a Gigabit connection! Strangely, all these servers are connected on a Gb switch! Not a fully satisfactory answer, but that said, there is a solution, and I've not much time to pour into this investigation (comments may always change my mind though!).

So here is a comprehensive procedure to fix this, from your Linux system, and using FreeDOS:

  • Get a USB key, at least 1,44 Mb (damn!), but a good old 32Gb will also do the trick ;-)
  • Open a terminal and format the USB key (WARNING: this will ERASE all data on the key! You've been warned. Really!)

$ mkfs.msdos /dev/sdX1

Note: 'X' is to be replaced by the exact name of your USB key. An hint: call 'tail -f /var/log/syslog" and unplug / replug your USB key. You will see some entries like "...sdb Attached SCSI removable disk". So, there, it's "sdb".

  • Download a FreeDOS image
  • Now use qemu to create the DOS boot disk on your USB key (replace 'X' again!):

$ qemu -boot a -fda balder10.img -hda /dev/sdX A:\> sys c: A:\> xcopy /E /N a: c:

Note that you will need "root" privileges.

  • (Optional) You can check the bootability with:

$ qemu -hda /dev/sdX

  • Download Broadcom DOS utilities there
  • Unzip this archive (self extract)

$ unzip Bcom_LAN_14.2.x_DOSUtilities_A03.exe

  • Copy only 'Userdiag/NetXtremeII/uxdiag.exe' to your USB key.
  • Plug the key in your server, and reboot it
  • Press F11 to enter the BIOS boot sequence,
  • Select the default entry, and press Enter. Once the system is booted, type:

c:\ uxdiag -t abcd –mfw 1

  • Reboot your system, and enjoy your *working* IPMI access over the network :-)

For what it's worth (again), I just hope that it will be useful to others...

I will now prepare another post using using FreeIPMI to manage your servers, the GNU way...

Thanks to Jordi Clariana, his enlightening post, Daniel for this one, Aurélien was motivating me again in solving this iDrac Express issue and Al Chu (FreeIPMI project leader) for all his invaluable help on IPMI.

Monday 27 September 2010

Penguin strikes back...

Seen during the last strike in France:

The (not so visible) slogan means "we are not penguins".

Well, we are not windows either! And why do you think that penguins are small defenseless things? For those leaving under a rock, Linux mascot is a penguin. In other word, the emblem of the biggest IT revolution (I actually mean the Free and Opensource software in general).

Jokes aside, the reasons of this strike are serious, and the way the government deals with this is probably sub optimal:

having more workers on the market, without having more work is simply a non sense! You will only be increasing unemployment, so public taxes in the end.

Looking for a solution to social and economic crisis?

(thanks to Luc for the picture, Seb for his support and my brother Guillaume for the video link)