UWMLSC > Beowulf Systems > Nemo
   IPMI HOWTO for Debian GNU/Linux on the Intel SR2300 (Server Board SE7501WV2)

Howto setup IPMI under Linux (Debian / Sarge) on the Intel SR2300 Server Chassis (Intel Server Board SE7501WV2)

Introduction

This document describes how to setup Debian / Sarge to take advantage of the management features of the Intel SR2300, this chassis uses the Intel Server Board SE7501WV2, but nearly all of this is also relevant to other related Intel server motherboards (such as the SE7501BR2, and the SE7501HG2), a lot of it will be relevant to other boards which implement IPMI v1.5, or later.

At the time of writing (July 2004), the Linux IPMI support is quite mature, but I found that information was on the sparse side, and getting a working system together seemed to require a lot of googling, reading of think technical documentation, and stabbing in the dark. Hence this document - the purpose of which is to allow Semantico staff to recreate the IPMI-based installation which I carried out during July, but which will hopefully be helpful to others as well.

What is IPMI, and why should I care?

The original motivation for setting up IPMI for me was to make use of Serial Over LAN - this allows you to deploy these servers in a remote location, make only power, and Ethernet connections to each server, and yet still get nearly all of the benefits of expensive KVM, or other remote control systems - such as those built around serial concentrators, with:
  • Less wiring
  • Less hardware
  • Lower cost
IPMI stands for Intelligent Platform Management Interface and is an open standard for machine health, and control (including remote control), and is implemented by many hardware vendors - Intel is one of the originators, and early adopters of the standard.  Here are some useful things that IPMI can do on the SR2300 with Linux:
  • Check on hardware health, and report on problems (via the OS, or autonomously via the network)
  • Provide a watchdog timer (in case the OS goes away, or programs can otherwise not run, the machine will be reset)
  • Provide remote "lights out" access to both the Linux console, and the BIOS via ethernet (no serial concentrators, multi-port serial cards, or extra cabling required)
  • Provide remote, OS independent control over the reset, and power buttons via ethernet (no funny remote control power sockets, relays, or other hacks required)
  • Provide remote control of a server over a modem connection
  • Make emergency remote management possible from a variety of simple devices (e.g. PDAs)
If you would like to know more, then this document from the 2003 Linux Symposium provides more detail.  IPMI is a large standard, with a slight whiff of committee about it, so I'm just going to consider what I think are the most useful bits, and the implementation which the SE7501WV2 makes (this is supposed to be about a single type of server, after all).  Note that IPMI seems to have more than its fair share of TLAs.

How IPMI works, and jargon

It is useful to know a bit about how IPMI does its stuff - so I'll give an overview, and try to bust some weird IPMI/Intel jargon.  There is a second autonomous computer on the motherboard (or baseboard, in IPMI's politically correct / obfusicated language), this is a very simple, low power-consumption device, which should operate as long as power is connected to the machine (including when the majority of the server is powered down) - in IPMI speak, this computer is called the BMC - the Baseboard Management Controller - it uses its own firmware, which is independent of the system BIOS.  On the SE7501WV2, and particularly on the SR2300 the BMC is connected to:

  • The power (at all times)
  • The main PC via something that looks like a keyboard controller to the PC - the KCS interface
  • All of the hardware sensors on the motherboard via its i2c bus (this is why these boards show up no sensors with the normal lm_sensors drivers, whilst similar non-IPMI boards do)
  • Both of the NICs via a sneaky secondary interface to the NIC chipsets
  • In-line between Serial Port B's RJ45 connector, and the motherboard Super I/O controller
  • The "ID" button, and blue LED
  • The power, and reset switch circuitry
  • The SCSI backplane, and redundant power supplies
This means that you can talk to the BMC from the server itself under Linux, or from a remote machine via the network (if configured).  The IPMI standard allows for other interfaces as well.

Install

Getting the Software


The packages and tools that I used to gain access to IPMI functionality are:

  1. Kernel space tools:
    • The kernel.org 2.4.26 kernel (with the rmap patch, but this shouldn't make any difference to the IPMI side of things)
    • OpenIPMI - this provides local machine communication with the BMC - I had a crash with the kcs driver included in 2.4.26, so I updated to v30, there are now newer versions available.  If you didn't want to (or couldn't) use this, you could use the Intel bootable CD that comes with the board to set up IPMI for LAN access only, and do all of your access via this, instead - you would only be able to access IPMI from other machines via the LAN, in that case.
    • i2c v2.8.7, and lm-sensors v2.8.7 - with a minor patch to get it to work with newer versions of OpenIPMI - not needed if you do not want lm_sensors integration (not required to check sensor values, but probably good if you want to use the various other user-land utilities which have been written to the lm_sensors interface).
    • If you like, you can then add the patch to fix RTS/CTS serial console support to the kernel - I was unable to find an up to date version of the patch, but it could probably be manually patched in using the version for 2.4.22 (this is left as an exercise for the reader) - without this patch, you face either:
    1. Risking losing serial data from the console output, if you have not configured it for RTS/CTS hardware serial flow control - this happens if the BMC cannot send serial data over the LAN quickly enough, and fills up its buffers, thus dropping data.
    2. Having the kernel block (very bad - especially during a reboot) whilst timing out sending kernel output to the serial port, if the serial over LAN session is down
  2. User space tools:
    • ipmitool - a reasonable command line utility to interact with the BMC from a Linux box, it support both LAN, and OpenIPMI interfaces - not currently part of Debian, but includes the necessary files to build a Debian package in the tar ball (untar, chdir to the ipmitool top-level directory and run "dpkg-buildpackage")
    • The Intel DPC (Direct Platform Control) command line utilities (dpccli and dpcproxy) for Linux - these can be downloaded from the Intel website, under the support/downloads for the SE7501WV2 as part of the ISM (Intel Server Management) suite.  The downloaded ism*.exe is a self extracting zip file, and can be extracted on Debian using the unzip command.  The alien tool can be used to convert the Redhat8.0 rpm - Software/linux/cli/8.0/CLI-2.0-1.i386.rpm to a .deb

Setup

Setting up IPMI

ipmitool + OpenIPMI

I will assume here that you will want local access to the BMC from Linux, using the OpenIPMI drivers, the advantages of doing this, over using Intel's bootable CD are:
  • Can alter BMC settings (e.g. passwords etc.) from within the OS
  • Can access the BMC from the local machine (if you only use the LAN interface, this is otherwise a tricky proposition)
  • Can make use of features such as the IPMI watchdog driver to automatically reset the machine on OS failure
  • Can be automated over many machines, and carried out remotely (e.g. using ssh)
The disadvantages are:
  • You may need to compile your own kernel (the OpenIPMI which shipped with 2.4.26 didn't seem reliable to me, YMMV)
Once you have what you feel is a suitable kernel installed, you will want to load the appropriate modules, on my machine these are:

ipmi_si_drv
ipmi_devintf

The kernel should say something like this (check /var/log/kern.log):

IPMI System Interface driver version v30, KCS version v30, SMIC version v30, BT version v30
ipmi_si: Found SMBIOS-specified state machine at I/O address 0xca2
 IPMI kcs interface initialized

If you aren't using devfs, ensure that you have an /dev/ipmi0 device for ipmitool to talk to:

# mknod -m 0600 /dev/ipmi0 c 254 0

Note, that as far as I know, the IPMI device is most likely to end up at device major number 254, but that it will take devices from the 240-254 block, which according to linux/Documentation/devices.txt is "Allocated for local/experimental use. For devices not assigned official numbers, these ranges should be used in order to avoid conflicting with future assignments." I believe that, this is because device numbers are no longer being "officially" assigned, in preparation for the introduction of fully dynamic device number allocation. So, if you have another driver which uses character device numbers from this block, and this other driver gets there first, then ipmi will end up at c 253 0 or lower...

To check where your ipmidev has ended up,

# cat /proc/devices


Testing ipmitool + OpenIPMI

You should now be able to speak to the BMC using ipmitool locally, e.g.

# ipmitool -I open chassis status
System Power         : on
Power Overload       : false
Power Interlock      : inactive
Main Power Fault     : false
Power Control Fault  : false
Power Restore Policy : previous
Last Power Event     : ac-failed
Chassis Intrusion    : inactive
Front-Panel Lockout  : inactive
Drive Fault          : false
Cooling/Fan Fault    : false

If this works, you may want to try out the following, and/or have a look through the manual page to see what else you can do -

# ipmitool -I open sdr list
# ipmitool -I open sel list
# ipmitool -I open chassis identify 1
# ipmitool -I open chassis identify 0

ipmitool + IPMI over LAN

As shipped, the BMC in the WV2 boards doesn't listen on the LAN interfaces - it must be configured to do so, there are at least two ways of doing this, the ones I know about are:

  • The bootable CD that ships with the Intel motherboards - I haven't used this, but there is documentation in Intel's Platform Guide document
  • Using ipmitool, with the OpenIPMI driver to set up LAN access on the local machine
    • The /usr/bin/bmcautoconf.sh script which comes with ipmitool will automate the majority of the setup - this currently needs some editing to select the correct ethernet interface - also note that unpatched versions of this script will need to be altered on Debian to use gawk instead of awk (you may need to install gawk), and a few binary paths may be wrong
    • Unless you are feeling very trusting, you will need to set up passworded access - IPMI includes the concept of multiple users, and privilege levels, but I have not looked into this closely.  To setup up a single password which gives full access, (this is partially setup by the bmcautoconf.sh script), run these commands to set the password for both of the SE7501WV2's interfaces:
      • ipmitool -I open lan set 6 password <your password here>
      • ipmitool -I open lan set 7 password <your password here>

Testing ipmitool + IPMI over LAN

You will now want to test IPMI over LAN, two things are worth pointing out at this stage:
  • You will need to do this from another machine (due to the way that the BMC conspires with the NIC to intercept packets - if you try to send the packets from the local machine, Linux will deliver the packets locally, without touching the NIC, so the BMC doesn't get a chance to steal the packets)
  • Any local firewall on the target machine will not need altering (for the same reason as above, if the BMC is configured correctly, the Linux kernel on the target machine doesn't get to see the packets at all)
Install ipmitool on another machine (this machine needn't  have OpenIPMI, or an IPMI equipped board), then, from that machines try the following:

$ IPMI_PASSWORD=<your password here> ipmitool -I lan -H <target hostname, or IP address> -E chassis status

This should give you similar output to running the command locally.  Now for something more interesting (well, I think it is more interesting, anyway), shutdown the target machine (e.g. # shutdown -h now), you should still be able to run the previous command, except amongst the output you should see:

System Power         : off

Now you can do this:

$ IPMI_PASSWORD=<your password here> ipmitool -I lan -H <target hostname, or IP address> -E chassis power on

The machine should power up, and boot automatically.  You can also use "power reset" to affect a remote hardware reset of the target machine.

Setting up Serial Over LAN (SOL)

If you like, you can set up the BMC on these Intel boards to talk to the main PC via its serial port B (the rj45 port), and relay the input, and output to another machine, over the LAN interface.

The IPMI v2.0 specification specifies RTS/CTS flow control should be used between the main PC, and the BMC during serial communication - this is necessary because the BMC needs a way to tell the main PC to stop sending it data, in the case that it is about to run out of buffer space - because it hasn't been able to send data to the SOL client quickly enough.  As previously mentioned, the Linux kernel, currently has a problem with RTS/CTS on serial consoles (although serial log-ins are unaffected, since the console output seems to be independant of the settings that the getty sets on the same port).

At the time of writing, the BMC code on the SE7501WV2 implements IPMI v1.5 - and IPMI v1.5 does not define SOL support, so the SOL implementation on these boards is proprietary, and Intel is not currently releasing details except under an NDA for some strange reason(boo, hiss).  The SOL support in IPMI v2.0 is believed to be based on the Intel implementation, so it may be possible to reverse engineer the implementation, but in the mean time you must use Intel's "dpccli / dpcproxy" programs to use the SOL functionality, unfortunately at the time of writing, these:

  • Are closed source
  • Are only available in an RPM, as part of a large download
  • Are flaky (they seem to crash on me quite frequently)
  • Have a sucky user interface - which also makes them difficult to script
  • The dpccli program munges the enter key, so that it is unusable with serial BIOSs' - you must use the telnet interface instead
But they are all that can be used at the moment, and they work well enough to be useful. 

A Quick Note About Security

Although the dpcproxy program must be spoken to using telnet (if you need any security at all on the LAN, then I recommend making the dpcproxy bind to the loopback interface only, and ssh <dpcproxyserver> telnet localhost 623), the SOL session itself (between dpcproxy, and the BMC) is encrypted by default (although Intel gives no details of the encryption), so passwords typed over the SOL session are (probably) not trivially interceptable.

IPMI includes some security, but I have not looked into what strengh of security ipmitool is able to use with the SE7501 boards. Any answers welcome..

The Recipe

The basic setup procedure is as follows..

On the machine(s) from which you will manage the other servers:
  • Download the Intel Server Management suite from http://support.intel.com/support/motherboards/server/SE7501WV2/ - I used v5.5.7 - a version should have shipped with the motherboard, you could try using this instead if you like
  • Use the unzip program (apt-get install unzip) to extract the archive (which is a Windows self-extracting zip file) e.g. unzip /path/to/ism557_build2.exe 'Software/linux/*'
  • Use alien to convert the Redhat8 binary rpm to a .deb - # alien -d Software/linux/cli/8.0/CLI-2.0-1.i386.rpm
  • Install the .deb - dpkg -i cli_2.0-2_i386.deb
  • # ln -s /usr/local/cli/dpccli /usr/local/bin
  • # mv /etc/rc.d/init.d/cliservice /etc/init.d/
  • Optionally edit the init script to start dpcproxy with the '-L' switch to bind to the loop back address (127.0.0.1) only
  • # /etc/init.d/cliservice start
On the target machine(s), you will need to set up serial console support - this is covered in detail in http://www.tldp.org/HOWTO/Remote-Serial-Console-HOWTO/ but I will include a brief recipe here:
  • In /etc/inittab, put a line like this "T1:23:respawn:/sbin/getty -h -L ttyS1 19200 vt100"
  • # killall -HUP init
  • Skip forward to the testing section now if you like, then come back to complete the setup once you are satisfied that it is working
  • Setup the BIOS for console redirection - 19200 baud, 8n1, RTS/CTS hardware flow control
  • Set up the Linux kernel for serial console operation, e.g. in /boot/grub/menu.lst "kernel /boot/vmlinuz root=/dev/md2 ro console=tty0 console=ttyS1,19200n8"
  • If and only if you have patched the kernel for better RTS/CTS support, make it: "kernel /boot/vmlinuz root=/dev/md2 ro console=tty0 console=ttyS1,19200n8r" - if you haven't patched the kernel, then it will block whilst it times out on each character of serial output when the SOL session is not running - this is bad - if you don't use hardware flow control, by using the 'r' flag, then you will lose some kernel output - this is also bad, but nowhere near as bad.
  • On these machines, do not set up your boot loader (e.g. grub) for serial output - as the BIOS redirects the boot loader's video console output, and they will tread on each other's toes - once the linux kernel is loaded, and enters protected mode, the BIOS doesn't get a look in, so the kernel output is OK.
  • Ensure ttyS1 is mentioned in /etc/securetty - otherwise you will not be able to log in as root on the serial console (this should be in there by default on Debian)

Enabling and Testing SOL

If you like, you can test the above configuration, using a real serial null-modem cable, and a terminal program such as "minicom", or "gkermit", in either case, you should then do the following:

On the management machine (you could also do the first step on the target machine, using the OpenIPMI interface)

# IPMI_PASSWORD=<your password here> ipmitool -I lan -H <target hostname, or IP address> -E chassis sol setup

Note that the ipmitool "sol" command is likely to be renamed when IPMI v2.0 sol support is added to the program.  I have read documentation which hints that it might be possible to set SOL support to "always on" using the Intel bootable CD - using the ipmitool "sol setup" command, the SOL session can be interupted by some actions (e.g. local keyboard activity, as simulated by some kvms), but I have not verified this.

# telnet localhost 623
Trying 127.0.0.1...
Connected to localhost.localdomain.
Escape character is '^]'.
Server: <your server name>
Username:
Password: *********
Login successful


dpccli> console

myservername login:

You should then be able to log in as root, and reboot the machine, following the entire boot process on the serial console.  To get out of the SOL session, you need to send the sequence "~." to the dpcproxy program.  This clashes nicely with the ssh escape sequence ("<cr>~." tells ssh to terminate an ssh session - so that you may need to remember to type "~~." if you are typing the sequence after a new line), and also means that you cannot type the tilde character on the console (at least I have been unable to figure out how to, and there is no man page).  Nice one Intel.

Other defects include the fact that you cannot send a serial break (for sysrq), and that it quite often seems to exit, and get confused when there is a lot of I/O going on.

Using the Serial BIOS

The serial BIOS interface is a bit brain damaged in that it does not recognise the "F11", and "F12" key escape codes that most terminal programs send, instead you can send "Esc-!", and "Esc-@" (yes very logical, as long as the '@' key is normally typed using 'Shift-2' - as on US keyboards, not miles away from the '2' key, as on many non-US keyboards).  These escapes from HP, and Dell serial BIOS' may or may not be useful:

Defined As     F1     F2     F3     F4     F5     F6     F7     F8     F9     F10    F11    F12
Keyboard Entry <ESC>1 <ESC>2 <ESC>3 <ESC>4 <ESC>5 <ESC>6 <ESC>7 <ESC>8 <ESC>9 <ESC>0 <ESC>! <ESC>@

Defined As     Home   End    Insert Delete PageUp PageDn
Keyboard Entry <ESC>h <ESC>k <ESC>+ <ESC>- <ESC>? <ESC>/

Use the <ESC><Ctrl><M> key sequence for <Ctrl><M>
Use the <ESC><Ctrl><H> key sequence for <Ctrl><H>
Use the <ESC><Ctrl><I> key sequence for <Ctrl><I>
Use the <ESC><Ctrl><J> key sequence for <Ctrl><J>
Use the <ESC><X><X> key sequence for <Alt><x>, where x is any letter key, and X is the upper case of that key

Adding Remote Logging of the Serial Output

Interactive use of the SOL consoles is very useful, but the addition of unattended logging of the output is even more useful - it can be used to catch kernel panics, and pre-cleaned kernel output, in the case that a box is compromised, two pick to examples.

The conserver program can carry out this kind of logging for other types of serial consoles, and with the addition of the "solsession" expect script from the IPMI_on_Debian_files/ directory - it can be made to speak to the Intel dpcproxy program. An example conserver configuration is also provided.

Conserver will also automatically restart dead dpcproxy connections, and continue to log output during interactive use (unless you tell it not to).

Setting up the IPMI watchdog

In this context, a watchdog is a device which has the job of reseting a computer system, if it thinks the software running on the system has hung, or is otherwide not operating as it should. A watch dog is often implemented as follows.. The software running on the computer system must carry out a regular task (such as writing a character to a device file, on Linux) in order to reassure the watchdog that everything is as it should be - if the software fails to carry out the task, the watchdog assumes that the computer has hung, and will reset it.

A watchdog is usually implemented so that it is likely to survive problems that might otherwise take out the operating system (a partial exception to this is the Linux "softdog" watchdog module, which runs in the kernel - it is still useful if the kernel is partially, but not entirely knackered). The IPMI standard includes a watchdog, and the OpenIPMI Linux drivers include a module which provides an implementation the Linux watchdog interface, which is backed by the IPMI BMC - such that it will reset (reset is the default behaviour) the computer if the watchdog device is not attended to in a timely fashion.

A simple recipe for setting up the IPMI watchdog on Debian/Sarge is presented below:

  • Ensure that the kernel ipmi_watchdog module has been built.
  • # echo '# Watchdog' >> /etc/modutils/aliases
  • # echo 'alias char-major-10-130 ipmi_watchdog' >> /etc/modutils/aliases
  • # echo 'options ipmi_watchdog timeout=40' >> /etc/modutils/watchdog
  • # update-modules
  • # apt-get install watchdog
  • # echo 'watchdog-device = /dev/watchdog' > /etc/watchdog.conf
  • # /etc/init.d/watchdog start
  • # tail /var/log/kern.log
  • # tail /var/log/daemon.log
Note that the default time-out for the ipmi_watchdog module is 10 seconds, this is also the default write interval for the watchdog daemon which is included in the Debian "watchdog" package, so make sure you change one of them, otherwise there will be no margin for error at all (in the example, I've upped the kernel module time-out).

TODO

The IPMI system event log will currently (AFAIK) fill up, and stop being appended to (or maybe old entries will be nuked, I am not sure, as it hasn't happened yet..). Ipmitool will allow you to clear the log, so the cron jobs should probably be modified to do this, and archive old entries under /var/log (so that e.g. logrotate cat take care of them). Contributions welcome!

Credits

The original version of this document was written by Tim Small for a client of WPAD Ltd. - Semantico Ltd. Semantico support the publishing this document as part of their backing of open source software. Thankyou, Semantico.

Thanks also to:
  • Erwan Velu at Mandrakesoft, for an email which prompted the movable character device major number note.

Feedback

Please send feedback, patches (e.g. remote modem support), etc. to tim@buttersideup.com
Check this page for dead links, sloppy HTML, or a bad style sheet; or strip it for printing.
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.