blob: 90d10e708ca3fba4afdad8cdf4609bd5b02d8842 [file] [log] [blame]
Linus Torvalds1da177e2005-04-16 15:20:36 -07001
2 The Linux IPMI Driver
3 ---------------------
4 Corey Minyard
5 <minyard@mvista.com>
6 <minyard@acm.org>
7
8The Intelligent Platform Management Interface, or IPMI, is a
9standard for controlling intelligent devices that monitor a system.
10It provides for dynamic discovery of sensors in the system and the
11ability to monitor the sensors and be informed when the sensor's
12values change or go outside certain boundaries. It also has a
13standardized database for field-replacable units (FRUs) and a watchdog
14timer.
15
16To use this, you need an interface to an IPMI controller in your
17system (called a Baseboard Management Controller, or BMC) and
18management software that can use the IPMI system.
19
20This document describes how to use the IPMI driver for Linux. If you
21are not familiar with IPMI itself, see the web site at
22http://www.intel.com/design/servers/ipmi/index.htm. IPMI is a big
23subject and I can't cover it all here!
24
25Configuration
26-------------
27
28The LinuxIPMI driver is modular, which means you have to pick several
29things to have it work right depending on your hardware. Most of
30these are available in the 'Character Devices' menu.
31
32No matter what, you must pick 'IPMI top-level message handler' to use
33IPMI. What you do beyond that depends on your needs and hardware.
34
35The message handler does not provide any user-level interfaces.
36Kernel code (like the watchdog) can still use it. If you need access
37from userland, you need to select 'Device interface for IPMI' if you
38want access through a device driver. Another interface is also
39available, you may select 'IPMI sockets' in the 'Networking Support'
40main menu. This provides a socket interface to IPMI. You may select
41both of these at the same time, they will both work together.
42
43The driver interface depends on your hardware. If you have a board
44with a standard interface (These will generally be either "KCS",
45"SMIC", or "BT", consult your hardware manual), choose the 'IPMI SI
46handler' option. A driver also exists for direct I2C access to the
47IPMI management controller. Some boards support this, but it is
48unknown if it will work on every board. For this, choose 'IPMI SMBus
49handler', but be ready to try to do some figuring to see if it will
50work.
51
52There is also a KCS-only driver interface supplied, but it is
53depracated in favor of the SI interface.
54
55You should generally enable ACPI on your system, as systems with IPMI
56should have ACPI tables describing them.
57
58If you have a standard interface and the board manufacturer has done
59their job correctly, the IPMI controller should be automatically
60detect (via ACPI or SMBIOS tables) and should just work. Sadly, many
61boards do not have this information. The driver attempts standard
62defaults, but they may not work. If you fall into this situation, you
63need to read the section below named 'The SI Driver' on how to
64hand-configure your system.
65
66IPMI defines a standard watchdog timer. You can enable this with the
67'IPMI Watchdog Timer' config option. If you compile the driver into
68the kernel, then via a kernel command-line option you can have the
69watchdog timer start as soon as it intitializes. It also have a lot
70of other options, see the 'Watchdog' section below for more details.
71Note that you can also have the watchdog continue to run if it is
72closed (by default it is disabled on close). Go into the 'Watchdog
73Cards' menu, enable 'Watchdog Timer Support', and enable the option
74'Disable watchdog shutdown on close'.
75
76
77Basic Design
78------------
79
80The Linux IPMI driver is designed to be very modular and flexible, you
81only need to take the pieces you need and you can use it in many
82different ways. Because of that, it's broken into many chunks of
83code. These chunks are:
84
85ipmi_msghandler - This is the central piece of software for the IPMI
86system. It handles all messages, message timing, and responses. The
87IPMI users tie into this, and the IPMI physical interfaces (called
88System Management Interfaces, or SMIs) also tie in here. This
89provides the kernelland interface for IPMI, but does not provide an
90interface for use by application processes.
91
92ipmi_devintf - This provides a userland IOCTL interface for the IPMI
93driver, each open file for this device ties in to the message handler
94as an IPMI user.
95
96ipmi_si - A driver for various system interfaces. This supports
97KCS, SMIC, and may support BT in the future. Unless you have your own
98custom interface, you probably need to use this.
99
100ipmi_smb - A driver for accessing BMCs on the SMBus. It uses the
101I2C kernel driver's SMBus interfaces to send and receive IPMI messages
102over the SMBus.
103
104af_ipmi - A network socket interface to IPMI. This doesn't take up
105a character device in your system.
106
107Note that the KCS-only interface ahs been removed.
108
109Much documentation for the interface is in the include files. The
110IPMI include files are:
111
112net/af_ipmi.h - Contains the socket interface.
113
114linux/ipmi.h - Contains the user interface and IOCTL interface for IPMI.
115
116linux/ipmi_smi.h - Contains the interface for system management interfaces
117(things that interface to IPMI controllers) to use.
118
119linux/ipmi_msgdefs.h - General definitions for base IPMI messaging.
120
121
122Addressing
123----------
124
125The IPMI addressing works much like IP addresses, you have an overlay
126to handle the different address types. The overlay is:
127
128 struct ipmi_addr
129 {
130 int addr_type;
131 short channel;
132 char data[IPMI_MAX_ADDR_SIZE];
133 };
134
135The addr_type determines what the address really is. The driver
136currently understands two different types of addresses.
137
138"System Interface" addresses are defined as:
139
140 struct ipmi_system_interface_addr
141 {
142 int addr_type;
143 short channel;
144 };
145
146and the type is IPMI_SYSTEM_INTERFACE_ADDR_TYPE. This is used for talking
147straight to the BMC on the current card. The channel must be
148IPMI_BMC_CHANNEL.
149
150Messages that are destined to go out on the IPMB bus use the
151IPMI_IPMB_ADDR_TYPE address type. The format is
152
153 struct ipmi_ipmb_addr
154 {
155 int addr_type;
156 short channel;
157 unsigned char slave_addr;
158 unsigned char lun;
159 };
160
161The "channel" here is generally zero, but some devices support more
162than one channel, it corresponds to the channel as defined in the IPMI
163spec.
164
165
166Messages
167--------
168
169Messages are defined as:
170
171struct ipmi_msg
172{
173 unsigned char netfn;
174 unsigned char lun;
175 unsigned char cmd;
176 unsigned char *data;
177 int data_len;
178};
179
180The driver takes care of adding/stripping the header information. The
181data portion is just the data to be send (do NOT put addressing info
182here) or the response. Note that the completion code of a response is
183the first item in "data", it is not stripped out because that is how
184all the messages are defined in the spec (and thus makes counting the
185offsets a little easier :-).
186
187When using the IOCTL interface from userland, you must provide a block
188of data for "data", fill it, and set data_len to the length of the
189block of data, even when receiving messages. Otherwise the driver
190will have no place to put the message.
191
192Messages coming up from the message handler in kernelland will come in
193as:
194
195 struct ipmi_recv_msg
196 {
197 struct list_head link;
198
199 /* The type of message as defined in the "Receive Types"
200 defines above. */
201 int recv_type;
202
203 ipmi_user_t *user;
204 struct ipmi_addr addr;
205 long msgid;
206 struct ipmi_msg msg;
207
208 /* Call this when done with the message. It will presumably free
209 the message and do any other necessary cleanup. */
210 void (*done)(struct ipmi_recv_msg *msg);
211
212 /* Place-holder for the data, don't make any assumptions about
213 the size or existence of this, since it may change. */
214 unsigned char msg_data[IPMI_MAX_MSG_LENGTH];
215 };
216
217You should look at the receive type and handle the message
218appropriately.
219
220
221The Upper Layer Interface (Message Handler)
222-------------------------------------------
223
224The upper layer of the interface provides the users with a consistent
225view of the IPMI interfaces. It allows multiple SMI interfaces to be
226addressed (because some boards actually have multiple BMCs on them)
227and the user should not have to care what type of SMI is below them.
228
229
230Creating the User
231
232To user the message handler, you must first create a user using
233ipmi_create_user. The interface number specifies which SMI you want
234to connect to, and you must supply callback functions to be called
235when data comes in. The callback function can run at interrupt level,
236so be careful using the callbacks. This also allows to you pass in a
237piece of data, the handler_data, that will be passed back to you on
238all calls.
239
240Once you are done, call ipmi_destroy_user() to get rid of the user.
241
242From userland, opening the device automatically creates a user, and
243closing the device automatically destroys the user.
244
245
246Messaging
247
248To send a message from kernel-land, the ipmi_request() call does
249pretty much all message handling. Most of the parameter are
250self-explanatory. However, it takes a "msgid" parameter. This is NOT
251the sequence number of messages. It is simply a long value that is
252passed back when the response for the message is returned. You may
253use it for anything you like.
254
255Responses come back in the function pointed to by the ipmi_recv_hndl
256field of the "handler" that you passed in to ipmi_create_user().
257Remember again, these may be running at interrupt level. Remember to
258look at the receive type, too.
259
260From userland, you fill out an ipmi_req_t structure and use the
261IPMICTL_SEND_COMMAND ioctl. For incoming stuff, you can use select()
262or poll() to wait for messages to come in. However, you cannot use
263read() to get them, you must call the IPMICTL_RECEIVE_MSG with the
264ipmi_recv_t structure to actually get the message. Remember that you
265must supply a pointer to a block of data in the msg.data field, and
266you must fill in the msg.data_len field with the size of the data.
267This gives the receiver a place to actually put the message.
268
269If the message cannot fit into the data you provide, you will get an
270EMSGSIZE error and the driver will leave the data in the receive
271queue. If you want to get it and have it truncate the message, us
272the IPMICTL_RECEIVE_MSG_TRUNC ioctl.
273
274When you send a command (which is defined by the lowest-order bit of
275the netfn per the IPMI spec) on the IPMB bus, the driver will
276automatically assign the sequence number to the command and save the
277command. If the response is not receive in the IPMI-specified 5
278seconds, it will generate a response automatically saying the command
279timed out. If an unsolicited response comes in (if it was after 5
280seconds, for instance), that response will be ignored.
281
282In kernelland, after you receive a message and are done with it, you
283MUST call ipmi_free_recv_msg() on it, or you will leak messages. Note
284that you should NEVER mess with the "done" field of a message, that is
285required to properly clean up the message.
286
287Note that when sending, there is an ipmi_request_supply_msgs() call
288that lets you supply the smi and receive message. This is useful for
289pieces of code that need to work even if the system is out of buffers
290(the watchdog timer uses this, for instance). You supply your own
291buffer and own free routines. This is not recommended for normal use,
292though, since it is tricky to manage your own buffers.
293
294
295Events and Incoming Commands
296
297The driver takes care of polling for IPMI events and receiving
298commands (commands are messages that are not responses, they are
299commands that other things on the IPMB bus have sent you). To receive
300these, you must register for them, they will not automatically be sent
301to you.
302
303To receive events, you must call ipmi_set_gets_events() and set the
304"val" to non-zero. Any events that have been received by the driver
305since startup will immediately be delivered to the first user that
306registers for events. After that, if multiple users are registered
307for events, they will all receive all events that come in.
308
309For receiving commands, you have to individually register commands you
310want to receive. Call ipmi_register_for_cmd() and supply the netfn
311and command name for each command you want to receive. Only one user
312may be registered for each netfn/cmd, but different users may register
313for different commands.
314
315From userland, equivalent IOCTLs are provided to do these functions.
316
317
318The Lower Layer (SMI) Interface
319-------------------------------
320
321As mentioned before, multiple SMI interfaces may be registered to the
322message handler, each of these is assigned an interface number when
323they register with the message handler. They are generally assigned
324in the order they register, although if an SMI unregisters and then
325another one registers, all bets are off.
326
327The ipmi_smi.h defines the interface for management interfaces, see
328that for more details.
329
330
331The SI Driver
332-------------
333
334The SI driver allows up to 4 KCS or SMIC interfaces to be configured
335in the system. By default, scan the ACPI tables for interfaces, and
336if it doesn't find any the driver will attempt to register one KCS
337interface at the spec-specified I/O port 0xca2 without interrupts.
338You can change this at module load time (for a module) with:
339
340 modprobe ipmi_si.o type=<type1>,<type2>....
341 ports=<port1>,<port2>... addrs=<addr1>,<addr2>...
342 irqs=<irq1>,<irq2>... trydefaults=[0|1]
343 regspacings=<sp1>,<sp2>,... regsizes=<size1>,<size2>,...
344 regshifts=<shift1>,<shift2>,...
345 slave_addrs=<addr1>,<addr2>,...
346
347Each of these except si_trydefaults is a list, the first item for the
348first interface, second item for the second interface, etc.
349
350The si_type may be either "kcs", "smic", or "bt". If you leave it blank, it
351defaults to "kcs".
352
353If you specify si_addrs as non-zero for an interface, the driver will
354use the memory address given as the address of the device. This
355overrides si_ports.
356
357If you specify si_ports as non-zero for an interface, the driver will
358use the I/O port given as the device address.
359
360If you specify si_irqs as non-zero for an interface, the driver will
361attempt to use the given interrupt for the device.
362
363si_trydefaults sets whether the standard IPMI interface at 0xca2 and
364any interfaces specified by ACPE are tried. By default, the driver
365tries it, set this value to zero to turn this off.
366
367The next three parameters have to do with register layout. The
368registers used by the interfaces may not appear at successive
369locations and they may not be in 8-bit registers. These parameters
370allow the layout of the data in the registers to be more precisely
371specified.
372
373The regspacings parameter give the number of bytes between successive
374register start addresses. For instance, if the regspacing is set to 4
375and the start address is 0xca2, then the address for the second
376register would be 0xca6. This defaults to 1.
377
378The regsizes parameter gives the size of a register, in bytes. The
379data used by IPMI is 8-bits wide, but it may be inside a larger
380register. This parameter allows the read and write type to specified.
381It may be 1, 2, 4, or 8. The default is 1.
382
383Since the register size may be larger than 32 bits, the IPMI data may not
384be in the lower 8 bits. The regshifts parameter give the amount to shift
385the data to get to the actual IPMI data.
386
387The slave_addrs specifies the IPMI address of the local BMC. This is
388usually 0x20 and the driver defaults to that, but in case it's not, it
389can be specified when the driver starts up.
390
391When compiled into the kernel, the addresses can be specified on the
392kernel command line as:
393
394 ipmi_si.type=<type1>,<type2>...
395 ipmi_si.ports=<port1>,<port2>... ipmi_si.addrs=<addr1>,<addr2>...
396 ipmi_si.irqs=<irq1>,<irq2>... ipmi_si.trydefaults=[0|1]
397 ipmi_si.regspacings=<sp1>,<sp2>,...
398 ipmi_si.regsizes=<size1>,<size2>,...
399 ipmi_si.regshifts=<shift1>,<shift2>,...
400 ipmi_si.slave_addrs=<addr1>,<addr2>,...
401
402It works the same as the module parameters of the same names.
403
404By default, the driver will attempt to detect any device specified by
405ACPI, and if none of those then a KCS device at the spec-specified
4060xca2. If you want to turn this off, set the "trydefaults" option to
407false.
408
409If you have high-res timers compiled into the kernel, the driver will
410use them to provide much better performance. Note that if you do not
411have high-res timers enabled in the kernel and you don't have
412interrupts enabled, the driver will run VERY slowly. Don't blame me,
413these interfaces suck.
414
415
416The SMBus Driver
417----------------
418
419The SMBus driver allows up to 4 SMBus devices to be configured in the
420system. By default, the driver will register any SMBus interfaces it finds
421in the I2C address range of 0x20 to 0x4f on any adapter. You can change this
422at module load time (for a module) with:
423
424 modprobe ipmi_smb.o
425 addr=<adapter1>,<i2caddr1>[,<adapter2>,<i2caddr2>[,...]]
426 dbg=<flags1>,<flags2>...
427 [defaultprobe=0] [dbg_probe=1]
428
429The addresses are specified in pairs, the first is the adapter ID and the
430second is the I2C address on that adapter.
431
432The debug flags are bit flags for each BMC found, they are:
433IPMI messages: 1, driver state: 2, timing: 4, I2C probe: 8
434
435Setting smb_defaultprobe to zero disabled the default probing of SMBus
436interfaces at address range 0x20 to 0x4f. This means that only the
437BMCs specified on the smb_addr line will be detected.
438
439Setting smb_dbg_probe to 1 will enable debugging of the probing and
440detection process for BMCs on the SMBusses.
441
442Discovering the IPMI compilant BMC on the SMBus can cause devices
443on the I2C bus to fail. The SMBus driver writes a "Get Device ID" IPMI
444message as a block write to the I2C bus and waits for a response.
445This action can be detrimental to some I2C devices. It is highly recommended
446that the known I2c address be given to the SMBus driver in the smb_addr
447parameter. The default adrress range will not be used when a smb_addr
448parameter is provided.
449
450When compiled into the kernel, the addresses can be specified on the
451kernel command line as:
452
453 ipmb_smb.addr=<adapter1>,<i2caddr1>[,<adapter2>,<i2caddr2>[,...]]
454 ipmi_smb.dbg=<flags1>,<flags2>...
455 ipmi_smb.defaultprobe=0 ipmi_smb.dbg_probe=1
456
457These are the same options as on the module command line.
458
459Note that you might need some I2C changes if CONFIG_IPMI_PANIC_EVENT
460is enabled along with this, so the I2C driver knows to run to
461completion during sending a panic event.
462
463
464Other Pieces
465------------
466
467Watchdog
468--------
469
470A watchdog timer is provided that implements the Linux-standard
471watchdog timer interface. It has three module parameters that can be
472used to control it:
473
474 modprobe ipmi_watchdog timeout=<t> pretimeout=<t> action=<action type>
475 preaction=<preaction type> preop=<preop type> start_now=x
476 nowayout=x
477
478The timeout is the number of seconds to the action, and the pretimeout
479is the amount of seconds before the reset that the pre-timeout panic will
480occur (if pretimeout is zero, then pretimeout will not be enabled). Note
481that the pretimeout is the time before the final timeout. So if the
482timeout is 50 seconds and the pretimeout is 10 seconds, then the pretimeout
483will occur in 40 second (10 seconds before the timeout).
484
485The action may be "reset", "power_cycle", or "power_off", and
486specifies what to do when the timer times out, and defaults to
487"reset".
488
489The preaction may be "pre_smi" for an indication through the SMI
490interface, "pre_int" for an indication through the SMI with an
491interrupts, and "pre_nmi" for a NMI on a preaction. This is how
492the driver is informed of the pretimeout.
493
494The preop may be set to "preop_none" for no operation on a pretimeout,
495"preop_panic" to set the preoperation to panic, or "preop_give_data"
496to provide data to read from the watchdog device when the pretimeout
497occurs. A "pre_nmi" setting CANNOT be used with "preop_give_data"
498because you can't do data operations from an NMI.
499
500When preop is set to "preop_give_data", one byte comes ready to read
501on the device when the pretimeout occurs. Select and fasync work on
502the device, as well.
503
504If start_now is set to 1, the watchdog timer will start running as
505soon as the driver is loaded.
506
507If nowayout is set to 1, the watchdog timer will not stop when the
508watchdog device is closed. The default value of nowayout is true
509if the CONFIG_WATCHDOG_NOWAYOUT option is enabled, or false if not.
510
511When compiled into the kernel, the kernel command line is available
512for configuring the watchdog:
513
514 ipmi_watchdog.timeout=<t> ipmi_watchdog.pretimeout=<t>
515 ipmi_watchdog.action=<action type>
516 ipmi_watchdog.preaction=<preaction type>
517 ipmi_watchdog.preop=<preop type>
518 ipmi_watchdog.start_now=x
519 ipmi_watchdog.nowayout=x
520
521The options are the same as the module parameter options.
522
523The watchdog will panic and start a 120 second reset timeout if it
524gets a pre-action. During a panic or a reboot, the watchdog will
525start a 120 timer if it is running to make sure the reboot occurs.
526
527Note that if you use the NMI preaction for the watchdog, you MUST
528NOT use nmi watchdog mode 1. If you use the NMI watchdog, you
529must use mode 2.
530
531Once you open the watchdog timer, you must write a 'V' character to the
532device to close it, or the timer will not stop. This is a new semantic
533for the driver, but makes it consistent with the rest of the watchdog
534drivers in Linux.