Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1 | Most (all) Intel-MP compliant SMP boards have the so-called 'IO-APIC', |
Nick Andrew | 248fb89 | 2008-02-17 18:01:42 +1100 | [diff] [blame] | 2 | which is an enhanced interrupt controller. It enables us to route |
| 3 | hardware interrupts to multiple CPUs, or to CPU groups. Without an |
| 4 | IO-APIC, interrupts from hardware will be delivered only to the |
| 5 | CPU which boots the operating system (usually CPU#0). |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 6 | |
| 7 | Linux supports all variants of compliant SMP boards, including ones with |
Nick Andrew | 248fb89 | 2008-02-17 18:01:42 +1100 | [diff] [blame] | 8 | multiple IO-APICs. Multiple IO-APICs are used in high-end servers to |
| 9 | distribute IRQ load further. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 10 | |
Nick Andrew | 248fb89 | 2008-02-17 18:01:42 +1100 | [diff] [blame] | 11 | There are (a few) known breakages in certain older boards, such bugs are |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 12 | usually worked around by the kernel. If your MP-compliant SMP board does |
| 13 | not boot Linux, then consult the linux-smp mailing list archives first. |
| 14 | |
| 15 | If your box boots fine with enabled IO-APIC IRQs, then your |
| 16 | /proc/interrupts will look like this one: |
| 17 | |
| 18 | ----------------------------> |
| 19 | hell:~> cat /proc/interrupts |
| 20 | CPU0 |
| 21 | 0: 1360293 IO-APIC-edge timer |
| 22 | 1: 4 IO-APIC-edge keyboard |
| 23 | 2: 0 XT-PIC cascade |
| 24 | 13: 1 XT-PIC fpu |
| 25 | 14: 1448 IO-APIC-edge ide0 |
| 26 | 16: 28232 IO-APIC-level Intel EtherExpress Pro 10/100 Ethernet |
| 27 | 17: 51304 IO-APIC-level eth0 |
| 28 | NMI: 0 |
| 29 | ERR: 0 |
| 30 | hell:~> |
| 31 | <---------------------------- |
| 32 | |
Nick Andrew | 248fb89 | 2008-02-17 18:01:42 +1100 | [diff] [blame] | 33 | Some interrupts are still listed as 'XT PIC', but this is not a problem; |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 34 | none of those IRQ sources is performance-critical. |
| 35 | |
| 36 | |
Nick Andrew | 248fb89 | 2008-02-17 18:01:42 +1100 | [diff] [blame] | 37 | In the unlikely case that your board does not create a working mp-table, |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 38 | you can use the pirq= boot parameter to 'hand-construct' IRQ entries. This |
Nick Andrew | 248fb89 | 2008-02-17 18:01:42 +1100 | [diff] [blame] | 39 | is non-trivial though and cannot be automated. One sample /etc/lilo.conf |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 40 | entry: |
| 41 | |
| 42 | append="pirq=15,11,10" |
| 43 | |
Nick Andrew | 248fb89 | 2008-02-17 18:01:42 +1100 | [diff] [blame] | 44 | The actual numbers depend on your system, on your PCI cards and on their |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 45 | PCI slot position. Usually PCI slots are 'daisy chained' before they are |
| 46 | connected to the PCI chipset IRQ routing facility (the incoming PIRQ1-4 |
| 47 | lines): |
| 48 | |
| 49 | ,-. ,-. ,-. ,-. ,-. |
| 50 | PIRQ4 ----| |-. ,-| |-. ,-| |-. ,-| |--------| | |
| 51 | |S| \ / |S| \ / |S| \ / |S| |S| |
| 52 | PIRQ3 ----|l|-. `/---|l|-. `/---|l|-. `/---|l|--------|l| |
| 53 | |o| \/ |o| \/ |o| \/ |o| |o| |
| 54 | PIRQ2 ----|t|-./`----|t|-./`----|t|-./`----|t|--------|t| |
| 55 | |1| /\ |2| /\ |3| /\ |4| |5| |
| 56 | PIRQ1 ----| |- `----| |- `----| |- `----| |--------| | |
| 57 | `-' `-' `-' `-' `-' |
| 58 | |
Nick Andrew | 248fb89 | 2008-02-17 18:01:42 +1100 | [diff] [blame] | 59 | Every PCI card emits a PCI IRQ, which can be INTA, INTB, INTC or INTD: |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 60 | |
| 61 | ,-. |
| 62 | INTD--| | |
| 63 | |S| |
| 64 | INTC--|l| |
| 65 | |o| |
| 66 | INTB--|t| |
| 67 | |x| |
| 68 | INTA--| | |
| 69 | `-' |
| 70 | |
| 71 | These INTA-D PCI IRQs are always 'local to the card', their real meaning |
| 72 | depends on which slot they are in. If you look at the daisy chaining diagram, |
| 73 | a card in slot4, issuing INTA IRQ, it will end up as a signal on PIRQ2 of |
| 74 | the PCI chipset. Most cards issue INTA, this creates optimal distribution |
| 75 | between the PIRQ lines. (distributing IRQ sources properly is not a |
| 76 | necessity, PCI IRQs can be shared at will, but it's a good for performance |
| 77 | to have non shared interrupts). Slot5 should be used for videocards, they |
| 78 | do not use interrupts normally, thus they are not daisy chained either. |
| 79 | |
| 80 | so if you have your SCSI card (IRQ11) in Slot1, Tulip card (IRQ9) in |
| 81 | Slot2, then you'll have to specify this pirq= line: |
| 82 | |
| 83 | append="pirq=11,9" |
| 84 | |
| 85 | the following script tries to figure out such a default pirq= line from |
| 86 | your PCI configuration: |
| 87 | |
| 88 | echo -n pirq=; echo `scanpci | grep T_L | cut -c56-` | sed 's/ /,/g' |
| 89 | |
| 90 | note that this script wont work if you have skipped a few slots or if your |
| 91 | board does not do default daisy-chaining. (or the IO-APIC has the PIRQ pins |
| 92 | connected in some strange way). E.g. if in the above case you have your SCSI |
| 93 | card (IRQ11) in Slot3, and have Slot1 empty: |
| 94 | |
| 95 | append="pirq=0,9,11" |
| 96 | |
| 97 | [value '0' is a generic 'placeholder', reserved for empty (or non-IRQ emitting) |
| 98 | slots.] |
| 99 | |
Nick Andrew | 248fb89 | 2008-02-17 18:01:42 +1100 | [diff] [blame] | 100 | Generally, it's always possible to find out the correct pirq= settings, just |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 101 | permute all IRQ numbers properly ... it will take some time though. An |
| 102 | 'incorrect' pirq line will cause the booting process to hang, or a device |
Nick Andrew | 248fb89 | 2008-02-17 18:01:42 +1100 | [diff] [blame] | 103 | won't function properly (e.g. if it's inserted as a module). |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 104 | |
Nick Andrew | 248fb89 | 2008-02-17 18:01:42 +1100 | [diff] [blame] | 105 | If you have 2 PCI buses, then you can use up to 8 pirq values, although such |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 106 | boards tend to have a good configuration. |
| 107 | |
| 108 | Be prepared that it might happen that you need some strange pirq line: |
| 109 | |
| 110 | append="pirq=0,0,0,0,0,0,9,11" |
| 111 | |
Nick Andrew | 248fb89 | 2008-02-17 18:01:42 +1100 | [diff] [blame] | 112 | Use smart trial-and-error techniques to find out the correct pirq line ... |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 113 | |
Nick Andrew | 248fb89 | 2008-02-17 18:01:42 +1100 | [diff] [blame] | 114 | Good luck and mail to linux-smp@vger.kernel.org or |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 115 | linux-kernel@vger.kernel.org if you have any problems that are not covered |
| 116 | by this document. |
| 117 | |
| 118 | -- mingo |
| 119 | |