Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1 | Hollis Blanchard <hollis@austin.ibm.com> |
| 2 | 5 Jun 2002 |
| 3 | |
| 4 | This document describes the system (including self-modifying code) used in the |
| 5 | PPC Linux kernel to support a variety of PowerPC CPUs without requiring |
| 6 | compile-time selection. |
| 7 | |
| 8 | Early in the boot process the ppc32 kernel detects the current CPU type and |
| 9 | chooses a set of features accordingly. Some examples include Altivec support, |
| 10 | split instruction and data caches, and if the CPU supports the DOZE and NAP |
| 11 | sleep modes. |
| 12 | |
| 13 | Detection of the feature set is simple. A list of processors can be found in |
| 14 | arch/ppc/kernel/cputable.c. The PVR register is masked and compared with each |
| 15 | value in the list. If a match is found, the cpu_features of cur_cpu_spec is |
| 16 | assigned to the feature bitmask for this processor and a __setup_cpu function |
| 17 | is called. |
| 18 | |
| 19 | C code may test 'cur_cpu_spec[smp_processor_id()]->cpu_features' for a |
| 20 | particular feature bit. This is done in quite a few places, for example |
| 21 | in ppc_setup_l2cr(). |
| 22 | |
| 23 | Implementing cpufeatures in assembly is a little more involved. There are |
| 24 | several paths that are performance-critical and would suffer if an array |
| 25 | index, structure dereference, and conditional branch were added. To avoid the |
| 26 | performance penalty but still allow for runtime (rather than compile-time) CPU |
| 27 | selection, unused code is replaced by 'nop' instructions. This nop'ing is |
| 28 | based on CPU 0's capabilities, so a multi-processor system with non-identical |
| 29 | processors will not work (but such a system would likely have other problems |
| 30 | anyways). |
| 31 | |
| 32 | After detecting the processor type, the kernel patches out sections of code |
| 33 | that shouldn't be used by writing nop's over it. Using cpufeatures requires |
| 34 | just 2 macros (found in include/asm-ppc/cputable.h), as seen in head.S |
| 35 | transfer_to_handler: |
| 36 | |
| 37 | #ifdef CONFIG_ALTIVEC |
| 38 | BEGIN_FTR_SECTION |
| 39 | mfspr r22,SPRN_VRSAVE /* if G4, save vrsave register value */ |
| 40 | stw r22,THREAD_VRSAVE(r23) |
| 41 | END_FTR_SECTION_IFSET(CPU_FTR_ALTIVEC) |
| 42 | #endif /* CONFIG_ALTIVEC */ |
| 43 | |
| 44 | If CPU 0 supports Altivec, the code is left untouched. If it doesn't, both |
| 45 | instructions are replaced with nop's. |
| 46 | |
| 47 | The END_FTR_SECTION macro has two simpler variations: END_FTR_SECTION_IFSET |
| 48 | and END_FTR_SECTION_IFCLR. These simply test if a flag is set (in |
| 49 | cur_cpu_spec[0]->cpu_features) or is cleared, respectively. These two macros |
| 50 | should be used in the majority of cases. |
| 51 | |
| 52 | The END_FTR_SECTION macros are implemented by storing information about this |
| 53 | code in the '__ftr_fixup' ELF section. When do_cpu_ftr_fixups |
| 54 | (arch/ppc/kernel/misc.S) is invoked, it will iterate over the records in |
| 55 | __ftr_fixup, and if the required feature is not present it will loop writing |
| 56 | nop's from each BEGIN_FTR_SECTION to END_FTR_SECTION. |