Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1 | ================================= |
| 2 | FR451 MMU LINUX MEMORY MANAGEMENT |
| 3 | ================================= |
| 4 | |
| 5 | ============ |
| 6 | MMU HARDWARE |
| 7 | ============ |
| 8 | |
| 9 | FR451 MMU Linux puts the MMU into EDAT mode whilst running. This means that it uses both the SAT |
| 10 | registers and the DAT TLB to perform address translation. |
| 11 | |
| 12 | There are 8 IAMLR/IAMPR register pairs and 16 DAMLR/DAMPR register pairs for SAT mode. |
| 13 | |
| 14 | In DAT mode, there is also a TLB organised in cache format as 64 lines x 2 ways. Each line spans a |
| 15 | 16KB range of addresses, but can match a larger region. |
| 16 | |
| 17 | |
| 18 | =========================== |
| 19 | MEMORY MANAGEMENT REGISTERS |
| 20 | =========================== |
| 21 | |
| 22 | Certain control registers are used by the kernel memory management routines: |
| 23 | |
| 24 | REGISTERS USAGE |
| 25 | ====================== ================================================== |
| 26 | IAMR0, DAMR0 Kernel image and data mappings |
| 27 | IAMR1, DAMR1 First-chance TLB lookup mapping |
| 28 | DAMR2 Page attachment for cache flush by page |
| 29 | DAMR3 Current PGD mapping |
| 30 | SCR0, DAMR4 Instruction TLB PGE/PTD cache |
| 31 | SCR1, DAMR5 Data TLB PGE/PTD cache |
| 32 | DAMR6-10 kmap_atomic() mappings |
| 33 | DAMR11 I/O mapping |
| 34 | CXNR mm_struct context ID |
| 35 | TTBR Page directory (PGD) pointer (physical address) |
| 36 | |
| 37 | |
| 38 | ===================== |
| 39 | GENERAL MEMORY LAYOUT |
| 40 | ===================== |
| 41 | |
| 42 | The physical memory layout is as follows: |
| 43 | |
| 44 | PHYSICAL ADDRESS CONTROLLER DEVICE |
| 45 | =================== ============== ======================================= |
| 46 | 00000000 - BFFFFFFF SDRAM SDRAM area |
| 47 | E0000000 - EFFFFFFF L-BUS CS2# VDK SLBUS/PCI window |
| 48 | F0000000 - F0FFFFFF L-BUS CS5# MB93493 CSC area (DAV daughter board) |
| 49 | F1000000 - F1FFFFFF L-BUS CS7# (CB70 CPU-card PCMCIA port I/O space) |
| 50 | FC000000 - FC0FFFFF L-BUS CS1# VDK MB86943 config space |
| 51 | FC100000 - FC1FFFFF L-BUS CS6# DM9000 NIC I/O space |
| 52 | FC200000 - FC2FFFFF L-BUS CS3# MB93493 CSR area (DAV daughter board) |
| 53 | FD000000 - FDFFFFFF L-BUS CS4# (CB70 CPU-card extra flash space) |
| 54 | FE000000 - FEFFFFFF Internal CPU peripherals |
| 55 | FF000000 - FF1FFFFF L-BUS CS0# Flash 1 |
| 56 | FF200000 - FF3FFFFF L-BUS CS0# Flash 2 |
| 57 | FFC00000 - FFC0001F L-BUS CS0# FPGA |
| 58 | |
| 59 | The virtual memory layout is: |
| 60 | |
| 61 | VIRTUAL ADDRESS PHYSICAL TRANSLATOR FLAGS SIZE OCCUPATION |
| 62 | ================= ======== ============== ======= ======= =================================== |
| 63 | 00004000-BFFFFFFF various TLB,xAMR1 D-N-??V 3GB Userspace |
| 64 | C0000000-CFFFFFFF 00000000 xAMPR0 -L-S--V 256MB Kernel image and data |
| 65 | D0000000-D7FFFFFF various TLB,xAMR1 D-NS??V 128MB vmalloc area |
| 66 | D8000000-DBFFFFFF various TLB,xAMR1 D-NS??V 64MB kmap() area |
| 67 | DC000000-DCFFFFFF various TLB 1MB Secondary kmap_atomic() frame |
| 68 | DD000000-DD27FFFF various DAMR 160KB Primary kmap_atomic() frame |
| 69 | DD040000 DAMR2/IAMR2 -L-S--V page Page cache flush attachment point |
| 70 | DD080000 DAMR3 -L-SC-V page Page Directory (PGD) |
| 71 | DD0C0000 DAMR4 -L-SC-V page Cached insn TLB Page Table lookup |
| 72 | DD100000 DAMR5 -L-SC-V page Cached data TLB Page Table lookup |
| 73 | DD140000 DAMR6 -L-S--V page kmap_atomic(KM_BOUNCE_READ) |
| 74 | DD180000 DAMR7 -L-S--V page kmap_atomic(KM_SKB_SUNRPC_DATA) |
| 75 | DD1C0000 DAMR8 -L-S--V page kmap_atomic(KM_SKB_DATA_SOFTIRQ) |
| 76 | DD200000 DAMR9 -L-S--V page kmap_atomic(KM_USER0) |
| 77 | DD240000 DAMR10 -L-S--V page kmap_atomic(KM_USER1) |
| 78 | E0000000-FFFFFFFF E0000000 DAMR11 -L-SC-V 512MB I/O region |
| 79 | |
| 80 | IAMPR1 and DAMPR1 are used as an extension to the TLB. |
| 81 | |
| 82 | |
| 83 | ==================== |
| 84 | KMAP AND KMAP_ATOMIC |
| 85 | ==================== |
| 86 | |
| 87 | To access pages in the page cache (which may not be directly accessible if highmem is available), |
| 88 | the kernel calls kmap(), does the access and then calls kunmap(); or it calls kmap_atomic(), does |
| 89 | the access and then calls kunmap_atomic(). |
| 90 | |
| 91 | kmap() creates an attachment between an arbitrary inaccessible page and a range of virtual |
| 92 | addresses by installing a PTE in a special page table. The kernel can then access this page as it |
| 93 | wills. When it's finished, the kernel calls kunmap() to clear the PTE. |
| 94 | |
| 95 | kmap_atomic() does something slightly different. In the interests of speed, it chooses one of two |
| 96 | strategies: |
| 97 | |
| 98 | (1) If possible, kmap_atomic() attaches the requested page to one of DAMPR5 through DAMPR10 |
| 99 | register pairs; and the matching kunmap_atomic() clears the DAMPR. This makes high memory |
| 100 | support really fast as there's no need to flush the TLB or modify the page tables. The DAMLR |
| 101 | registers being used for this are preset during boot and don't change over the lifetime of the |
| 102 | process. There's a direct mapping between the first few kmap_atomic() types, DAMR number and |
| 103 | virtual address slot. |
| 104 | |
| 105 | However, there are more kmap_atomic() types defined than there are DAMR registers available, |
| 106 | so we fall back to: |
| 107 | |
| 108 | (2) kmap_atomic() uses a slot in the secondary frame (determined by the type parameter), and then |
| 109 | locks an entry in the TLB to translate that slot to the specified page. The number of slots is |
| 110 | obviously limited, and their positions are controlled such that each slot is matched by a |
| 111 | different line in the TLB. kunmap() ejects the entry from the TLB. |
| 112 | |
| 113 | Note that the first three kmap atomic types are really just declared as placeholders. The DAMPR |
| 114 | registers involved are actually modified directly. |
| 115 | |
| 116 | Also note that kmap() itself may sleep, kmap_atomic() may never sleep and both always succeed; |
| 117 | furthermore, a driver using kmap() may sleep before calling kunmap(), but may not sleep before |
| 118 | calling kunmap_atomic() if it had previously called kmap_atomic(). |
| 119 | |
| 120 | |
| 121 | =============================== |
| 122 | USING MORE THAN 256MB OF MEMORY |
| 123 | =============================== |
| 124 | |
| 125 | The kernel cannot access more than 256MB of memory directly. The physical layout, however, permits |
| 126 | up to 3GB of SDRAM (possibly 3.25GB) to be made available. By using CONFIG_HIGHMEM, the kernel can |
| 127 | allow userspace (by way of page tables) and itself (by way of kmap) to deal with the memory |
| 128 | allocation. |
| 129 | |
| 130 | External devices can, of course, still DMA to and from all of the SDRAM, even if the kernel can't |
| 131 | see it directly. The kernel translates page references into real addresses for communicating to the |
| 132 | devices. |
| 133 | |
| 134 | |
| 135 | =================== |
| 136 | PAGE TABLE TOPOLOGY |
| 137 | =================== |
| 138 | |
| 139 | The page tables are arranged in 2-layer format. There is a middle layer (PMD) that would be used in |
| 140 | 3-layer format tables but that is folded into the top layer (PGD) and so consumes no extra memory |
| 141 | or processing power. |
| 142 | |
| 143 | +------+ PGD PMD |
| 144 | | TTBR |--->+-------------------+ |
| 145 | +------+ | | : STE | |
| 146 | | PGE0 | PME0 : STE | |
| 147 | | | : STE | |
| 148 | +-------------------+ Page Table |
| 149 | | | : STE -------------->+--------+ +0x0000 |
| 150 | | PGE1 | PME0 : STE -----------+ | PTE0 | |
| 151 | | | : STE -------+ | +--------+ |
| 152 | +-------------------+ | | | PTE63 | |
| 153 | | | : STE | | +-->+--------+ +0x0100 |
| 154 | | PGE2 | PME0 : STE | | | PTE64 | |
| 155 | | | : STE | | +--------+ |
| 156 | +-------------------+ | | PTE127 | |
| 157 | | | : STE | +------>+--------+ +0x0200 |
| 158 | | PGE3 | PME0 : STE | | PTE128 | |
| 159 | | | : STE | +--------+ |
| 160 | +-------------------+ | PTE191 | |
| 161 | +--------+ +0x0300 |
| 162 | |
| 163 | Each Page Directory (PGD) is 16KB (page size) in size and is divided into 64 entries (PGEs). Each |
| 164 | PGE contains one Page Mid Directory (PMD). |
| 165 | |
| 166 | Each PMD is 256 bytes in size and contains a single entry (PME). Each PME holds 64 FR451 MMU |
| 167 | segment table entries of 4 bytes apiece. Each PME "points to" a page table. In practice, each STE |
| 168 | points to a subset of the page table, the first to PT+0x0000, the second to PT+0x0100, the third to |
| 169 | PT+0x200, and so on. |
| 170 | |
| 171 | Each PGE and PME covers 64MB of the total virtual address space. |
| 172 | |
| 173 | Each Page Table (PTD) is 16KB (page size) in size, and is divided into 4096 entries (PTEs). Each |
| 174 | entry can point to one 16KB page. In practice, each Linux page table is subdivided into 64 FR451 |
| 175 | MMU page tables. But they are all grouped together to make management easier, in particular rmap |
| 176 | support is then trivial. |
| 177 | |
| 178 | Grouping page tables in this fashion makes PGE caching in SCR0/SCR1 more efficient because the |
| 179 | coverage of the cached item is greater. |
| 180 | |
| 181 | Page tables for the vmalloc area are allocated at boot time and shared between all mm_structs. |
| 182 | |
| 183 | |
| 184 | ================= |
| 185 | USER SPACE LAYOUT |
| 186 | ================= |
| 187 | |
| 188 | For MMU capable Linux, the regions userspace code are allowed to access are kept entirely separate |
| 189 | from those dedicated to the kernel: |
| 190 | |
| 191 | VIRTUAL ADDRESS SIZE PURPOSE |
| 192 | ================= ===== =================================== |
| 193 | 00000000-00003fff 4KB NULL pointer access trap |
| 194 | 00004000-01ffffff ~32MB lower mmap space (grows up) |
| 195 | 02000000-021fffff 2MB Stack space (grows down from top) |
| 196 | 02200000-nnnnnnnn Executable mapping |
| 197 | nnnnnnnn- brk space (grows up) |
| 198 | -bfffffff upper mmap space (grows down) |
| 199 | |
| 200 | This is so arranged so as to make best use of the 16KB page tables and the way in which PGEs/PMEs |
| 201 | are cached by the TLB handler. The lower mmap space is filled first, and then the upper mmap space |
| 202 | is filled. |
| 203 | |
| 204 | |
| 205 | =============================== |
| 206 | GDB-STUB MMU DEBUGGING SERVICES |
| 207 | =============================== |
| 208 | |
| 209 | The gdb-stub included in this kernel provides a number of services to aid in the debugging of MMU |
| 210 | related kernel services: |
| 211 | |
| 212 | (*) Every time the kernel stops, certain state information is dumped into __debug_mmu. This |
| 213 | variable is defined in arch/frv/kernel/gdb-stub.c. Note that the gdbinit file in this |
| 214 | directory has some useful macros for dealing with this. |
| 215 | |
| 216 | (*) __debug_mmu.tlb[] |
| 217 | |
| 218 | This receives the current TLB contents. This can be viewed with the _tlb GDB macro: |
| 219 | |
| 220 | (gdb) _tlb |
| 221 | tlb[0x00]: 01000005 00718203 01000002 00718203 |
| 222 | tlb[0x01]: 01004002 006d4201 01004005 006d4203 |
| 223 | tlb[0x02]: 01008002 006d0201 01008006 00004200 |
| 224 | tlb[0x03]: 0100c006 007f4202 0100c002 0064c202 |
| 225 | tlb[0x04]: 01110005 00774201 01110002 00774201 |
| 226 | tlb[0x05]: 01114005 00770201 01114002 00770201 |
| 227 | tlb[0x06]: 01118002 0076c201 01118005 0076c201 |
| 228 | ... |
| 229 | tlb[0x3d]: 010f4002 00790200 001f4002 0054ca02 |
| 230 | tlb[0x3e]: 010f8005 0078c201 010f8002 0078c201 |
| 231 | tlb[0x3f]: 001fc002 0056ca01 001fc005 00538a01 |
| 232 | |
| 233 | (*) __debug_mmu.iamr[] |
| 234 | (*) __debug_mmu.damr[] |
| 235 | |
| 236 | These receive the current IAMR and DAMR contents. These can be viewed with with the _amr |
| 237 | GDB macro: |
| 238 | |
| 239 | (gdb) _amr |
| 240 | AMRx DAMR IAMR |
| 241 | ==== ===================== ===================== |
| 242 | amr0 : L:c0000000 P:00000cb9 : L:c0000000 P:000004b9 |
| 243 | amr1 : L:01070005 P:006f9203 : L:0102c005 P:006a1201 |
| 244 | amr2 : L:d8d00000 P:00000000 : L:d8d00000 P:00000000 |
| 245 | amr3 : L:d8d04000 P:00534c0d : L:00000000 P:00000000 |
| 246 | amr4 : L:d8d08000 P:00554c0d : L:00000000 P:00000000 |
| 247 | amr5 : L:d8d0c000 P:00554c0d : L:00000000 P:00000000 |
| 248 | amr6 : L:d8d10000 P:00000000 : L:00000000 P:00000000 |
| 249 | amr7 : L:d8d14000 P:00000000 : L:00000000 P:00000000 |
| 250 | amr8 : L:d8d18000 P:00000000 |
| 251 | amr9 : L:d8d1c000 P:00000000 |
| 252 | amr10: L:d8d20000 P:00000000 |
| 253 | amr11: L:e0000000 P:e0000ccd |
| 254 | |
| 255 | (*) The current task's page directory is bound to DAMR3. |
| 256 | |
| 257 | This can be viewed with the _pgd GDB macro: |
| 258 | |
| 259 | (gdb) _pgd |
| 260 | $3 = {{pge = {{ste = {0x554001, 0x554101, 0x554201, 0x554301, 0x554401, |
| 261 | 0x554501, 0x554601, 0x554701, 0x554801, 0x554901, 0x554a01, |
| 262 | 0x554b01, 0x554c01, 0x554d01, 0x554e01, 0x554f01, 0x555001, |
| 263 | 0x555101, 0x555201, 0x555301, 0x555401, 0x555501, 0x555601, |
| 264 | 0x555701, 0x555801, 0x555901, 0x555a01, 0x555b01, 0x555c01, |
| 265 | 0x555d01, 0x555e01, 0x555f01, 0x556001, 0x556101, 0x556201, |
| 266 | 0x556301, 0x556401, 0x556501, 0x556601, 0x556701, 0x556801, |
| 267 | 0x556901, 0x556a01, 0x556b01, 0x556c01, 0x556d01, 0x556e01, |
| 268 | 0x556f01, 0x557001, 0x557101, 0x557201, 0x557301, 0x557401, |
| 269 | 0x557501, 0x557601, 0x557701, 0x557801, 0x557901, 0x557a01, |
| 270 | 0x557b01, 0x557c01, 0x557d01, 0x557e01, 0x557f01}}}}, {pge = {{ |
| 271 | ste = {0x0 <repeats 64 times>}}}} <repeats 51 times>, {pge = {{ste = { |
| 272 | 0x248001, 0x248101, 0x248201, 0x248301, 0x248401, 0x248501, |
| 273 | 0x248601, 0x248701, 0x248801, 0x248901, 0x248a01, 0x248b01, |
| 274 | 0x248c01, 0x248d01, 0x248e01, 0x248f01, 0x249001, 0x249101, |
| 275 | 0x249201, 0x249301, 0x249401, 0x249501, 0x249601, 0x249701, |
| 276 | 0x249801, 0x249901, 0x249a01, 0x249b01, 0x249c01, 0x249d01, |
| 277 | 0x249e01, 0x249f01, 0x24a001, 0x24a101, 0x24a201, 0x24a301, |
| 278 | 0x24a401, 0x24a501, 0x24a601, 0x24a701, 0x24a801, 0x24a901, |
| 279 | 0x24aa01, 0x24ab01, 0x24ac01, 0x24ad01, 0x24ae01, 0x24af01, |
| 280 | 0x24b001, 0x24b101, 0x24b201, 0x24b301, 0x24b401, 0x24b501, |
| 281 | 0x24b601, 0x24b701, 0x24b801, 0x24b901, 0x24ba01, 0x24bb01, |
| 282 | 0x24bc01, 0x24bd01, 0x24be01, 0x24bf01}}}}, {pge = {{ste = { |
| 283 | 0x0 <repeats 64 times>}}}} <repeats 11 times>} |
| 284 | |
| 285 | (*) The PTD last used by the instruction TLB miss handler is attached to DAMR4. |
| 286 | (*) The PTD last used by the data TLB miss handler is attached to DAMR5. |
| 287 | |
| 288 | These can be viewed with the _ptd_i and _ptd_d GDB macros: |
| 289 | |
| 290 | (gdb) _ptd_d |
| 291 | $5 = {{pte = 0x0} <repeats 127 times>, {pte = 0x539b01}, { |
| 292 | pte = 0x0} <repeats 896 times>, {pte = 0x719303}, {pte = 0x6d5303}, { |
| 293 | pte = 0x0}, {pte = 0x0}, {pte = 0x0}, {pte = 0x0}, {pte = 0x0}, { |
| 294 | pte = 0x0}, {pte = 0x0}, {pte = 0x0}, {pte = 0x0}, {pte = 0x6a1303}, { |
| 295 | pte = 0x0} <repeats 12 times>, {pte = 0x709303}, {pte = 0x0}, {pte = 0x0}, |
| 296 | {pte = 0x6fd303}, {pte = 0x6f9303}, {pte = 0x6f5303}, {pte = 0x0}, { |
| 297 | pte = 0x6ed303}, {pte = 0x531b01}, {pte = 0x50db01}, { |
| 298 | pte = 0x0} <repeats 13 times>, {pte = 0x5303}, {pte = 0x7f5303}, { |
| 299 | pte = 0x509b01}, {pte = 0x505b01}, {pte = 0x7c9303}, {pte = 0x7b9303}, { |
| 300 | pte = 0x7b5303}, {pte = 0x7b1303}, {pte = 0x7ad303}, {pte = 0x0}, { |
| 301 | pte = 0x0}, {pte = 0x7a1303}, {pte = 0x0}, {pte = 0x795303}, {pte = 0x0}, { |
| 302 | pte = 0x78d303}, {pte = 0x0}, {pte = 0x0}, {pte = 0x0}, {pte = 0x0}, { |
| 303 | pte = 0x0}, {pte = 0x775303}, {pte = 0x771303}, {pte = 0x76d303}, { |
| 304 | pte = 0x0}, {pte = 0x765303}, {pte = 0x7c5303}, {pte = 0x501b01}, { |
| 305 | pte = 0x4f1b01}, {pte = 0x4edb01}, {pte = 0x0}, {pte = 0x4f9b01}, { |
| 306 | pte = 0x4fdb01}, {pte = 0x0} <repeats 2992 times>} |