| |
| PAT (Page Attribute Table) |
| |
| x86 Page Attribute Table (PAT) allows for setting the memory attribute at the |
| page level granularity. PAT is complementary to the MTRR settings which allows |
| for setting of memory types over physical address ranges. However, PAT is |
| more flexible than MTRR due to its capability to set attributes at page level |
| and also due to the fact that there are no hardware limitations on number of |
| such attribute settings allowed. Added flexibility comes with guidelines for |
| not having memory type aliasing for the same physical memory with multiple |
| virtual addresses. |
| |
| PAT allows for different types of memory attributes. The most commonly used |
| ones that will be supported at this time are Write-back, Uncached, |
| Write-combined and Uncached Minus. |
| |
| |
| PAT APIs |
| -------- |
| |
| There are many different APIs in the kernel that allows setting of memory |
| attributes at the page level. In order to avoid aliasing, these interfaces |
| should be used thoughtfully. Below is a table of interfaces available, |
| their intended usage and their memory attribute relationships. Internally, |
| these APIs use a reserve_memtype()/free_memtype() interface on the physical |
| address range to avoid any aliasing. |
| |
| |
| ------------------------------------------------------------------- |
| API | RAM | ACPI,... | Reserved/Holes | |
| -----------------------|----------|------------|------------------| |
| | | | | |
| ioremap | -- | UC- | UC- | |
| | | | | |
| ioremap_cache | -- | WB | WB | |
| | | | | |
| ioremap_nocache | -- | UC- | UC- | |
| | | | | |
| ioremap_wc | -- | -- | WC | |
| | | | | |
| set_memory_uc | UC- | -- | -- | |
| set_memory_wb | | | | |
| | | | | |
| set_memory_wc | WC | -- | -- | |
| set_memory_wb | | | | |
| | | | | |
| pci sysfs resource | -- | -- | UC- | |
| | | | | |
| pci sysfs resource_wc | -- | -- | WC | |
| is IORESOURCE_PREFETCH| | | | |
| | | | | |
| pci proc | -- | -- | UC- | |
| !PCIIOC_WRITE_COMBINE | | | | |
| | | | | |
| pci proc | -- | -- | WC | |
| PCIIOC_WRITE_COMBINE | | | | |
| | | | | |
| /dev/mem | -- | WB/WC/UC- | WB/WC/UC- | |
| read-write | | | | |
| | | | | |
| /dev/mem | -- | UC- | UC- | |
| mmap SYNC flag | | | | |
| | | | | |
| /dev/mem | -- | WB/WC/UC- | WB/WC/UC- | |
| mmap !SYNC flag | |(from exist-| (from exist- | |
| and | | ing alias)| ing alias) | |
| any alias to this area| | | | |
| | | | | |
| /dev/mem | -- | WB | WB | |
| mmap !SYNC flag | | | | |
| no alias to this area | | | | |
| and | | | | |
| MTRR says WB | | | | |
| | | | | |
| /dev/mem | -- | -- | UC- | |
| mmap !SYNC flag | | | | |
| no alias to this area | | | | |
| and | | | | |
| MTRR says !WB | | | | |
| | | | | |
| ------------------------------------------------------------------- |
| |
| Notes: |
| |
| -- in the above table mean "Not suggested usage for the API". Some of the --'s |
| are strictly enforced by the kernel. Some others are not really enforced |
| today, but may be enforced in future. |
| |
| For ioremap and pci access through /sys or /proc - The actual type returned |
| can be more restrictive, in case of any existing aliasing for that address. |
| For example: If there is an existing uncached mapping, a new ioremap_wc can |
| return uncached mapping in place of write-combine requested. |
| |
| set_memory_[uc|wc] and set_memory_wb should be used in pairs, where driver will |
| first make a region uc or wc and switch it back to wb after use. |
| |
| Over time writes to /proc/mtrr will be deprecated in favor of using PAT based |
| interfaces. Users writing to /proc/mtrr are suggested to use above interfaces. |
| |
| Drivers should use ioremap_[uc|wc] to access PCI BARs with [uc|wc] access |
| types. |
| |
| Drivers should use set_memory_[uc|wc] to set access type for RAM ranges. |
| |
| |
| PAT debugging |
| ------------- |
| |
| With CONFIG_DEBUG_FS enabled, PAT memtype list can be examined by |
| |
| # mount -t debugfs debugfs /sys/kernel/debug |
| # cat /sys/kernel/debug/x86/pat_memtype_list |
| PAT memtype list: |
| uncached-minus @ 0x7fadf000-0x7fae0000 |
| uncached-minus @ 0x7fb19000-0x7fb1a000 |
| uncached-minus @ 0x7fb1a000-0x7fb1b000 |
| uncached-minus @ 0x7fb1b000-0x7fb1c000 |
| uncached-minus @ 0x7fb1c000-0x7fb1d000 |
| uncached-minus @ 0x7fb1d000-0x7fb1e000 |
| uncached-minus @ 0x7fb1e000-0x7fb25000 |
| uncached-minus @ 0x7fb25000-0x7fb26000 |
| uncached-minus @ 0x7fb26000-0x7fb27000 |
| uncached-minus @ 0x7fb27000-0x7fb28000 |
| uncached-minus @ 0x7fb28000-0x7fb2e000 |
| uncached-minus @ 0x7fb2e000-0x7fb2f000 |
| uncached-minus @ 0x7fb2f000-0x7fb30000 |
| uncached-minus @ 0x7fb31000-0x7fb32000 |
| uncached-minus @ 0x80000000-0x90000000 |
| |
| This list shows physical address ranges and various PAT settings used to |
| access those physical address ranges. |
| |
| Another, more verbose way of getting PAT related debug messages is with |
| "debugpat" boot parameter. With this parameter, various debug messages are |
| printed to dmesg log. |
| |