Keshavamurthy, Anil S | ba39592 | 2007-10-21 16:41:49 -0700 | [diff] [blame] | 1 | Linux IOMMU Support |
| 2 | =================== |
| 3 | |
| 4 | The architecture spec can be obtained from the below location. |
| 5 | |
| 6 | http://www.intel.com/technology/virtualization/ |
| 7 | |
| 8 | This guide gives a quick cheat sheet for some basic understanding. |
| 9 | |
| 10 | Some Keywords |
| 11 | |
| 12 | DMAR - DMA remapping |
| 13 | DRHD - DMA Engine Reporting Structure |
| 14 | RMRR - Reserved memory Region Reporting Structure |
| 15 | ZLR - Zero length reads from PCI devices |
| 16 | IOVA - IO Virtual address. |
| 17 | |
| 18 | Basic stuff |
| 19 | ----------- |
| 20 | |
| 21 | ACPI enumerates and lists the different DMA engines in the platform, and |
| 22 | device scope relationships between PCI devices and which DMA engine controls |
| 23 | them. |
| 24 | |
| 25 | What is RMRR? |
| 26 | ------------- |
| 27 | |
| 28 | There are some devices the BIOS controls, for e.g USB devices to perform |
| 29 | PS2 emulation. The regions of memory used for these devices are marked |
| 30 | reserved in the e820 map. When we turn on DMA translation, DMA to those |
| 31 | regions will fail. Hence BIOS uses RMRR to specify these regions along with |
| 32 | devices that need to access these regions. OS is expected to setup |
| 33 | unity mappings for these regions for these devices to access these regions. |
| 34 | |
| 35 | How is IOVA generated? |
| 36 | --------------------- |
| 37 | |
| 38 | Well behaved drivers call pci_map_*() calls before sending command to device |
| 39 | that needs to perform DMA. Once DMA is completed and mapping is no longer |
| 40 | required, device performs a pci_unmap_*() calls to unmap the region. |
| 41 | |
| 42 | The Intel IOMMU driver allocates a virtual address per domain. Each PCIE |
| 43 | device has its own domain (hence protection). Devices under p2p bridges |
| 44 | share the virtual address with all devices under the p2p bridge due to |
| 45 | transaction id aliasing for p2p bridges. |
| 46 | |
| 47 | IOVA generation is pretty generic. We used the same technique as vmalloc() |
| 48 | but these are not global address spaces, but separate for each domain. |
| 49 | Different DMA engines may support different number of domains. |
| 50 | |
| 51 | We also allocate gaurd pages with each mapping, so we can attempt to catch |
| 52 | any overflow that might happen. |
| 53 | |
| 54 | |
| 55 | Graphics Problems? |
| 56 | ------------------ |
| 57 | If you encounter issues with graphics devices, you can try adding |
| 58 | option intel_iommu=igfx_off to turn off the integrated graphics engine. |
| 59 | |
Keshavamurthy, Anil S | e820482 | 2007-10-21 16:41:55 -0700 | [diff] [blame] | 60 | If it happens to be a PCI device included in the INCLUDE_ALL Engine, |
| 61 | then try enabling CONFIG_DMAR_GFX_WA to setup a 1-1 map. We hear |
| 62 | graphics drivers may be in process of using DMA api's in the near |
| 63 | future and at that time this option can be yanked out. |
| 64 | |
Keshavamurthy, Anil S | ba39592 | 2007-10-21 16:41:49 -0700 | [diff] [blame] | 65 | Some exceptions to IOVA |
| 66 | ----------------------- |
| 67 | Interrupt ranges are not address translated, (0xfee00000 - 0xfeefffff). |
| 68 | The same is true for peer to peer transactions. Hence we reserve the |
| 69 | address from PCI MMIO ranges so they are not allocated for IOVA addresses. |
| 70 | |
Keshavamurthy, Anil S | 3460a6d | 2007-10-21 16:41:54 -0700 | [diff] [blame] | 71 | |
| 72 | Fault reporting |
| 73 | --------------- |
| 74 | When errors are reported, the DMA engine signals via an interrupt. The fault |
| 75 | reason and device that caused it with fault reason is printed on console. |
| 76 | |
| 77 | See below for sample. |
| 78 | |
| 79 | |
Keshavamurthy, Anil S | ba39592 | 2007-10-21 16:41:49 -0700 | [diff] [blame] | 80 | Boot Message Sample |
| 81 | ------------------- |
| 82 | |
| 83 | Something like this gets printed indicating presence of DMAR tables |
| 84 | in ACPI. |
| 85 | |
| 86 | ACPI: DMAR (v001 A M I OEMDMAR 0x00000001 MSFT 0x00000097) @ 0x000000007f5b5ef0 |
| 87 | |
| 88 | When DMAR is being processed and initialized by ACPI, prints DMAR locations |
| 89 | and any RMRR's processed. |
| 90 | |
| 91 | ACPI DMAR:Host address width 36 |
| 92 | ACPI DMAR:DRHD (flags: 0x00000000)base: 0x00000000fed90000 |
| 93 | ACPI DMAR:DRHD (flags: 0x00000000)base: 0x00000000fed91000 |
| 94 | ACPI DMAR:DRHD (flags: 0x00000001)base: 0x00000000fed93000 |
| 95 | ACPI DMAR:RMRR base: 0x00000000000ed000 end: 0x00000000000effff |
| 96 | ACPI DMAR:RMRR base: 0x000000007f600000 end: 0x000000007fffffff |
| 97 | |
| 98 | When DMAR is enabled for use, you will notice.. |
| 99 | |
| 100 | PCI-DMA: Using DMAR IOMMU |
| 101 | |
Keshavamurthy, Anil S | 3460a6d | 2007-10-21 16:41:54 -0700 | [diff] [blame] | 102 | Fault reporting |
| 103 | --------------- |
| 104 | |
| 105 | DMAR:[DMA Write] Request device [00:02.0] fault addr 6df084000 |
| 106 | DMAR:[fault reason 05] PTE Write access is not set |
| 107 | DMAR:[DMA Write] Request device [00:02.0] fault addr 6df084000 |
| 108 | DMAR:[fault reason 05] PTE Write access is not set |
| 109 | |
Keshavamurthy, Anil S | ba39592 | 2007-10-21 16:41:49 -0700 | [diff] [blame] | 110 | TBD |
| 111 | ---- |
| 112 | |
| 113 | - For compatibility testing, could use unity map domain for all devices, just |
| 114 | provide a 1-1 for all useful memory under a single domain for all devices. |
| 115 | - API for paravirt ops for abstracting functionlity for VMM folks. |