Keshavamurthy, Anil S | ba39592 | 2007-10-21 16:41:49 -0700 | [diff] [blame] | 1 | Linux IOMMU Support |
| 2 | =================== |
| 3 | |
| 4 | The architecture spec can be obtained from the below location. |
| 5 | |
| 6 | http://www.intel.com/technology/virtualization/ |
| 7 | |
| 8 | This guide gives a quick cheat sheet for some basic understanding. |
| 9 | |
| 10 | Some Keywords |
| 11 | |
| 12 | DMAR - DMA remapping |
| 13 | DRHD - DMA Engine Reporting Structure |
| 14 | RMRR - Reserved memory Region Reporting Structure |
| 15 | ZLR - Zero length reads from PCI devices |
| 16 | IOVA - IO Virtual address. |
| 17 | |
| 18 | Basic stuff |
| 19 | ----------- |
| 20 | |
| 21 | ACPI enumerates and lists the different DMA engines in the platform, and |
| 22 | device scope relationships between PCI devices and which DMA engine controls |
| 23 | them. |
| 24 | |
| 25 | What is RMRR? |
| 26 | ------------- |
| 27 | |
| 28 | There are some devices the BIOS controls, for e.g USB devices to perform |
| 29 | PS2 emulation. The regions of memory used for these devices are marked |
| 30 | reserved in the e820 map. When we turn on DMA translation, DMA to those |
| 31 | regions will fail. Hence BIOS uses RMRR to specify these regions along with |
| 32 | devices that need to access these regions. OS is expected to setup |
| 33 | unity mappings for these regions for these devices to access these regions. |
| 34 | |
| 35 | How is IOVA generated? |
| 36 | --------------------- |
| 37 | |
| 38 | Well behaved drivers call pci_map_*() calls before sending command to device |
| 39 | that needs to perform DMA. Once DMA is completed and mapping is no longer |
| 40 | required, device performs a pci_unmap_*() calls to unmap the region. |
| 41 | |
| 42 | The Intel IOMMU driver allocates a virtual address per domain. Each PCIE |
| 43 | device has its own domain (hence protection). Devices under p2p bridges |
| 44 | share the virtual address with all devices under the p2p bridge due to |
| 45 | transaction id aliasing for p2p bridges. |
| 46 | |
| 47 | IOVA generation is pretty generic. We used the same technique as vmalloc() |
| 48 | but these are not global address spaces, but separate for each domain. |
| 49 | Different DMA engines may support different number of domains. |
| 50 | |
Matt LaPlante | d919588 | 2008-07-25 19:45:33 -0700 | [diff] [blame] | 51 | We also allocate guard pages with each mapping, so we can attempt to catch |
Keshavamurthy, Anil S | ba39592 | 2007-10-21 16:41:49 -0700 | [diff] [blame] | 52 | any overflow that might happen. |
| 53 | |
| 54 | |
| 55 | Graphics Problems? |
| 56 | ------------------ |
| 57 | If you encounter issues with graphics devices, you can try adding |
| 58 | option intel_iommu=igfx_off to turn off the integrated graphics engine. |
David Woodhouse | 0c02a20 | 2009-09-19 09:37:23 -0700 | [diff] [blame] | 59 | If this fixes anything, please ensure you file a bug reporting the problem. |
Keshavamurthy, Anil S | e820482 | 2007-10-21 16:41:55 -0700 | [diff] [blame] | 60 | |
Keshavamurthy, Anil S | ba39592 | 2007-10-21 16:41:49 -0700 | [diff] [blame] | 61 | Some exceptions to IOVA |
| 62 | ----------------------- |
| 63 | Interrupt ranges are not address translated, (0xfee00000 - 0xfeefffff). |
| 64 | The same is true for peer to peer transactions. Hence we reserve the |
| 65 | address from PCI MMIO ranges so they are not allocated for IOVA addresses. |
| 66 | |
Keshavamurthy, Anil S | 3460a6d | 2007-10-21 16:41:54 -0700 | [diff] [blame] | 67 | |
| 68 | Fault reporting |
| 69 | --------------- |
| 70 | When errors are reported, the DMA engine signals via an interrupt. The fault |
| 71 | reason and device that caused it with fault reason is printed on console. |
| 72 | |
| 73 | See below for sample. |
| 74 | |
| 75 | |
Keshavamurthy, Anil S | ba39592 | 2007-10-21 16:41:49 -0700 | [diff] [blame] | 76 | Boot Message Sample |
| 77 | ------------------- |
| 78 | |
| 79 | Something like this gets printed indicating presence of DMAR tables |
| 80 | in ACPI. |
| 81 | |
| 82 | ACPI: DMAR (v001 A M I OEMDMAR 0x00000001 MSFT 0x00000097) @ 0x000000007f5b5ef0 |
| 83 | |
| 84 | When DMAR is being processed and initialized by ACPI, prints DMAR locations |
| 85 | and any RMRR's processed. |
| 86 | |
| 87 | ACPI DMAR:Host address width 36 |
| 88 | ACPI DMAR:DRHD (flags: 0x00000000)base: 0x00000000fed90000 |
| 89 | ACPI DMAR:DRHD (flags: 0x00000000)base: 0x00000000fed91000 |
| 90 | ACPI DMAR:DRHD (flags: 0x00000001)base: 0x00000000fed93000 |
| 91 | ACPI DMAR:RMRR base: 0x00000000000ed000 end: 0x00000000000effff |
| 92 | ACPI DMAR:RMRR base: 0x000000007f600000 end: 0x000000007fffffff |
| 93 | |
| 94 | When DMAR is enabled for use, you will notice.. |
| 95 | |
| 96 | PCI-DMA: Using DMAR IOMMU |
| 97 | |
Keshavamurthy, Anil S | 3460a6d | 2007-10-21 16:41:54 -0700 | [diff] [blame] | 98 | Fault reporting |
| 99 | --------------- |
| 100 | |
| 101 | DMAR:[DMA Write] Request device [00:02.0] fault addr 6df084000 |
| 102 | DMAR:[fault reason 05] PTE Write access is not set |
| 103 | DMAR:[DMA Write] Request device [00:02.0] fault addr 6df084000 |
| 104 | DMAR:[fault reason 05] PTE Write access is not set |
| 105 | |
Keshavamurthy, Anil S | ba39592 | 2007-10-21 16:41:49 -0700 | [diff] [blame] | 106 | TBD |
| 107 | ---- |
| 108 | |
| 109 | - For compatibility testing, could use unity map domain for all devices, just |
| 110 | provide a 1-1 for all useful memory under a single domain for all devices. |
Matt LaPlante | d919588 | 2008-07-25 19:45:33 -0700 | [diff] [blame] | 111 | - API for paravirt ops for abstracting functionality for VMM folks. |