Mauro Carvalho Chehab | 36c682f6 | 2017-05-14 13:08:35 -0300 | [diff] [blame] | 1 | ============== |
| 2 | DMA attributes |
| 3 | ============== |
Arthur Kepner | a75b0a2 | 2008-04-29 01:00:31 -0700 | [diff] [blame] | 4 | |
| 5 | This document describes the semantics of the DMA attributes that are |
Krzysztof Kozlowski | 00085f1 | 2016-08-03 13:46:00 -0700 | [diff] [blame] | 6 | defined in linux/dma-mapping.h. |
Arthur Kepner | a75b0a2 | 2008-04-29 01:00:31 -0700 | [diff] [blame] | 7 | |
| 8 | DMA_ATTR_WRITE_BARRIER |
| 9 | ---------------------- |
| 10 | |
| 11 | DMA_ATTR_WRITE_BARRIER is a (write) barrier attribute for DMA. DMA |
| 12 | to a memory region with the DMA_ATTR_WRITE_BARRIER attribute forces |
| 13 | all pending DMA writes to complete, and thus provides a mechanism to |
| 14 | strictly order DMA from a device across all intervening busses and |
| 15 | bridges. This barrier is not specific to a particular type of |
| 16 | interconnect, it applies to the system as a whole, and so its |
Xishi Qiu | bf03822 | 2013-08-30 17:39:28 +0800 | [diff] [blame] | 17 | implementation must account for the idiosyncrasies of the system all |
Arthur Kepner | a75b0a2 | 2008-04-29 01:00:31 -0700 | [diff] [blame] | 18 | the way from the DMA device to memory. |
| 19 | |
| 20 | As an example of a situation where DMA_ATTR_WRITE_BARRIER would be |
| 21 | useful, suppose that a device does a DMA write to indicate that data is |
| 22 | ready and available in memory. The DMA of the "completion indication" |
| 23 | could race with data DMA. Mapping the memory used for completion |
| 24 | indications with DMA_ATTR_WRITE_BARRIER would prevent the race. |
| 25 | |
Mark Nelson | 1ed6af7 | 2008-07-18 23:03:34 +1000 | [diff] [blame] | 26 | DMA_ATTR_WEAK_ORDERING |
| 27 | ---------------------- |
| 28 | |
| 29 | DMA_ATTR_WEAK_ORDERING specifies that reads and writes to the mapping |
| 30 | may be weakly ordered, that is that reads and writes may pass each other. |
| 31 | |
| 32 | Since it is optional for platforms to implement DMA_ATTR_WEAK_ORDERING, |
| 33 | those that do not will simply ignore the attribute and exhibit default |
| 34 | behavior. |
Marek Szyprowski | 8a41343 | 2011-12-23 09:30:47 +0100 | [diff] [blame] | 35 | |
| 36 | DMA_ATTR_WRITE_COMBINE |
| 37 | ---------------------- |
| 38 | |
| 39 | DMA_ATTR_WRITE_COMBINE specifies that writes to the mapping may be |
| 40 | buffered to improve performance. |
| 41 | |
| 42 | Since it is optional for platforms to implement DMA_ATTR_WRITE_COMBINE, |
| 43 | those that do not will simply ignore the attribute and exhibit default |
| 44 | behavior. |
Marek Szyprowski | 64d70fe | 2012-03-28 07:55:56 +0200 | [diff] [blame] | 45 | |
| 46 | DMA_ATTR_NON_CONSISTENT |
| 47 | ----------------------- |
| 48 | |
| 49 | DMA_ATTR_NON_CONSISTENT lets the platform to choose to return either |
| 50 | consistent or non-consistent memory as it sees fit. By using this API, |
| 51 | you are guaranteeing to the platform that you have all the correct and |
| 52 | necessary sync points for this memory in the driver. |
Marek Szyprowski | d5724f1 | 2012-05-16 15:20:37 +0200 | [diff] [blame] | 53 | |
| 54 | DMA_ATTR_NO_KERNEL_MAPPING |
| 55 | -------------------------- |
| 56 | |
| 57 | DMA_ATTR_NO_KERNEL_MAPPING lets the platform to avoid creating a kernel |
| 58 | virtual mapping for the allocated buffer. On some architectures creating |
| 59 | such mapping is non-trivial task and consumes very limited resources |
| 60 | (like kernel virtual address space or dma consistent address space). |
| 61 | Buffers allocated with this attribute can be only passed to user space |
| 62 | by calling dma_mmap_attrs(). By using this API, you are guaranteeing |
| 63 | that you won't dereference the pointer returned by dma_alloc_attr(). You |
Xishi Qiu | bf03822 | 2013-08-30 17:39:28 +0800 | [diff] [blame] | 64 | can treat it as a cookie that must be passed to dma_mmap_attrs() and |
Marek Szyprowski | d5724f1 | 2012-05-16 15:20:37 +0200 | [diff] [blame] | 65 | dma_free_attrs(). Make sure that both of these also get this attribute |
| 66 | set on each call. |
| 67 | |
| 68 | Since it is optional for platforms to implement |
| 69 | DMA_ATTR_NO_KERNEL_MAPPING, those that do not will simply ignore the |
| 70 | attribute and exhibit default behavior. |
Marek Szyprowski | bdf5e48 | 2012-06-06 14:46:44 +0200 | [diff] [blame] | 71 | |
| 72 | DMA_ATTR_SKIP_CPU_SYNC |
| 73 | ---------------------- |
| 74 | |
| 75 | By default dma_map_{single,page,sg} functions family transfer a given |
| 76 | buffer from CPU domain to device domain. Some advanced use cases might |
| 77 | require sharing a buffer between more than one device. This requires |
| 78 | having a mapping created separately for each device and is usually |
| 79 | performed by calling dma_map_{single,page,sg} function more than once |
| 80 | for the given buffer with device pointer to each device taking part in |
| 81 | the buffer sharing. The first call transfers a buffer from 'CPU' domain |
| 82 | to 'device' domain, what synchronizes CPU caches for the given region |
| 83 | (usually it means that the cache has been flushed or invalidated |
| 84 | depending on the dma direction). However, next calls to |
| 85 | dma_map_{single,page,sg}() for other devices will perform exactly the |
Xishi Qiu | bf03822 | 2013-08-30 17:39:28 +0800 | [diff] [blame] | 86 | same synchronization operation on the CPU cache. CPU cache synchronization |
Marek Szyprowski | bdf5e48 | 2012-06-06 14:46:44 +0200 | [diff] [blame] | 87 | might be a time consuming operation, especially if the buffers are |
| 88 | large, so it is highly recommended to avoid it if possible. |
| 89 | DMA_ATTR_SKIP_CPU_SYNC allows platform code to skip synchronization of |
| 90 | the CPU cache for the given buffer assuming that it has been already |
| 91 | transferred to 'device' domain. This attribute can be also used for |
| 92 | dma_unmap_{single,page,sg} functions family to force buffer to stay in |
| 93 | device domain after releasing a mapping for it. Use this attribute with |
| 94 | care! |
Marek Szyprowski | 4b9347d | 2012-10-15 16:03:51 +0200 | [diff] [blame] | 95 | |
| 96 | DMA_ATTR_FORCE_CONTIGUOUS |
| 97 | ------------------------- |
| 98 | |
| 99 | By default DMA-mapping subsystem is allowed to assemble the buffer |
| 100 | allocated by dma_alloc_attrs() function from individual pages if it can |
| 101 | be mapped as contiguous chunk into device dma address space. By |
Carlos Garcia | c98be0c | 2014-04-04 22:31:00 -0400 | [diff] [blame] | 102 | specifying this attribute the allocated buffer is forced to be contiguous |
Marek Szyprowski | 4b9347d | 2012-10-15 16:03:51 +0200 | [diff] [blame] | 103 | also in physical memory. |
Doug Anderson | df05c6f6 | 2016-01-29 23:07:26 +0100 | [diff] [blame] | 104 | |
| 105 | DMA_ATTR_ALLOC_SINGLE_PAGES |
| 106 | --------------------------- |
| 107 | |
| 108 | This is a hint to the DMA-mapping subsystem that it's probably not worth |
| 109 | the time to try to allocate memory to in a way that gives better TLB |
| 110 | efficiency (AKA it's not worth trying to build the mapping out of larger |
| 111 | pages). You might want to specify this if: |
Mauro Carvalho Chehab | 36c682f6 | 2017-05-14 13:08:35 -0300 | [diff] [blame] | 112 | |
Doug Anderson | df05c6f6 | 2016-01-29 23:07:26 +0100 | [diff] [blame] | 113 | - You know that the accesses to this memory won't thrash the TLB. |
| 114 | You might know that the accesses are likely to be sequential or |
| 115 | that they aren't sequential but it's unlikely you'll ping-pong |
| 116 | between many addresses that are likely to be in different physical |
| 117 | pages. |
| 118 | - You know that the penalty of TLB misses while accessing the |
| 119 | memory will be small enough to be inconsequential. If you are |
| 120 | doing a heavy operation like decryption or decompression this |
| 121 | might be the case. |
| 122 | - You know that the DMA mapping is fairly transitory. If you expect |
| 123 | the mapping to have a short lifetime then it may be worth it to |
| 124 | optimize allocation (avoid coming up with large pages) instead of |
| 125 | getting the slight performance win of larger pages. |
Mauro Carvalho Chehab | 36c682f6 | 2017-05-14 13:08:35 -0300 | [diff] [blame] | 126 | |
Doug Anderson | df05c6f6 | 2016-01-29 23:07:26 +0100 | [diff] [blame] | 127 | Setting this hint doesn't guarantee that you won't get huge pages, but it |
| 128 | means that we won't try quite as hard to get them. |
| 129 | |
Mauro Carvalho Chehab | 36c682f6 | 2017-05-14 13:08:35 -0300 | [diff] [blame] | 130 | .. note:: At the moment DMA_ATTR_ALLOC_SINGLE_PAGES is only implemented on ARM, |
| 131 | though ARM64 patches will likely be posted soon. |
Mauricio Faria de Oliveira | a9a62c9 | 2016-10-11 13:54:14 -0700 | [diff] [blame] | 132 | |
| 133 | DMA_ATTR_NO_WARN |
| 134 | ---------------- |
| 135 | |
| 136 | This tells the DMA-mapping subsystem to suppress allocation failure reports |
| 137 | (similarly to __GFP_NOWARN). |
| 138 | |
| 139 | On some architectures allocation failures are reported with error messages |
| 140 | to the system logs. Although this can help to identify and debug problems, |
| 141 | drivers which handle failures (eg, retry later) have no problems with them, |
| 142 | and can actually flood the system logs with error messages that aren't any |
| 143 | problem at all, depending on the implementation of the retry mechanism. |
| 144 | |
| 145 | So, this provides a way for drivers to avoid those error messages on calls |
| 146 | where allocation failures are not a problem, and shouldn't bother the logs. |
| 147 | |
Mauro Carvalho Chehab | 36c682f6 | 2017-05-14 13:08:35 -0300 | [diff] [blame] | 148 | .. note:: At the moment DMA_ATTR_NO_WARN is only implemented on PowerPC. |
Mitchel Humpherys | b2fb366 | 2017-01-06 18:58:11 +0530 | [diff] [blame] | 149 | |
| 150 | DMA_ATTR_PRIVILEGED |
Mauro Carvalho Chehab | 36c682f6 | 2017-05-14 13:08:35 -0300 | [diff] [blame] | 151 | ------------------- |
Mitchel Humpherys | b2fb366 | 2017-01-06 18:58:11 +0530 | [diff] [blame] | 152 | |
| 153 | Some advanced peripherals such as remote processors and GPUs perform |
| 154 | accesses to DMA buffers in both privileged "supervisor" and unprivileged |
| 155 | "user" modes. This attribute is used to indicate to the DMA-mapping |
| 156 | subsystem that the buffer is fully accessible at the elevated privilege |
| 157 | level (and ideally inaccessible or at least read-only at the |
| 158 | lesser-privileged levels). |