Arthur Kepner | a75b0a2 | 2008-04-29 01:00:31 -0700 | [diff] [blame] | 1 | DMA attributes |
| 2 | ============== |
| 3 | |
| 4 | This document describes the semantics of the DMA attributes that are |
Krzysztof Kozlowski | 00085f1 | 2016-08-03 13:46:00 -0700 | [diff] [blame] | 5 | defined in linux/dma-mapping.h. |
Arthur Kepner | a75b0a2 | 2008-04-29 01:00:31 -0700 | [diff] [blame] | 6 | |
| 7 | DMA_ATTR_WRITE_BARRIER |
| 8 | ---------------------- |
| 9 | |
| 10 | DMA_ATTR_WRITE_BARRIER is a (write) barrier attribute for DMA. DMA |
| 11 | to a memory region with the DMA_ATTR_WRITE_BARRIER attribute forces |
| 12 | all pending DMA writes to complete, and thus provides a mechanism to |
| 13 | strictly order DMA from a device across all intervening busses and |
| 14 | bridges. This barrier is not specific to a particular type of |
| 15 | interconnect, it applies to the system as a whole, and so its |
Xishi Qiu | bf03822 | 2013-08-30 17:39:28 +0800 | [diff] [blame] | 16 | implementation must account for the idiosyncrasies of the system all |
Arthur Kepner | a75b0a2 | 2008-04-29 01:00:31 -0700 | [diff] [blame] | 17 | the way from the DMA device to memory. |
| 18 | |
| 19 | As an example of a situation where DMA_ATTR_WRITE_BARRIER would be |
| 20 | useful, suppose that a device does a DMA write to indicate that data is |
| 21 | ready and available in memory. The DMA of the "completion indication" |
| 22 | could race with data DMA. Mapping the memory used for completion |
| 23 | indications with DMA_ATTR_WRITE_BARRIER would prevent the race. |
| 24 | |
Mark Nelson | 1ed6af7 | 2008-07-18 23:03:34 +1000 | [diff] [blame] | 25 | DMA_ATTR_WEAK_ORDERING |
| 26 | ---------------------- |
| 27 | |
| 28 | DMA_ATTR_WEAK_ORDERING specifies that reads and writes to the mapping |
| 29 | may be weakly ordered, that is that reads and writes may pass each other. |
| 30 | |
| 31 | Since it is optional for platforms to implement DMA_ATTR_WEAK_ORDERING, |
| 32 | those that do not will simply ignore the attribute and exhibit default |
| 33 | behavior. |
Marek Szyprowski | 8a41343 | 2011-12-23 09:30:47 +0100 | [diff] [blame] | 34 | |
| 35 | DMA_ATTR_WRITE_COMBINE |
| 36 | ---------------------- |
| 37 | |
| 38 | DMA_ATTR_WRITE_COMBINE specifies that writes to the mapping may be |
| 39 | buffered to improve performance. |
| 40 | |
| 41 | Since it is optional for platforms to implement DMA_ATTR_WRITE_COMBINE, |
| 42 | those that do not will simply ignore the attribute and exhibit default |
| 43 | behavior. |
Marek Szyprowski | 64d70fe | 2012-03-28 07:55:56 +0200 | [diff] [blame] | 44 | |
| 45 | DMA_ATTR_NON_CONSISTENT |
| 46 | ----------------------- |
| 47 | |
| 48 | DMA_ATTR_NON_CONSISTENT lets the platform to choose to return either |
| 49 | consistent or non-consistent memory as it sees fit. By using this API, |
| 50 | you are guaranteeing to the platform that you have all the correct and |
| 51 | necessary sync points for this memory in the driver. |
Marek Szyprowski | d5724f1 | 2012-05-16 15:20:37 +0200 | [diff] [blame] | 52 | |
| 53 | DMA_ATTR_NO_KERNEL_MAPPING |
| 54 | -------------------------- |
| 55 | |
| 56 | DMA_ATTR_NO_KERNEL_MAPPING lets the platform to avoid creating a kernel |
| 57 | virtual mapping for the allocated buffer. On some architectures creating |
| 58 | such mapping is non-trivial task and consumes very limited resources |
| 59 | (like kernel virtual address space or dma consistent address space). |
| 60 | Buffers allocated with this attribute can be only passed to user space |
| 61 | by calling dma_mmap_attrs(). By using this API, you are guaranteeing |
| 62 | that you won't dereference the pointer returned by dma_alloc_attr(). You |
Xishi Qiu | bf03822 | 2013-08-30 17:39:28 +0800 | [diff] [blame] | 63 | can treat it as a cookie that must be passed to dma_mmap_attrs() and |
Marek Szyprowski | d5724f1 | 2012-05-16 15:20:37 +0200 | [diff] [blame] | 64 | dma_free_attrs(). Make sure that both of these also get this attribute |
| 65 | set on each call. |
| 66 | |
| 67 | Since it is optional for platforms to implement |
| 68 | DMA_ATTR_NO_KERNEL_MAPPING, those that do not will simply ignore the |
| 69 | attribute and exhibit default behavior. |
Marek Szyprowski | bdf5e48 | 2012-06-06 14:46:44 +0200 | [diff] [blame] | 70 | |
| 71 | DMA_ATTR_SKIP_CPU_SYNC |
| 72 | ---------------------- |
| 73 | |
| 74 | By default dma_map_{single,page,sg} functions family transfer a given |
| 75 | buffer from CPU domain to device domain. Some advanced use cases might |
| 76 | require sharing a buffer between more than one device. This requires |
| 77 | having a mapping created separately for each device and is usually |
| 78 | performed by calling dma_map_{single,page,sg} function more than once |
| 79 | for the given buffer with device pointer to each device taking part in |
| 80 | the buffer sharing. The first call transfers a buffer from 'CPU' domain |
| 81 | to 'device' domain, what synchronizes CPU caches for the given region |
| 82 | (usually it means that the cache has been flushed or invalidated |
| 83 | depending on the dma direction). However, next calls to |
| 84 | dma_map_{single,page,sg}() for other devices will perform exactly the |
Xishi Qiu | bf03822 | 2013-08-30 17:39:28 +0800 | [diff] [blame] | 85 | same synchronization operation on the CPU cache. CPU cache synchronization |
Marek Szyprowski | bdf5e48 | 2012-06-06 14:46:44 +0200 | [diff] [blame] | 86 | might be a time consuming operation, especially if the buffers are |
| 87 | large, so it is highly recommended to avoid it if possible. |
| 88 | DMA_ATTR_SKIP_CPU_SYNC allows platform code to skip synchronization of |
| 89 | the CPU cache for the given buffer assuming that it has been already |
| 90 | transferred to 'device' domain. This attribute can be also used for |
| 91 | dma_unmap_{single,page,sg} functions family to force buffer to stay in |
| 92 | device domain after releasing a mapping for it. Use this attribute with |
| 93 | care! |
Marek Szyprowski | 4b9347d | 2012-10-15 16:03:51 +0200 | [diff] [blame] | 94 | |
| 95 | DMA_ATTR_FORCE_CONTIGUOUS |
| 96 | ------------------------- |
| 97 | |
| 98 | By default DMA-mapping subsystem is allowed to assemble the buffer |
| 99 | allocated by dma_alloc_attrs() function from individual pages if it can |
| 100 | be mapped as contiguous chunk into device dma address space. By |
Carlos Garcia | c98be0c | 2014-04-04 22:31:00 -0400 | [diff] [blame] | 101 | specifying this attribute the allocated buffer is forced to be contiguous |
Marek Szyprowski | 4b9347d | 2012-10-15 16:03:51 +0200 | [diff] [blame] | 102 | also in physical memory. |
Doug Anderson | df05c6f6 | 2016-01-29 23:07:26 +0100 | [diff] [blame] | 103 | |
| 104 | DMA_ATTR_ALLOC_SINGLE_PAGES |
| 105 | --------------------------- |
| 106 | |
| 107 | This is a hint to the DMA-mapping subsystem that it's probably not worth |
| 108 | the time to try to allocate memory to in a way that gives better TLB |
| 109 | efficiency (AKA it's not worth trying to build the mapping out of larger |
| 110 | pages). You might want to specify this if: |
| 111 | - You know that the accesses to this memory won't thrash the TLB. |
| 112 | You might know that the accesses are likely to be sequential or |
| 113 | that they aren't sequential but it's unlikely you'll ping-pong |
| 114 | between many addresses that are likely to be in different physical |
| 115 | pages. |
| 116 | - You know that the penalty of TLB misses while accessing the |
| 117 | memory will be small enough to be inconsequential. If you are |
| 118 | doing a heavy operation like decryption or decompression this |
| 119 | might be the case. |
| 120 | - You know that the DMA mapping is fairly transitory. If you expect |
| 121 | the mapping to have a short lifetime then it may be worth it to |
| 122 | optimize allocation (avoid coming up with large pages) instead of |
| 123 | getting the slight performance win of larger pages. |
| 124 | Setting this hint doesn't guarantee that you won't get huge pages, but it |
| 125 | means that we won't try quite as hard to get them. |
| 126 | |
| 127 | NOTE: At the moment DMA_ATTR_ALLOC_SINGLE_PAGES is only implemented on ARM, |
| 128 | though ARM64 patches will likely be posted soon. |