blob: 3bbd5c51605a39e7726088b792c6ba46385e0152 [file] [log] [blame]
Sumit Semwala7df47192011-12-26 14:53:16 +05301 DMA Buffer Sharing API Guide
2 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3
4 Sumit Semwal
5 <sumit dot semwal at linaro dot org>
6 <sumit dot semwal at ti dot com>
7
8This document serves as a guide to device-driver writers on what is the dma-buf
9buffer sharing API, how to use it for exporting and using shared buffers.
10
11Any device driver which wishes to be a part of DMA buffer sharing, can do so as
12either the 'exporter' of buffers, or the 'user' of buffers.
13
14Say a driver A wants to use buffers created by driver B, then we call B as the
15exporter, and A as buffer-user.
16
17The exporter
18- implements and manages operations[1] for the buffer
19- allows other users to share the buffer by using dma_buf sharing APIs,
20- manages the details of buffer allocation,
21- decides about the actual backing storage where this allocation happens,
22- takes care of any migration of scatterlist - for all (shared) users of this
23 buffer,
24
25The buffer-user
26- is one of (many) sharing users of the buffer.
27- doesn't need to worry about how the buffer is allocated, or where.
28- needs a mechanism to get access to the scatterlist that makes up this buffer
29 in memory, mapped into its own address space, so it can access the same area
30 of memory.
31
32*IMPORTANT*: [see https://lkml.org/lkml/2011/12/20/211 for more details]
33For this first version, A buffer shared using the dma_buf sharing API:
34- *may* be exported to user space using "mmap" *ONLY* by exporter, outside of
Daniel Vetterb0b40f22012-03-19 00:34:27 +010035 this framework.
36- with this new iteration of the dma-buf api cpu access from the kernel has been
37 enable, see below for the details.
38
39dma-buf operations for device dma only
40--------------------------------------
Sumit Semwala7df47192011-12-26 14:53:16 +053041
42The dma_buf buffer sharing API usage contains the following steps:
43
441. Exporter announces that it wishes to export a buffer
452. Userspace gets the file descriptor associated with the exported buffer, and
46 passes it around to potential buffer-users based on use case
473. Each buffer-user 'connects' itself to the buffer
484. When needed, buffer-user requests access to the buffer from exporter
495. When finished with its use, the buffer-user notifies end-of-DMA to exporter
506. when buffer-user is done using this buffer completely, it 'disconnects'
51 itself from the buffer.
52
53
541. Exporter's announcement of buffer export
55
56 The buffer exporter announces its wish to export a buffer. In this, it
57 connects its own private buffer data, provides implementation for operations
58 that can be performed on the exported dma_buf, and flags for the file
59 associated with this buffer.
60
61 Interface:
62 struct dma_buf *dma_buf_export(void *priv, struct dma_buf_ops *ops,
63 size_t size, int flags)
64
65 If this succeeds, dma_buf_export allocates a dma_buf structure, and returns a
66 pointer to the same. It also associates an anonymous file with this buffer,
67 so it can be exported. On failure to allocate the dma_buf object, it returns
68 NULL.
69
702. Userspace gets a handle to pass around to potential buffer-users
71
72 Userspace entity requests for a file-descriptor (fd) which is a handle to the
73 anonymous file associated with the buffer. It can then share the fd with other
74 drivers and/or processes.
75
76 Interface:
77 int dma_buf_fd(struct dma_buf *dmabuf)
78
79 This API installs an fd for the anonymous file associated with this buffer;
80 returns either 'fd', or error.
81
823. Each buffer-user 'connects' itself to the buffer
83
84 Each buffer-user now gets a reference to the buffer, using the fd passed to
85 it.
86
87 Interface:
88 struct dma_buf *dma_buf_get(int fd)
89
90 This API will return a reference to the dma_buf, and increment refcount for
91 it.
92
93 After this, the buffer-user needs to attach its device with the buffer, which
94 helps the exporter to know of device buffer constraints.
95
96 Interface:
97 struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
98 struct device *dev)
99
100 This API returns reference to an attachment structure, which is then used
101 for scatterlist operations. It will optionally call the 'attach' dma_buf
102 operation, if provided by the exporter.
103
104 The dma-buf sharing framework does the bookkeeping bits related to managing
105 the list of all attachments to a buffer.
106
107Until this stage, the buffer-exporter has the option to choose not to actually
108allocate the backing storage for this buffer, but wait for the first buffer-user
109to request use of buffer for allocation.
110
111
1124. When needed, buffer-user requests access to the buffer
113
114 Whenever a buffer-user wants to use the buffer for any DMA, it asks for
115 access to the buffer using dma_buf_map_attachment API. At least one attach to
116 the buffer must have happened before map_dma_buf can be called.
117
118 Interface:
119 struct sg_table * dma_buf_map_attachment(struct dma_buf_attachment *,
120 enum dma_data_direction);
121
122 This is a wrapper to dma_buf->ops->map_dma_buf operation, which hides the
123 "dma_buf->ops->" indirection from the users of this interface.
124
125 In struct dma_buf_ops, map_dma_buf is defined as
126 struct sg_table * (*map_dma_buf)(struct dma_buf_attachment *,
127 enum dma_data_direction);
128
129 It is one of the buffer operations that must be implemented by the exporter.
130 It should return the sg_table containing scatterlist for this buffer, mapped
131 into caller's address space.
132
133 If this is being called for the first time, the exporter can now choose to
134 scan through the list of attachments for this buffer, collate the requirements
135 of the attached devices, and choose an appropriate backing storage for the
136 buffer.
137
138 Based on enum dma_data_direction, it might be possible to have multiple users
139 accessing at the same time (for reading, maybe), or any other kind of sharing
140 that the exporter might wish to make available to buffer-users.
141
142 map_dma_buf() operation can return -EINTR if it is interrupted by a signal.
143
144
1455. When finished, the buffer-user notifies end-of-DMA to exporter
146
147 Once the DMA for the current buffer-user is over, it signals 'end-of-DMA' to
148 the exporter using the dma_buf_unmap_attachment API.
149
150 Interface:
151 void dma_buf_unmap_attachment(struct dma_buf_attachment *,
152 struct sg_table *);
153
154 This is a wrapper to dma_buf->ops->unmap_dma_buf() operation, which hides the
155 "dma_buf->ops->" indirection from the users of this interface.
156
157 In struct dma_buf_ops, unmap_dma_buf is defined as
158 void (*unmap_dma_buf)(struct dma_buf_attachment *, struct sg_table *);
159
160 unmap_dma_buf signifies the end-of-DMA for the attachment provided. Like
161 map_dma_buf, this API also must be implemented by the exporter.
162
163
1646. when buffer-user is done using this buffer, it 'disconnects' itself from the
165 buffer.
166
167 After the buffer-user has no more interest in using this buffer, it should
168 disconnect itself from the buffer:
169
170 - it first detaches itself from the buffer.
171
172 Interface:
173 void dma_buf_detach(struct dma_buf *dmabuf,
174 struct dma_buf_attachment *dmabuf_attach);
175
176 This API removes the attachment from the list in dmabuf, and optionally calls
177 dma_buf->ops->detach(), if provided by exporter, for any housekeeping bits.
178
179 - Then, the buffer-user returns the buffer reference to exporter.
180
181 Interface:
182 void dma_buf_put(struct dma_buf *dmabuf);
183
184 This API then reduces the refcount for this buffer.
185
186 If, as a result of this call, the refcount becomes 0, the 'release' file
187 operation related to this fd is called. It calls the dmabuf->ops->release()
188 operation in turn, and frees the memory allocated for dmabuf when exported.
189
190NOTES:
191- Importance of attach-detach and {map,unmap}_dma_buf operation pairs
192 The attach-detach calls allow the exporter to figure out backing-storage
193 constraints for the currently-interested devices. This allows preferential
194 allocation, and/or migration of pages across different types of storage
195 available, if possible.
196
197 Bracketing of DMA access with {map,unmap}_dma_buf operations is essential
198 to allow just-in-time backing of storage, and migration mid-way through a
199 use-case.
200
201- Migration of backing storage if needed
202 If after
203 - at least one map_dma_buf has happened,
204 - and the backing storage has been allocated for this buffer,
205 another new buffer-user intends to attach itself to this buffer, it might
206 be allowed, if possible for the exporter.
207
208 In case it is allowed by the exporter:
209 if the new buffer-user has stricter 'backing-storage constraints', and the
210 exporter can handle these constraints, the exporter can just stall on the
211 map_dma_buf until all outstanding access is completed (as signalled by
212 unmap_dma_buf).
213 Once all users have finished accessing and have unmapped this buffer, the
214 exporter could potentially move the buffer to the stricter backing-storage,
215 and then allow further {map,unmap}_dma_buf operations from any buffer-user
216 from the migrated backing-storage.
217
218 If the exporter cannot fulfil the backing-storage constraints of the new
219 buffer-user device as requested, dma_buf_attach() would return an error to
220 denote non-compatibility of the new buffer-sharing request with the current
221 buffer.
222
223 If the exporter chooses not to allow an attach() operation once a
224 map_dma_buf() API has been called, it simply returns an error.
225
Daniel Vetterb0b40f22012-03-19 00:34:27 +0100226Kernel cpu access to a dma-buf buffer object
227--------------------------------------------
228
229The motivation to allow cpu access from the kernel to a dma-buf object from the
230importers side are:
231- fallback operations, e.g. if the devices is connected to a usb bus and the
232 kernel needs to shuffle the data around first before sending it away.
233- full transparency for existing users on the importer side, i.e. userspace
234 should not notice the difference between a normal object from that subsystem
235 and an imported one backed by a dma-buf. This is really important for drm
236 opengl drivers that expect to still use all the existing upload/download
237 paths.
238
239Access to a dma_buf from the kernel context involves three steps:
240
2411. Prepare access, which invalidate any necessary caches and make the object
242 available for cpu access.
2432. Access the object page-by-page with the dma_buf map apis
2443. Finish access, which will flush any necessary cpu caches and free reserved
245 resources.
246
2471. Prepare access
248
249 Before an importer can access a dma_buf object with the cpu from the kernel
250 context, it needs to notify the exporter of the access that is about to
251 happen.
252
253 Interface:
254 int dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
255 size_t start, size_t len,
256 enum dma_data_direction direction)
257
258 This allows the exporter to ensure that the memory is actually available for
259 cpu access - the exporter might need to allocate or swap-in and pin the
260 backing storage. The exporter also needs to ensure that cpu access is
261 coherent for the given range and access direction. The range and access
262 direction can be used by the exporter to optimize the cache flushing, i.e.
263 access outside of the range or with a different direction (read instead of
264 write) might return stale or even bogus data (e.g. when the exporter needs to
265 copy the data to temporary storage).
266
267 This step might fail, e.g. in oom conditions.
268
2692. Accessing the buffer
270
271 To support dma_buf objects residing in highmem cpu access is page-based using
272 an api similar to kmap. Accessing a dma_buf is done in aligned chunks of
273 PAGE_SIZE size. Before accessing a chunk it needs to be mapped, which returns
274 a pointer in kernel virtual address space. Afterwards the chunk needs to be
275 unmapped again. There is no limit on how often a given chunk can be mapped
276 and unmapped, i.e. the importer does not need to call begin_cpu_access again
277 before mapping the same chunk again.
278
279 Interfaces:
280 void *dma_buf_kmap(struct dma_buf *, unsigned long);
281 void dma_buf_kunmap(struct dma_buf *, unsigned long, void *);
282
283 There are also atomic variants of these interfaces. Like for kmap they
284 facilitate non-blocking fast-paths. Neither the importer nor the exporter (in
285 the callback) is allowed to block when using these.
286
287 Interfaces:
288 void *dma_buf_kmap_atomic(struct dma_buf *, unsigned long);
289 void dma_buf_kunmap_atomic(struct dma_buf *, unsigned long, void *);
290
291 For importers all the restrictions of using kmap apply, like the limited
292 supply of kmap_atomic slots. Hence an importer shall only hold onto at most 2
293 atomic dma_buf kmaps at the same time (in any given process context).
294
295 dma_buf kmap calls outside of the range specified in begin_cpu_access are
296 undefined. If the range is not PAGE_SIZE aligned, kmap needs to succeed on
297 the partial chunks at the beginning and end but may return stale or bogus
298 data outside of the range (in these partial chunks).
299
300 Note that these calls need to always succeed. The exporter needs to complete
301 any preparations that might fail in begin_cpu_access.
302
3033. Finish access
304
305 When the importer is done accessing the range specified in begin_cpu_access,
306 it needs to announce this to the exporter (to facilitate cache flushing and
307 unpinning of any pinned resources). The result of of any dma_buf kmap calls
308 after end_cpu_access is undefined.
309
310 Interface:
311 void dma_buf_end_cpu_access(struct dma_buf *dma_buf,
312 size_t start, size_t len,
313 enum dma_data_direction dir);
314
315
316Miscellaneous notes
317-------------------
318
Sumit Semwal08179452012-01-13 15:15:05 +0530319- Any exporters or users of the dma-buf buffer sharing framework must have
320 a 'select DMA_SHARED_BUFFER' in their respective Kconfigs.
321
Rob Clarkfbb231e2012-03-19 16:42:49 -0500322- In order to avoid fd leaks on exec, the FD_CLOEXEC flag must be set
323 on the file descriptor. This is not just a resource leak, but a
324 potential security hole. It could give the newly exec'd application
325 access to buffers, via the leaked fd, to which it should otherwise
326 not be permitted access.
327
328 The problem with doing this via a separate fcntl() call, versus doing it
329 atomically when the fd is created, is that this is inherently racy in a
330 multi-threaded app[3]. The issue is made worse when it is library code
331 opening/creating the file descriptor, as the application may not even be
332 aware of the fd's.
333
334 To avoid this problem, userspace must have a way to request O_CLOEXEC
335 flag be set when the dma-buf fd is created. So any API provided by
336 the exporting driver to create a dmabuf fd must provide a way to let
337 userspace control setting of O_CLOEXEC flag passed in to dma_buf_fd().
338
Sumit Semwala7df47192011-12-26 14:53:16 +0530339References:
340[1] struct dma_buf_ops in include/linux/dma-buf.h
341[2] All interfaces mentioned above defined in include/linux/dma-buf.h
Rob Clarkfbb231e2012-03-19 16:42:49 -0500342[3] https://lwn.net/Articles/236486/