Georg Brandl | f684272 | 2008-01-19 22:08:21 +0000 | [diff] [blame] | 1 | .. highlightlang:: c |
| 2 | |
| 3 | .. _bufferobjects: |
| 4 | |
Antoine Pitrou | 789be0c | 2009-04-02 21:18:34 +0000 | [diff] [blame] | 5 | Buffers and Memoryview Objects |
| 6 | ------------------------------ |
Georg Brandl | f684272 | 2008-01-19 22:08:21 +0000 | [diff] [blame] | 7 | |
| 8 | .. sectionauthor:: Greg Stein <gstein@lyra.org> |
Antoine Pitrou | 789be0c | 2009-04-02 21:18:34 +0000 | [diff] [blame] | 9 | .. sectionauthor:: Benjamin Peterson |
Georg Brandl | f684272 | 2008-01-19 22:08:21 +0000 | [diff] [blame] | 10 | |
| 11 | |
| 12 | .. index:: |
| 13 | object: buffer |
| 14 | single: buffer interface |
| 15 | |
| 16 | Python objects implemented in C can export a group of functions called the |
| 17 | "buffer interface." These functions can be used by an object to expose its data |
| 18 | in a raw, byte-oriented format. Clients of the object can use the buffer |
| 19 | interface to access the object data directly, without needing to copy it first. |
| 20 | |
| 21 | Two examples of objects that support the buffer interface are strings and |
| 22 | arrays. The string object exposes the character contents in the buffer |
| 23 | interface's byte-oriented form. An array can also expose its contents, but it |
| 24 | should be noted that array elements may be multi-byte values. |
| 25 | |
| 26 | An example user of the buffer interface is the file object's :meth:`write` |
| 27 | method. Any object that can export a series of bytes through the buffer |
| 28 | interface can be written to a file. There are a number of format codes to |
| 29 | :cfunc:`PyArg_ParseTuple` that operate against an object's buffer interface, |
| 30 | returning data from the target object. |
| 31 | |
Antoine Pitrou | 789be0c | 2009-04-02 21:18:34 +0000 | [diff] [blame] | 32 | Starting from version 1.6, Python has been providing Python-level buffer |
| 33 | objects and a C-level buffer API so that any builtin or used-defined type |
| 34 | can expose its characteristics. Both, however, have been deprecated because |
| 35 | of various shortcomings, and have been officially removed in Python 3.0 in |
| 36 | favour of a new C-level buffer API and a new Python-level object named |
| 37 | :class:`memoryview`. |
| 38 | |
| 39 | The new buffer API has been backported to Python 2.6, and the |
| 40 | :class:`memoryview` object has been backported to Python 2.7. It is strongly |
| 41 | advised to use them rather than the old APIs, unless you are blocked from |
| 42 | doing so for compatibility reasons. |
| 43 | |
| 44 | |
| 45 | The new-style Py_buffer struct |
| 46 | ============================== |
| 47 | |
| 48 | |
| 49 | .. ctype:: Py_buffer |
| 50 | |
| 51 | .. cmember:: void *buf |
| 52 | |
| 53 | A pointer to the start of the memory for the object. |
| 54 | |
| 55 | .. cmember:: Py_ssize_t len |
| 56 | :noindex: |
| 57 | |
| 58 | The total length of the memory in bytes. |
| 59 | |
| 60 | .. cmember:: int readonly |
| 61 | |
| 62 | An indicator of whether the buffer is read only. |
| 63 | |
| 64 | .. cmember:: const char *format |
| 65 | :noindex: |
| 66 | |
| 67 | A *NULL* terminated string in :mod:`struct` module style syntax giving the |
| 68 | contents of the elements available through the buffer. If this is *NULL*, |
| 69 | ``"B"`` (unsigned bytes) is assumed. |
| 70 | |
| 71 | .. cmember:: int ndim |
| 72 | |
| 73 | The number of dimensions the memory represents as a multi-dimensional |
| 74 | array. If it is 0, :cdata:`strides` and :cdata:`suboffsets` must be |
| 75 | *NULL*. |
| 76 | |
| 77 | .. cmember:: Py_ssize_t *shape |
| 78 | |
| 79 | An array of :ctype:`Py_ssize_t`\s the length of :cdata:`ndim` giving the |
| 80 | shape of the memory as a multi-dimensional array. Note that |
| 81 | ``((*shape)[0] * ... * (*shape)[ndims-1])*itemsize`` should be equal to |
| 82 | :cdata:`len`. |
| 83 | |
| 84 | .. cmember:: Py_ssize_t *strides |
| 85 | |
| 86 | An array of :ctype:`Py_ssize_t`\s the length of :cdata:`ndim` giving the |
| 87 | number of bytes to skip to get to a new element in each dimension. |
| 88 | |
| 89 | .. cmember:: Py_ssize_t *suboffsets |
| 90 | |
| 91 | An array of :ctype:`Py_ssize_t`\s the length of :cdata:`ndim`. If these |
| 92 | suboffset numbers are greater than or equal to 0, then the value stored |
| 93 | along the indicated dimension is a pointer and the suboffset value |
| 94 | dictates how many bytes to add to the pointer after de-referencing. A |
| 95 | suboffset value that it negative indicates that no de-referencing should |
| 96 | occur (striding in a contiguous memory block). |
| 97 | |
| 98 | Here is a function that returns a pointer to the element in an N-D array |
| 99 | pointed to by an N-dimesional index when there are both non-NULL strides |
| 100 | and suboffsets:: |
| 101 | |
| 102 | void *get_item_pointer(int ndim, void *buf, Py_ssize_t *strides, |
| 103 | Py_ssize_t *suboffsets, Py_ssize_t *indices) { |
| 104 | char *pointer = (char*)buf; |
| 105 | int i; |
| 106 | for (i = 0; i < ndim; i++) { |
| 107 | pointer += strides[i] * indices[i]; |
| 108 | if (suboffsets[i] >=0 ) { |
| 109 | pointer = *((char**)pointer) + suboffsets[i]; |
| 110 | } |
| 111 | } |
| 112 | return (void*)pointer; |
| 113 | } |
| 114 | |
| 115 | |
| 116 | .. cmember:: Py_ssize_t itemsize |
| 117 | |
| 118 | This is a storage for the itemsize (in bytes) of each element of the |
| 119 | shared memory. It is technically un-necessary as it can be obtained using |
| 120 | :cfunc:`PyBuffer_SizeFromFormat`, however an exporter may know this |
| 121 | information without parsing the format string and it is necessary to know |
| 122 | the itemsize for proper interpretation of striding. Therefore, storing it |
| 123 | is more convenient and faster. |
| 124 | |
| 125 | .. cmember:: void *internal |
| 126 | |
| 127 | This is for use internally by the exporting object. For example, this |
| 128 | might be re-cast as an integer by the exporter and used to store flags |
| 129 | about whether or not the shape, strides, and suboffsets arrays must be |
| 130 | freed when the buffer is released. The consumer should never alter this |
| 131 | value. |
| 132 | |
| 133 | |
| 134 | Buffer related functions |
| 135 | ======================== |
| 136 | |
| 137 | |
| 138 | .. cfunction:: int PyObject_CheckBuffer(PyObject *obj) |
| 139 | |
| 140 | Return 1 if *obj* supports the buffer interface otherwise 0. |
| 141 | |
| 142 | |
| 143 | .. cfunction:: int PyObject_GetBuffer(PyObject *obj, PyObject *view, int flags) |
| 144 | |
| 145 | Export *obj* into a :ctype:`Py_buffer`, *view*. These arguments must |
| 146 | never be *NULL*. The *flags* argument is a bit field indicating what kind |
| 147 | of buffer the caller is prepared to deal with and therefore what kind of |
| 148 | buffer the exporter is allowed to return. The buffer interface allows for |
| 149 | complicated memory sharing possibilities, but some caller may not be able |
| 150 | to handle all the complexibity but may want to see if the exporter will |
| 151 | let them take a simpler view to its memory. |
| 152 | |
| 153 | Some exporters may not be able to share memory in every possible way and |
| 154 | may need to raise errors to signal to some consumers that something is |
| 155 | just not possible. These errors should be a :exc:`BufferError` unless |
| 156 | there is another error that is actually causing the problem. The exporter |
| 157 | can use flags information to simplify how much of the :cdata:`Py_buffer` |
| 158 | structure is filled in with non-default values and/or raise an error if |
| 159 | the object can't support a simpler view of its memory. |
| 160 | |
| 161 | 0 is returned on success and -1 on error. |
| 162 | |
| 163 | The following table gives possible values to the *flags* arguments. |
| 164 | |
| 165 | +------------------------------+---------------------------------------------------+ |
| 166 | | Flag | Description | |
| 167 | +==============================+===================================================+ |
| 168 | | :cmacro:`PyBUF_SIMPLE` | This is the default flag state. The returned | |
| 169 | | | buffer may or may not have writable memory. The | |
| 170 | | | format of the data will be assumed to be unsigned | |
| 171 | | | bytes. This is a "stand-alone" flag constant. It | |
| 172 | | | never needs to be '|'d to the others. The exporter| |
| 173 | | | will raise an error if it cannot provide such a | |
| 174 | | | contiguous buffer of bytes. | |
| 175 | | | | |
| 176 | +------------------------------+---------------------------------------------------+ |
| 177 | | :cmacro:`PyBUF_WRITABLE` | The returned buffer must be writable. If it is | |
| 178 | | | not writable, then raise an error. | |
| 179 | +------------------------------+---------------------------------------------------+ |
| 180 | | :cmacro:`PyBUF_STRIDES` | This implies :cmacro:`PyBUF_ND`. The returned | |
| 181 | | | buffer must provide strides information (i.e. the | |
| 182 | | | strides cannot be NULL). This would be used when | |
| 183 | | | the consumer can handle strided, discontiguous | |
| 184 | | | arrays. Handling strides automatically assumes | |
| 185 | | | you can handle shape. The exporter can raise an | |
| 186 | | | error if a strided representation of the data is | |
| 187 | | | not possible (i.e. without the suboffsets). | |
| 188 | | | | |
| 189 | +------------------------------+---------------------------------------------------+ |
| 190 | | :cmacro:`PyBUF_ND` | The returned buffer must provide shape | |
| 191 | | | information. The memory will be assumed C-style | |
| 192 | | | contiguous (last dimension varies the | |
| 193 | | | fastest). The exporter may raise an error if it | |
| 194 | | | cannot provide this kind of contiguous buffer. If | |
| 195 | | | this is not given then shape will be *NULL*. | |
| 196 | | | | |
| 197 | | | | |
| 198 | | | | |
| 199 | +------------------------------+---------------------------------------------------+ |
| 200 | |:cmacro:`PyBUF_C_CONTIGUOUS` | These flags indicate that the contiguity returned | |
| 201 | |:cmacro:`PyBUF_F_CONTIGUOUS` | buffer must be respectively, C-contiguous (last | |
| 202 | |:cmacro:`PyBUF_ANY_CONTIGUOUS`| dimension varies the fastest), Fortran contiguous | |
| 203 | | | (first dimension varies the fastest) or either | |
| 204 | | | one. All of these flags imply | |
| 205 | | | :cmacro:`PyBUF_STRIDES` and guarantee that the | |
| 206 | | | strides buffer info structure will be filled in | |
| 207 | | | correctly. | |
| 208 | | | | |
| 209 | +------------------------------+---------------------------------------------------+ |
| 210 | | :cmacro:`PyBUF_INDIRECT` | This flag indicates the returned buffer must have | |
| 211 | | | suboffsets information (which can be NULL if no | |
| 212 | | | suboffsets are needed). This can be used when | |
| 213 | | | the consumer can handle indirect array | |
| 214 | | | referencing implied by these suboffsets. This | |
| 215 | | | implies :cmacro:`PyBUF_STRIDES`. | |
| 216 | | | | |
| 217 | | | | |
| 218 | | | | |
| 219 | +------------------------------+---------------------------------------------------+ |
| 220 | | :cmacro:`PyBUF_FORMAT` | The returned buffer must have true format | |
| 221 | | | information if this flag is provided. This would | |
| 222 | | | be used when the consumer is going to be checking | |
| 223 | | | for what 'kind' of data is actually stored. An | |
| 224 | | | exporter should always be able to provide this | |
| 225 | | | information if requested. If format is not | |
| 226 | | | explicitly requested then the format must be | |
| 227 | | | returned as *NULL* (which means ``'B'``, or | |
| 228 | | | unsigned bytes) | |
| 229 | +------------------------------+---------------------------------------------------+ |
| 230 | | :cmacro:`PyBUF_STRIDED` | This is equivalent to ``(PyBUF_STRIDES | | |
| 231 | | | PyBUF_WRITABLE)``. | |
| 232 | +------------------------------+---------------------------------------------------+ |
| 233 | | :cmacro:`PyBUF_STRIDED_RO` | This is equivalent to ``(PyBUF_STRIDES)``. | |
| 234 | | | | |
| 235 | +------------------------------+---------------------------------------------------+ |
| 236 | | :cmacro:`PyBUF_RECORDS` | This is equivalent to ``(PyBUF_STRIDES | | |
| 237 | | | PyBUF_FORMAT | PyBUF_WRITABLE)``. | |
| 238 | +------------------------------+---------------------------------------------------+ |
| 239 | | :cmacro:`PyBUF_RECORDS_RO` | This is equivalent to ``(PyBUF_STRIDES | | |
| 240 | | | PyBUF_FORMAT)``. | |
| 241 | +------------------------------+---------------------------------------------------+ |
| 242 | | :cmacro:`PyBUF_FULL` | This is equivalent to ``(PyBUF_INDIRECT | | |
| 243 | | | PyBUF_FORMAT | PyBUF_WRITABLE)``. | |
| 244 | +------------------------------+---------------------------------------------------+ |
Georg Brandl | 6cb1ff3 | 2009-04-08 16:36:39 +0000 | [diff] [blame^] | 245 | | :cmacro:`PyBUF_FULL_RO` | This is equivalent to ``(PyBUF_INDIRECT | | |
Antoine Pitrou | 789be0c | 2009-04-02 21:18:34 +0000 | [diff] [blame] | 246 | | | PyBUF_FORMAT)``. | |
| 247 | +------------------------------+---------------------------------------------------+ |
| 248 | | :cmacro:`PyBUF_CONTIG` | This is equivalent to ``(PyBUF_ND | | |
| 249 | | | PyBUF_WRITABLE)``. | |
| 250 | +------------------------------+---------------------------------------------------+ |
| 251 | | :cmacro:`PyBUF_CONTIG_RO` | This is equivalent to ``(PyBUF_ND)``. | |
| 252 | | | | |
| 253 | +------------------------------+---------------------------------------------------+ |
| 254 | |
| 255 | |
| 256 | .. cfunction:: void PyBuffer_Release(PyObject *obj, Py_buffer *view) |
| 257 | |
| 258 | Release the buffer *view* over *obj*. This shouldd be called when the buffer |
| 259 | is no longer being used as it may free memory from it. |
| 260 | |
| 261 | |
| 262 | .. cfunction:: Py_ssize_t PyBuffer_SizeFromFormat(const char *) |
| 263 | |
| 264 | Return the implied :cdata:`~Py_buffer.itemsize` from the struct-stype |
| 265 | :cdata:`~Py_buffer.format`. |
| 266 | |
| 267 | |
| 268 | .. cfunction:: int PyObject_CopyToObject(PyObject *obj, void *buf, Py_ssize_t len, char fortran) |
| 269 | |
| 270 | Copy *len* bytes of data pointed to by the contiguous chunk of memory pointed |
| 271 | to by *buf* into the buffer exported by obj. The buffer must of course be |
| 272 | writable. Return 0 on success and return -1 and raise an error on failure. |
| 273 | If the object does not have a writable buffer, then an error is raised. If |
| 274 | *fortran* is ``'F'``, then if the object is multi-dimensional, then the data |
| 275 | will be copied into the array in Fortran-style (first dimension varies the |
| 276 | fastest). If *fortran* is ``'C'``, then the data will be copied into the |
| 277 | array in C-style (last dimension varies the fastest). If *fortran* is |
| 278 | ``'A'``, then it does not matter and the copy will be made in whatever way is |
| 279 | more efficient. |
| 280 | |
| 281 | |
| 282 | .. cfunction:: int PyBuffer_IsContiguous(Py_buffer *view, char fortran) |
| 283 | |
| 284 | Return 1 if the memory defined by the *view* is C-style (*fortran* is |
| 285 | ``'C'``) or Fortran-style (*fortran* is ``'F'``) contiguous or either one |
| 286 | (*fortran* is ``'A'``). Return 0 otherwise. |
| 287 | |
| 288 | |
| 289 | .. cfunction:: void PyBuffer_FillContiguousStrides(int ndim, Py_ssize_t *shape, Py_ssize_t *strides, Py_ssize_t itemsize, char fortran) |
| 290 | |
| 291 | Fill the *strides* array with byte-strides of a contiguous (C-style if |
| 292 | *fortran* is ``'C'`` or Fortran-style if *fortran* is ``'F'`` array of the |
| 293 | given shape with the given number of bytes per element. |
| 294 | |
| 295 | |
| 296 | .. cfunction:: int PyBuffer_FillInfo(Py_buffer *view, void *buf, Py_ssize_t len, int readonly, int infoflags) |
| 297 | |
| 298 | Fill in a buffer-info structure, *view*, correctly for an exporter that can |
| 299 | only share a contiguous chunk of memory of "unsigned bytes" of the given |
| 300 | length. Return 0 on success and -1 (with raising an error) on error. |
| 301 | |
| 302 | |
| 303 | MemoryView objects |
| 304 | ================== |
| 305 | |
| 306 | A memoryview object is an extended buffer object that could replace the buffer |
| 307 | object (but doesn't have to as that could be kept as a simple 1-d memoryview |
| 308 | object). It, unlike :ctype:`Py_buffer`, is a Python object (exposed as |
| 309 | :class:`memoryview` in :mod:`builtins`), so it can be used with Python code. |
| 310 | |
| 311 | .. cfunction:: PyObject* PyMemoryView_FromObject(PyObject *obj) |
| 312 | |
| 313 | Return a memoryview object from an object that defines the buffer interface. |
| 314 | |
| 315 | |
| 316 | Old-style buffer objects |
| 317 | ======================== |
| 318 | |
Georg Brandl | f684272 | 2008-01-19 22:08:21 +0000 | [diff] [blame] | 319 | .. index:: single: PyBufferProcs |
| 320 | |
Antoine Pitrou | 789be0c | 2009-04-02 21:18:34 +0000 | [diff] [blame] | 321 | More information on the old buffer interface is provided in the section |
Georg Brandl | f684272 | 2008-01-19 22:08:21 +0000 | [diff] [blame] | 322 | :ref:`buffer-structs`, under the description for :ctype:`PyBufferProcs`. |
| 323 | |
| 324 | A "buffer object" is defined in the :file:`bufferobject.h` header (included by |
| 325 | :file:`Python.h`). These objects look very similar to string objects at the |
| 326 | Python programming level: they support slicing, indexing, concatenation, and |
| 327 | some other standard string operations. However, their data can come from one of |
| 328 | two sources: from a block of memory, or from another object which exports the |
| 329 | buffer interface. |
| 330 | |
| 331 | Buffer objects are useful as a way to expose the data from another object's |
| 332 | buffer interface to the Python programmer. They can also be used as a zero-copy |
| 333 | slicing mechanism. Using their ability to reference a block of memory, it is |
| 334 | possible to expose any data to the Python programmer quite easily. The memory |
| 335 | could be a large, constant array in a C extension, it could be a raw block of |
| 336 | memory for manipulation before passing to an operating system library, or it |
| 337 | could be used to pass around structured data in its native, in-memory format. |
| 338 | |
| 339 | |
| 340 | .. ctype:: PyBufferObject |
| 341 | |
| 342 | This subtype of :ctype:`PyObject` represents a buffer object. |
| 343 | |
| 344 | |
| 345 | .. cvar:: PyTypeObject PyBuffer_Type |
| 346 | |
| 347 | .. index:: single: BufferType (in module types) |
| 348 | |
| 349 | The instance of :ctype:`PyTypeObject` which represents the Python buffer type; |
| 350 | it is the same object as ``buffer`` and ``types.BufferType`` in the Python |
| 351 | layer. . |
| 352 | |
| 353 | |
| 354 | .. cvar:: int Py_END_OF_BUFFER |
| 355 | |
| 356 | This constant may be passed as the *size* parameter to |
| 357 | :cfunc:`PyBuffer_FromObject` or :cfunc:`PyBuffer_FromReadWriteObject`. It |
| 358 | indicates that the new :ctype:`PyBufferObject` should refer to *base* object |
| 359 | from the specified *offset* to the end of its exported buffer. Using this |
| 360 | enables the caller to avoid querying the *base* object for its length. |
| 361 | |
| 362 | |
| 363 | .. cfunction:: int PyBuffer_Check(PyObject *p) |
| 364 | |
| 365 | Return true if the argument has type :cdata:`PyBuffer_Type`. |
| 366 | |
| 367 | |
| 368 | .. cfunction:: PyObject* PyBuffer_FromObject(PyObject *base, Py_ssize_t offset, Py_ssize_t size) |
| 369 | |
| 370 | Return a new read-only buffer object. This raises :exc:`TypeError` if *base* |
| 371 | doesn't support the read-only buffer protocol or doesn't provide exactly one |
| 372 | buffer segment, or it raises :exc:`ValueError` if *offset* is less than zero. |
| 373 | The buffer will hold a reference to the *base* object, and the buffer's contents |
| 374 | will refer to the *base* object's buffer interface, starting as position |
| 375 | *offset* and extending for *size* bytes. If *size* is :const:`Py_END_OF_BUFFER`, |
| 376 | then the new buffer's contents extend to the length of the *base* object's |
| 377 | exported buffer data. |
| 378 | |
| 379 | |
| 380 | .. cfunction:: PyObject* PyBuffer_FromReadWriteObject(PyObject *base, Py_ssize_t offset, Py_ssize_t size) |
| 381 | |
| 382 | Return a new writable buffer object. Parameters and exceptions are similar to |
| 383 | those for :cfunc:`PyBuffer_FromObject`. If the *base* object does not export |
| 384 | the writeable buffer protocol, then :exc:`TypeError` is raised. |
| 385 | |
| 386 | |
| 387 | .. cfunction:: PyObject* PyBuffer_FromMemory(void *ptr, Py_ssize_t size) |
| 388 | |
| 389 | Return a new read-only buffer object that reads from a specified location in |
| 390 | memory, with a specified size. The caller is responsible for ensuring that the |
| 391 | memory buffer, passed in as *ptr*, is not deallocated while the returned buffer |
| 392 | object exists. Raises :exc:`ValueError` if *size* is less than zero. Note that |
| 393 | :const:`Py_END_OF_BUFFER` may *not* be passed for the *size* parameter; |
| 394 | :exc:`ValueError` will be raised in that case. |
| 395 | |
| 396 | |
| 397 | .. cfunction:: PyObject* PyBuffer_FromReadWriteMemory(void *ptr, Py_ssize_t size) |
| 398 | |
| 399 | Similar to :cfunc:`PyBuffer_FromMemory`, but the returned buffer is writable. |
| 400 | |
| 401 | |
| 402 | .. cfunction:: PyObject* PyBuffer_New(Py_ssize_t size) |
| 403 | |
| 404 | Return a new writable buffer object that maintains its own memory buffer of |
| 405 | *size* bytes. :exc:`ValueError` is returned if *size* is not zero or positive. |
| 406 | Note that the memory buffer (as returned by :cfunc:`PyObject_AsWriteBuffer`) is |
| 407 | not specifically aligned. |