Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 1 | :mod:`mmap` --- Memory-mapped file support |
| 2 | ========================================== |
| 3 | |
| 4 | .. module:: mmap |
| 5 | :synopsis: Interface to memory-mapped files for Unix and Windows. |
| 6 | |
| 7 | |
Antoine Pitrou | 11cb961 | 2010-09-15 11:11:28 +0000 | [diff] [blame] | 8 | Memory-mapped file objects behave like both :class:`bytearray` and like |
| 9 | :term:`file objects <file object>`. You can use mmap objects in most places |
| 10 | where :class:`bytearray` are expected; for example, you can use the :mod:`re` |
| 11 | module to search through a memory-mapped file. You can also change a single |
| 12 | byte by doing ``obj[index] = 97``, or change a subsequence by assigning to a |
| 13 | slice: ``obj[i1:i2] = b'...'``. You can also read and write data starting at |
| 14 | the current file position, and :meth:`seek` through the file to different positions. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 15 | |
Christian Heimes | dae2a89 | 2008-04-19 00:55:37 +0000 | [diff] [blame] | 16 | A memory-mapped file is created by the :class:`mmap` constructor, which is |
| 17 | different on Unix and on Windows. In either case you must provide a file |
| 18 | descriptor for a file opened for update. If you wish to map an existing Python |
| 19 | file object, use its :meth:`fileno` method to obtain the correct value for the |
| 20 | *fileno* parameter. Otherwise, you can open the file using the |
| 21 | :func:`os.open` function, which returns a file descriptor directly (the file |
| 22 | still needs to be closed when done). |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 23 | |
Georg Brandl | 86def6c | 2008-01-21 20:36:10 +0000 | [diff] [blame] | 24 | For both the Unix and Windows versions of the constructor, *access* may be |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 25 | specified as an optional keyword parameter. *access* accepts one of three |
Christian Heimes | dae2a89 | 2008-04-19 00:55:37 +0000 | [diff] [blame] | 26 | values: :const:`ACCESS_READ`, :const:`ACCESS_WRITE`, or :const:`ACCESS_COPY` |
| 27 | to specify read-only, write-through or copy-on-write memory respectively. |
| 28 | *access* can be used on both Unix and Windows. If *access* is not specified, |
| 29 | Windows mmap returns a write-through mapping. The initial memory values for |
| 30 | all three access types are taken from the specified file. Assignment to an |
| 31 | :const:`ACCESS_READ` memory map raises a :exc:`TypeError` exception. |
| 32 | Assignment to an :const:`ACCESS_WRITE` memory map affects both memory and the |
| 33 | underlying file. Assignment to an :const:`ACCESS_COPY` memory map affects |
| 34 | memory but does not update the underlying file. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 35 | |
Georg Brandl | 55ac8f0 | 2007-09-01 13:51:09 +0000 | [diff] [blame] | 36 | To map anonymous memory, -1 should be passed as the fileno along with the length. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 37 | |
Georg Brandl | cd7f32b | 2009-06-08 09:13:45 +0000 | [diff] [blame] | 38 | .. class:: mmap(fileno, length, tagname=None, access=ACCESS_DEFAULT[, offset]) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 39 | |
Christian Heimes | dae2a89 | 2008-04-19 00:55:37 +0000 | [diff] [blame] | 40 | **(Windows version)** Maps *length* bytes from the file specified by the |
| 41 | file handle *fileno*, and creates a mmap object. If *length* is larger |
| 42 | than the current size of the file, the file is extended to contain *length* |
| 43 | bytes. If *length* is ``0``, the maximum length of the map is the current |
| 44 | size of the file, except that if the file is empty Windows raises an |
| 45 | exception (you cannot create an empty mapping on Windows). |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 46 | |
Christian Heimes | dae2a89 | 2008-04-19 00:55:37 +0000 | [diff] [blame] | 47 | *tagname*, if specified and not ``None``, is a string giving a tag name for |
| 48 | the mapping. Windows allows you to have many different mappings against |
| 49 | the same file. If you specify the name of an existing tag, that tag is |
| 50 | opened, otherwise a new tag of this name is created. If this parameter is |
| 51 | omitted or ``None``, the mapping is created without a name. Avoiding the |
| 52 | use of the tag parameter will assist in keeping your code portable between |
| 53 | Unix and Windows. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 54 | |
Christian Heimes | dae2a89 | 2008-04-19 00:55:37 +0000 | [diff] [blame] | 55 | *offset* may be specified as a non-negative integer offset. mmap references |
| 56 | will be relative to the offset from the beginning of the file. *offset* |
| 57 | defaults to 0. *offset* must be a multiple of the ALLOCATIONGRANULARITY. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 58 | |
Georg Brandl | 9afde1c | 2007-11-01 20:32:30 +0000 | [diff] [blame] | 59 | |
Georg Brandl | cd7f32b | 2009-06-08 09:13:45 +0000 | [diff] [blame] | 60 | .. class:: mmap(fileno, length, flags=MAP_SHARED, prot=PROT_WRITE|PROT_READ, access=ACCESS_DEFAULT[, offset]) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 61 | :noindex: |
| 62 | |
| 63 | **(Unix version)** Maps *length* bytes from the file specified by the file |
| 64 | descriptor *fileno*, and returns a mmap object. If *length* is ``0``, the |
Christian Heimes | dae2a89 | 2008-04-19 00:55:37 +0000 | [diff] [blame] | 65 | maximum length of the map will be the current size of the file when |
| 66 | :class:`mmap` is called. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 67 | |
| 68 | *flags* specifies the nature of the mapping. :const:`MAP_PRIVATE` creates a |
Christian Heimes | dae2a89 | 2008-04-19 00:55:37 +0000 | [diff] [blame] | 69 | private copy-on-write mapping, so changes to the contents of the mmap |
| 70 | object will be private to this process, and :const:`MAP_SHARED` creates a |
| 71 | mapping that's shared with all other processes mapping the same areas of |
| 72 | the file. The default value is :const:`MAP_SHARED`. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 73 | |
Christian Heimes | dae2a89 | 2008-04-19 00:55:37 +0000 | [diff] [blame] | 74 | *prot*, if specified, gives the desired memory protection; the two most |
| 75 | useful values are :const:`PROT_READ` and :const:`PROT_WRITE`, to specify |
| 76 | that the pages may be read or written. *prot* defaults to |
| 77 | :const:`PROT_READ \| PROT_WRITE`. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 78 | |
Christian Heimes | dae2a89 | 2008-04-19 00:55:37 +0000 | [diff] [blame] | 79 | *access* may be specified in lieu of *flags* and *prot* as an optional |
| 80 | keyword parameter. It is an error to specify both *flags*, *prot* and |
| 81 | *access*. See the description of *access* above for information on how to |
| 82 | use this parameter. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 83 | |
Christian Heimes | dae2a89 | 2008-04-19 00:55:37 +0000 | [diff] [blame] | 84 | *offset* may be specified as a non-negative integer offset. mmap references |
| 85 | will be relative to the offset from the beginning of the file. *offset* |
| 86 | defaults to 0. *offset* must be a multiple of the PAGESIZE or |
| 87 | ALLOCATIONGRANULARITY. |
Georg Brandl | 48310cd | 2009-01-03 21:18:54 +0000 | [diff] [blame] | 88 | |
Georg Brandl | 86def6c | 2008-01-21 20:36:10 +0000 | [diff] [blame] | 89 | This example shows a simple way of using :class:`mmap`:: |
Christian Heimes | d8654cf | 2007-12-02 15:22:16 +0000 | [diff] [blame] | 90 | |
| 91 | import mmap |
| 92 | |
| 93 | # write a simple example file |
Benjamin Peterson | e0124bd | 2009-03-09 21:04:33 +0000 | [diff] [blame] | 94 | with open("hello.txt", "wb") as f: |
Benjamin Peterson | e099b37 | 2009-04-04 17:09:35 +0000 | [diff] [blame] | 95 | f.write(b"Hello Python!\n") |
Christian Heimes | d8654cf | 2007-12-02 15:22:16 +0000 | [diff] [blame] | 96 | |
Benjamin Peterson | e0124bd | 2009-03-09 21:04:33 +0000 | [diff] [blame] | 97 | with open("hello.txt", "r+b") as f: |
Christian Heimes | d8654cf | 2007-12-02 15:22:16 +0000 | [diff] [blame] | 98 | # memory-map the file, size 0 means whole file |
| 99 | map = mmap.mmap(f.fileno(), 0) |
| 100 | # read content via standard file methods |
Benjamin Peterson | e099b37 | 2009-04-04 17:09:35 +0000 | [diff] [blame] | 101 | print(map.readline()) # prints b"Hello Python!\n" |
Christian Heimes | d8654cf | 2007-12-02 15:22:16 +0000 | [diff] [blame] | 102 | # read content via slice notation |
Benjamin Peterson | e099b37 | 2009-04-04 17:09:35 +0000 | [diff] [blame] | 103 | print(map[:5]) # prints b"Hello" |
Christian Heimes | d8654cf | 2007-12-02 15:22:16 +0000 | [diff] [blame] | 104 | # update content using slice notation; |
| 105 | # note that new content must have same size |
Benjamin Peterson | e099b37 | 2009-04-04 17:09:35 +0000 | [diff] [blame] | 106 | map[6:] = b" world!\n" |
Christian Heimes | d8654cf | 2007-12-02 15:22:16 +0000 | [diff] [blame] | 107 | # ... and read again using standard file methods |
| 108 | map.seek(0) |
Benjamin Peterson | e099b37 | 2009-04-04 17:09:35 +0000 | [diff] [blame] | 109 | print(map.readline()) # prints b"Hello world!\n" |
Christian Heimes | d8654cf | 2007-12-02 15:22:16 +0000 | [diff] [blame] | 110 | # close the map |
| 111 | map.close() |
| 112 | |
| 113 | |
Georg Brandl | 0bccc18 | 2010-08-01 14:50:00 +0000 | [diff] [blame] | 114 | :class:`mmap` can also be used as a context manager in a :keyword:`with` |
| 115 | statement.:: |
| 116 | |
| 117 | import mmap |
| 118 | |
| 119 | with mmap.mmap(-1, 13) as map: |
| 120 | map.write("Hello world!") |
| 121 | |
| 122 | .. versionadded:: 3.2 |
| 123 | Context manager support. |
| 124 | |
| 125 | |
Christian Heimes | d8654cf | 2007-12-02 15:22:16 +0000 | [diff] [blame] | 126 | The next example demonstrates how to create an anonymous map and exchange |
| 127 | data between the parent and child processes:: |
| 128 | |
| 129 | import mmap |
| 130 | import os |
| 131 | |
| 132 | map = mmap.mmap(-1, 13) |
Benjamin Peterson | e099b37 | 2009-04-04 17:09:35 +0000 | [diff] [blame] | 133 | map.write(b"Hello world!") |
Christian Heimes | d8654cf | 2007-12-02 15:22:16 +0000 | [diff] [blame] | 134 | |
| 135 | pid = os.fork() |
| 136 | |
| 137 | if pid == 0: # In a child process |
| 138 | map.seek(0) |
Georg Brandl | a09ca38 | 2007-12-02 18:20:12 +0000 | [diff] [blame] | 139 | print(map.readline()) |
Christian Heimes | d8654cf | 2007-12-02 15:22:16 +0000 | [diff] [blame] | 140 | |
| 141 | map.close() |
| 142 | |
Georg Brandl | 9afde1c | 2007-11-01 20:32:30 +0000 | [diff] [blame] | 143 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 144 | Memory-mapped file objects support the following methods: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 145 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 146 | .. method:: close() |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 147 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 148 | Close the file. Subsequent calls to other methods of the object will |
| 149 | result in an exception being raised. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 150 | |
| 151 | |
Georg Brandl | 0bccc18 | 2010-08-01 14:50:00 +0000 | [diff] [blame] | 152 | .. attribute:: closed |
| 153 | |
| 154 | True if the file is closed. |
| 155 | |
| 156 | .. versionadded:: 3.2 |
| 157 | |
| 158 | |
Benjamin Peterson | e099b37 | 2009-04-04 17:09:35 +0000 | [diff] [blame] | 159 | .. method:: find(sub[, start[, end]]) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 160 | |
Benjamin Peterson | e099b37 | 2009-04-04 17:09:35 +0000 | [diff] [blame] | 161 | Returns the lowest index in the object where the subsequence *sub* is |
| 162 | found, such that *sub* is contained in the range [*start*, *end*]. |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 163 | Optional arguments *start* and *end* are interpreted as in slice notation. |
| 164 | Returns ``-1`` on failure. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 165 | |
| 166 | |
Georg Brandl | cd7f32b | 2009-06-08 09:13:45 +0000 | [diff] [blame] | 167 | .. method:: flush([offset[, size]]) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 168 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 169 | Flushes changes made to the in-memory copy of a file back to disk. Without |
| 170 | use of this call there is no guarantee that changes are written back before |
| 171 | the object is destroyed. If *offset* and *size* are specified, only |
| 172 | changes to the given range of bytes will be flushed to disk; otherwise, the |
| 173 | whole extent of the mapping is flushed. |
Christian Heimes | dae2a89 | 2008-04-19 00:55:37 +0000 | [diff] [blame] | 174 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 175 | **(Windows version)** A nonzero value returned indicates success; zero |
| 176 | indicates failure. |
Christian Heimes | dae2a89 | 2008-04-19 00:55:37 +0000 | [diff] [blame] | 177 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 178 | **(Unix version)** A zero value is returned to indicate success. An |
| 179 | exception is raised when the call failed. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 180 | |
| 181 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 182 | .. method:: move(dest, src, count) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 183 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 184 | Copy the *count* bytes starting at offset *src* to the destination index |
| 185 | *dest*. If the mmap was created with :const:`ACCESS_READ`, then calls to |
Georg Brandl | 7cb1319 | 2010-08-03 12:06:29 +0000 | [diff] [blame] | 186 | move will raise a :exc:`TypeError` exception. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 187 | |
| 188 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 189 | .. method:: read(num) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 190 | |
Benjamin Peterson | e099b37 | 2009-04-04 17:09:35 +0000 | [diff] [blame] | 191 | Return a :class:`bytes` containing up to *num* bytes starting from the |
| 192 | current file position; the file position is updated to point after the |
| 193 | bytes that were returned. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 194 | |
| 195 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 196 | .. method:: read_byte() |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 197 | |
Benjamin Peterson | e099b37 | 2009-04-04 17:09:35 +0000 | [diff] [blame] | 198 | Returns a byte at the current file position as an integer, and advances |
| 199 | the file position by 1. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 200 | |
| 201 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 202 | .. method:: readline() |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 203 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 204 | Returns a single line, starting at the current file position and up to the |
| 205 | next newline. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 206 | |
| 207 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 208 | .. method:: resize(newsize) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 209 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 210 | Resizes the map and the underlying file, if any. If the mmap was created |
| 211 | with :const:`ACCESS_READ` or :const:`ACCESS_COPY`, resizing the map will |
Georg Brandl | 7cb1319 | 2010-08-03 12:06:29 +0000 | [diff] [blame] | 212 | raise a :exc:`TypeError` exception. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 213 | |
| 214 | |
Benjamin Peterson | e099b37 | 2009-04-04 17:09:35 +0000 | [diff] [blame] | 215 | .. method:: rfind(sub[, start[, end]]) |
Georg Brandl | fceab5a | 2008-01-19 20:08:23 +0000 | [diff] [blame] | 216 | |
Benjamin Peterson | e099b37 | 2009-04-04 17:09:35 +0000 | [diff] [blame] | 217 | Returns the highest index in the object where the subsequence *sub* is |
| 218 | found, such that *sub* is contained in the range [*start*, *end*]. |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 219 | Optional arguments *start* and *end* are interpreted as in slice notation. |
| 220 | Returns ``-1`` on failure. |
Georg Brandl | fceab5a | 2008-01-19 20:08:23 +0000 | [diff] [blame] | 221 | |
| 222 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 223 | .. method:: seek(pos[, whence]) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 224 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 225 | Set the file's current position. *whence* argument is optional and |
| 226 | defaults to ``os.SEEK_SET`` or ``0`` (absolute file positioning); other |
| 227 | values are ``os.SEEK_CUR`` or ``1`` (seek relative to the current |
| 228 | position) and ``os.SEEK_END`` or ``2`` (seek relative to the file's end). |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 229 | |
| 230 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 231 | .. method:: size() |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 232 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 233 | Return the length of the file, which can be larger than the size of the |
| 234 | memory-mapped area. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 235 | |
| 236 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 237 | .. method:: tell() |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 238 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 239 | Returns the current position of the file pointer. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 240 | |
| 241 | |
Benjamin Peterson | e099b37 | 2009-04-04 17:09:35 +0000 | [diff] [blame] | 242 | .. method:: write(bytes) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 243 | |
Benjamin Peterson | e099b37 | 2009-04-04 17:09:35 +0000 | [diff] [blame] | 244 | Write the bytes in *bytes* into memory at the current position of the |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 245 | file pointer; the file position is updated to point after the bytes that |
| 246 | were written. If the mmap was created with :const:`ACCESS_READ`, then |
Georg Brandl | 7cb1319 | 2010-08-03 12:06:29 +0000 | [diff] [blame] | 247 | writing to it will raise a :exc:`TypeError` exception. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 248 | |
| 249 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 250 | .. method:: write_byte(byte) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 251 | |
Benjamin Peterson | e099b37 | 2009-04-04 17:09:35 +0000 | [diff] [blame] | 252 | Write the the integer *byte* into memory at the current |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 253 | position of the file pointer; the file position is advanced by ``1``. If |
| 254 | the mmap was created with :const:`ACCESS_READ`, then writing to it will |
Georg Brandl | 7cb1319 | 2010-08-03 12:06:29 +0000 | [diff] [blame] | 255 | raise a :exc:`TypeError` exception. |