| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 1 |  | 
 | 2 | :mod:`mmap` --- Memory-mapped file support | 
 | 3 | ========================================== | 
 | 4 |  | 
 | 5 | .. module:: mmap | 
 | 6 |    :synopsis: Interface to memory-mapped files for Unix and Windows. | 
 | 7 |  | 
 | 8 |  | 
 | 9 | Memory-mapped file objects behave like both strings and like file objects. | 
 | 10 | Unlike normal string objects, however, these are mutable.  You can use mmap | 
| Christian Heimes | dae2a89 | 2008-04-19 00:55:37 +0000 | [diff] [blame] | 11 | objects in most places where strings are expected; for example, you can use | 
 | 12 | the :mod:`re` module to search through a memory-mapped file.  Since they're | 
 | 13 | mutable, you can change a single character by doing ``obj[index] = 'a'``, or | 
 | 14 | change a substring by assigning to a slice: ``obj[i1:i2] = '...'``.  You can | 
 | 15 | also read and write data starting at the current file position, and | 
 | 16 | :meth:`seek` through the file to different positions. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 17 |  | 
| Christian Heimes | dae2a89 | 2008-04-19 00:55:37 +0000 | [diff] [blame] | 18 | A memory-mapped file is created by the :class:`mmap` constructor, which is | 
 | 19 | different on Unix and on Windows.  In either case you must provide a file | 
 | 20 | descriptor for a file opened for update. If you wish to map an existing Python | 
 | 21 | file object, use its :meth:`fileno` method to obtain the correct value for the | 
 | 22 | *fileno* parameter.  Otherwise, you can open the file using the | 
 | 23 | :func:`os.open` function, which returns a file descriptor directly (the file | 
 | 24 | still needs to be closed when done). | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 25 |  | 
| Georg Brandl | 86def6c | 2008-01-21 20:36:10 +0000 | [diff] [blame] | 26 | For both the Unix and Windows versions of the constructor, *access* may be | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 27 | specified as an optional keyword parameter. *access* accepts one of three | 
| Christian Heimes | dae2a89 | 2008-04-19 00:55:37 +0000 | [diff] [blame] | 28 | values: :const:`ACCESS_READ`, :const:`ACCESS_WRITE`, or :const:`ACCESS_COPY` | 
 | 29 | to specify read-only, write-through or copy-on-write memory respectively. | 
 | 30 | *access* can be used on both Unix and Windows.  If *access* is not specified, | 
 | 31 | Windows mmap returns a write-through mapping.  The initial memory values for | 
 | 32 | all three access types are taken from the specified file.  Assignment to an | 
 | 33 | :const:`ACCESS_READ` memory map raises a :exc:`TypeError` exception. | 
 | 34 | Assignment to an :const:`ACCESS_WRITE` memory map affects both memory and the | 
 | 35 | underlying file.  Assignment to an :const:`ACCESS_COPY` memory map affects | 
 | 36 | memory but does not update the underlying file. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 37 |  | 
| Georg Brandl | 55ac8f0 | 2007-09-01 13:51:09 +0000 | [diff] [blame] | 38 | To map anonymous memory, -1 should be passed as the fileno along with the length. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 39 |  | 
| Georg Brandl | 86def6c | 2008-01-21 20:36:10 +0000 | [diff] [blame] | 40 | .. class:: mmap(fileno, length[, tagname[, access[, offset]]]) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 41 |  | 
| Christian Heimes | dae2a89 | 2008-04-19 00:55:37 +0000 | [diff] [blame] | 42 |    **(Windows version)** Maps *length* bytes from the file specified by the | 
 | 43 |    file handle *fileno*, and creates a mmap object.  If *length* is larger | 
 | 44 |    than the current size of the file, the file is extended to contain *length* | 
 | 45 |    bytes.  If *length* is ``0``, the maximum length of the map is the current | 
 | 46 |    size of the file, except that if the file is empty Windows raises an | 
 | 47 |    exception (you cannot create an empty mapping on Windows). | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 48 |  | 
| Christian Heimes | dae2a89 | 2008-04-19 00:55:37 +0000 | [diff] [blame] | 49 |    *tagname*, if specified and not ``None``, is a string giving a tag name for | 
 | 50 |    the mapping.  Windows allows you to have many different mappings against | 
 | 51 |    the same file.  If you specify the name of an existing tag, that tag is | 
 | 52 |    opened, otherwise a new tag of this name is created.  If this parameter is | 
 | 53 |    omitted or ``None``, the mapping is created without a name.  Avoiding the | 
 | 54 |    use of the tag parameter will assist in keeping your code portable between | 
 | 55 |    Unix and Windows. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 56 |  | 
| Christian Heimes | dae2a89 | 2008-04-19 00:55:37 +0000 | [diff] [blame] | 57 |    *offset* may be specified as a non-negative integer offset. mmap references | 
 | 58 |    will be relative to the offset from the beginning of the file. *offset* | 
 | 59 |    defaults to 0.  *offset* must be a multiple of the ALLOCATIONGRANULARITY. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 60 |  | 
| Georg Brandl | 9afde1c | 2007-11-01 20:32:30 +0000 | [diff] [blame] | 61 |  | 
| Georg Brandl | 86def6c | 2008-01-21 20:36:10 +0000 | [diff] [blame] | 62 | .. class:: mmap(fileno, length[, flags[, prot[, access[, offset]]]]) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 63 |    :noindex: | 
 | 64 |  | 
 | 65 |    **(Unix version)** Maps *length* bytes from the file specified by the file | 
 | 66 |    descriptor *fileno*, and returns a mmap object.  If *length* is ``0``, the | 
| Christian Heimes | dae2a89 | 2008-04-19 00:55:37 +0000 | [diff] [blame] | 67 |    maximum length of the map will be the current size of the file when | 
 | 68 |    :class:`mmap` is called. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 69 |  | 
 | 70 |    *flags* specifies the nature of the mapping. :const:`MAP_PRIVATE` creates a | 
| Christian Heimes | dae2a89 | 2008-04-19 00:55:37 +0000 | [diff] [blame] | 71 |    private copy-on-write mapping, so changes to the contents of the mmap | 
 | 72 |    object will be private to this process, and :const:`MAP_SHARED` creates a | 
 | 73 |    mapping that's shared with all other processes mapping the same areas of | 
 | 74 |    the file.  The default value is :const:`MAP_SHARED`. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 75 |  | 
| Christian Heimes | dae2a89 | 2008-04-19 00:55:37 +0000 | [diff] [blame] | 76 |    *prot*, if specified, gives the desired memory protection; the two most | 
 | 77 |    useful values are :const:`PROT_READ` and :const:`PROT_WRITE`, to specify | 
 | 78 |    that the pages may be read or written.  *prot* defaults to | 
 | 79 |    :const:`PROT_READ \| PROT_WRITE`. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 80 |  | 
| Christian Heimes | dae2a89 | 2008-04-19 00:55:37 +0000 | [diff] [blame] | 81 |    *access* may be specified in lieu of *flags* and *prot* as an optional | 
 | 82 |    keyword parameter.  It is an error to specify both *flags*, *prot* and | 
 | 83 |    *access*.  See the description of *access* above for information on how to | 
 | 84 |    use this parameter. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 85 |  | 
| Christian Heimes | dae2a89 | 2008-04-19 00:55:37 +0000 | [diff] [blame] | 86 |    *offset* may be specified as a non-negative integer offset. mmap references | 
 | 87 |    will be relative to the offset from the beginning of the file. *offset* | 
 | 88 |    defaults to 0.  *offset* must be a multiple of the PAGESIZE or | 
 | 89 |    ALLOCATIONGRANULARITY. | 
| Christian Heimes | d8654cf | 2007-12-02 15:22:16 +0000 | [diff] [blame] | 90 |     | 
| Georg Brandl | 86def6c | 2008-01-21 20:36:10 +0000 | [diff] [blame] | 91 |    This example shows a simple way of using :class:`mmap`:: | 
| Christian Heimes | d8654cf | 2007-12-02 15:22:16 +0000 | [diff] [blame] | 92 |  | 
 | 93 |       import mmap | 
 | 94 |  | 
 | 95 |       # write a simple example file | 
 | 96 |       with open("hello.txt", "w") as f: | 
 | 97 |           f.write("Hello Python!\n") | 
 | 98 |  | 
 | 99 |       with open("hello.txt", "r+") as f: | 
 | 100 |           # memory-map the file, size 0 means whole file | 
 | 101 |           map = mmap.mmap(f.fileno(), 0) | 
 | 102 |           # read content via standard file methods | 
| Georg Brandl | a09ca38 | 2007-12-02 18:20:12 +0000 | [diff] [blame] | 103 |           print(map.readline())  # prints "Hello Python!" | 
| Christian Heimes | d8654cf | 2007-12-02 15:22:16 +0000 | [diff] [blame] | 104 |           # read content via slice notation | 
| Georg Brandl | a09ca38 | 2007-12-02 18:20:12 +0000 | [diff] [blame] | 105 |           print(map[:5])  # prints "Hello" | 
| Christian Heimes | d8654cf | 2007-12-02 15:22:16 +0000 | [diff] [blame] | 106 |           # update content using slice notation; | 
 | 107 |           # note that new content must have same size | 
 | 108 |           map[6:] = " world!\n" | 
 | 109 |           # ... and read again using standard file methods | 
 | 110 |           map.seek(0) | 
| Georg Brandl | a09ca38 | 2007-12-02 18:20:12 +0000 | [diff] [blame] | 111 |           print(map.readline())  # prints "Hello  world!" | 
| Christian Heimes | d8654cf | 2007-12-02 15:22:16 +0000 | [diff] [blame] | 112 |           # close the map | 
 | 113 |           map.close() | 
 | 114 |  | 
 | 115 |  | 
 | 116 |    The next example demonstrates how to create an anonymous map and exchange | 
 | 117 |    data between the parent and child processes:: | 
 | 118 |  | 
 | 119 |       import mmap | 
 | 120 |       import os | 
 | 121 |  | 
 | 122 |       map = mmap.mmap(-1, 13) | 
 | 123 |       map.write("Hello world!") | 
 | 124 |  | 
 | 125 |       pid = os.fork() | 
 | 126 |  | 
 | 127 |       if pid == 0: # In a child process | 
 | 128 |           map.seek(0) | 
| Georg Brandl | a09ca38 | 2007-12-02 18:20:12 +0000 | [diff] [blame] | 129 |           print(map.readline()) | 
| Christian Heimes | d8654cf | 2007-12-02 15:22:16 +0000 | [diff] [blame] | 130 |  | 
 | 131 |           map.close() | 
 | 132 |  | 
| Georg Brandl | 9afde1c | 2007-11-01 20:32:30 +0000 | [diff] [blame] | 133 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 134 |    Memory-mapped file objects support the following methods: | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 135 |  | 
 | 136 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 137 |    .. method:: close() | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 138 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 139 |       Close the file.  Subsequent calls to other methods of the object will | 
 | 140 |       result in an exception being raised. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 141 |  | 
 | 142 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 143 |    .. method:: find(string[, start[, end]]) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 144 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 145 |       Returns the lowest index in the object where the substring *string* is | 
 | 146 |       found, such that *string* is contained in the range [*start*, *end*]. | 
 | 147 |       Optional arguments *start* and *end* are interpreted as in slice notation. | 
 | 148 |       Returns ``-1`` on failure. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 149 |  | 
 | 150 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 151 |    .. method:: flush([offset, size]) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 152 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 153 |       Flushes changes made to the in-memory copy of a file back to disk. Without | 
 | 154 |       use of this call there is no guarantee that changes are written back before | 
 | 155 |       the object is destroyed.  If *offset* and *size* are specified, only | 
 | 156 |       changes to the given range of bytes will be flushed to disk; otherwise, the | 
 | 157 |       whole extent of the mapping is flushed. | 
| Christian Heimes | dae2a89 | 2008-04-19 00:55:37 +0000 | [diff] [blame] | 158 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 159 |       **(Windows version)** A nonzero value returned indicates success; zero | 
 | 160 |       indicates failure. | 
| Christian Heimes | dae2a89 | 2008-04-19 00:55:37 +0000 | [diff] [blame] | 161 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 162 |       **(Unix version)** A zero value is returned to indicate success. An | 
 | 163 |       exception is raised when the call failed. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 164 |  | 
 | 165 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 166 |    .. method:: move(dest, src, count) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 167 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 168 |       Copy the *count* bytes starting at offset *src* to the destination index | 
 | 169 |       *dest*.  If the mmap was created with :const:`ACCESS_READ`, then calls to | 
 | 170 |       move will throw a :exc:`TypeError` exception. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 171 |  | 
 | 172 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 173 |    .. method:: read(num) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 174 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 175 |       Return a string containing up to *num* bytes starting from the current | 
 | 176 |       file position; the file position is updated to point after the bytes that | 
 | 177 |       were returned. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 178 |  | 
 | 179 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 180 |    .. method:: read_byte() | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 181 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 182 |       Returns a string of length 1 containing the character at the current file | 
 | 183 |       position, and advances the file position by 1. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 184 |  | 
 | 185 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 186 |    .. method:: readline() | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 187 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 188 |       Returns a single line, starting at the current file position and up to the | 
 | 189 |       next newline. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 190 |  | 
 | 191 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 192 |    .. method:: resize(newsize) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 193 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 194 |       Resizes the map and the underlying file, if any. If the mmap was created | 
 | 195 |       with :const:`ACCESS_READ` or :const:`ACCESS_COPY`, resizing the map will | 
 | 196 |       throw a :exc:`TypeError` exception. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 197 |  | 
 | 198 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 199 |    .. method:: rfind(string[, start[, end]]) | 
| Georg Brandl | fceab5a | 2008-01-19 20:08:23 +0000 | [diff] [blame] | 200 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 201 |       Returns the highest index in the object where the substring *string* is | 
 | 202 |       found, such that *string* is contained in the range [*start*, *end*]. | 
 | 203 |       Optional arguments *start* and *end* are interpreted as in slice notation. | 
 | 204 |       Returns ``-1`` on failure. | 
| Georg Brandl | fceab5a | 2008-01-19 20:08:23 +0000 | [diff] [blame] | 205 |  | 
 | 206 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 207 |    .. method:: seek(pos[, whence]) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 208 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 209 |       Set the file's current position.  *whence* argument is optional and | 
 | 210 |       defaults to ``os.SEEK_SET`` or ``0`` (absolute file positioning); other | 
 | 211 |       values are ``os.SEEK_CUR`` or ``1`` (seek relative to the current | 
 | 212 |       position) and ``os.SEEK_END`` or ``2`` (seek relative to the file's end). | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 213 |  | 
 | 214 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 215 |    .. method:: size() | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 216 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 217 |       Return the length of the file, which can be larger than the size of the | 
 | 218 |       memory-mapped area. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 219 |  | 
 | 220 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 221 |    .. method:: tell() | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 222 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 223 |       Returns the current position of the file pointer. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 224 |  | 
 | 225 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 226 |    .. method:: write(string) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 227 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 228 |       Write the bytes in *string* into memory at the current position of the | 
 | 229 |       file pointer; the file position is updated to point after the bytes that | 
 | 230 |       were written. If the mmap was created with :const:`ACCESS_READ`, then | 
 | 231 |       writing to it will throw a :exc:`TypeError` exception. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 232 |  | 
 | 233 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 234 |    .. method:: write_byte(byte) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 235 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 236 |       Write the single-character string *byte* into memory at the current | 
 | 237 |       position of the file pointer; the file position is advanced by ``1``. If | 
 | 238 |       the mmap was created with :const:`ACCESS_READ`, then writing to it will | 
 | 239 |       throw a :exc:`TypeError` exception. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 240 |  | 
| Georg Brandl | 9afde1c | 2007-11-01 20:32:30 +0000 | [diff] [blame] | 241 |  |