blob: 403e2f524eab94a527377e42c0c2e8326de838a3 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`mmap` --- Memory-mapped file support
2==========================================
3
4.. module:: mmap
5 :synopsis: Interface to memory-mapped files for Unix and Windows.
6
7
Benjamin Petersone099b372009-04-04 17:09:35 +00008Memory-mapped file objects behave like both :class:`bytes` and like file
9objects. Unlike normal :class:`bytes` objects, however, these are mutable.
10You can use mmap objects in most places where :class:`bytes` are expected; for
11example, you can use the :mod:`re` module to search through a memory-mapped file.
12Since they're mutable, you can change a single byte by doing ``obj[index] = 97``,
13or change a subsequence by assigning to a slice: ``obj[i1:i2] = b'...'``.
14You can also read and write data starting at the current file position, and
Christian Heimesdae2a892008-04-19 00:55:37 +000015:meth:`seek` through the file to different positions.
Georg Brandl116aa622007-08-15 14:28:22 +000016
Christian Heimesdae2a892008-04-19 00:55:37 +000017A memory-mapped file is created by the :class:`mmap` constructor, which is
18different on Unix and on Windows. In either case you must provide a file
19descriptor for a file opened for update. If you wish to map an existing Python
20file object, use its :meth:`fileno` method to obtain the correct value for the
21*fileno* parameter. Otherwise, you can open the file using the
22:func:`os.open` function, which returns a file descriptor directly (the file
23still needs to be closed when done).
Georg Brandl116aa622007-08-15 14:28:22 +000024
Georg Brandl86def6c2008-01-21 20:36:10 +000025For both the Unix and Windows versions of the constructor, *access* may be
Georg Brandl116aa622007-08-15 14:28:22 +000026specified as an optional keyword parameter. *access* accepts one of three
Christian Heimesdae2a892008-04-19 00:55:37 +000027values: :const:`ACCESS_READ`, :const:`ACCESS_WRITE`, or :const:`ACCESS_COPY`
28to specify read-only, write-through or copy-on-write memory respectively.
29*access* can be used on both Unix and Windows. If *access* is not specified,
30Windows mmap returns a write-through mapping. The initial memory values for
31all three access types are taken from the specified file. Assignment to an
32:const:`ACCESS_READ` memory map raises a :exc:`TypeError` exception.
33Assignment to an :const:`ACCESS_WRITE` memory map affects both memory and the
34underlying file. Assignment to an :const:`ACCESS_COPY` memory map affects
35memory but does not update the underlying file.
Georg Brandl116aa622007-08-15 14:28:22 +000036
Georg Brandl55ac8f02007-09-01 13:51:09 +000037To map anonymous memory, -1 should be passed as the fileno along with the length.
Georg Brandl116aa622007-08-15 14:28:22 +000038
Georg Brandlcd7f32b2009-06-08 09:13:45 +000039.. class:: mmap(fileno, length, tagname=None, access=ACCESS_DEFAULT[, offset])
Georg Brandl116aa622007-08-15 14:28:22 +000040
Christian Heimesdae2a892008-04-19 00:55:37 +000041 **(Windows version)** Maps *length* bytes from the file specified by the
42 file handle *fileno*, and creates a mmap object. If *length* is larger
43 than the current size of the file, the file is extended to contain *length*
44 bytes. If *length* is ``0``, the maximum length of the map is the current
45 size of the file, except that if the file is empty Windows raises an
46 exception (you cannot create an empty mapping on Windows).
Georg Brandl116aa622007-08-15 14:28:22 +000047
Christian Heimesdae2a892008-04-19 00:55:37 +000048 *tagname*, if specified and not ``None``, is a string giving a tag name for
49 the mapping. Windows allows you to have many different mappings against
50 the same file. If you specify the name of an existing tag, that tag is
51 opened, otherwise a new tag of this name is created. If this parameter is
52 omitted or ``None``, the mapping is created without a name. Avoiding the
53 use of the tag parameter will assist in keeping your code portable between
54 Unix and Windows.
Georg Brandl116aa622007-08-15 14:28:22 +000055
Christian Heimesdae2a892008-04-19 00:55:37 +000056 *offset* may be specified as a non-negative integer offset. mmap references
57 will be relative to the offset from the beginning of the file. *offset*
58 defaults to 0. *offset* must be a multiple of the ALLOCATIONGRANULARITY.
Georg Brandl116aa622007-08-15 14:28:22 +000059
Georg Brandl9afde1c2007-11-01 20:32:30 +000060
Georg Brandlcd7f32b2009-06-08 09:13:45 +000061.. class:: mmap(fileno, length, flags=MAP_SHARED, prot=PROT_WRITE|PROT_READ, access=ACCESS_DEFAULT[, offset])
Georg Brandl116aa622007-08-15 14:28:22 +000062 :noindex:
63
64 **(Unix version)** Maps *length* bytes from the file specified by the file
65 descriptor *fileno*, and returns a mmap object. If *length* is ``0``, the
Christian Heimesdae2a892008-04-19 00:55:37 +000066 maximum length of the map will be the current size of the file when
67 :class:`mmap` is called.
Georg Brandl116aa622007-08-15 14:28:22 +000068
69 *flags* specifies the nature of the mapping. :const:`MAP_PRIVATE` creates a
Christian Heimesdae2a892008-04-19 00:55:37 +000070 private copy-on-write mapping, so changes to the contents of the mmap
71 object will be private to this process, and :const:`MAP_SHARED` creates a
72 mapping that's shared with all other processes mapping the same areas of
73 the file. The default value is :const:`MAP_SHARED`.
Georg Brandl116aa622007-08-15 14:28:22 +000074
Christian Heimesdae2a892008-04-19 00:55:37 +000075 *prot*, if specified, gives the desired memory protection; the two most
76 useful values are :const:`PROT_READ` and :const:`PROT_WRITE`, to specify
77 that the pages may be read or written. *prot* defaults to
78 :const:`PROT_READ \| PROT_WRITE`.
Georg Brandl116aa622007-08-15 14:28:22 +000079
Christian Heimesdae2a892008-04-19 00:55:37 +000080 *access* may be specified in lieu of *flags* and *prot* as an optional
81 keyword parameter. It is an error to specify both *flags*, *prot* and
82 *access*. See the description of *access* above for information on how to
83 use this parameter.
Georg Brandl116aa622007-08-15 14:28:22 +000084
Christian Heimesdae2a892008-04-19 00:55:37 +000085 *offset* may be specified as a non-negative integer offset. mmap references
86 will be relative to the offset from the beginning of the file. *offset*
87 defaults to 0. *offset* must be a multiple of the PAGESIZE or
88 ALLOCATIONGRANULARITY.
Georg Brandl48310cd2009-01-03 21:18:54 +000089
Georg Brandl86def6c2008-01-21 20:36:10 +000090 This example shows a simple way of using :class:`mmap`::
Christian Heimesd8654cf2007-12-02 15:22:16 +000091
92 import mmap
93
94 # write a simple example file
Benjamin Petersone0124bd2009-03-09 21:04:33 +000095 with open("hello.txt", "wb") as f:
Benjamin Petersone099b372009-04-04 17:09:35 +000096 f.write(b"Hello Python!\n")
Christian Heimesd8654cf2007-12-02 15:22:16 +000097
Benjamin Petersone0124bd2009-03-09 21:04:33 +000098 with open("hello.txt", "r+b") as f:
Christian Heimesd8654cf2007-12-02 15:22:16 +000099 # memory-map the file, size 0 means whole file
100 map = mmap.mmap(f.fileno(), 0)
101 # read content via standard file methods
Benjamin Petersone099b372009-04-04 17:09:35 +0000102 print(map.readline()) # prints b"Hello Python!\n"
Christian Heimesd8654cf2007-12-02 15:22:16 +0000103 # read content via slice notation
Benjamin Petersone099b372009-04-04 17:09:35 +0000104 print(map[:5]) # prints b"Hello"
Christian Heimesd8654cf2007-12-02 15:22:16 +0000105 # update content using slice notation;
106 # note that new content must have same size
Benjamin Petersone099b372009-04-04 17:09:35 +0000107 map[6:] = b" world!\n"
Christian Heimesd8654cf2007-12-02 15:22:16 +0000108 # ... and read again using standard file methods
109 map.seek(0)
Benjamin Petersone099b372009-04-04 17:09:35 +0000110 print(map.readline()) # prints b"Hello world!\n"
Christian Heimesd8654cf2007-12-02 15:22:16 +0000111 # close the map
112 map.close()
113
114
115 The next example demonstrates how to create an anonymous map and exchange
116 data between the parent and child processes::
117
118 import mmap
119 import os
120
121 map = mmap.mmap(-1, 13)
Benjamin Petersone099b372009-04-04 17:09:35 +0000122 map.write(b"Hello world!")
Christian Heimesd8654cf2007-12-02 15:22:16 +0000123
124 pid = os.fork()
125
126 if pid == 0: # In a child process
127 map.seek(0)
Georg Brandla09ca382007-12-02 18:20:12 +0000128 print(map.readline())
Christian Heimesd8654cf2007-12-02 15:22:16 +0000129
130 map.close()
131
Georg Brandl9afde1c2007-11-01 20:32:30 +0000132
Benjamin Petersone41251e2008-04-25 01:59:09 +0000133 Memory-mapped file objects support the following methods:
Georg Brandl116aa622007-08-15 14:28:22 +0000134
135
Benjamin Petersone41251e2008-04-25 01:59:09 +0000136 .. method:: close()
Georg Brandl116aa622007-08-15 14:28:22 +0000137
Benjamin Petersone41251e2008-04-25 01:59:09 +0000138 Close the file. Subsequent calls to other methods of the object will
139 result in an exception being raised.
Georg Brandl116aa622007-08-15 14:28:22 +0000140
141
Benjamin Petersone099b372009-04-04 17:09:35 +0000142 .. method:: find(sub[, start[, end]])
Georg Brandl116aa622007-08-15 14:28:22 +0000143
Benjamin Petersone099b372009-04-04 17:09:35 +0000144 Returns the lowest index in the object where the subsequence *sub* is
145 found, such that *sub* is contained in the range [*start*, *end*].
Benjamin Petersone41251e2008-04-25 01:59:09 +0000146 Optional arguments *start* and *end* are interpreted as in slice notation.
147 Returns ``-1`` on failure.
Georg Brandl116aa622007-08-15 14:28:22 +0000148
149
Georg Brandlcd7f32b2009-06-08 09:13:45 +0000150 .. method:: flush([offset[, size]])
Georg Brandl116aa622007-08-15 14:28:22 +0000151
Benjamin Petersone41251e2008-04-25 01:59:09 +0000152 Flushes changes made to the in-memory copy of a file back to disk. Without
153 use of this call there is no guarantee that changes are written back before
154 the object is destroyed. If *offset* and *size* are specified, only
155 changes to the given range of bytes will be flushed to disk; otherwise, the
156 whole extent of the mapping is flushed.
Christian Heimesdae2a892008-04-19 00:55:37 +0000157
Benjamin Petersone41251e2008-04-25 01:59:09 +0000158 **(Windows version)** A nonzero value returned indicates success; zero
159 indicates failure.
Christian Heimesdae2a892008-04-19 00:55:37 +0000160
Benjamin Petersone41251e2008-04-25 01:59:09 +0000161 **(Unix version)** A zero value is returned to indicate success. An
162 exception is raised when the call failed.
Georg Brandl116aa622007-08-15 14:28:22 +0000163
164
Benjamin Petersone41251e2008-04-25 01:59:09 +0000165 .. method:: move(dest, src, count)
Georg Brandl116aa622007-08-15 14:28:22 +0000166
Benjamin Petersone41251e2008-04-25 01:59:09 +0000167 Copy the *count* bytes starting at offset *src* to the destination index
168 *dest*. If the mmap was created with :const:`ACCESS_READ`, then calls to
169 move will throw a :exc:`TypeError` exception.
Georg Brandl116aa622007-08-15 14:28:22 +0000170
171
Benjamin Petersone41251e2008-04-25 01:59:09 +0000172 .. method:: read(num)
Georg Brandl116aa622007-08-15 14:28:22 +0000173
Benjamin Petersone099b372009-04-04 17:09:35 +0000174 Return a :class:`bytes` containing up to *num* bytes starting from the
175 current file position; the file position is updated to point after the
176 bytes that were returned.
Georg Brandl116aa622007-08-15 14:28:22 +0000177
178
Benjamin Petersone41251e2008-04-25 01:59:09 +0000179 .. method:: read_byte()
Georg Brandl116aa622007-08-15 14:28:22 +0000180
Benjamin Petersone099b372009-04-04 17:09:35 +0000181 Returns a byte at the current file position as an integer, and advances
182 the file position by 1.
Georg Brandl116aa622007-08-15 14:28:22 +0000183
184
Benjamin Petersone41251e2008-04-25 01:59:09 +0000185 .. method:: readline()
Georg Brandl116aa622007-08-15 14:28:22 +0000186
Benjamin Petersone41251e2008-04-25 01:59:09 +0000187 Returns a single line, starting at the current file position and up to the
188 next newline.
Georg Brandl116aa622007-08-15 14:28:22 +0000189
190
Benjamin Petersone41251e2008-04-25 01:59:09 +0000191 .. method:: resize(newsize)
Georg Brandl116aa622007-08-15 14:28:22 +0000192
Benjamin Petersone41251e2008-04-25 01:59:09 +0000193 Resizes the map and the underlying file, if any. If the mmap was created
194 with :const:`ACCESS_READ` or :const:`ACCESS_COPY`, resizing the map will
195 throw a :exc:`TypeError` exception.
Georg Brandl116aa622007-08-15 14:28:22 +0000196
197
Benjamin Petersone099b372009-04-04 17:09:35 +0000198 .. method:: rfind(sub[, start[, end]])
Georg Brandlfceab5a2008-01-19 20:08:23 +0000199
Benjamin Petersone099b372009-04-04 17:09:35 +0000200 Returns the highest index in the object where the subsequence *sub* is
201 found, such that *sub* is contained in the range [*start*, *end*].
Benjamin Petersone41251e2008-04-25 01:59:09 +0000202 Optional arguments *start* and *end* are interpreted as in slice notation.
203 Returns ``-1`` on failure.
Georg Brandlfceab5a2008-01-19 20:08:23 +0000204
205
Benjamin Petersone41251e2008-04-25 01:59:09 +0000206 .. method:: seek(pos[, whence])
Georg Brandl116aa622007-08-15 14:28:22 +0000207
Benjamin Petersone41251e2008-04-25 01:59:09 +0000208 Set the file's current position. *whence* argument is optional and
209 defaults to ``os.SEEK_SET`` or ``0`` (absolute file positioning); other
210 values are ``os.SEEK_CUR`` or ``1`` (seek relative to the current
211 position) and ``os.SEEK_END`` or ``2`` (seek relative to the file's end).
Georg Brandl116aa622007-08-15 14:28:22 +0000212
213
Benjamin Petersone41251e2008-04-25 01:59:09 +0000214 .. method:: size()
Georg Brandl116aa622007-08-15 14:28:22 +0000215
Benjamin Petersone41251e2008-04-25 01:59:09 +0000216 Return the length of the file, which can be larger than the size of the
217 memory-mapped area.
Georg Brandl116aa622007-08-15 14:28:22 +0000218
219
Benjamin Petersone41251e2008-04-25 01:59:09 +0000220 .. method:: tell()
Georg Brandl116aa622007-08-15 14:28:22 +0000221
Benjamin Petersone41251e2008-04-25 01:59:09 +0000222 Returns the current position of the file pointer.
Georg Brandl116aa622007-08-15 14:28:22 +0000223
224
Benjamin Petersone099b372009-04-04 17:09:35 +0000225 .. method:: write(bytes)
Georg Brandl116aa622007-08-15 14:28:22 +0000226
Benjamin Petersone099b372009-04-04 17:09:35 +0000227 Write the bytes in *bytes* into memory at the current position of the
Benjamin Petersone41251e2008-04-25 01:59:09 +0000228 file pointer; the file position is updated to point after the bytes that
229 were written. If the mmap was created with :const:`ACCESS_READ`, then
230 writing to it will throw a :exc:`TypeError` exception.
Georg Brandl116aa622007-08-15 14:28:22 +0000231
232
Benjamin Petersone41251e2008-04-25 01:59:09 +0000233 .. method:: write_byte(byte)
Georg Brandl116aa622007-08-15 14:28:22 +0000234
Benjamin Petersone099b372009-04-04 17:09:35 +0000235 Write the the integer *byte* into memory at the current
Benjamin Petersone41251e2008-04-25 01:59:09 +0000236 position of the file pointer; the file position is advanced by ``1``. If
237 the mmap was created with :const:`ACCESS_READ`, then writing to it will
238 throw a :exc:`TypeError` exception.
Georg Brandl116aa622007-08-15 14:28:22 +0000239
Georg Brandl9afde1c2007-11-01 20:32:30 +0000240