blob: 92a4526715b9941b83547101a9889914dab16d8f [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001
2:mod:`zipfile` --- Work with ZIP archives
3=========================================
4
5.. module:: zipfile
6 :synopsis: Read and write ZIP-format archive files.
7.. moduleauthor:: James C. Ahlstrom <jim@interet.com>
8.. sectionauthor:: James C. Ahlstrom <jim@interet.com>
9
10
11.. % LaTeX markup by Fred L. Drake, Jr. <fdrake@acm.org>
12
Georg Brandl116aa622007-08-15 14:28:22 +000013The ZIP file format is a common archive and compression standard. This module
14provides tools to create, read, write, append, and list a ZIP file. Any
15advanced use of this module will require an understanding of the format, as
16defined in `PKZIP Application Note
17<http://www.pkware.com/business_and_developers/developer/appnote/>`_.
18
19This module does not currently handle ZIP files which have appended comments, or
20multi-disk ZIP files. It can handle ZIP files that use the ZIP64 extensions
21(that is ZIP files that are more than 4 GByte in size). It supports decryption
22of encrypted files in ZIP archives, but it cannot currently create an encrypted
23file.
24
25The available attributes of this module are:
26
27
28.. exception:: BadZipfile
29
30 The error raised for bad ZIP files (old name: ``zipfile.error``).
31
32
33.. exception:: LargeZipFile
34
35 The error raised when a ZIP file would require ZIP64 functionality but that has
36 not been enabled.
37
38
39.. class:: ZipFile
40
41 The class for reading and writing ZIP files. See section
42 :ref:`zipfile-objects` for constructor details.
43
44
45.. class:: PyZipFile
46
47 Class for creating ZIP archives containing Python libraries.
48
49
50.. class:: ZipInfo([filename[, date_time]])
51
52 Class used to represent information about a member of an archive. Instances
53 of this class are returned by the :meth:`getinfo` and :meth:`infolist`
54 methods of :class:`ZipFile` objects. Most users of the :mod:`zipfile` module
55 will not need to create these, but only use those created by this
56 module. *filename* should be the full name of the archive member, and
57 *date_time* should be a tuple containing six fields which describe the time
58 of the last modification to the file; the fields are described in section
59 :ref:`zipinfo-objects`.
60
61
62.. function:: is_zipfile(filename)
63
64 Returns ``True`` if *filename* is a valid ZIP file based on its magic number,
65 otherwise returns ``False``. This module does not currently handle ZIP files
66 which have appended comments.
67
68
69.. data:: ZIP_STORED
70
71 The numeric constant for an uncompressed archive member.
72
73
74.. data:: ZIP_DEFLATED
75
76 The numeric constant for the usual ZIP compression method. This requires the
77 zlib module. No other compression methods are currently supported.
78
79
80.. seealso::
81
82 `PKZIP Application Note <http://www.pkware.com/business_and_developers/developer/appnote/>`_
83 Documentation on the ZIP file format by Phil Katz, the creator of the format and
84 algorithms used.
85
86 `Info-ZIP Home Page <http://www.info-zip.org/>`_
87 Information about the Info-ZIP project's ZIP archive programs and development
88 libraries.
89
90
91.. _zipfile-objects:
92
93ZipFile Objects
94---------------
95
96
97.. class:: ZipFile(file[, mode[, compression[, allowZip64]]])
98
99 Open a ZIP file, where *file* can be either a path to a file (a string) or a
100 file-like object. The *mode* parameter should be ``'r'`` to read an existing
101 file, ``'w'`` to truncate and write a new file, or ``'a'`` to append to an
102 existing file. If *mode* is ``'a'`` and *file* refers to an existing ZIP file,
103 then additional files are added to it. If *file* does not refer to a ZIP file,
104 then a new ZIP archive is appended to the file. This is meant for adding a ZIP
105 archive to another file, such as :file:`python.exe`. Using ::
106
107 cat myzip.zip >> python.exe
108
109 also works, and at least :program:`WinZip` can read such files. If *mode* is
110 ``a`` and the file does not exist at all, it is created. *compression* is the
111 ZIP compression method to use when writing the archive, and should be
112 :const:`ZIP_STORED` or :const:`ZIP_DEFLATED`; unrecognized values will cause
113 :exc:`RuntimeError` to be raised. If :const:`ZIP_DEFLATED` is specified but the
114 :mod:`zlib` module is not available, :exc:`RuntimeError` is also raised. The
115 default is :const:`ZIP_STORED`. If *allowZip64* is ``True`` zipfile will create
116 ZIP files that use the ZIP64 extensions when the zipfile is larger than 2 GB. If
117 it is false (the default) :mod:`zipfile` will raise an exception when the ZIP
118 file would require ZIP64 extensions. ZIP64 extensions are disabled by default
119 because the default :program:`zip` and :program:`unzip` commands on Unix (the
120 InfoZIP utilities) don't support these extensions.
121
Georg Brandl116aa622007-08-15 14:28:22 +0000122
123.. method:: ZipFile.close()
124
125 Close the archive file. You must call :meth:`close` before exiting your program
126 or essential records will not be written.
127
128
129.. method:: ZipFile.getinfo(name)
130
131 Return a :class:`ZipInfo` object with information about the archive member
132 *name*. Calling :meth:`getinfo` for a name not currently contained in the
133 archive will raise a :exc:`KeyError`.
134
135
136.. method:: ZipFile.infolist()
137
138 Return a list containing a :class:`ZipInfo` object for each member of the
139 archive. The objects are in the same order as their entries in the actual ZIP
140 file on disk if an existing archive was opened.
141
142
143.. method:: ZipFile.namelist()
144
145 Return a list of archive members by name.
146
147
148.. method:: ZipFile.open(name[, mode[, pwd]])
149
150 Extract a member from the archive as a file-like object (ZipExtFile). *name* is
151 the name of the file in the archive. The *mode* parameter, if included, must be
152 one of the following: ``'r'`` (the default), ``'U'``, or ``'rU'``. Choosing
153 ``'U'`` or ``'rU'`` will enable universal newline support in the read-only
154 object. *pwd* is the password used for encrypted files. Calling :meth:`open`
155 on a closed ZipFile will raise a :exc:`RuntimeError`.
156
157 .. note::
158
159 The file-like object is read-only and provides the following methods:
160 :meth:`read`, :meth:`readline`, :meth:`readlines`, :meth:`__iter__`,
161 :meth:`next`.
162
163 .. note::
164
165 If the ZipFile was created by passing in a file-like object as the first
Guido van Rossumda27fd22007-08-17 00:24:54 +0000166 argument to the constructor, then the object returned by :meth:`.open` shares the
Georg Brandl116aa622007-08-15 14:28:22 +0000167 ZipFile's file pointer. Under these circumstances, the object returned by
Guido van Rossumda27fd22007-08-17 00:24:54 +0000168 :meth:`.open` should not be used after any additional operations are performed
Georg Brandl116aa622007-08-15 14:28:22 +0000169 on the ZipFile object. If the ZipFile was created by passing in a string (the
Guido van Rossumda27fd22007-08-17 00:24:54 +0000170 filename) as the first argument to the constructor, then :meth:`.open` will
Georg Brandl116aa622007-08-15 14:28:22 +0000171 create a new file object that will be held by the ZipExtFile, allowing it to
172 operate independently of the ZipFile.
173
Georg Brandl116aa622007-08-15 14:28:22 +0000174
175.. method:: ZipFile.printdir()
176
177 Print a table of contents for the archive to ``sys.stdout``.
178
179
180.. method:: ZipFile.setpassword(pwd)
181
182 Set *pwd* as default password to extract encrypted files.
183
Georg Brandl116aa622007-08-15 14:28:22 +0000184
185.. method:: ZipFile.read(name[, pwd])
186
187 Return the bytes of the file in the archive. The archive must be open for read
188 or append. *pwd* is the password used for encrypted files and, if specified, it
189 will override the default password set with :meth:`setpassword`. Calling
190 :meth:`read` on a closed ZipFile will raise a :exc:`RuntimeError`.
191
Georg Brandl116aa622007-08-15 14:28:22 +0000192
193.. method:: ZipFile.testzip()
194
195 Read all the files in the archive and check their CRC's and file headers.
196 Return the name of the first bad file, or else return ``None``. Calling
197 :meth:`testzip` on a closed ZipFile will raise a :exc:`RuntimeError`.
198
199
200.. method:: ZipFile.write(filename[, arcname[, compress_type]])
201
202 Write the file named *filename* to the archive, giving it the archive name
203 *arcname* (by default, this will be the same as *filename*, but without a drive
204 letter and with leading path separators removed). If given, *compress_type*
205 overrides the value given for the *compression* parameter to the constructor for
206 the new entry. The archive must be open with mode ``'w'`` or ``'a'`` -- calling
207 :meth:`write` on a ZipFile created with mode ``'r'`` will raise a
208 :exc:`RuntimeError`. Calling :meth:`write` on a closed ZipFile will raise a
209 :exc:`RuntimeError`.
210
211 .. note::
212
213 There is no official file name encoding for ZIP files. If you have unicode file
Thomas Wouters47b49bf2007-08-30 22:15:33 +0000214 names, you must convert them to byte strings in your desired encoding before
Georg Brandl116aa622007-08-15 14:28:22 +0000215 passing them to :meth:`write`. WinZip interprets all file names as encoded in
216 CP437, also known as DOS Latin.
217
218 .. note::
219
220 Archive names should be relative to the archive root, that is, they should not
221 start with a path separator.
222
223 .. note::
224
225 If ``arcname`` (or ``filename``, if ``arcname`` is not given) contains a null
226 byte, the name of the file in the archive will be truncated at the null byte.
227
228
229.. method:: ZipFile.writestr(zinfo_or_arcname, bytes)
230
231 Write the string *bytes* to the archive; *zinfo_or_arcname* is either the file
232 name it will be given in the archive, or a :class:`ZipInfo` instance. If it's
233 an instance, at least the filename, date, and time must be given. If it's a
234 name, the date and time is set to the current date and time. The archive must be
235 opened with mode ``'w'`` or ``'a'`` -- calling :meth:`writestr` on a ZipFile
236 created with mode ``'r'`` will raise a :exc:`RuntimeError`. Calling
237 :meth:`writestr` on a closed ZipFile will raise a :exc:`RuntimeError`.
238
239The following data attribute is also available:
240
241
242.. attribute:: ZipFile.debug
243
244 The level of debug output to use. This may be set from ``0`` (the default, no
245 output) to ``3`` (the most output). Debugging information is written to
246 ``sys.stdout``.
247
248
249.. _pyzipfile-objects:
250
251PyZipFile Objects
252-----------------
253
254The :class:`PyZipFile` constructor takes the same parameters as the
255:class:`ZipFile` constructor. Instances have one method in addition to those of
256:class:`ZipFile` objects.
257
258
259.. method:: PyZipFile.writepy(pathname[, basename])
260
261 Search for files :file:`\*.py` and add the corresponding file to the archive.
262 The corresponding file is a :file:`\*.pyo` file if available, else a
263 :file:`\*.pyc` file, compiling if necessary. If the pathname is a file, the
264 filename must end with :file:`.py`, and just the (corresponding
265 :file:`\*.py[co]`) file is added at the top level (no path information). If the
266 pathname is a file that does not end with :file:`.py`, a :exc:`RuntimeError`
267 will be raised. If it is a directory, and the directory is not a package
268 directory, then all the files :file:`\*.py[co]` are added at the top level. If
269 the directory is a package directory, then all :file:`\*.py[co]` are added under
270 the package name as a file path, and if any subdirectories are package
271 directories, all of these are added recursively. *basename* is intended for
272 internal use only. The :meth:`writepy` method makes archives with file names
273 like this::
274
275 string.pyc # Top level name
276 test/__init__.pyc # Package directory
277 test/testall.pyc # Module test.testall
278 test/bogus/__init__.pyc # Subpackage directory
279 test/bogus/myfile.pyc # Submodule test.bogus.myfile
280
281
282.. _zipinfo-objects:
283
284ZipInfo Objects
285---------------
286
287Instances of the :class:`ZipInfo` class are returned by the :meth:`getinfo` and
288:meth:`infolist` methods of :class:`ZipFile` objects. Each object stores
289information about a single member of the ZIP archive.
290
291Instances have the following attributes:
292
293
294.. attribute:: ZipInfo.filename
295
296 Name of the file in the archive.
297
298
299.. attribute:: ZipInfo.date_time
300
301 The time and date of the last modification to the archive member. This is a
302 tuple of six values:
303
304 +-------+--------------------------+
305 | Index | Value |
306 +=======+==========================+
307 | ``0`` | Year |
308 +-------+--------------------------+
309 | ``1`` | Month (one-based) |
310 +-------+--------------------------+
311 | ``2`` | Day of month (one-based) |
312 +-------+--------------------------+
313 | ``3`` | Hours (zero-based) |
314 +-------+--------------------------+
315 | ``4`` | Minutes (zero-based) |
316 +-------+--------------------------+
317 | ``5`` | Seconds (zero-based) |
318 +-------+--------------------------+
319
320
321.. attribute:: ZipInfo.compress_type
322
323 Type of compression for the archive member.
324
325
326.. attribute:: ZipInfo.comment
327
328 Comment for the individual archive member.
329
330
331.. attribute:: ZipInfo.extra
332
333 Expansion field data. The `PKZIP Application Note
334 <http://www.pkware.com/business_and_developers/developer/appnote/>`_ contains
335 some comments on the internal structure of the data contained in this string.
336
337
338.. attribute:: ZipInfo.create_system
339
340 System which created ZIP archive.
341
342
343.. attribute:: ZipInfo.create_version
344
345 PKZIP version which created ZIP archive.
346
347
348.. attribute:: ZipInfo.extract_version
349
350 PKZIP version needed to extract archive.
351
352
353.. attribute:: ZipInfo.reserved
354
355 Must be zero.
356
357
358.. attribute:: ZipInfo.flag_bits
359
360 ZIP flag bits.
361
362
363.. attribute:: ZipInfo.volume
364
365 Volume number of file header.
366
367
368.. attribute:: ZipInfo.internal_attr
369
370 Internal attributes.
371
372
373.. attribute:: ZipInfo.external_attr
374
375 External file attributes.
376
377
378.. attribute:: ZipInfo.header_offset
379
380 Byte offset to the file header.
381
382
383.. attribute:: ZipInfo.CRC
384
385 CRC-32 of the uncompressed file.
386
387
388.. attribute:: ZipInfo.compress_size
389
390 Size of the compressed data.
391
392
393.. attribute:: ZipInfo.file_size
394
395 Size of the uncompressed file.
396