blob: e8a2530fb8c1710834d525472e0683644cec3d29 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`zipfile` --- Work with ZIP archives
2=========================================
3
4.. module:: zipfile
5 :synopsis: Read and write ZIP-format archive files.
Terry Jan Reedyfa089b92016-06-11 15:02:54 -04006
Georg Brandl116aa622007-08-15 14:28:22 +00007.. moduleauthor:: James C. Ahlstrom <jim@interet.com>
8.. sectionauthor:: James C. Ahlstrom <jim@interet.com>
9
Raymond Hettinger469271d2011-01-27 20:38:46 +000010**Source code:** :source:`Lib/zipfile.py`
11
12--------------
13
Georg Brandl116aa622007-08-15 14:28:22 +000014The ZIP file format is a common archive and compression standard. This module
15provides tools to create, read, write, append, and list a ZIP file. Any
16advanced use of this module will require an understanding of the format, as
Georg Brandl5d941342016-02-26 19:37:12 +010017defined in `PKZIP Application Note`_.
Georg Brandl116aa622007-08-15 14:28:22 +000018
Georg Brandl98be9962010-08-02 20:52:10 +000019This module does not currently handle multi-disk ZIP files.
20It can handle ZIP files that use the ZIP64 extensions
Serhiy Storchakaf8def282013-02-16 17:29:56 +020021(that is ZIP files that are more than 4 GiB in size). It supports
Guido van Rossum77677112007-11-05 19:43:04 +000022decryption of encrypted files in ZIP archives, but it currently cannot
Christian Heimesfdab48e2008-01-20 09:06:41 +000023create an encrypted file. Decryption is extremely slow as it is
Benjamin Peterson20211002009-11-25 18:34:42 +000024implemented in native Python rather than C.
Georg Brandl116aa622007-08-15 14:28:22 +000025
Guido van Rossum77677112007-11-05 19:43:04 +000026The module defines the following items:
Georg Brandl116aa622007-08-15 14:28:22 +000027
Georg Brandl4d540882010-10-28 06:42:33 +000028.. exception:: BadZipFile
Georg Brandl116aa622007-08-15 14:28:22 +000029
Éric Araujod001ffe2011-08-19 00:44:31 +020030 The error raised for bad ZIP files.
Georg Brandl116aa622007-08-15 14:28:22 +000031
Georg Brandl4d540882010-10-28 06:42:33 +000032 .. versionadded:: 3.2
33
34
35.. exception:: BadZipfile
36
Éric Araujod001ffe2011-08-19 00:44:31 +020037 Alias of :exc:`BadZipFile`, for compatibility with older Python versions.
38
39 .. deprecated:: 3.2
Georg Brandl4d540882010-10-28 06:42:33 +000040
Georg Brandl116aa622007-08-15 14:28:22 +000041
42.. exception:: LargeZipFile
43
44 The error raised when a ZIP file would require ZIP64 functionality but that has
45 not been enabled.
46
47
48.. class:: ZipFile
Georg Brandl5e92a502010-11-12 06:20:12 +000049 :noindex:
Georg Brandl116aa622007-08-15 14:28:22 +000050
51 The class for reading and writing ZIP files. See section
52 :ref:`zipfile-objects` for constructor details.
53
54
Jason R. Coombsb2758ff2019-05-08 09:45:06 -040055.. class:: Path
56 :noindex:
57
58 A pathlib-compatible wrapper for zip files. See section
59 :ref:`path-objects` for details.
60
61 .. versionadded:: 3.8
62
63
Georg Brandl116aa622007-08-15 14:28:22 +000064.. class:: PyZipFile
Georg Brandl8334fd92010-12-04 10:26:46 +000065 :noindex:
Georg Brandl116aa622007-08-15 14:28:22 +000066
67 Class for creating ZIP archives containing Python libraries.
68
69
Georg Brandl7f01a132009-09-16 15:58:14 +000070.. class:: ZipInfo(filename='NoName', date_time=(1980,1,1,0,0,0))
Georg Brandl116aa622007-08-15 14:28:22 +000071
72 Class used to represent information about a member of an archive. Instances
Andrew Svetlovafbf90c2012-10-06 18:02:05 +030073 of this class are returned by the :meth:`.getinfo` and :meth:`.infolist`
Georg Brandl116aa622007-08-15 14:28:22 +000074 methods of :class:`ZipFile` objects. Most users of the :mod:`zipfile` module
75 will not need to create these, but only use those created by this
76 module. *filename* should be the full name of the archive member, and
77 *date_time* should be a tuple containing six fields which describe the time
78 of the last modification to the file; the fields are described in section
79 :ref:`zipinfo-objects`.
80
81
82.. function:: is_zipfile(filename)
83
84 Returns ``True`` if *filename* is a valid ZIP file based on its magic number,
Antoine Pitroudb5fe662008-12-27 15:50:40 +000085 otherwise returns ``False``. *filename* may be a file or file-like object too.
Georg Brandl116aa622007-08-15 14:28:22 +000086
Georg Brandl277a1502009-01-04 00:28:14 +000087 .. versionchanged:: 3.1
88 Support for file and file-like objects.
Georg Brandl116aa622007-08-15 14:28:22 +000089
Georg Brandl67b21b72010-08-17 15:07:14 +000090
Georg Brandl116aa622007-08-15 14:28:22 +000091.. data:: ZIP_STORED
92
93 The numeric constant for an uncompressed archive member.
94
95
96.. data:: ZIP_DEFLATED
97
98 The numeric constant for the usual ZIP compression method. This requires the
Andrew Svetlov5061a342012-10-06 18:10:01 +030099 :mod:`zlib` module.
Martin v. Löwisf6b16a42012-05-01 07:58:44 +0200100
101
102.. data:: ZIP_BZIP2
103
104 The numeric constant for the BZIP2 compression method. This requires the
Andrew Svetlov5061a342012-10-06 18:10:01 +0300105 :mod:`bz2` module.
Martin v. Löwisf6b16a42012-05-01 07:58:44 +0200106
107 .. versionadded:: 3.3
108
Martin v. Löwis7fb79fc2012-05-13 10:06:36 +0200109.. data:: ZIP_LZMA
110
111 The numeric constant for the LZMA compression method. This requires the
Andrew Svetlov5061a342012-10-06 18:10:01 +0300112 :mod:`lzma` module.
Martin v. Löwis7fb79fc2012-05-13 10:06:36 +0200113
114 .. versionadded:: 3.3
115
Martin v. Löwisf6b16a42012-05-01 07:58:44 +0200116 .. note::
117
118 The ZIP file format specification has included support for bzip2 compression
Martin v. Löwis7fb79fc2012-05-13 10:06:36 +0200119 since 2001, and for LZMA compression since 2006. However, some tools
120 (including older Python releases) do not support these compression
121 methods, and may either refuse to process the ZIP file altogether,
122 or fail to extract individual files.
Georg Brandl116aa622007-08-15 14:28:22 +0000123
124
125.. seealso::
126
Georg Brandl5d941342016-02-26 19:37:12 +0100127 `PKZIP Application Note`_
Georg Brandl116aa622007-08-15 14:28:22 +0000128 Documentation on the ZIP file format by Phil Katz, the creator of the format and
129 algorithms used.
130
131 `Info-ZIP Home Page <http://www.info-zip.org/>`_
132 Information about the Info-ZIP project's ZIP archive programs and development
133 libraries.
134
135
136.. _zipfile-objects:
137
138ZipFile Objects
139---------------
140
141
Bo Baylesce237c72018-01-29 23:54:07 -0600142.. class:: ZipFile(file, mode='r', compression=ZIP_STORED, allowZip64=True, \
Marcel Plch77b112c2018-08-31 16:43:31 +0200143 compresslevel=None, *, strict_timestamps=True)
Georg Brandl116aa622007-08-15 14:28:22 +0000144
Serhiy Storchaka8606e952017-03-08 14:37:51 +0200145 Open a ZIP file, where *file* can be a path to a file (a string), a
146 file-like object or a :term:`path-like object`.
Bo Baylesce237c72018-01-29 23:54:07 -0600147
Serhiy Storchaka8606e952017-03-08 14:37:51 +0200148 The *mode* parameter should be ``'r'`` to read an existing
Senthil Kumarane5c05cc2016-01-21 21:06:47 -0800149 file, ``'w'`` to truncate and write a new file, ``'a'`` to append to an
150 existing file, or ``'x'`` to exclusively create and write a new file.
Serhiy Storchaka764fc9b2015-03-25 10:09:41 +0200151 If *mode* is ``'x'`` and *file* refers to an existing file,
152 a :exc:`FileExistsError` will be raised.
153 If *mode* is ``'a'`` and *file* refers to an existing ZIP
Ezio Melottifaa6b7f2009-12-30 12:34:59 +0000154 file, then additional files are added to it. If *file* does not refer to a
155 ZIP file, then a new ZIP archive is appended to the file. This is meant for
156 adding a ZIP archive to another file (such as :file:`python.exe`). If
Berker Peksag7927e752016-09-13 04:49:12 +0300157 *mode* is ``'a'`` and the file does not exist at all, it is created.
158 If *mode* is ``'r'`` or ``'a'``, the file should be seekable.
Bo Baylesce237c72018-01-29 23:54:07 -0600159
Ezio Melottifaa6b7f2009-12-30 12:34:59 +0000160 *compression* is the ZIP compression method to use when writing the archive,
Martin v. Löwis7fb79fc2012-05-13 10:06:36 +0200161 and should be :const:`ZIP_STORED`, :const:`ZIP_DEFLATED`,
162 :const:`ZIP_BZIP2` or :const:`ZIP_LZMA`; unrecognized
Bo Baylesce237c72018-01-29 23:54:07 -0600163 values will cause :exc:`NotImplementedError` to be raised. If
164 :const:`ZIP_DEFLATED`, :const:`ZIP_BZIP2` or :const:`ZIP_LZMA` is specified
165 but the corresponding module (:mod:`zlib`, :mod:`bz2` or :mod:`lzma`) is not
166 available, :exc:`RuntimeError` is raised. The default is :const:`ZIP_STORED`.
167
168 If *allowZip64* is ``True`` (the default) zipfile will create ZIP files that
169 use the ZIP64 extensions when the zipfile is larger than 4 GiB. If it is
170 ``false`` :mod:`zipfile` will raise an exception when the ZIP file would
171 require ZIP64 extensions.
172
173 The *compresslevel* parameter controls the compression level to use when
174 writing files to the archive.
175 When using :const:`ZIP_STORED` or :const:`ZIP_LZMA` it has no effect.
176 When using :const:`ZIP_DEFLATED` integers ``0`` through ``9`` are accepted
177 (see :class:`zlib <zlib.compressobj>` for more information).
178 When using :const:`ZIP_BZIP2` integers ``1`` through ``9`` are accepted
179 (see :class:`bz2 <bz2.BZ2File>` for more information).
Georg Brandl116aa622007-08-15 14:28:22 +0000180
Marcel Plch77b112c2018-08-31 16:43:31 +0200181 The *strict_timestamps* argument, when set to ``False``, allows to
182 zip files older than 1980-01-01 at the cost of setting the
183 timestamp to 1980-01-01.
184 Similar behavior occurs with files newer than 2107-12-31,
185 the timestamp is also set to the limit.
186
Serhiy Storchaka764fc9b2015-03-25 10:09:41 +0200187 If the file is created with mode ``'w'``, ``'x'`` or ``'a'`` and then
Andrew Svetlovafbf90c2012-10-06 18:02:05 +0300188 :meth:`closed <close>` without adding any files to the archive, the appropriate
Georg Brandl268e4d42010-10-14 06:59:45 +0000189 ZIP structures for an empty archive will be written to the file.
190
Ezio Melottifaa6b7f2009-12-30 12:34:59 +0000191 ZipFile is also a context manager and therefore supports the
192 :keyword:`with` statement. In the example, *myzip* is closed after the
Serhiy Storchaka2b57c432018-12-19 08:09:46 +0200193 :keyword:`!with` statement's suite is finished---even if an exception occurs::
Georg Brandl116aa622007-08-15 14:28:22 +0000194
Ezio Melottifaa6b7f2009-12-30 12:34:59 +0000195 with ZipFile('spam.zip', 'w') as myzip:
196 myzip.write('eggs.txt')
197
198 .. versionadded:: 3.2
199 Added the ability to use :class:`ZipFile` as a context manager.
Georg Brandl116aa622007-08-15 14:28:22 +0000200
Martin v. Löwisf6b16a42012-05-01 07:58:44 +0200201 .. versionchanged:: 3.3
Andrew Svetlov5061a342012-10-06 18:10:01 +0300202 Added support for :mod:`bzip2 <bz2>` and :mod:`lzma` compression.
Martin v. Löwisf6b16a42012-05-01 07:58:44 +0200203
Serhiy Storchaka235c5e02013-11-23 15:55:38 +0200204 .. versionchanged:: 3.4
205 ZIP64 extensions are enabled by default.
206
Serhiy Storchaka77d89972015-03-23 01:09:35 +0200207 .. versionchanged:: 3.5
208 Added support for writing to unseekable streams.
Serhiy Storchaka764fc9b2015-03-25 10:09:41 +0200209 Added support for the ``'x'`` mode.
Serhiy Storchaka77d89972015-03-23 01:09:35 +0200210
Serhiy Storchakab0d497c2016-09-10 21:28:07 +0300211 .. versionchanged:: 3.6
212 Previously, a plain :exc:`RuntimeError` was raised for unrecognized
213 compression values.
214
Serhiy Storchaka8606e952017-03-08 14:37:51 +0200215 .. versionchanged:: 3.6.2
216 The *file* parameter accepts a :term:`path-like object`.
217
Bo Baylesce237c72018-01-29 23:54:07 -0600218 .. versionchanged:: 3.7
219 Add the *compresslevel* parameter.
220
Marcel Plch77b112c2018-08-31 16:43:31 +0200221 .. versionadded:: 3.8
222 The *strict_timestamps* keyword-only argument
223
Georg Brandl116aa622007-08-15 14:28:22 +0000224
225.. method:: ZipFile.close()
226
227 Close the archive file. You must call :meth:`close` before exiting your program
228 or essential records will not be written.
229
230
231.. method:: ZipFile.getinfo(name)
232
233 Return a :class:`ZipInfo` object with information about the archive member
234 *name*. Calling :meth:`getinfo` for a name not currently contained in the
235 archive will raise a :exc:`KeyError`.
236
237
238.. method:: ZipFile.infolist()
239
240 Return a list containing a :class:`ZipInfo` object for each member of the
241 archive. The objects are in the same order as their entries in the actual ZIP
242 file on disk if an existing archive was opened.
243
244
245.. method:: ZipFile.namelist()
246
247 Return a list of archive members by name.
248
249
Serhiy Storchakaf47fc552016-05-15 12:27:16 +0300250.. method:: ZipFile.open(name, mode='r', pwd=None, *, force_zip64=False)
Georg Brandl116aa622007-08-15 14:28:22 +0000251
Serhiy Storchakae670be22016-06-11 19:32:44 +0300252 Access a member of the archive as a binary file-like object. *name*
253 can be either the name of a file within the archive or a :class:`ZipInfo`
254 object. The *mode* parameter, if included, must be ``'r'`` (the default)
255 or ``'w'``. *pwd* is the password used to decrypt encrypted ZIP files.
Georg Brandl116aa622007-08-15 14:28:22 +0000256
Benjamin Petersonf0f14f72015-03-12 22:41:06 -0500257 :meth:`~ZipFile.open` is also a context manager and therefore supports the
Berker Peksagce77ee92015-03-13 02:29:54 +0200258 :keyword:`with` statement::
259
260 with ZipFile('spam.zip') as myzip:
261 with myzip.open('eggs.txt') as myfile:
262 print(myfile.read())
263
Serhiy Storchakae670be22016-06-11 19:32:44 +0300264 With *mode* ``'r'`` the file-like object
Serhiy Storchaka18ee29d2016-05-13 13:52:49 +0300265 (``ZipExtFile``) is read-only and provides the following methods:
266 :meth:`~io.BufferedIOBase.read`, :meth:`~io.IOBase.readline`,
John Jolly066df4f2018-01-30 01:51:35 -0700267 :meth:`~io.IOBase.readlines`, :meth:`~io.IOBase.seek`,
268 :meth:`~io.IOBase.tell`, :meth:`__iter__`, :meth:`~iterator.__next__`.
269 These objects can operate independently of the ZipFile.
Georg Brandl116aa622007-08-15 14:28:22 +0000270
Serhiy Storchaka18ee29d2016-05-13 13:52:49 +0300271 With ``mode='w'``, a writable file handle is returned, which supports the
272 :meth:`~io.BufferedIOBase.write` method. While a writable file handle is open,
273 attempting to read or write other files in the ZIP file will raise a
Serhiy Storchakab0d497c2016-09-10 21:28:07 +0300274 :exc:`ValueError`.
Georg Brandl116aa622007-08-15 14:28:22 +0000275
Serhiy Storchaka18ee29d2016-05-13 13:52:49 +0300276 When writing a file, if the file size is not known in advance but may exceed
277 2 GiB, pass ``force_zip64=True`` to ensure that the header format is
278 capable of supporting large files. If the file size is known in advance,
279 construct a :class:`ZipInfo` object with :attr:`~ZipInfo.file_size` set, and
280 use that as the *name* parameter.
Georg Brandl116aa622007-08-15 14:28:22 +0000281
Georg Brandlb533e262008-05-25 18:19:30 +0000282 .. note::
283
Andrew Svetlovafbf90c2012-10-06 18:02:05 +0300284 The :meth:`.open`, :meth:`read` and :meth:`extract` methods can take a filename
Georg Brandlb533e262008-05-25 18:19:30 +0000285 or a :class:`ZipInfo` object. You will appreciate this when trying to read a
286 ZIP file that contains members with duplicate names.
287
Serhiy Storchakae670be22016-06-11 19:32:44 +0300288 .. versionchanged:: 3.6
289 Removed support of ``mode='U'``. Use :class:`io.TextIOWrapper` for reading
Serhiy Storchaka6787a382013-11-23 22:12:06 +0200290 compressed text files in :term:`universal newlines` mode.
Georg Brandl116aa622007-08-15 14:28:22 +0000291
Serhiy Storchaka18ee29d2016-05-13 13:52:49 +0300292 .. versionchanged:: 3.6
293 :meth:`open` can now be used to write files into the archive with the
294 ``mode='w'`` option.
295
Serhiy Storchakab0d497c2016-09-10 21:28:07 +0300296 .. versionchanged:: 3.6
297 Calling :meth:`.open` on a closed ZipFile will raise a :exc:`ValueError`.
298 Previously, a :exc:`RuntimeError` was raised.
299
300
Georg Brandl7f01a132009-09-16 15:58:14 +0000301.. method:: ZipFile.extract(member, path=None, pwd=None)
Christian Heimes790c8232008-01-07 21:14:23 +0000302
Georg Brandlb533e262008-05-25 18:19:30 +0000303 Extract a member from the archive to the current working directory; *member*
Berker Peksaga0643822016-06-24 12:56:50 +0300304 must be its full name or a :class:`ZipInfo` object. Its file information is
Georg Brandlb533e262008-05-25 18:19:30 +0000305 extracted as accurately as possible. *path* specifies a different directory
306 to extract to. *member* can be a filename or a :class:`ZipInfo` object.
307 *pwd* is the password used for encrypted files.
Christian Heimes790c8232008-01-07 21:14:23 +0000308
Zachary Wareae9f0fe2015-04-13 16:40:49 -0500309 Returns the normalized path created (a directory or new file).
310
Gregory P. Smithb47acbf2013-02-01 11:22:43 -0800311 .. note::
312
313 If a member filename is an absolute path, a drive/UNC sharepoint and
314 leading (back)slashes will be stripped, e.g.: ``///foo/bar`` becomes
Serhiy Storchaka44b8cbf2013-02-02 13:27:30 +0200315 ``foo/bar`` on Unix, and ``C:\foo\bar`` becomes ``foo\bar`` on Windows.
Gregory P. Smithb47acbf2013-02-01 11:22:43 -0800316 And all ``".."`` components in a member filename will be removed, e.g.:
317 ``../../foo../../ba..r`` becomes ``foo../ba..r``. On Windows illegal
318 characters (``:``, ``<``, ``>``, ``|``, ``"``, ``?``, and ``*``)
319 replaced by underscore (``_``).
320
Serhiy Storchakab0d497c2016-09-10 21:28:07 +0300321 .. versionchanged:: 3.6
322 Calling :meth:`extract` on a closed ZipFile will raise a
323 :exc:`ValueError`. Previously, a :exc:`RuntimeError` was raised.
324
Serhiy Storchaka8606e952017-03-08 14:37:51 +0200325 .. versionchanged:: 3.6.2
326 The *path* parameter accepts a :term:`path-like object`.
327
Christian Heimes790c8232008-01-07 21:14:23 +0000328
Georg Brandl7f01a132009-09-16 15:58:14 +0000329.. method:: ZipFile.extractall(path=None, members=None, pwd=None)
Christian Heimes790c8232008-01-07 21:14:23 +0000330
Georg Brandl48310cd2009-01-03 21:18:54 +0000331 Extract all members from the archive to the current working directory. *path*
Christian Heimes790c8232008-01-07 21:14:23 +0000332 specifies a different directory to extract to. *members* is optional and must
333 be a subset of the list returned by :meth:`namelist`. *pwd* is the password
334 used for encrypted files.
335
Gregory P. Smithf1319d82013-02-07 22:15:04 -0800336 .. warning::
Benjamin Petersona0dfa822009-11-13 02:25:08 +0000337
Gregory P. Smithf1319d82013-02-07 22:15:04 -0800338 Never extract archives from untrusted sources without prior inspection.
339 It is possible that files are created outside of *path*, e.g. members
340 that have absolute filenames starting with ``"/"`` or filenames with two
Gregory P. Smith1d824ec2013-02-07 22:17:21 -0800341 dots ``".."``. This module attempts to prevent that.
Gregory P. Smithb47acbf2013-02-01 11:22:43 -0800342 See :meth:`extract` note.
Benjamin Petersona0dfa822009-11-13 02:25:08 +0000343
Serhiy Storchakab0d497c2016-09-10 21:28:07 +0300344 .. versionchanged:: 3.6
345 Calling :meth:`extractall` on a closed ZipFile will raise a
346 :exc:`ValueError`. Previously, a :exc:`RuntimeError` was raised.
347
Serhiy Storchaka8606e952017-03-08 14:37:51 +0200348 .. versionchanged:: 3.6.2
349 The *path* parameter accepts a :term:`path-like object`.
350
Christian Heimes790c8232008-01-07 21:14:23 +0000351
Georg Brandl116aa622007-08-15 14:28:22 +0000352.. method:: ZipFile.printdir()
353
354 Print a table of contents for the archive to ``sys.stdout``.
355
356
357.. method:: ZipFile.setpassword(pwd)
358
359 Set *pwd* as default password to extract encrypted files.
360
Georg Brandl116aa622007-08-15 14:28:22 +0000361
Georg Brandl7f01a132009-09-16 15:58:14 +0000362.. method:: ZipFile.read(name, pwd=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000363
Georg Brandlb533e262008-05-25 18:19:30 +0000364 Return the bytes of the file *name* in the archive. *name* is the name of the
365 file in the archive, or a :class:`ZipInfo` object. The archive must be open for
366 read or append. *pwd* is the password used for encrypted files and, if specified,
367 it will override the default password set with :meth:`setpassword`. Calling
Gregory P. Smithf2a448a2015-04-14 10:02:20 -0700368 :meth:`read` on a ZipFile that uses a compression method other than
Gregory P. Smith23a6a0d2015-04-14 10:04:30 -0700369 :const:`ZIP_STORED`, :const:`ZIP_DEFLATED`, :const:`ZIP_BZIP2` or
Gregory P. Smithf2a448a2015-04-14 10:02:20 -0700370 :const:`ZIP_LZMA` will raise a :exc:`NotImplementedError`. An error will also
371 be raised if the corresponding compression module is not available.
Georg Brandl116aa622007-08-15 14:28:22 +0000372
Serhiy Storchakab0d497c2016-09-10 21:28:07 +0300373 .. versionchanged:: 3.6
374 Calling :meth:`read` on a closed ZipFile will raise a :exc:`ValueError`.
375 Previously, a :exc:`RuntimeError` was raised.
376
Georg Brandl116aa622007-08-15 14:28:22 +0000377
378.. method:: ZipFile.testzip()
379
380 Read all the files in the archive and check their CRC's and file headers.
Serhiy Storchakab0d497c2016-09-10 21:28:07 +0300381 Return the name of the first bad file, or else return ``None``.
382
383 .. versionchanged:: 3.6
nsrip40bf6cf2018-10-27 10:42:56 -0400384 Calling :meth:`testzip` on a closed ZipFile will raise a
Serhiy Storchakab0d497c2016-09-10 21:28:07 +0300385 :exc:`ValueError`. Previously, a :exc:`RuntimeError` was raised.
Georg Brandl116aa622007-08-15 14:28:22 +0000386
387
Bo Baylesce237c72018-01-29 23:54:07 -0600388.. method:: ZipFile.write(filename, arcname=None, compress_type=None, \
Marcel Plch77b112c2018-08-31 16:43:31 +0200389 compresslevel=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000390
391 Write the file named *filename* to the archive, giving it the archive name
392 *arcname* (by default, this will be the same as *filename*, but without a drive
393 letter and with leading path separators removed). If given, *compress_type*
394 overrides the value given for the *compression* parameter to the constructor for
Bo Baylesce237c72018-01-29 23:54:07 -0600395 the new entry. Similarly, *compresslevel* will override the constructor if
396 given.
Serhiy Storchakab0d497c2016-09-10 21:28:07 +0300397 The archive must be open with mode ``'w'``, ``'x'`` or ``'a'``.
Georg Brandl116aa622007-08-15 14:28:22 +0000398
399 .. note::
400
Georg Brandl116aa622007-08-15 14:28:22 +0000401 Archive names should be relative to the archive root, that is, they should not
402 start with a path separator.
403
404 .. note::
405
406 If ``arcname`` (or ``filename``, if ``arcname`` is not given) contains a null
407 byte, the name of the file in the archive will be truncated at the null byte.
408
Serhiy Storchakab0d497c2016-09-10 21:28:07 +0300409 .. versionchanged:: 3.6
410 Calling :meth:`write` on a ZipFile created with mode ``'r'`` or
411 a closed ZipFile will raise a :exc:`ValueError`. Previously,
412 a :exc:`RuntimeError` was raised.
413
414
Bo Baylesce237c72018-01-29 23:54:07 -0600415.. method:: ZipFile.writestr(zinfo_or_arcname, data, compress_type=None, \
416 compresslevel=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000417
Serhiy Storchaka4bb186d2018-11-25 09:51:14 +0200418 Write a file into the archive. The contents is *data*, which may be either
419 a :class:`str` or a :class:`bytes` instance; if it is a :class:`str`,
420 it is encoded as UTF-8 first. *zinfo_or_arcname* is either the file
Georg Brandl116aa622007-08-15 14:28:22 +0000421 name it will be given in the archive, or a :class:`ZipInfo` instance. If it's
422 an instance, at least the filename, date, and time must be given. If it's a
Serhiy Storchaka764fc9b2015-03-25 10:09:41 +0200423 name, the date and time is set to the current date and time.
Serhiy Storchakab0d497c2016-09-10 21:28:07 +0300424 The archive must be opened with mode ``'w'``, ``'x'`` or ``'a'``.
Georg Brandl116aa622007-08-15 14:28:22 +0000425
Ronald Oussorenee5c8852010-02-07 20:24:02 +0000426 If given, *compress_type* overrides the value given for the *compression*
427 parameter to the constructor for the new entry, or in the *zinfo_or_arcname*
Bo Baylesce237c72018-01-29 23:54:07 -0600428 (if that is a :class:`ZipInfo` instance). Similarly, *compresslevel* will
429 override the constructor if given.
Ronald Oussorenee5c8852010-02-07 20:24:02 +0000430
Christian Heimes790c8232008-01-07 21:14:23 +0000431 .. note::
432
Éric Araujo0d4bcf42010-12-26 17:53:27 +0000433 When passing a :class:`ZipInfo` instance as the *zinfo_or_arcname* parameter,
Georg Brandl48310cd2009-01-03 21:18:54 +0000434 the compression method used will be that specified in the *compress_type*
435 member of the given :class:`ZipInfo` instance. By default, the
Christian Heimes790c8232008-01-07 21:14:23 +0000436 :class:`ZipInfo` constructor sets this member to :const:`ZIP_STORED`.
437
Ezio Melottif8754a62010-03-21 07:16:43 +0000438 .. versionchanged:: 3.2
Andrew Svetlovafbf90c2012-10-06 18:02:05 +0300439 The *compress_type* argument.
Ronald Oussorenee5c8852010-02-07 20:24:02 +0000440
Serhiy Storchakab0d497c2016-09-10 21:28:07 +0300441 .. versionchanged:: 3.6
442 Calling :meth:`writestr` on a ZipFile created with mode ``'r'`` or
443 a closed ZipFile will raise a :exc:`ValueError`. Previously,
444 a :exc:`RuntimeError` was raised.
445
446
Martin v. Löwisb09b8442008-07-03 14:13:42 +0000447The following data attributes are also available:
Georg Brandl116aa622007-08-15 14:28:22 +0000448
Serhiy Storchaka8606e952017-03-08 14:37:51 +0200449.. attribute:: ZipFile.filename
450
451 Name of the ZIP file.
Georg Brandl116aa622007-08-15 14:28:22 +0000452
453.. attribute:: ZipFile.debug
454
455 The level of debug output to use. This may be set from ``0`` (the default, no
456 output) to ``3`` (the most output). Debugging information is written to
457 ``sys.stdout``.
458
Martin v. Löwisb09b8442008-07-03 14:13:42 +0000459.. attribute:: ZipFile.comment
460
Serhiy Storchaka4bb186d2018-11-25 09:51:14 +0200461 The comment associated with the ZIP file as a :class:`bytes` object.
462 If assigning a comment to a
Serhiy Storchaka764fc9b2015-03-25 10:09:41 +0200463 :class:`ZipFile` instance created with mode ``'w'``, ``'x'`` or ``'a'``,
Serhiy Storchaka4bb186d2018-11-25 09:51:14 +0200464 it should be no longer than 65535 bytes. Comments longer than this will be
465 truncated.
Georg Brandl116aa622007-08-15 14:28:22 +0000466
Georg Brandl8334fd92010-12-04 10:26:46 +0000467
Jason R. Coombsb2758ff2019-05-08 09:45:06 -0400468.. _path-objects:
469
470Path Objects
471------------
472
473.. class:: Path(root, at='')
474
475 Construct a Path object from a ``root`` zipfile (which may be a
476 :class:`ZipFile` instance or ``file`` suitable for passing to
477 the :class:`ZipFile` constructor).
478
479 ``at`` specifies the location of this Path within the zipfile,
480 e.g. 'dir/file.txt', 'dir/', or ''. Defaults to the empty string,
481 indicating the root.
482
483Path objects expose the following features of :mod:`pathlib.Path`
484objects:
485
486Path objects are traversable using the ``/`` operator.
487
488.. attribute:: Path.name
489
490 The final path component.
491
492.. method:: Path.open(*, **)
493
494 Invoke :meth:`ZipFile.open` on the current path. Accepts
495 the same arguments as :meth:`ZipFile.open`.
496
Claudiu Popa65444cf2019-11-21 22:23:13 +0100497.. method:: Path.iterdir()
Jason R. Coombsb2758ff2019-05-08 09:45:06 -0400498
499 Enumerate the children of the current directory.
500
501.. method:: Path.is_dir()
502
503 Return ``True`` if the current context references a directory.
504
505.. method:: Path.is_file()
506
507 Return ``True`` if the current context references a file.
508
509.. method:: Path.exists()
510
511 Return ``True`` if the current context references a file or
512 directory in the zip file.
513
514.. method:: Path.read_text(*, **)
515
516 Read the current file as unicode text. Positional and
517 keyword arguments are passed through to
518 :class:`io.TextIOWrapper` (except ``buffer``, which is
519 implied by the context).
520
521.. method:: Path.read_bytes()
522
523 Read the current file as bytes.
524
525
Georg Brandl116aa622007-08-15 14:28:22 +0000526.. _pyzipfile-objects:
527
528PyZipFile Objects
529-----------------
530
531The :class:`PyZipFile` constructor takes the same parameters as the
Georg Brandl8334fd92010-12-04 10:26:46 +0000532:class:`ZipFile` constructor, and one additional parameter, *optimize*.
Georg Brandl116aa622007-08-15 14:28:22 +0000533
Serhiy Storchaka235c5e02013-11-23 15:55:38 +0200534.. class:: PyZipFile(file, mode='r', compression=ZIP_STORED, allowZip64=True, \
Georg Brandl8334fd92010-12-04 10:26:46 +0000535 optimize=-1)
Georg Brandl116aa622007-08-15 14:28:22 +0000536
Georg Brandl8334fd92010-12-04 10:26:46 +0000537 .. versionadded:: 3.2
538 The *optimize* parameter.
Georg Brandl116aa622007-08-15 14:28:22 +0000539
Serhiy Storchaka235c5e02013-11-23 15:55:38 +0200540 .. versionchanged:: 3.4
541 ZIP64 extensions are enabled by default.
542
Georg Brandl8334fd92010-12-04 10:26:46 +0000543 Instances have one method in addition to those of :class:`ZipFile` objects:
Georg Brandl116aa622007-08-15 14:28:22 +0000544
Christian Tismer59202e52013-10-21 03:59:23 +0200545 .. method:: PyZipFile.writepy(pathname, basename='', filterfunc=None)
546
Georg Brandl8334fd92010-12-04 10:26:46 +0000547 Search for files :file:`\*.py` and add the corresponding file to the
548 archive.
549
550 If the *optimize* parameter to :class:`PyZipFile` was not given or ``-1``,
Brett Cannonf299abd2015-04-13 14:21:02 -0400551 the corresponding file is a :file:`\*.pyc` file, compiling if necessary.
Georg Brandl8334fd92010-12-04 10:26:46 +0000552
553 If the *optimize* parameter to :class:`PyZipFile` was ``0``, ``1`` or
554 ``2``, only files with that optimization level (see :func:`compile`) are
555 added to the archive, compiling if necessary.
556
Larry Hastings3732ed22014-03-15 21:13:56 -0700557 If *pathname* is a file, the filename must end with :file:`.py`, and
Xiang Zhang0710d752017-03-11 13:02:52 +0800558 just the (corresponding :file:`\*.pyc`) file is added at the top level
Larry Hastings3732ed22014-03-15 21:13:56 -0700559 (no path information). If *pathname* is a file that does not end with
Georg Brandl8334fd92010-12-04 10:26:46 +0000560 :file:`.py`, a :exc:`RuntimeError` will be raised. If it is a directory,
561 and the directory is not a package directory, then all the files
Xiang Zhang0710d752017-03-11 13:02:52 +0800562 :file:`\*.pyc` are added at the top level. If the directory is a
563 package directory, then all :file:`\*.pyc` are added under the package
Georg Brandl8334fd92010-12-04 10:26:46 +0000564 name as a file path, and if any subdirectories are package directories,
Bernhard M. Wiedemann84521042018-01-31 11:17:10 +0100565 all of these are added recursively in sorted order.
Larry Hastings3732ed22014-03-15 21:13:56 -0700566
567 *basename* is intended for internal use only.
568
569 *filterfunc*, if given, must be a function taking a single string
570 argument. It will be passed each path (including each individual full
571 file path) before it is added to the archive. If *filterfunc* returns a
572 false value, the path will not be added, and if it is a directory its
573 contents will be ignored. For example, if our test files are all either
574 in ``test`` directories or start with the string ``test_``, we can use a
575 *filterfunc* to exclude them::
576
577 >>> zf = PyZipFile('myprog.zip')
578 >>> def notests(s):
579 ... fn = os.path.basename(s)
580 ... return (not (fn == 'test' or fn.startswith('test_')))
581 >>> zf.writepy('myprog', filterfunc=notests)
582
Christian Tismer59202e52013-10-21 03:59:23 +0200583 The :meth:`writepy` method makes archives with file names like
Georg Brandl8334fd92010-12-04 10:26:46 +0000584 this::
585
586 string.pyc # Top level name
587 test/__init__.pyc # Package directory
588 test/testall.pyc # Module test.testall
589 test/bogus/__init__.pyc # Subpackage directory
590 test/bogus/myfile.pyc # Submodule test.bogus.myfile
Georg Brandl116aa622007-08-15 14:28:22 +0000591
Georg Brandla6065422013-10-21 08:29:29 +0200592 .. versionadded:: 3.4
593 The *filterfunc* parameter.
594
Serhiy Storchaka8606e952017-03-08 14:37:51 +0200595 .. versionchanged:: 3.6.2
596 The *pathname* parameter accepts a :term:`path-like object`.
597
Bernhard M. Wiedemann84521042018-01-31 11:17:10 +0100598 .. versionchanged:: 3.7
599 Recursion sorts directory entries.
600
Georg Brandl116aa622007-08-15 14:28:22 +0000601
602.. _zipinfo-objects:
603
604ZipInfo Objects
605---------------
606
Andrew Svetlovafbf90c2012-10-06 18:02:05 +0300607Instances of the :class:`ZipInfo` class are returned by the :meth:`.getinfo` and
608:meth:`.infolist` methods of :class:`ZipFile` objects. Each object stores
Georg Brandl116aa622007-08-15 14:28:22 +0000609information about a single member of the ZIP archive.
610
Serhiy Storchaka503f9082016-02-08 00:02:25 +0200611There is one classmethod to make a :class:`ZipInfo` instance for a filesystem
612file:
613
Marcel Plcha2fe1e52018-08-02 15:04:52 +0200614.. classmethod:: ZipInfo.from_file(filename, arcname=None, *, \
615 strict_timestamps=True)
Serhiy Storchaka503f9082016-02-08 00:02:25 +0200616
617 Construct a :class:`ZipInfo` instance for a file on the filesystem, in
618 preparation for adding it to a zip file.
619
620 *filename* should be the path to a file or directory on the filesystem.
621
622 If *arcname* is specified, it is used as the name within the archive.
623 If *arcname* is not specified, the name will be the same as *filename*, but
624 with any drive letter and leading path separators removed.
625
Marcel Plcha2fe1e52018-08-02 15:04:52 +0200626 The *strict_timestamps* argument, when set to ``False``, allows to
627 zip files older than 1980-01-01 at the cost of setting the
628 timestamp to 1980-01-01.
629 Similar behavior occurs with files newer than 2107-12-31,
630 the timestamp is also set to the limit.
631
Serhiy Storchaka503f9082016-02-08 00:02:25 +0200632 .. versionadded:: 3.6
633
Serhiy Storchaka8606e952017-03-08 14:37:51 +0200634 .. versionchanged:: 3.6.2
635 The *filename* parameter accepts a :term:`path-like object`.
636
Marcel Plcha2fe1e52018-08-02 15:04:52 +0200637 .. versionadded:: 3.8
638 The *strict_timestamps* keyword-only argument
639
Serhiy Storchaka8606e952017-03-08 14:37:51 +0200640
Serhiy Storchakaf47fc552016-05-15 12:27:16 +0300641Instances have the following methods and attributes:
642
643.. method:: ZipInfo.is_dir()
644
Serhiy Storchaka7d6dda42016-10-19 18:36:51 +0300645 Return ``True`` if this archive member is a directory.
Serhiy Storchakaf47fc552016-05-15 12:27:16 +0300646
647 This uses the entry's name: directories should always end with ``/``.
648
649 .. versionadded:: 3.6
Georg Brandl116aa622007-08-15 14:28:22 +0000650
651
652.. attribute:: ZipInfo.filename
653
654 Name of the file in the archive.
655
656
657.. attribute:: ZipInfo.date_time
658
659 The time and date of the last modification to the archive member. This is a
660 tuple of six values:
661
662 +-------+--------------------------+
663 | Index | Value |
664 +=======+==========================+
Senthil Kumaran29fa9d42011-10-20 01:46:00 +0800665 | ``0`` | Year (>= 1980) |
Georg Brandl116aa622007-08-15 14:28:22 +0000666 +-------+--------------------------+
667 | ``1`` | Month (one-based) |
668 +-------+--------------------------+
669 | ``2`` | Day of month (one-based) |
670 +-------+--------------------------+
671 | ``3`` | Hours (zero-based) |
672 +-------+--------------------------+
673 | ``4`` | Minutes (zero-based) |
674 +-------+--------------------------+
675 | ``5`` | Seconds (zero-based) |
676 +-------+--------------------------+
677
Senthil Kumaran29fa9d42011-10-20 01:46:00 +0800678 .. note::
679
680 The ZIP file format does not support timestamps before 1980.
681
Georg Brandl116aa622007-08-15 14:28:22 +0000682
683.. attribute:: ZipInfo.compress_type
684
685 Type of compression for the archive member.
686
687
688.. attribute:: ZipInfo.comment
689
Serhiy Storchaka4bb186d2018-11-25 09:51:14 +0200690 Comment for the individual archive member as a :class:`bytes` object.
Georg Brandl116aa622007-08-15 14:28:22 +0000691
692
693.. attribute:: ZipInfo.extra
694
Georg Brandl5d941342016-02-26 19:37:12 +0100695 Expansion field data. The `PKZIP Application Note`_ contains
Serhiy Storchaka4bb186d2018-11-25 09:51:14 +0200696 some comments on the internal structure of the data contained in this
697 :class:`bytes` object.
Georg Brandl116aa622007-08-15 14:28:22 +0000698
699
700.. attribute:: ZipInfo.create_system
701
702 System which created ZIP archive.
703
704
705.. attribute:: ZipInfo.create_version
706
707 PKZIP version which created ZIP archive.
708
709
710.. attribute:: ZipInfo.extract_version
711
712 PKZIP version needed to extract archive.
713
714
715.. attribute:: ZipInfo.reserved
716
717 Must be zero.
718
719
720.. attribute:: ZipInfo.flag_bits
721
722 ZIP flag bits.
723
724
725.. attribute:: ZipInfo.volume
726
727 Volume number of file header.
728
729
730.. attribute:: ZipInfo.internal_attr
731
732 Internal attributes.
733
734
735.. attribute:: ZipInfo.external_attr
736
737 External file attributes.
738
739
740.. attribute:: ZipInfo.header_offset
741
742 Byte offset to the file header.
743
744
745.. attribute:: ZipInfo.CRC
746
747 CRC-32 of the uncompressed file.
748
749
750.. attribute:: ZipInfo.compress_size
751
752 Size of the compressed data.
753
754
755.. attribute:: ZipInfo.file_size
756
757 Size of the uncompressed file.
Serhiy Storchaka503f9082016-02-08 00:02:25 +0200758
Serhiy Storchaka503f9082016-02-08 00:02:25 +0200759
Serhiy Storchaka92c1a902016-11-02 12:06:15 +0200760.. _zipfile-commandline:
761.. program:: zipfile
762
763Command-Line Interface
764----------------------
765
766The :mod:`zipfile` module provides a simple command-line interface to interact
767with ZIP archives.
768
769If you want to create a new ZIP archive, specify its name after the :option:`-c`
770option and then list the filename(s) that should be included:
771
772.. code-block:: shell-session
773
774 $ python -m zipfile -c monty.zip spam.txt eggs.txt
775
776Passing a directory is also acceptable:
777
778.. code-block:: shell-session
779
780 $ python -m zipfile -c monty.zip life-of-brian_1979/
781
782If you want to extract a ZIP archive into the specified directory, use
783the :option:`-e` option:
784
785.. code-block:: shell-session
786
787 $ python -m zipfile -e monty.zip target-dir/
788
789For a list of the files in a ZIP archive, use the :option:`-l` option:
790
791.. code-block:: shell-session
792
793 $ python -m zipfile -l monty.zip
794
795
796Command-line options
797~~~~~~~~~~~~~~~~~~~~
798
799.. cmdoption:: -l <zipfile>
Serhiy Storchaka5a97bf72016-11-02 12:13:48 +0200800 --list <zipfile>
Serhiy Storchaka92c1a902016-11-02 12:06:15 +0200801
802 List files in a zipfile.
803
804.. cmdoption:: -c <zipfile> <source1> ... <sourceN>
Serhiy Storchaka5a97bf72016-11-02 12:13:48 +0200805 --create <zipfile> <source1> ... <sourceN>
Serhiy Storchaka92c1a902016-11-02 12:06:15 +0200806
807 Create zipfile from source files.
808
809.. cmdoption:: -e <zipfile> <output_dir>
Serhiy Storchaka5a97bf72016-11-02 12:13:48 +0200810 --extract <zipfile> <output_dir>
Serhiy Storchaka92c1a902016-11-02 12:06:15 +0200811
812 Extract zipfile into target directory.
813
814.. cmdoption:: -t <zipfile>
Serhiy Storchaka5a97bf72016-11-02 12:13:48 +0200815 --test <zipfile>
Serhiy Storchaka92c1a902016-11-02 12:06:15 +0200816
817 Test whether the zipfile is valid or not.
818
JunWei Song3ba51d52019-09-11 23:04:12 +0800819Decompression pitfalls
820----------------------
Serhiy Storchaka92c1a902016-11-02 12:06:15 +0200821
JunWei Song3ba51d52019-09-11 23:04:12 +0800822The extraction in zipfile module might fail due to some pitfalls listed below.
823
824From file itself
825~~~~~~~~~~~~~~~~
826
827Decompression may fail due to incorrect password / CRC checksum / ZIP format or
828unsupported compression method / decryption.
829
830File System limitations
831~~~~~~~~~~~~~~~~~~~~~~~
832
833Exceeding limitations on different file systems can cause decompression failed.
834Such as allowable characters in the directory entries, length of the file name,
835length of the pathname, size of a single file, and number of files, etc.
836
837Resources limitations
838~~~~~~~~~~~~~~~~~~~~~
839
840The lack of memory or disk volume would lead to decompression
841failed. For example, decompression bombs (aka `ZIP bomb`_)
842apply to zipfile library that can cause disk volume exhaustion.
843
844Interruption
845~~~~~~~~~~~~
846
847Interruption during the decompression, such as pressing control-C or killing the
848decompression process may result in incomplete decompression of the archive.
849
850Default behaviors of extraction
851~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
852
853Not knowing the default extraction behaviors
854can cause unexpected decompression results.
855For example, when extracting the same archive twice,
856it overwrites files without asking.
857
858
859.. _ZIP bomb: https://en.wikipedia.org/wiki/Zip_bomb
Georg Brandl5d941342016-02-26 19:37:12 +0100860.. _PKZIP Application Note: https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT