blob: 3db55e646c47cc5a73f3ba7052a0b4f7885c5872 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`zipfile` --- Work with ZIP archives
2=========================================
3
4.. module:: zipfile
5 :synopsis: Read and write ZIP-format archive files.
Terry Jan Reedyfa089b92016-06-11 15:02:54 -04006
Georg Brandl116aa622007-08-15 14:28:22 +00007.. moduleauthor:: James C. Ahlstrom <jim@interet.com>
8.. sectionauthor:: James C. Ahlstrom <jim@interet.com>
9
Raymond Hettinger469271d2011-01-27 20:38:46 +000010**Source code:** :source:`Lib/zipfile.py`
11
12--------------
13
Georg Brandl116aa622007-08-15 14:28:22 +000014The ZIP file format is a common archive and compression standard. This module
15provides tools to create, read, write, append, and list a ZIP file. Any
16advanced use of this module will require an understanding of the format, as
Georg Brandl5d941342016-02-26 19:37:12 +010017defined in `PKZIP Application Note`_.
Georg Brandl116aa622007-08-15 14:28:22 +000018
Georg Brandl98be9962010-08-02 20:52:10 +000019This module does not currently handle multi-disk ZIP files.
20It can handle ZIP files that use the ZIP64 extensions
Serhiy Storchakaf8def282013-02-16 17:29:56 +020021(that is ZIP files that are more than 4 GiB in size). It supports
Guido van Rossum77677112007-11-05 19:43:04 +000022decryption of encrypted files in ZIP archives, but it currently cannot
Christian Heimesfdab48e2008-01-20 09:06:41 +000023create an encrypted file. Decryption is extremely slow as it is
Benjamin Peterson20211002009-11-25 18:34:42 +000024implemented in native Python rather than C.
Georg Brandl116aa622007-08-15 14:28:22 +000025
Guido van Rossum77677112007-11-05 19:43:04 +000026The module defines the following items:
Georg Brandl116aa622007-08-15 14:28:22 +000027
Georg Brandl4d540882010-10-28 06:42:33 +000028.. exception:: BadZipFile
Georg Brandl116aa622007-08-15 14:28:22 +000029
Éric Araujod001ffe2011-08-19 00:44:31 +020030 The error raised for bad ZIP files.
Georg Brandl116aa622007-08-15 14:28:22 +000031
Georg Brandl4d540882010-10-28 06:42:33 +000032 .. versionadded:: 3.2
33
34
35.. exception:: BadZipfile
36
Éric Araujod001ffe2011-08-19 00:44:31 +020037 Alias of :exc:`BadZipFile`, for compatibility with older Python versions.
38
39 .. deprecated:: 3.2
Georg Brandl4d540882010-10-28 06:42:33 +000040
Georg Brandl116aa622007-08-15 14:28:22 +000041
42.. exception:: LargeZipFile
43
44 The error raised when a ZIP file would require ZIP64 functionality but that has
45 not been enabled.
46
47
48.. class:: ZipFile
Georg Brandl5e92a502010-11-12 06:20:12 +000049 :noindex:
Georg Brandl116aa622007-08-15 14:28:22 +000050
51 The class for reading and writing ZIP files. See section
52 :ref:`zipfile-objects` for constructor details.
53
54
Jason R. Coombsb2758ff2019-05-08 09:45:06 -040055.. class:: Path
56 :noindex:
57
58 A pathlib-compatible wrapper for zip files. See section
59 :ref:`path-objects` for details.
60
61 .. versionadded:: 3.8
62
63
Georg Brandl116aa622007-08-15 14:28:22 +000064.. class:: PyZipFile
Georg Brandl8334fd92010-12-04 10:26:46 +000065 :noindex:
Georg Brandl116aa622007-08-15 14:28:22 +000066
67 Class for creating ZIP archives containing Python libraries.
68
69
Georg Brandl7f01a132009-09-16 15:58:14 +000070.. class:: ZipInfo(filename='NoName', date_time=(1980,1,1,0,0,0))
Georg Brandl116aa622007-08-15 14:28:22 +000071
72 Class used to represent information about a member of an archive. Instances
Andrew Svetlovafbf90c2012-10-06 18:02:05 +030073 of this class are returned by the :meth:`.getinfo` and :meth:`.infolist`
Georg Brandl116aa622007-08-15 14:28:22 +000074 methods of :class:`ZipFile` objects. Most users of the :mod:`zipfile` module
75 will not need to create these, but only use those created by this
76 module. *filename* should be the full name of the archive member, and
77 *date_time* should be a tuple containing six fields which describe the time
78 of the last modification to the file; the fields are described in section
79 :ref:`zipinfo-objects`.
80
81
82.. function:: is_zipfile(filename)
83
84 Returns ``True`` if *filename* is a valid ZIP file based on its magic number,
Antoine Pitroudb5fe662008-12-27 15:50:40 +000085 otherwise returns ``False``. *filename* may be a file or file-like object too.
Georg Brandl116aa622007-08-15 14:28:22 +000086
Georg Brandl277a1502009-01-04 00:28:14 +000087 .. versionchanged:: 3.1
88 Support for file and file-like objects.
Georg Brandl116aa622007-08-15 14:28:22 +000089
Georg Brandl67b21b72010-08-17 15:07:14 +000090
Georg Brandl116aa622007-08-15 14:28:22 +000091.. data:: ZIP_STORED
92
93 The numeric constant for an uncompressed archive member.
94
95
96.. data:: ZIP_DEFLATED
97
98 The numeric constant for the usual ZIP compression method. This requires the
Andrew Svetlov5061a342012-10-06 18:10:01 +030099 :mod:`zlib` module.
Martin v. Löwisf6b16a42012-05-01 07:58:44 +0200100
101
102.. data:: ZIP_BZIP2
103
104 The numeric constant for the BZIP2 compression method. This requires the
Andrew Svetlov5061a342012-10-06 18:10:01 +0300105 :mod:`bz2` module.
Martin v. Löwisf6b16a42012-05-01 07:58:44 +0200106
107 .. versionadded:: 3.3
108
Martin v. Löwis7fb79fc2012-05-13 10:06:36 +0200109.. data:: ZIP_LZMA
110
111 The numeric constant for the LZMA compression method. This requires the
Andrew Svetlov5061a342012-10-06 18:10:01 +0300112 :mod:`lzma` module.
Martin v. Löwis7fb79fc2012-05-13 10:06:36 +0200113
114 .. versionadded:: 3.3
115
Martin v. Löwisf6b16a42012-05-01 07:58:44 +0200116 .. note::
117
118 The ZIP file format specification has included support for bzip2 compression
Martin v. Löwis7fb79fc2012-05-13 10:06:36 +0200119 since 2001, and for LZMA compression since 2006. However, some tools
120 (including older Python releases) do not support these compression
121 methods, and may either refuse to process the ZIP file altogether,
122 or fail to extract individual files.
Georg Brandl116aa622007-08-15 14:28:22 +0000123
124
125.. seealso::
126
Georg Brandl5d941342016-02-26 19:37:12 +0100127 `PKZIP Application Note`_
Georg Brandl116aa622007-08-15 14:28:22 +0000128 Documentation on the ZIP file format by Phil Katz, the creator of the format and
129 algorithms used.
130
131 `Info-ZIP Home Page <http://www.info-zip.org/>`_
132 Information about the Info-ZIP project's ZIP archive programs and development
133 libraries.
134
135
136.. _zipfile-objects:
137
138ZipFile Objects
139---------------
140
141
Bo Baylesce237c72018-01-29 23:54:07 -0600142.. class:: ZipFile(file, mode='r', compression=ZIP_STORED, allowZip64=True, \
Marcel Plch77b112c2018-08-31 16:43:31 +0200143 compresslevel=None, *, strict_timestamps=True)
Georg Brandl116aa622007-08-15 14:28:22 +0000144
Serhiy Storchaka8606e952017-03-08 14:37:51 +0200145 Open a ZIP file, where *file* can be a path to a file (a string), a
146 file-like object or a :term:`path-like object`.
Bo Baylesce237c72018-01-29 23:54:07 -0600147
Serhiy Storchaka8606e952017-03-08 14:37:51 +0200148 The *mode* parameter should be ``'r'`` to read an existing
Senthil Kumarane5c05cc2016-01-21 21:06:47 -0800149 file, ``'w'`` to truncate and write a new file, ``'a'`` to append to an
150 existing file, or ``'x'`` to exclusively create and write a new file.
Serhiy Storchaka764fc9b2015-03-25 10:09:41 +0200151 If *mode* is ``'x'`` and *file* refers to an existing file,
152 a :exc:`FileExistsError` will be raised.
153 If *mode* is ``'a'`` and *file* refers to an existing ZIP
Ezio Melottifaa6b7f2009-12-30 12:34:59 +0000154 file, then additional files are added to it. If *file* does not refer to a
155 ZIP file, then a new ZIP archive is appended to the file. This is meant for
156 adding a ZIP archive to another file (such as :file:`python.exe`). If
Berker Peksag7927e752016-09-13 04:49:12 +0300157 *mode* is ``'a'`` and the file does not exist at all, it is created.
158 If *mode* is ``'r'`` or ``'a'``, the file should be seekable.
Bo Baylesce237c72018-01-29 23:54:07 -0600159
Ezio Melottifaa6b7f2009-12-30 12:34:59 +0000160 *compression* is the ZIP compression method to use when writing the archive,
Martin v. Löwis7fb79fc2012-05-13 10:06:36 +0200161 and should be :const:`ZIP_STORED`, :const:`ZIP_DEFLATED`,
162 :const:`ZIP_BZIP2` or :const:`ZIP_LZMA`; unrecognized
Bo Baylesce237c72018-01-29 23:54:07 -0600163 values will cause :exc:`NotImplementedError` to be raised. If
164 :const:`ZIP_DEFLATED`, :const:`ZIP_BZIP2` or :const:`ZIP_LZMA` is specified
165 but the corresponding module (:mod:`zlib`, :mod:`bz2` or :mod:`lzma`) is not
166 available, :exc:`RuntimeError` is raised. The default is :const:`ZIP_STORED`.
167
168 If *allowZip64* is ``True`` (the default) zipfile will create ZIP files that
169 use the ZIP64 extensions when the zipfile is larger than 4 GiB. If it is
170 ``false`` :mod:`zipfile` will raise an exception when the ZIP file would
171 require ZIP64 extensions.
172
173 The *compresslevel* parameter controls the compression level to use when
174 writing files to the archive.
175 When using :const:`ZIP_STORED` or :const:`ZIP_LZMA` it has no effect.
176 When using :const:`ZIP_DEFLATED` integers ``0`` through ``9`` are accepted
177 (see :class:`zlib <zlib.compressobj>` for more information).
178 When using :const:`ZIP_BZIP2` integers ``1`` through ``9`` are accepted
179 (see :class:`bz2 <bz2.BZ2File>` for more information).
Georg Brandl116aa622007-08-15 14:28:22 +0000180
Marcel Plch77b112c2018-08-31 16:43:31 +0200181 The *strict_timestamps* argument, when set to ``False``, allows to
182 zip files older than 1980-01-01 at the cost of setting the
183 timestamp to 1980-01-01.
184 Similar behavior occurs with files newer than 2107-12-31,
185 the timestamp is also set to the limit.
186
Serhiy Storchaka764fc9b2015-03-25 10:09:41 +0200187 If the file is created with mode ``'w'``, ``'x'`` or ``'a'`` and then
Andrew Svetlovafbf90c2012-10-06 18:02:05 +0300188 :meth:`closed <close>` without adding any files to the archive, the appropriate
Georg Brandl268e4d42010-10-14 06:59:45 +0000189 ZIP structures for an empty archive will be written to the file.
190
Ezio Melottifaa6b7f2009-12-30 12:34:59 +0000191 ZipFile is also a context manager and therefore supports the
192 :keyword:`with` statement. In the example, *myzip* is closed after the
Serhiy Storchaka2b57c432018-12-19 08:09:46 +0200193 :keyword:`!with` statement's suite is finished---even if an exception occurs::
Georg Brandl116aa622007-08-15 14:28:22 +0000194
Ezio Melottifaa6b7f2009-12-30 12:34:59 +0000195 with ZipFile('spam.zip', 'w') as myzip:
196 myzip.write('eggs.txt')
197
198 .. versionadded:: 3.2
199 Added the ability to use :class:`ZipFile` as a context manager.
Georg Brandl116aa622007-08-15 14:28:22 +0000200
Martin v. Löwisf6b16a42012-05-01 07:58:44 +0200201 .. versionchanged:: 3.3
Andrew Svetlov5061a342012-10-06 18:10:01 +0300202 Added support for :mod:`bzip2 <bz2>` and :mod:`lzma` compression.
Martin v. Löwisf6b16a42012-05-01 07:58:44 +0200203
Serhiy Storchaka235c5e02013-11-23 15:55:38 +0200204 .. versionchanged:: 3.4
205 ZIP64 extensions are enabled by default.
206
Serhiy Storchaka77d89972015-03-23 01:09:35 +0200207 .. versionchanged:: 3.5
208 Added support for writing to unseekable streams.
Serhiy Storchaka764fc9b2015-03-25 10:09:41 +0200209 Added support for the ``'x'`` mode.
Serhiy Storchaka77d89972015-03-23 01:09:35 +0200210
Serhiy Storchakab0d497c2016-09-10 21:28:07 +0300211 .. versionchanged:: 3.6
212 Previously, a plain :exc:`RuntimeError` was raised for unrecognized
213 compression values.
214
Serhiy Storchaka8606e952017-03-08 14:37:51 +0200215 .. versionchanged:: 3.6.2
216 The *file* parameter accepts a :term:`path-like object`.
217
Bo Baylesce237c72018-01-29 23:54:07 -0600218 .. versionchanged:: 3.7
219 Add the *compresslevel* parameter.
220
Marcel Plch77b112c2018-08-31 16:43:31 +0200221 .. versionadded:: 3.8
222 The *strict_timestamps* keyword-only argument
223
Georg Brandl116aa622007-08-15 14:28:22 +0000224
225.. method:: ZipFile.close()
226
227 Close the archive file. You must call :meth:`close` before exiting your program
228 or essential records will not be written.
229
230
231.. method:: ZipFile.getinfo(name)
232
233 Return a :class:`ZipInfo` object with information about the archive member
234 *name*. Calling :meth:`getinfo` for a name not currently contained in the
235 archive will raise a :exc:`KeyError`.
236
237
238.. method:: ZipFile.infolist()
239
240 Return a list containing a :class:`ZipInfo` object for each member of the
241 archive. The objects are in the same order as their entries in the actual ZIP
242 file on disk if an existing archive was opened.
243
244
245.. method:: ZipFile.namelist()
246
247 Return a list of archive members by name.
248
249
Serhiy Storchakaf47fc552016-05-15 12:27:16 +0300250.. method:: ZipFile.open(name, mode='r', pwd=None, *, force_zip64=False)
Georg Brandl116aa622007-08-15 14:28:22 +0000251
Serhiy Storchakae670be22016-06-11 19:32:44 +0300252 Access a member of the archive as a binary file-like object. *name*
253 can be either the name of a file within the archive or a :class:`ZipInfo`
254 object. The *mode* parameter, if included, must be ``'r'`` (the default)
255 or ``'w'``. *pwd* is the password used to decrypt encrypted ZIP files.
Georg Brandl116aa622007-08-15 14:28:22 +0000256
Benjamin Petersonf0f14f72015-03-12 22:41:06 -0500257 :meth:`~ZipFile.open` is also a context manager and therefore supports the
Berker Peksagce77ee92015-03-13 02:29:54 +0200258 :keyword:`with` statement::
259
260 with ZipFile('spam.zip') as myzip:
261 with myzip.open('eggs.txt') as myfile:
262 print(myfile.read())
263
Serhiy Storchakae670be22016-06-11 19:32:44 +0300264 With *mode* ``'r'`` the file-like object
Serhiy Storchaka18ee29d2016-05-13 13:52:49 +0300265 (``ZipExtFile``) is read-only and provides the following methods:
266 :meth:`~io.BufferedIOBase.read`, :meth:`~io.IOBase.readline`,
John Jolly066df4f2018-01-30 01:51:35 -0700267 :meth:`~io.IOBase.readlines`, :meth:`~io.IOBase.seek`,
268 :meth:`~io.IOBase.tell`, :meth:`__iter__`, :meth:`~iterator.__next__`.
269 These objects can operate independently of the ZipFile.
Georg Brandl116aa622007-08-15 14:28:22 +0000270
Serhiy Storchaka18ee29d2016-05-13 13:52:49 +0300271 With ``mode='w'``, a writable file handle is returned, which supports the
272 :meth:`~io.BufferedIOBase.write` method. While a writable file handle is open,
273 attempting to read or write other files in the ZIP file will raise a
Serhiy Storchakab0d497c2016-09-10 21:28:07 +0300274 :exc:`ValueError`.
Georg Brandl116aa622007-08-15 14:28:22 +0000275
Serhiy Storchaka18ee29d2016-05-13 13:52:49 +0300276 When writing a file, if the file size is not known in advance but may exceed
277 2 GiB, pass ``force_zip64=True`` to ensure that the header format is
278 capable of supporting large files. If the file size is known in advance,
279 construct a :class:`ZipInfo` object with :attr:`~ZipInfo.file_size` set, and
280 use that as the *name* parameter.
Georg Brandl116aa622007-08-15 14:28:22 +0000281
Georg Brandlb533e262008-05-25 18:19:30 +0000282 .. note::
283
Andrew Svetlovafbf90c2012-10-06 18:02:05 +0300284 The :meth:`.open`, :meth:`read` and :meth:`extract` methods can take a filename
Georg Brandlb533e262008-05-25 18:19:30 +0000285 or a :class:`ZipInfo` object. You will appreciate this when trying to read a
286 ZIP file that contains members with duplicate names.
287
Serhiy Storchakae670be22016-06-11 19:32:44 +0300288 .. versionchanged:: 3.6
289 Removed support of ``mode='U'``. Use :class:`io.TextIOWrapper` for reading
Serhiy Storchaka6787a382013-11-23 22:12:06 +0200290 compressed text files in :term:`universal newlines` mode.
Georg Brandl116aa622007-08-15 14:28:22 +0000291
Serhiy Storchaka18ee29d2016-05-13 13:52:49 +0300292 .. versionchanged:: 3.6
293 :meth:`open` can now be used to write files into the archive with the
294 ``mode='w'`` option.
295
Serhiy Storchakab0d497c2016-09-10 21:28:07 +0300296 .. versionchanged:: 3.6
297 Calling :meth:`.open` on a closed ZipFile will raise a :exc:`ValueError`.
298 Previously, a :exc:`RuntimeError` was raised.
299
300
Georg Brandl7f01a132009-09-16 15:58:14 +0000301.. method:: ZipFile.extract(member, path=None, pwd=None)
Christian Heimes790c8232008-01-07 21:14:23 +0000302
Georg Brandlb533e262008-05-25 18:19:30 +0000303 Extract a member from the archive to the current working directory; *member*
Berker Peksaga0643822016-06-24 12:56:50 +0300304 must be its full name or a :class:`ZipInfo` object. Its file information is
Georg Brandlb533e262008-05-25 18:19:30 +0000305 extracted as accurately as possible. *path* specifies a different directory
306 to extract to. *member* can be a filename or a :class:`ZipInfo` object.
307 *pwd* is the password used for encrypted files.
Christian Heimes790c8232008-01-07 21:14:23 +0000308
Zachary Wareae9f0fe2015-04-13 16:40:49 -0500309 Returns the normalized path created (a directory or new file).
310
Gregory P. Smithb47acbf2013-02-01 11:22:43 -0800311 .. note::
312
313 If a member filename is an absolute path, a drive/UNC sharepoint and
314 leading (back)slashes will be stripped, e.g.: ``///foo/bar`` becomes
Serhiy Storchaka44b8cbf2013-02-02 13:27:30 +0200315 ``foo/bar`` on Unix, and ``C:\foo\bar`` becomes ``foo\bar`` on Windows.
Gregory P. Smithb47acbf2013-02-01 11:22:43 -0800316 And all ``".."`` components in a member filename will be removed, e.g.:
317 ``../../foo../../ba..r`` becomes ``foo../ba..r``. On Windows illegal
318 characters (``:``, ``<``, ``>``, ``|``, ``"``, ``?``, and ``*``)
319 replaced by underscore (``_``).
320
Serhiy Storchakab0d497c2016-09-10 21:28:07 +0300321 .. versionchanged:: 3.6
322 Calling :meth:`extract` on a closed ZipFile will raise a
323 :exc:`ValueError`. Previously, a :exc:`RuntimeError` was raised.
324
Serhiy Storchaka8606e952017-03-08 14:37:51 +0200325 .. versionchanged:: 3.6.2
326 The *path* parameter accepts a :term:`path-like object`.
327
Christian Heimes790c8232008-01-07 21:14:23 +0000328
Georg Brandl7f01a132009-09-16 15:58:14 +0000329.. method:: ZipFile.extractall(path=None, members=None, pwd=None)
Christian Heimes790c8232008-01-07 21:14:23 +0000330
Georg Brandl48310cd2009-01-03 21:18:54 +0000331 Extract all members from the archive to the current working directory. *path*
Christian Heimes790c8232008-01-07 21:14:23 +0000332 specifies a different directory to extract to. *members* is optional and must
333 be a subset of the list returned by :meth:`namelist`. *pwd* is the password
334 used for encrypted files.
335
Gregory P. Smithf1319d82013-02-07 22:15:04 -0800336 .. warning::
Benjamin Petersona0dfa822009-11-13 02:25:08 +0000337
Gregory P. Smithf1319d82013-02-07 22:15:04 -0800338 Never extract archives from untrusted sources without prior inspection.
339 It is possible that files are created outside of *path*, e.g. members
340 that have absolute filenames starting with ``"/"`` or filenames with two
Gregory P. Smith1d824ec2013-02-07 22:17:21 -0800341 dots ``".."``. This module attempts to prevent that.
Gregory P. Smithb47acbf2013-02-01 11:22:43 -0800342 See :meth:`extract` note.
Benjamin Petersona0dfa822009-11-13 02:25:08 +0000343
Serhiy Storchakab0d497c2016-09-10 21:28:07 +0300344 .. versionchanged:: 3.6
345 Calling :meth:`extractall` on a closed ZipFile will raise a
346 :exc:`ValueError`. Previously, a :exc:`RuntimeError` was raised.
347
Serhiy Storchaka8606e952017-03-08 14:37:51 +0200348 .. versionchanged:: 3.6.2
349 The *path* parameter accepts a :term:`path-like object`.
350
Christian Heimes790c8232008-01-07 21:14:23 +0000351
Georg Brandl116aa622007-08-15 14:28:22 +0000352.. method:: ZipFile.printdir()
353
354 Print a table of contents for the archive to ``sys.stdout``.
355
356
357.. method:: ZipFile.setpassword(pwd)
358
359 Set *pwd* as default password to extract encrypted files.
360
Georg Brandl116aa622007-08-15 14:28:22 +0000361
Georg Brandl7f01a132009-09-16 15:58:14 +0000362.. method:: ZipFile.read(name, pwd=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000363
Georg Brandlb533e262008-05-25 18:19:30 +0000364 Return the bytes of the file *name* in the archive. *name* is the name of the
365 file in the archive, or a :class:`ZipInfo` object. The archive must be open for
366 read or append. *pwd* is the password used for encrypted files and, if specified,
367 it will override the default password set with :meth:`setpassword`. Calling
Gregory P. Smithf2a448a2015-04-14 10:02:20 -0700368 :meth:`read` on a ZipFile that uses a compression method other than
Gregory P. Smith23a6a0d2015-04-14 10:04:30 -0700369 :const:`ZIP_STORED`, :const:`ZIP_DEFLATED`, :const:`ZIP_BZIP2` or
Gregory P. Smithf2a448a2015-04-14 10:02:20 -0700370 :const:`ZIP_LZMA` will raise a :exc:`NotImplementedError`. An error will also
371 be raised if the corresponding compression module is not available.
Georg Brandl116aa622007-08-15 14:28:22 +0000372
Serhiy Storchakab0d497c2016-09-10 21:28:07 +0300373 .. versionchanged:: 3.6
374 Calling :meth:`read` on a closed ZipFile will raise a :exc:`ValueError`.
375 Previously, a :exc:`RuntimeError` was raised.
376
Georg Brandl116aa622007-08-15 14:28:22 +0000377
378.. method:: ZipFile.testzip()
379
380 Read all the files in the archive and check their CRC's and file headers.
Serhiy Storchakab0d497c2016-09-10 21:28:07 +0300381 Return the name of the first bad file, or else return ``None``.
382
383 .. versionchanged:: 3.6
nsrip40bf6cf2018-10-27 10:42:56 -0400384 Calling :meth:`testzip` on a closed ZipFile will raise a
Serhiy Storchakab0d497c2016-09-10 21:28:07 +0300385 :exc:`ValueError`. Previously, a :exc:`RuntimeError` was raised.
Georg Brandl116aa622007-08-15 14:28:22 +0000386
387
Bo Baylesce237c72018-01-29 23:54:07 -0600388.. method:: ZipFile.write(filename, arcname=None, compress_type=None, \
Marcel Plch77b112c2018-08-31 16:43:31 +0200389 compresslevel=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000390
391 Write the file named *filename* to the archive, giving it the archive name
392 *arcname* (by default, this will be the same as *filename*, but without a drive
393 letter and with leading path separators removed). If given, *compress_type*
394 overrides the value given for the *compression* parameter to the constructor for
Bo Baylesce237c72018-01-29 23:54:07 -0600395 the new entry. Similarly, *compresslevel* will override the constructor if
396 given.
Serhiy Storchakab0d497c2016-09-10 21:28:07 +0300397 The archive must be open with mode ``'w'``, ``'x'`` or ``'a'``.
Georg Brandl116aa622007-08-15 14:28:22 +0000398
399 .. note::
400
Georg Brandl116aa622007-08-15 14:28:22 +0000401 Archive names should be relative to the archive root, that is, they should not
402 start with a path separator.
403
404 .. note::
405
406 If ``arcname`` (or ``filename``, if ``arcname`` is not given) contains a null
407 byte, the name of the file in the archive will be truncated at the null byte.
408
Serhiy Storchakab0d497c2016-09-10 21:28:07 +0300409 .. versionchanged:: 3.6
410 Calling :meth:`write` on a ZipFile created with mode ``'r'`` or
411 a closed ZipFile will raise a :exc:`ValueError`. Previously,
412 a :exc:`RuntimeError` was raised.
413
414
Bo Baylesce237c72018-01-29 23:54:07 -0600415.. method:: ZipFile.writestr(zinfo_or_arcname, data, compress_type=None, \
416 compresslevel=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000417
Serhiy Storchaka4bb186d2018-11-25 09:51:14 +0200418 Write a file into the archive. The contents is *data*, which may be either
419 a :class:`str` or a :class:`bytes` instance; if it is a :class:`str`,
420 it is encoded as UTF-8 first. *zinfo_or_arcname* is either the file
Georg Brandl116aa622007-08-15 14:28:22 +0000421 name it will be given in the archive, or a :class:`ZipInfo` instance. If it's
422 an instance, at least the filename, date, and time must be given. If it's a
Serhiy Storchaka764fc9b2015-03-25 10:09:41 +0200423 name, the date and time is set to the current date and time.
Serhiy Storchakab0d497c2016-09-10 21:28:07 +0300424 The archive must be opened with mode ``'w'``, ``'x'`` or ``'a'``.
Georg Brandl116aa622007-08-15 14:28:22 +0000425
Ronald Oussorenee5c8852010-02-07 20:24:02 +0000426 If given, *compress_type* overrides the value given for the *compression*
427 parameter to the constructor for the new entry, or in the *zinfo_or_arcname*
Bo Baylesce237c72018-01-29 23:54:07 -0600428 (if that is a :class:`ZipInfo` instance). Similarly, *compresslevel* will
429 override the constructor if given.
Ronald Oussorenee5c8852010-02-07 20:24:02 +0000430
Christian Heimes790c8232008-01-07 21:14:23 +0000431 .. note::
432
Éric Araujo0d4bcf42010-12-26 17:53:27 +0000433 When passing a :class:`ZipInfo` instance as the *zinfo_or_arcname* parameter,
Georg Brandl48310cd2009-01-03 21:18:54 +0000434 the compression method used will be that specified in the *compress_type*
435 member of the given :class:`ZipInfo` instance. By default, the
Christian Heimes790c8232008-01-07 21:14:23 +0000436 :class:`ZipInfo` constructor sets this member to :const:`ZIP_STORED`.
437
Ezio Melottif8754a62010-03-21 07:16:43 +0000438 .. versionchanged:: 3.2
Andrew Svetlovafbf90c2012-10-06 18:02:05 +0300439 The *compress_type* argument.
Ronald Oussorenee5c8852010-02-07 20:24:02 +0000440
Serhiy Storchakab0d497c2016-09-10 21:28:07 +0300441 .. versionchanged:: 3.6
442 Calling :meth:`writestr` on a ZipFile created with mode ``'r'`` or
443 a closed ZipFile will raise a :exc:`ValueError`. Previously,
444 a :exc:`RuntimeError` was raised.
445
446
Martin v. Löwisb09b8442008-07-03 14:13:42 +0000447The following data attributes are also available:
Georg Brandl116aa622007-08-15 14:28:22 +0000448
Serhiy Storchaka8606e952017-03-08 14:37:51 +0200449.. attribute:: ZipFile.filename
450
451 Name of the ZIP file.
Georg Brandl116aa622007-08-15 14:28:22 +0000452
453.. attribute:: ZipFile.debug
454
455 The level of debug output to use. This may be set from ``0`` (the default, no
456 output) to ``3`` (the most output). Debugging information is written to
457 ``sys.stdout``.
458
Martin v. Löwisb09b8442008-07-03 14:13:42 +0000459.. attribute:: ZipFile.comment
460
Serhiy Storchaka4bb186d2018-11-25 09:51:14 +0200461 The comment associated with the ZIP file as a :class:`bytes` object.
462 If assigning a comment to a
Serhiy Storchaka764fc9b2015-03-25 10:09:41 +0200463 :class:`ZipFile` instance created with mode ``'w'``, ``'x'`` or ``'a'``,
Serhiy Storchaka4bb186d2018-11-25 09:51:14 +0200464 it should be no longer than 65535 bytes. Comments longer than this will be
465 truncated.
Georg Brandl116aa622007-08-15 14:28:22 +0000466
Georg Brandl8334fd92010-12-04 10:26:46 +0000467
Jason R. Coombsb2758ff2019-05-08 09:45:06 -0400468.. _path-objects:
469
470Path Objects
471------------
472
473.. class:: Path(root, at='')
474
475 Construct a Path object from a ``root`` zipfile (which may be a
476 :class:`ZipFile` instance or ``file`` suitable for passing to
477 the :class:`ZipFile` constructor).
478
479 ``at`` specifies the location of this Path within the zipfile,
480 e.g. 'dir/file.txt', 'dir/', or ''. Defaults to the empty string,
481 indicating the root.
482
483Path objects expose the following features of :mod:`pathlib.Path`
484objects:
485
Jason R. Coombs928dbfc2020-12-15 21:12:54 -0500486Path objects are traversable using the ``/`` operator or ``joinpath``.
Jason R. Coombsb2758ff2019-05-08 09:45:06 -0400487
488.. attribute:: Path.name
489
490 The final path component.
491
Jason R. Coombs0aeab5c2020-02-29 10:34:11 -0600492.. method:: Path.open(mode='r', *, pwd, **)
Jason R. Coombsb2758ff2019-05-08 09:45:06 -0400493
Jason R. Coombs0aeab5c2020-02-29 10:34:11 -0600494 Invoke :meth:`ZipFile.open` on the current path.
495 Allows opening for read or write, text or binary
496 through supported modes: 'r', 'w', 'rb', 'wb'.
497 Positional and keyword arguments are passed through to
498 :class:`io.TextIOWrapper` when opened as text and
499 ignored otherwise.
500 ``pwd`` is the ``pwd`` parameter to
501 :meth:`ZipFile.open`.
502
503 .. versionchanged:: 3.9
504 Added support for text and binary modes for open. Default
505 mode is now text.
Jason R. Coombsb2758ff2019-05-08 09:45:06 -0400506
Claudiu Popa65444cf2019-11-21 22:23:13 +0100507.. method:: Path.iterdir()
Jason R. Coombsb2758ff2019-05-08 09:45:06 -0400508
509 Enumerate the children of the current directory.
510
511.. method:: Path.is_dir()
512
513 Return ``True`` if the current context references a directory.
514
515.. method:: Path.is_file()
516
517 Return ``True`` if the current context references a file.
518
519.. method:: Path.exists()
520
521 Return ``True`` if the current context references a file or
522 directory in the zip file.
523
524.. method:: Path.read_text(*, **)
525
526 Read the current file as unicode text. Positional and
527 keyword arguments are passed through to
528 :class:`io.TextIOWrapper` (except ``buffer``, which is
529 implied by the context).
530
531.. method:: Path.read_bytes()
532
533 Read the current file as bytes.
534
Jason R. Coombs928dbfc2020-12-15 21:12:54 -0500535.. method:: Path.joinpath(*other)
536
537 Return a new Path object with each of the *other* arguments
538 joined. The following are equivalent::
539
540 >>> Path(...).joinpath('child').joinpath('grandchild')
541 >>> Path(...).joinpath('child', 'grandchild')
542 >>> Path(...) / 'child' / 'grandchild'
543
544 .. versionchanged:: 3.10
545 Prior to 3.10, ``joinpath`` was undocumented and accepted
546 exactly one parameter.
547
Jason R. Coombsb2758ff2019-05-08 09:45:06 -0400548
Georg Brandl116aa622007-08-15 14:28:22 +0000549.. _pyzipfile-objects:
550
551PyZipFile Objects
552-----------------
553
554The :class:`PyZipFile` constructor takes the same parameters as the
Georg Brandl8334fd92010-12-04 10:26:46 +0000555:class:`ZipFile` constructor, and one additional parameter, *optimize*.
Georg Brandl116aa622007-08-15 14:28:22 +0000556
Serhiy Storchaka235c5e02013-11-23 15:55:38 +0200557.. class:: PyZipFile(file, mode='r', compression=ZIP_STORED, allowZip64=True, \
Georg Brandl8334fd92010-12-04 10:26:46 +0000558 optimize=-1)
Georg Brandl116aa622007-08-15 14:28:22 +0000559
Georg Brandl8334fd92010-12-04 10:26:46 +0000560 .. versionadded:: 3.2
561 The *optimize* parameter.
Georg Brandl116aa622007-08-15 14:28:22 +0000562
Serhiy Storchaka235c5e02013-11-23 15:55:38 +0200563 .. versionchanged:: 3.4
564 ZIP64 extensions are enabled by default.
565
Georg Brandl8334fd92010-12-04 10:26:46 +0000566 Instances have one method in addition to those of :class:`ZipFile` objects:
Georg Brandl116aa622007-08-15 14:28:22 +0000567
Christian Tismer59202e52013-10-21 03:59:23 +0200568 .. method:: PyZipFile.writepy(pathname, basename='', filterfunc=None)
569
Georg Brandl8334fd92010-12-04 10:26:46 +0000570 Search for files :file:`\*.py` and add the corresponding file to the
571 archive.
572
573 If the *optimize* parameter to :class:`PyZipFile` was not given or ``-1``,
Brett Cannonf299abd2015-04-13 14:21:02 -0400574 the corresponding file is a :file:`\*.pyc` file, compiling if necessary.
Georg Brandl8334fd92010-12-04 10:26:46 +0000575
576 If the *optimize* parameter to :class:`PyZipFile` was ``0``, ``1`` or
577 ``2``, only files with that optimization level (see :func:`compile`) are
578 added to the archive, compiling if necessary.
579
Larry Hastings3732ed22014-03-15 21:13:56 -0700580 If *pathname* is a file, the filename must end with :file:`.py`, and
Xiang Zhang0710d752017-03-11 13:02:52 +0800581 just the (corresponding :file:`\*.pyc`) file is added at the top level
Larry Hastings3732ed22014-03-15 21:13:56 -0700582 (no path information). If *pathname* is a file that does not end with
Georg Brandl8334fd92010-12-04 10:26:46 +0000583 :file:`.py`, a :exc:`RuntimeError` will be raised. If it is a directory,
584 and the directory is not a package directory, then all the files
Xiang Zhang0710d752017-03-11 13:02:52 +0800585 :file:`\*.pyc` are added at the top level. If the directory is a
586 package directory, then all :file:`\*.pyc` are added under the package
Georg Brandl8334fd92010-12-04 10:26:46 +0000587 name as a file path, and if any subdirectories are package directories,
Bernhard M. Wiedemann84521042018-01-31 11:17:10 +0100588 all of these are added recursively in sorted order.
Larry Hastings3732ed22014-03-15 21:13:56 -0700589
590 *basename* is intended for internal use only.
591
592 *filterfunc*, if given, must be a function taking a single string
593 argument. It will be passed each path (including each individual full
594 file path) before it is added to the archive. If *filterfunc* returns a
595 false value, the path will not be added, and if it is a directory its
596 contents will be ignored. For example, if our test files are all either
597 in ``test`` directories or start with the string ``test_``, we can use a
598 *filterfunc* to exclude them::
599
600 >>> zf = PyZipFile('myprog.zip')
601 >>> def notests(s):
602 ... fn = os.path.basename(s)
603 ... return (not (fn == 'test' or fn.startswith('test_')))
604 >>> zf.writepy('myprog', filterfunc=notests)
605
Christian Tismer59202e52013-10-21 03:59:23 +0200606 The :meth:`writepy` method makes archives with file names like
Georg Brandl8334fd92010-12-04 10:26:46 +0000607 this::
608
609 string.pyc # Top level name
610 test/__init__.pyc # Package directory
611 test/testall.pyc # Module test.testall
612 test/bogus/__init__.pyc # Subpackage directory
613 test/bogus/myfile.pyc # Submodule test.bogus.myfile
Georg Brandl116aa622007-08-15 14:28:22 +0000614
Georg Brandla6065422013-10-21 08:29:29 +0200615 .. versionadded:: 3.4
616 The *filterfunc* parameter.
617
Serhiy Storchaka8606e952017-03-08 14:37:51 +0200618 .. versionchanged:: 3.6.2
619 The *pathname* parameter accepts a :term:`path-like object`.
620
Bernhard M. Wiedemann84521042018-01-31 11:17:10 +0100621 .. versionchanged:: 3.7
622 Recursion sorts directory entries.
623
Georg Brandl116aa622007-08-15 14:28:22 +0000624
625.. _zipinfo-objects:
626
627ZipInfo Objects
628---------------
629
Andrew Svetlovafbf90c2012-10-06 18:02:05 +0300630Instances of the :class:`ZipInfo` class are returned by the :meth:`.getinfo` and
631:meth:`.infolist` methods of :class:`ZipFile` objects. Each object stores
Georg Brandl116aa622007-08-15 14:28:22 +0000632information about a single member of the ZIP archive.
633
Serhiy Storchaka503f9082016-02-08 00:02:25 +0200634There is one classmethod to make a :class:`ZipInfo` instance for a filesystem
635file:
636
Marcel Plcha2fe1e52018-08-02 15:04:52 +0200637.. classmethod:: ZipInfo.from_file(filename, arcname=None, *, \
638 strict_timestamps=True)
Serhiy Storchaka503f9082016-02-08 00:02:25 +0200639
640 Construct a :class:`ZipInfo` instance for a file on the filesystem, in
641 preparation for adding it to a zip file.
642
643 *filename* should be the path to a file or directory on the filesystem.
644
645 If *arcname* is specified, it is used as the name within the archive.
646 If *arcname* is not specified, the name will be the same as *filename*, but
647 with any drive letter and leading path separators removed.
648
Marcel Plcha2fe1e52018-08-02 15:04:52 +0200649 The *strict_timestamps* argument, when set to ``False``, allows to
650 zip files older than 1980-01-01 at the cost of setting the
651 timestamp to 1980-01-01.
652 Similar behavior occurs with files newer than 2107-12-31,
653 the timestamp is also set to the limit.
654
Serhiy Storchaka503f9082016-02-08 00:02:25 +0200655 .. versionadded:: 3.6
656
Serhiy Storchaka8606e952017-03-08 14:37:51 +0200657 .. versionchanged:: 3.6.2
658 The *filename* parameter accepts a :term:`path-like object`.
659
Marcel Plcha2fe1e52018-08-02 15:04:52 +0200660 .. versionadded:: 3.8
661 The *strict_timestamps* keyword-only argument
662
Serhiy Storchaka8606e952017-03-08 14:37:51 +0200663
Serhiy Storchakaf47fc552016-05-15 12:27:16 +0300664Instances have the following methods and attributes:
665
666.. method:: ZipInfo.is_dir()
667
Serhiy Storchaka7d6dda42016-10-19 18:36:51 +0300668 Return ``True`` if this archive member is a directory.
Serhiy Storchakaf47fc552016-05-15 12:27:16 +0300669
670 This uses the entry's name: directories should always end with ``/``.
671
672 .. versionadded:: 3.6
Georg Brandl116aa622007-08-15 14:28:22 +0000673
674
675.. attribute:: ZipInfo.filename
676
677 Name of the file in the archive.
678
679
680.. attribute:: ZipInfo.date_time
681
682 The time and date of the last modification to the archive member. This is a
683 tuple of six values:
684
685 +-------+--------------------------+
686 | Index | Value |
687 +=======+==========================+
Senthil Kumaran29fa9d42011-10-20 01:46:00 +0800688 | ``0`` | Year (>= 1980) |
Georg Brandl116aa622007-08-15 14:28:22 +0000689 +-------+--------------------------+
690 | ``1`` | Month (one-based) |
691 +-------+--------------------------+
692 | ``2`` | Day of month (one-based) |
693 +-------+--------------------------+
694 | ``3`` | Hours (zero-based) |
695 +-------+--------------------------+
696 | ``4`` | Minutes (zero-based) |
697 +-------+--------------------------+
698 | ``5`` | Seconds (zero-based) |
699 +-------+--------------------------+
700
Senthil Kumaran29fa9d42011-10-20 01:46:00 +0800701 .. note::
702
703 The ZIP file format does not support timestamps before 1980.
704
Georg Brandl116aa622007-08-15 14:28:22 +0000705
706.. attribute:: ZipInfo.compress_type
707
708 Type of compression for the archive member.
709
710
711.. attribute:: ZipInfo.comment
712
Serhiy Storchaka4bb186d2018-11-25 09:51:14 +0200713 Comment for the individual archive member as a :class:`bytes` object.
Georg Brandl116aa622007-08-15 14:28:22 +0000714
715
716.. attribute:: ZipInfo.extra
717
Georg Brandl5d941342016-02-26 19:37:12 +0100718 Expansion field data. The `PKZIP Application Note`_ contains
Serhiy Storchaka4bb186d2018-11-25 09:51:14 +0200719 some comments on the internal structure of the data contained in this
720 :class:`bytes` object.
Georg Brandl116aa622007-08-15 14:28:22 +0000721
722
723.. attribute:: ZipInfo.create_system
724
725 System which created ZIP archive.
726
727
728.. attribute:: ZipInfo.create_version
729
730 PKZIP version which created ZIP archive.
731
732
733.. attribute:: ZipInfo.extract_version
734
735 PKZIP version needed to extract archive.
736
737
738.. attribute:: ZipInfo.reserved
739
740 Must be zero.
741
742
743.. attribute:: ZipInfo.flag_bits
744
745 ZIP flag bits.
746
747
748.. attribute:: ZipInfo.volume
749
750 Volume number of file header.
751
752
753.. attribute:: ZipInfo.internal_attr
754
755 Internal attributes.
756
757
758.. attribute:: ZipInfo.external_attr
759
760 External file attributes.
761
762
763.. attribute:: ZipInfo.header_offset
764
765 Byte offset to the file header.
766
767
768.. attribute:: ZipInfo.CRC
769
770 CRC-32 of the uncompressed file.
771
772
773.. attribute:: ZipInfo.compress_size
774
775 Size of the compressed data.
776
777
778.. attribute:: ZipInfo.file_size
779
780 Size of the uncompressed file.
Serhiy Storchaka503f9082016-02-08 00:02:25 +0200781
Serhiy Storchaka503f9082016-02-08 00:02:25 +0200782
Serhiy Storchaka92c1a902016-11-02 12:06:15 +0200783.. _zipfile-commandline:
784.. program:: zipfile
785
786Command-Line Interface
787----------------------
788
789The :mod:`zipfile` module provides a simple command-line interface to interact
790with ZIP archives.
791
792If you want to create a new ZIP archive, specify its name after the :option:`-c`
793option and then list the filename(s) that should be included:
794
795.. code-block:: shell-session
796
797 $ python -m zipfile -c monty.zip spam.txt eggs.txt
798
799Passing a directory is also acceptable:
800
801.. code-block:: shell-session
802
803 $ python -m zipfile -c monty.zip life-of-brian_1979/
804
805If you want to extract a ZIP archive into the specified directory, use
806the :option:`-e` option:
807
808.. code-block:: shell-session
809
810 $ python -m zipfile -e monty.zip target-dir/
811
812For a list of the files in a ZIP archive, use the :option:`-l` option:
813
814.. code-block:: shell-session
815
816 $ python -m zipfile -l monty.zip
817
818
819Command-line options
820~~~~~~~~~~~~~~~~~~~~
821
822.. cmdoption:: -l <zipfile>
Serhiy Storchaka5a97bf72016-11-02 12:13:48 +0200823 --list <zipfile>
Serhiy Storchaka92c1a902016-11-02 12:06:15 +0200824
825 List files in a zipfile.
826
827.. cmdoption:: -c <zipfile> <source1> ... <sourceN>
Serhiy Storchaka5a97bf72016-11-02 12:13:48 +0200828 --create <zipfile> <source1> ... <sourceN>
Serhiy Storchaka92c1a902016-11-02 12:06:15 +0200829
830 Create zipfile from source files.
831
832.. cmdoption:: -e <zipfile> <output_dir>
Serhiy Storchaka5a97bf72016-11-02 12:13:48 +0200833 --extract <zipfile> <output_dir>
Serhiy Storchaka92c1a902016-11-02 12:06:15 +0200834
835 Extract zipfile into target directory.
836
837.. cmdoption:: -t <zipfile>
Serhiy Storchaka5a97bf72016-11-02 12:13:48 +0200838 --test <zipfile>
Serhiy Storchaka92c1a902016-11-02 12:06:15 +0200839
840 Test whether the zipfile is valid or not.
841
JunWei Song3ba51d52019-09-11 23:04:12 +0800842Decompression pitfalls
843----------------------
Serhiy Storchaka92c1a902016-11-02 12:06:15 +0200844
JunWei Song3ba51d52019-09-11 23:04:12 +0800845The extraction in zipfile module might fail due to some pitfalls listed below.
846
847From file itself
848~~~~~~~~~~~~~~~~
849
850Decompression may fail due to incorrect password / CRC checksum / ZIP format or
851unsupported compression method / decryption.
852
853File System limitations
854~~~~~~~~~~~~~~~~~~~~~~~
855
856Exceeding limitations on different file systems can cause decompression failed.
857Such as allowable characters in the directory entries, length of the file name,
858length of the pathname, size of a single file, and number of files, etc.
859
860Resources limitations
861~~~~~~~~~~~~~~~~~~~~~
862
863The lack of memory or disk volume would lead to decompression
864failed. For example, decompression bombs (aka `ZIP bomb`_)
865apply to zipfile library that can cause disk volume exhaustion.
866
867Interruption
868~~~~~~~~~~~~
869
870Interruption during the decompression, such as pressing control-C or killing the
871decompression process may result in incomplete decompression of the archive.
872
873Default behaviors of extraction
874~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
875
876Not knowing the default extraction behaviors
877can cause unexpected decompression results.
878For example, when extracting the same archive twice,
879it overwrites files without asking.
880
881
882.. _ZIP bomb: https://en.wikipedia.org/wiki/Zip_bomb
Georg Brandl5d941342016-02-26 19:37:12 +0100883.. _PKZIP Application Note: https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT