blob: 3349a94446d0ce59919146c44e7303cc1eb3c50a [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`gzip` --- Support for :program:`gzip` files
2=================================================
3
4.. module:: gzip
5 :synopsis: Interfaces for gzip compression and decompression using file objects.
6
Raymond Hettinger469271d2011-01-27 20:38:46 +00007**Source code:** :source:`Lib/gzip.py`
8
9--------------
10
Christian Heimesbbe741d2008-03-28 10:53:29 +000011This module provides a simple interface to compress and decompress files just
12like the GNU programs :program:`gzip` and :program:`gunzip` would.
Georg Brandl116aa622007-08-15 14:28:22 +000013
Georg Brandl1f01deb2009-01-03 22:47:39 +000014The data compression is provided by the :mod:`zlib` module.
Christian Heimesbbe741d2008-03-28 10:53:29 +000015
Nadeem Vawda7e126202012-05-06 15:04:01 +020016The :mod:`gzip` module provides the :class:`GzipFile` class, as well as the
Nadeem Vawda68721012012-06-04 23:21:38 +020017:func:`.open`, :func:`compress` and :func:`decompress` convenience functions.
18The :class:`GzipFile` class reads and writes :program:`gzip`\ -format files,
19automatically compressing or decompressing the data so that it looks like an
20ordinary :term:`file object`.
Christian Heimesbbe741d2008-03-28 10:53:29 +000021
22Note that additional file formats which can be decompressed by the
23:program:`gzip` and :program:`gunzip` programs, such as those produced by
24:program:`compress` and :program:`pack`, are not supported by this module.
Georg Brandl116aa622007-08-15 14:28:22 +000025
Georg Brandl116aa622007-08-15 14:28:22 +000026The module defines the following items:
27
28
Nadeem Vawda7e126202012-05-06 15:04:01 +020029.. function:: open(filename, mode='rb', compresslevel=9, encoding=None, errors=None, newline=None)
30
Nadeem Vawda68721012012-06-04 23:21:38 +020031 Open a gzip-compressed file in binary or text mode, returning a :term:`file
32 object`.
Nadeem Vawda7e126202012-05-06 15:04:01 +020033
Nadeem Vawda68721012012-06-04 23:21:38 +020034 The *filename* argument can be an actual filename (a :class:`str` or
35 :class:`bytes` object), or an existing file object to read from or write to.
Nadeem Vawda7e126202012-05-06 15:04:01 +020036
37 The *mode* argument can be any of ``'r'``, ``'rb'``, ``'a'``, ``'ab'``,
Nadeem Vawdaee1be992013-10-19 00:11:13 +020038 ``'w'``, ``'wb'``, ``'x'`` or ``'xb'`` for binary mode, or ``'rt'``,
39 ``'at'``, ``'wt'``, or ``'xt'`` for text mode. The default is ``'rb'``.
Nadeem Vawda7e126202012-05-06 15:04:01 +020040
Nadeem Vawda6ff262e2012-11-11 14:14:47 +010041 The *compresslevel* argument is an integer from 0 to 9, as for the
Nadeem Vawda7e126202012-05-06 15:04:01 +020042 :class:`GzipFile` constructor.
43
44 For binary mode, this function is equivalent to the :class:`GzipFile`
45 constructor: ``GzipFile(filename, mode, compresslevel)``. In this case, the
46 *encoding*, *errors* and *newline* arguments must not be provided.
47
48 For text mode, a :class:`GzipFile` object is created, and wrapped in an
49 :class:`io.TextIOWrapper` instance with the specified encoding, error
50 handling behavior, and line ending(s).
51
52 .. versionchanged:: 3.3
Nadeem Vawda68721012012-06-04 23:21:38 +020053 Added support for *filename* being a file object, support for text mode,
54 and the *encoding*, *errors* and *newline* arguments.
Nadeem Vawda7e126202012-05-06 15:04:01 +020055
Nadeem Vawdaee1be992013-10-19 00:11:13 +020056 .. versionchanged:: 3.4
57 Added support for the ``'x'``, ``'xb'`` and ``'xt'`` modes.
58
Berker Peksag03020cf2016-10-02 13:47:58 +030059 .. versionchanged:: 3.6
60 Accepts a :term:`path-like object`.
Nadeem Vawda7e126202012-05-06 15:04:01 +020061
Zackery Spytzcf599f62019-05-13 01:50:52 -060062.. exception:: BadGzipFile
63
64 An exception raised for invalid gzip files. It inherits :exc:`OSError`.
65 :exc:`EOFError` and :exc:`zlib.error` can also be raised for invalid gzip
66 files.
67
68 .. versionadded:: 3.8
69
Georg Brandl036490d2009-05-17 13:00:36 +000070.. class:: GzipFile(filename=None, mode=None, compresslevel=9, fileobj=None, mtime=None)
Georg Brandl116aa622007-08-15 14:28:22 +000071
Antoine Pitrouc3ed2e72010-09-29 10:49:46 +000072 Constructor for the :class:`GzipFile` class, which simulates most of the
73 methods of a :term:`file object`, with the exception of the :meth:`truncate`
74 method. At least one of *fileobj* and *filename* must be given a non-trivial
75 value.
Georg Brandl116aa622007-08-15 14:28:22 +000076
Serhiy Storchakad65c9492015-11-02 14:10:23 +020077 The new class instance is based on *fileobj*, which can be a regular file, an
Serhiy Storchakae79be872013-08-17 00:09:55 +030078 :class:`io.BytesIO` object, or any other object which simulates a file. It
Georg Brandl116aa622007-08-15 14:28:22 +000079 defaults to ``None``, in which case *filename* is opened to provide a file
80 object.
81
82 When *fileobj* is not ``None``, the *filename* argument is only used to be
Georg Brandlf27bfd82013-10-06 12:33:20 +020083 included in the :program:`gzip` file header, which may include the original
Georg Brandl116aa622007-08-15 14:28:22 +000084 filename of the uncompressed file. It defaults to the filename of *fileobj*, if
85 discernible; otherwise, it defaults to the empty string, and in this case the
86 original filename is not included in the header.
87
88 The *mode* argument can be any of ``'r'``, ``'rb'``, ``'a'``, ``'ab'``, ``'w'``,
Nadeem Vawdaee1be992013-10-19 00:11:13 +020089 ``'wb'``, ``'x'``, or ``'xb'``, depending on whether the file will be read or
90 written. The default is the mode of *fileobj* if discernible; otherwise, the
91 default is ``'rb'``.
Nadeem Vawda30d94b72012-02-11 23:45:10 +020092
Nadeem Vawda7e126202012-05-06 15:04:01 +020093 Note that the file is always opened in binary mode. To open a compressed file
Nadeem Vawda68721012012-06-04 23:21:38 +020094 in text mode, use :func:`.open` (or wrap your :class:`GzipFile` with an
Nadeem Vawda7e126202012-05-06 15:04:01 +020095 :class:`io.TextIOWrapper`).
Georg Brandl116aa622007-08-15 14:28:22 +000096
Nadeem Vawda19e568d2012-11-11 14:04:14 +010097 The *compresslevel* argument is an integer from ``0`` to ``9`` controlling
98 the level of compression; ``1`` is fastest and produces the least
99 compression, and ``9`` is slowest and produces the most compression. ``0``
100 is no compression. The default is ``9``.
Georg Brandl116aa622007-08-15 14:28:22 +0000101
Antoine Pitrou42db3ef2009-01-04 21:37:59 +0000102 The *mtime* argument is an optional numeric timestamp to be written to
Antoine Pitrou2dbc6e62015-04-11 00:31:01 +0200103 the last modification time field in the stream when compressing. It
104 should only be provided in compression mode. If omitted or ``None``, the
105 current time is used. See the :attr:`mtime` attribute for more details.
Antoine Pitrou42db3ef2009-01-04 21:37:59 +0000106
Georg Brandl116aa622007-08-15 14:28:22 +0000107 Calling a :class:`GzipFile` object's :meth:`close` method does not close
108 *fileobj*, since you might wish to append more material after the compressed
Martin Panter7462b6492015-11-02 03:37:02 +0000109 data. This also allows you to pass an :class:`io.BytesIO` object opened for
Georg Brandl116aa622007-08-15 14:28:22 +0000110 writing as *fileobj*, and retrieve the resulting memory buffer using the
Antoine Pitroue5768cf2010-09-23 16:45:17 +0000111 :class:`io.BytesIO` object's :meth:`~io.BytesIO.getvalue` method.
Georg Brandl116aa622007-08-15 14:28:22 +0000112
Antoine Pitrouc3ed2e72010-09-29 10:49:46 +0000113 :class:`GzipFile` supports the :class:`io.BufferedIOBase` interface,
114 including iteration and the :keyword:`with` statement. Only the
115 :meth:`truncate` method isn't implemented.
Benjamin Petersone0124bd2009-03-09 21:04:33 +0000116
Antoine Pitrou2dbc6e62015-04-11 00:31:01 +0200117 :class:`GzipFile` also provides the following method and attribute:
Antoine Pitrou7b998e92010-10-04 21:55:14 +0000118
Antoine Pitrou2dbc6e62015-04-11 00:31:01 +0200119 .. method:: peek(n)
Antoine Pitrou7b998e92010-10-04 21:55:14 +0000120
121 Read *n* uncompressed bytes without advancing the file position.
122 At most one single read on the compressed stream is done to satisfy
123 the call. The number of bytes returned may be more or less than
124 requested.
125
Nadeem Vawda69761042013-12-08 19:47:22 +0100126 .. note:: While calling :meth:`peek` does not change the file position of
127 the :class:`GzipFile`, it may change the position of the underlying
128 file object (e.g. if the :class:`GzipFile` was constructed with the
129 *fileobj* parameter).
130
Antoine Pitrou7b998e92010-10-04 21:55:14 +0000131 .. versionadded:: 3.2
132
Antoine Pitrou2dbc6e62015-04-11 00:31:01 +0200133 .. attribute:: mtime
134
135 When decompressing, the value of the last modification time field in
136 the most recently read header may be read from this attribute, as an
137 integer. The initial value before reading any headers is ``None``.
138
139 All :program:`gzip` compressed streams are required to contain this
140 timestamp field. Some programs, such as :program:`gunzip`\ , make use
141 of the timestamp. The format is the same as the return value of
142 :func:`time.time` and the :attr:`~os.stat_result.st_mtime` attribute of
143 the object returned by :func:`os.stat`.
144
Benjamin Peterson10745a92009-03-09 21:08:47 +0000145 .. versionchanged:: 3.1
Georg Brandlffb94ae2013-10-06 19:02:08 +0200146 Support for the :keyword:`with` statement was added, along with the
Antoine Pitrou2dbc6e62015-04-11 00:31:01 +0200147 *mtime* constructor argument and :attr:`mtime` attribute.
Benjamin Petersone0124bd2009-03-09 21:04:33 +0000148
Antoine Pitrou8e33fd72010-01-13 14:37:26 +0000149 .. versionchanged:: 3.2
Georg Brandlffb94ae2013-10-06 19:02:08 +0200150 Support for zero-padded and unseekable files was added.
Antoine Pitrou7b969842010-09-23 16:22:51 +0000151
Antoine Pitrou6b4be362011-04-04 21:09:05 +0200152 .. versionchanged:: 3.3
153 The :meth:`io.BufferedIOBase.read1` method is now implemented.
154
Nadeem Vawdaee1be992013-10-19 00:11:13 +0200155 .. versionchanged:: 3.4
156 Added support for the ``'x'`` and ``'xb'`` modes.
157
Serhiy Storchakabca63b32015-03-23 14:59:48 +0200158 .. versionchanged:: 3.5
159 Added support for writing arbitrary
160 :term:`bytes-like objects <bytes-like object>`.
Antoine Pitrou2dbc6e62015-04-11 00:31:01 +0200161 The :meth:`~io.BufferedIOBase.read` method now accepts an argument of
162 ``None``.
Serhiy Storchakabca63b32015-03-23 14:59:48 +0200163
Berker Peksag03020cf2016-10-02 13:47:58 +0300164 .. versionchanged:: 3.6
165 Accepts a :term:`path-like object`.
166
Georg Brandl116aa622007-08-15 14:28:22 +0000167
guoci0e7497c2018-11-07 04:50:23 -0500168.. function:: compress(data, compresslevel=9, *, mtime=None)
Antoine Pitrou79c5ef12010-08-17 21:10:05 +0000169
170 Compress the *data*, returning a :class:`bytes` object containing
guoci0e7497c2018-11-07 04:50:23 -0500171 the compressed data. *compresslevel* and *mtime* have the same meaning as in
Antoine Pitrou79c5ef12010-08-17 21:10:05 +0000172 the :class:`GzipFile` constructor above.
173
Antoine Pitroucdfe1c52010-08-17 21:15:00 +0000174 .. versionadded:: 3.2
guoci0e7497c2018-11-07 04:50:23 -0500175 .. versionchanged:: 3.8
176 Added the *mtime* parameter for reproducible output.
Antoine Pitroucdfe1c52010-08-17 21:15:00 +0000177
Antoine Pitrou79c5ef12010-08-17 21:10:05 +0000178.. function:: decompress(data)
179
180 Decompress the *data*, returning a :class:`bytes` object containing the
181 uncompressed data.
182
Antoine Pitroucdfe1c52010-08-17 21:15:00 +0000183 .. versionadded:: 3.2
184
Georg Brandl116aa622007-08-15 14:28:22 +0000185
Christian Heimesbbe741d2008-03-28 10:53:29 +0000186.. _gzip-usage-examples:
187
188Examples of usage
189-----------------
190
191Example of how to read a compressed file::
192
193 import gzip
Antoine Pitroubf1a0182010-08-17 21:11:49 +0000194 with gzip.open('/home/joe/file.txt.gz', 'rb') as f:
195 file_content = f.read()
Christian Heimesbbe741d2008-03-28 10:53:29 +0000196
197Example of how to create a compressed GZIP file::
198
199 import gzip
Antoine Pitroubf1a0182010-08-17 21:11:49 +0000200 content = b"Lots of content here"
201 with gzip.open('/home/joe/file.txt.gz', 'wb') as f:
202 f.write(content)
Christian Heimesbbe741d2008-03-28 10:53:29 +0000203
204Example of how to GZIP compress an existing file::
205
206 import gzip
Andrew Kuchlingf887a612015-04-14 11:44:40 -0400207 import shutil
Antoine Pitroubf1a0182010-08-17 21:11:49 +0000208 with open('/home/joe/file.txt', 'rb') as f_in:
Éric Araujof5be0902010-08-17 21:24:05 +0000209 with gzip.open('/home/joe/file.txt.gz', 'wb') as f_out:
Andrew Kuchlingf887a612015-04-14 11:44:40 -0400210 shutil.copyfileobj(f_in, f_out)
Christian Heimesbbe741d2008-03-28 10:53:29 +0000211
Antoine Pitrou79c5ef12010-08-17 21:10:05 +0000212Example of how to GZIP compress a binary string::
213
214 import gzip
215 s_in = b"Lots of content here"
216 s_out = gzip.compress(s_in)
Christian Heimesbbe741d2008-03-28 10:53:29 +0000217
Georg Brandl116aa622007-08-15 14:28:22 +0000218.. seealso::
219
220 Module :mod:`zlib`
221 The basic data compression module needed to support the :program:`gzip` file
222 format.
223
Serhiy Storchaka083a7a12018-11-05 17:47:27 +0200224
225.. program:: gzip
226
Stéphane Wirtel7c817e62018-10-10 08:28:26 +0200227Command Line Interface
228----------------------
229
230The :mod:`gzip` module provides a simple command line interface to compress or
231decompress files.
232
233Once executed the :mod:`gzip` module keeps the input file(s).
234
235.. versionchanged:: 3.8
236
237 Add a new command line interface with a usage.
Stéphane Wirtel3e28eed2018-11-03 16:24:23 +0100238 By default, when you will execute the CLI, the default compression level is 6.
Stéphane Wirtel7c817e62018-10-10 08:28:26 +0200239
240Command line options
241^^^^^^^^^^^^^^^^^^^^
242
243.. cmdoption:: file
244
Stéphane Wirtel7c817e62018-10-10 08:28:26 +0200245 If *file* is not specified, read from :attr:`sys.stdin`.
246
Stéphane Wirtel3e28eed2018-11-03 16:24:23 +0100247.. cmdoption:: --fast
248
249 Indicates the fastest compression method (less compression).
250
251.. cmdoption:: --best
252
253 Indicates the slowest compression method (best compression).
254
Stéphane Wirtel7c817e62018-10-10 08:28:26 +0200255.. cmdoption:: -d, --decompress
256
Stéphane Wirtel3e28eed2018-11-03 16:24:23 +0100257 Decompress the given file.
Stéphane Wirtel7c817e62018-10-10 08:28:26 +0200258
259.. cmdoption:: -h, --help
260
261 Show the help message.
262