blob: 04c41d585c8e0c8b5783a818dd54d04306dc8194 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`gzip` --- Support for :program:`gzip` files
2=================================================
3
4.. module:: gzip
5 :synopsis: Interfaces for gzip compression and decompression using file objects.
6
Raymond Hettinger469271d2011-01-27 20:38:46 +00007**Source code:** :source:`Lib/gzip.py`
8
9--------------
10
Christian Heimesbbe741d2008-03-28 10:53:29 +000011This module provides a simple interface to compress and decompress files just
12like the GNU programs :program:`gzip` and :program:`gunzip` would.
Georg Brandl116aa622007-08-15 14:28:22 +000013
Georg Brandl1f01deb2009-01-03 22:47:39 +000014The data compression is provided by the :mod:`zlib` module.
Christian Heimesbbe741d2008-03-28 10:53:29 +000015
Nadeem Vawda7e126202012-05-06 15:04:01 +020016The :mod:`gzip` module provides the :class:`GzipFile` class, as well as the
Nadeem Vawda68721012012-06-04 23:21:38 +020017:func:`.open`, :func:`compress` and :func:`decompress` convenience functions.
18The :class:`GzipFile` class reads and writes :program:`gzip`\ -format files,
19automatically compressing or decompressing the data so that it looks like an
20ordinary :term:`file object`.
Christian Heimesbbe741d2008-03-28 10:53:29 +000021
22Note that additional file formats which can be decompressed by the
23:program:`gzip` and :program:`gunzip` programs, such as those produced by
24:program:`compress` and :program:`pack`, are not supported by this module.
Georg Brandl116aa622007-08-15 14:28:22 +000025
Georg Brandl116aa622007-08-15 14:28:22 +000026The module defines the following items:
27
28
Nadeem Vawda7e126202012-05-06 15:04:01 +020029.. function:: open(filename, mode='rb', compresslevel=9, encoding=None, errors=None, newline=None)
30
Nadeem Vawda68721012012-06-04 23:21:38 +020031 Open a gzip-compressed file in binary or text mode, returning a :term:`file
32 object`.
Nadeem Vawda7e126202012-05-06 15:04:01 +020033
Nadeem Vawda68721012012-06-04 23:21:38 +020034 The *filename* argument can be an actual filename (a :class:`str` or
35 :class:`bytes` object), or an existing file object to read from or write to.
Nadeem Vawda7e126202012-05-06 15:04:01 +020036
37 The *mode* argument can be any of ``'r'``, ``'rb'``, ``'a'``, ``'ab'``,
Nadeem Vawdaee1be992013-10-19 00:11:13 +020038 ``'w'``, ``'wb'``, ``'x'`` or ``'xb'`` for binary mode, or ``'rt'``,
39 ``'at'``, ``'wt'``, or ``'xt'`` for text mode. The default is ``'rb'``.
Nadeem Vawda7e126202012-05-06 15:04:01 +020040
Nadeem Vawda6ff262e2012-11-11 14:14:47 +010041 The *compresslevel* argument is an integer from 0 to 9, as for the
Nadeem Vawda7e126202012-05-06 15:04:01 +020042 :class:`GzipFile` constructor.
43
44 For binary mode, this function is equivalent to the :class:`GzipFile`
45 constructor: ``GzipFile(filename, mode, compresslevel)``. In this case, the
46 *encoding*, *errors* and *newline* arguments must not be provided.
47
48 For text mode, a :class:`GzipFile` object is created, and wrapped in an
49 :class:`io.TextIOWrapper` instance with the specified encoding, error
50 handling behavior, and line ending(s).
51
52 .. versionchanged:: 3.3
Nadeem Vawda68721012012-06-04 23:21:38 +020053 Added support for *filename* being a file object, support for text mode,
54 and the *encoding*, *errors* and *newline* arguments.
Nadeem Vawda7e126202012-05-06 15:04:01 +020055
Nadeem Vawdaee1be992013-10-19 00:11:13 +020056 .. versionchanged:: 3.4
57 Added support for the ``'x'``, ``'xb'`` and ``'xt'`` modes.
58
Nadeem Vawda7e126202012-05-06 15:04:01 +020059
Georg Brandl036490d2009-05-17 13:00:36 +000060.. class:: GzipFile(filename=None, mode=None, compresslevel=9, fileobj=None, mtime=None)
Georg Brandl116aa622007-08-15 14:28:22 +000061
Antoine Pitrouc3ed2e72010-09-29 10:49:46 +000062 Constructor for the :class:`GzipFile` class, which simulates most of the
63 methods of a :term:`file object`, with the exception of the :meth:`truncate`
64 method. At least one of *fileobj* and *filename* must be given a non-trivial
65 value.
Georg Brandl116aa622007-08-15 14:28:22 +000066
67 The new class instance is based on *fileobj*, which can be a regular file, a
Serhiy Storchakae79be872013-08-17 00:09:55 +030068 :class:`io.BytesIO` object, or any other object which simulates a file. It
Georg Brandl116aa622007-08-15 14:28:22 +000069 defaults to ``None``, in which case *filename* is opened to provide a file
70 object.
71
72 When *fileobj* is not ``None``, the *filename* argument is only used to be
Georg Brandlf27bfd82013-10-06 12:33:20 +020073 included in the :program:`gzip` file header, which may include the original
Georg Brandl116aa622007-08-15 14:28:22 +000074 filename of the uncompressed file. It defaults to the filename of *fileobj*, if
75 discernible; otherwise, it defaults to the empty string, and in this case the
76 original filename is not included in the header.
77
78 The *mode* argument can be any of ``'r'``, ``'rb'``, ``'a'``, ``'ab'``, ``'w'``,
Nadeem Vawdaee1be992013-10-19 00:11:13 +020079 ``'wb'``, ``'x'``, or ``'xb'``, depending on whether the file will be read or
80 written. The default is the mode of *fileobj* if discernible; otherwise, the
81 default is ``'rb'``.
Nadeem Vawda30d94b72012-02-11 23:45:10 +020082
Nadeem Vawda7e126202012-05-06 15:04:01 +020083 Note that the file is always opened in binary mode. To open a compressed file
Nadeem Vawda68721012012-06-04 23:21:38 +020084 in text mode, use :func:`.open` (or wrap your :class:`GzipFile` with an
Nadeem Vawda7e126202012-05-06 15:04:01 +020085 :class:`io.TextIOWrapper`).
Georg Brandl116aa622007-08-15 14:28:22 +000086
Nadeem Vawda19e568d2012-11-11 14:04:14 +010087 The *compresslevel* argument is an integer from ``0`` to ``9`` controlling
88 the level of compression; ``1`` is fastest and produces the least
89 compression, and ``9`` is slowest and produces the most compression. ``0``
90 is no compression. The default is ``9``.
Georg Brandl116aa622007-08-15 14:28:22 +000091
Antoine Pitrou42db3ef2009-01-04 21:37:59 +000092 The *mtime* argument is an optional numeric timestamp to be written to
Antoine Pitrou2dbc6e62015-04-11 00:31:01 +020093 the last modification time field in the stream when compressing. It
94 should only be provided in compression mode. If omitted or ``None``, the
95 current time is used. See the :attr:`mtime` attribute for more details.
Antoine Pitrou42db3ef2009-01-04 21:37:59 +000096
Georg Brandl116aa622007-08-15 14:28:22 +000097 Calling a :class:`GzipFile` object's :meth:`close` method does not close
98 *fileobj*, since you might wish to append more material after the compressed
Antoine Pitroue5768cf2010-09-23 16:45:17 +000099 data. This also allows you to pass a :class:`io.BytesIO` object opened for
Georg Brandl116aa622007-08-15 14:28:22 +0000100 writing as *fileobj*, and retrieve the resulting memory buffer using the
Antoine Pitroue5768cf2010-09-23 16:45:17 +0000101 :class:`io.BytesIO` object's :meth:`~io.BytesIO.getvalue` method.
Georg Brandl116aa622007-08-15 14:28:22 +0000102
Antoine Pitrouc3ed2e72010-09-29 10:49:46 +0000103 :class:`GzipFile` supports the :class:`io.BufferedIOBase` interface,
104 including iteration and the :keyword:`with` statement. Only the
105 :meth:`truncate` method isn't implemented.
Benjamin Petersone0124bd2009-03-09 21:04:33 +0000106
Antoine Pitrou2dbc6e62015-04-11 00:31:01 +0200107 :class:`GzipFile` also provides the following method and attribute:
Antoine Pitrou7b998e92010-10-04 21:55:14 +0000108
Antoine Pitrou2dbc6e62015-04-11 00:31:01 +0200109 .. method:: peek(n)
Antoine Pitrou7b998e92010-10-04 21:55:14 +0000110
111 Read *n* uncompressed bytes without advancing the file position.
112 At most one single read on the compressed stream is done to satisfy
113 the call. The number of bytes returned may be more or less than
114 requested.
115
Nadeem Vawda69761042013-12-08 19:47:22 +0100116 .. note:: While calling :meth:`peek` does not change the file position of
117 the :class:`GzipFile`, it may change the position of the underlying
118 file object (e.g. if the :class:`GzipFile` was constructed with the
119 *fileobj* parameter).
120
Antoine Pitrou7b998e92010-10-04 21:55:14 +0000121 .. versionadded:: 3.2
122
Antoine Pitrou2dbc6e62015-04-11 00:31:01 +0200123 .. attribute:: mtime
124
125 When decompressing, the value of the last modification time field in
126 the most recently read header may be read from this attribute, as an
127 integer. The initial value before reading any headers is ``None``.
128
129 All :program:`gzip` compressed streams are required to contain this
130 timestamp field. Some programs, such as :program:`gunzip`\ , make use
131 of the timestamp. The format is the same as the return value of
132 :func:`time.time` and the :attr:`~os.stat_result.st_mtime` attribute of
133 the object returned by :func:`os.stat`.
134
Benjamin Peterson10745a92009-03-09 21:08:47 +0000135 .. versionchanged:: 3.1
Georg Brandlffb94ae2013-10-06 19:02:08 +0200136 Support for the :keyword:`with` statement was added, along with the
Antoine Pitrou2dbc6e62015-04-11 00:31:01 +0200137 *mtime* constructor argument and :attr:`mtime` attribute.
Benjamin Petersone0124bd2009-03-09 21:04:33 +0000138
Antoine Pitrou8e33fd72010-01-13 14:37:26 +0000139 .. versionchanged:: 3.2
Georg Brandlffb94ae2013-10-06 19:02:08 +0200140 Support for zero-padded and unseekable files was added.
Antoine Pitrou7b969842010-09-23 16:22:51 +0000141
Antoine Pitrou6b4be362011-04-04 21:09:05 +0200142 .. versionchanged:: 3.3
143 The :meth:`io.BufferedIOBase.read1` method is now implemented.
144
Nadeem Vawdaee1be992013-10-19 00:11:13 +0200145 .. versionchanged:: 3.4
146 Added support for the ``'x'`` and ``'xb'`` modes.
147
Serhiy Storchakabca63b32015-03-23 14:59:48 +0200148 .. versionchanged:: 3.5
149 Added support for writing arbitrary
150 :term:`bytes-like objects <bytes-like object>`.
Antoine Pitrou2dbc6e62015-04-11 00:31:01 +0200151 The :meth:`~io.BufferedIOBase.read` method now accepts an argument of
152 ``None``.
Serhiy Storchakabca63b32015-03-23 14:59:48 +0200153
Georg Brandl116aa622007-08-15 14:28:22 +0000154
Antoine Pitrou79c5ef12010-08-17 21:10:05 +0000155.. function:: compress(data, compresslevel=9)
156
157 Compress the *data*, returning a :class:`bytes` object containing
158 the compressed data. *compresslevel* has the same meaning as in
159 the :class:`GzipFile` constructor above.
160
Antoine Pitroucdfe1c52010-08-17 21:15:00 +0000161 .. versionadded:: 3.2
162
Antoine Pitrou79c5ef12010-08-17 21:10:05 +0000163.. function:: decompress(data)
164
165 Decompress the *data*, returning a :class:`bytes` object containing the
166 uncompressed data.
167
Antoine Pitroucdfe1c52010-08-17 21:15:00 +0000168 .. versionadded:: 3.2
169
Georg Brandl116aa622007-08-15 14:28:22 +0000170
Christian Heimesbbe741d2008-03-28 10:53:29 +0000171.. _gzip-usage-examples:
172
173Examples of usage
174-----------------
175
176Example of how to read a compressed file::
177
178 import gzip
Antoine Pitroubf1a0182010-08-17 21:11:49 +0000179 with gzip.open('/home/joe/file.txt.gz', 'rb') as f:
180 file_content = f.read()
Christian Heimesbbe741d2008-03-28 10:53:29 +0000181
182Example of how to create a compressed GZIP file::
183
184 import gzip
Antoine Pitroubf1a0182010-08-17 21:11:49 +0000185 content = b"Lots of content here"
186 with gzip.open('/home/joe/file.txt.gz', 'wb') as f:
187 f.write(content)
Christian Heimesbbe741d2008-03-28 10:53:29 +0000188
189Example of how to GZIP compress an existing file::
190
191 import gzip
Andrew Kuchlingf887a612015-04-14 11:44:40 -0400192 import shutil
Antoine Pitroubf1a0182010-08-17 21:11:49 +0000193 with open('/home/joe/file.txt', 'rb') as f_in:
Éric Araujof5be0902010-08-17 21:24:05 +0000194 with gzip.open('/home/joe/file.txt.gz', 'wb') as f_out:
Andrew Kuchlingf887a612015-04-14 11:44:40 -0400195 shutil.copyfileobj(f_in, f_out)
Christian Heimesbbe741d2008-03-28 10:53:29 +0000196
Antoine Pitrou79c5ef12010-08-17 21:10:05 +0000197Example of how to GZIP compress a binary string::
198
199 import gzip
200 s_in = b"Lots of content here"
201 s_out = gzip.compress(s_in)
Christian Heimesbbe741d2008-03-28 10:53:29 +0000202
Georg Brandl116aa622007-08-15 14:28:22 +0000203.. seealso::
204
205 Module :mod:`zlib`
206 The basic data compression module needed to support the :program:`gzip` file
207 format.
208