Blame - Doc/library/zlib.rst - platform/external/python/cpython3

blob: 3d742ab35b9cc933b697551f23178a3c628e41d7 [file] [log] [blame]

Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1	:mod:`zlib` --- Compression compatible with :program:`gzip`
				2	===========================================================
				3
				4	.. module:: zlib
Georg Brandl	7f01a13	2009-09-16 15:58:14 +0000	[diff] [blame]	5	:synopsis: Low-level interface to compression and decompression routines
				6	compatible with gzip.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	7
Terry Jan Reedy	fa089b9	2016-06-11 15:02:54 -0400	[diff] [blame]	8	--------------
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	9
				10	For applications that require data compression, the functions in this module
				11	allow compression and decompression, using the zlib library. The zlib library
				12	has its own home page at http://www.zlib.net. There are known
				13	incompatibilities between the Python module and versions of the zlib library
				14	earlier than 1.1.3; 1.1.3 has a security vulnerability, so we recommend using
				15	1.1.4 or later.
				16
				17	zlib's functions have many options and often need to be used in a particular
				18	order. This documentation doesn't attempt to cover all of the permutations;
				19	consult the zlib manual at http://www.zlib.net/manual.html for authoritative
				20	information.
				21
Éric Araujo	f2fbb9c	2012-01-16 16:55:55 +0100	[diff] [blame]	22	For reading and writing ``.gz`` files see the :mod:`gzip` module.
Guido van Rossum	7767711	2007-11-05 19:43:04 +0000	[diff] [blame]	23
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	24	The available exception and functions in this module are:
				25
				26
				27	.. exception:: error
				28
				29	Exception raised on compression and decompression errors.
				30
				31
Benjamin Peterson	058e31e	2009-01-16 03:54:08 +0000	[diff] [blame]	32	.. function:: adler32(data[, value])
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	33
Serhiy Storchaka	d65c949	2015-11-02 14:10:23 +0200	[diff] [blame]	34	Computes an Adler-32 checksum of data. (An Adler-32 checksum is almost as
Martin Panter	b82032f	2015-12-11 05:19:29 +0000	[diff] [blame]	35	reliable as a CRC32 but can be computed much more quickly.) The result
				36	is an unsigned 32-bit integer. If value is present, it is used as
				37	the starting value of the checksum; otherwise, a default value of 1
				38	is used. Passing in value allows computing a running checksum over the
Benjamin Peterson	058e31e	2009-01-16 03:54:08 +0000	[diff] [blame]	39	concatenation of several inputs. The algorithm is not cryptographically
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	40	strong, and should not be used for authentication or digital signatures. Since
				41	the algorithm is designed for use as a checksum algorithm, it is not suitable
				42	for use as a general hash algorithm.
				43
Martin Panter	b82032f	2015-12-11 05:19:29 +0000	[diff] [blame]	44	.. versionchanged:: 3.0
				45	Always returns an unsigned value.
				46	To generate the same numeric value across all Python versions and
				47	platforms, use ``adler32(data) & 0xffffffff``.
Benjamin Peterson	058e31e	2009-01-16 03:54:08 +0000	[diff] [blame]	48
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	49
Martin Panter	1fe0d13	2016-02-10 10:06:36 +0000	[diff] [blame]	50	.. function:: compress(data, level=-1)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	51
Georg Brandl	4ad934f	2011-01-08 21:04:25 +0000	[diff] [blame]	52	Compresses the bytes in data, returning a bytes object containing compressed data.
Martin Panter	1fe0d13	2016-02-10 10:06:36 +0000	[diff] [blame]	53	level is an integer from ``0`` to ``9`` or ``-1`` controlling the level of compression;
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	54	``1`` is fastest and produces the least compression, ``9`` is slowest and
Martin Panter	1fe0d13	2016-02-10 10:06:36 +0000	[diff] [blame]	55	produces the most. ``0`` is no compression. The default value is ``-1``
				56	(Z_DEFAULT_COMPRESSION). Z_DEFAULT_COMPRESSION represents a default
				57	compromise between speed and compression (currently equivalent to level 6).
Nadeem Vawda	6ff262e	2012-11-11 14:14:47 +0100	[diff] [blame]	58	Raises the :exc:`error` exception if any error occurs.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	59
Martin Panter	1fe0d13	2016-02-10 10:06:36 +0000	[diff] [blame]	60	.. versionchanged:: 3.6
Serhiy Storchaka	2d8f945	2016-06-25 22:47:04 +0300	[diff] [blame]	61	level can now be used as a keyword parameter.
Martin Panter	1fe0d13	2016-02-10 10:06:36 +0000	[diff] [blame]	62
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	63
Martin Panter	bf19d16	2015-09-09 01:01:13 +0000	[diff] [blame]	64	.. function:: compressobj(level=-1, method=DEFLATED, wbits=15, memLevel=8, strategy=Z_DEFAULT_STRATEGY[, zdict])
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	65
				66	Returns a compression object, to be used for compressing data streams that won't
Nadeem Vawda	fd8a838	2012-06-21 02:13:12 +0200	[diff] [blame]	67	fit into memory at once.
				68
Martin Panter	567d513	2016-02-03 07:06:33 +0000	[diff] [blame]	69	level is the compression level -- an integer from ``0`` to ``9`` or ``-1``.
				70	A value of ``1`` is fastest and produces the least compression, while a value of
Nadeem Vawda	6ff262e	2012-11-11 14:14:47 +0100	[diff] [blame]	71	``9`` is slowest and produces the most. ``0`` is no compression. The default
Martin Panter	567d513	2016-02-03 07:06:33 +0000	[diff] [blame]	72	value is ``-1`` (Z_DEFAULT_COMPRESSION). Z_DEFAULT_COMPRESSION represents a default
				73	compromise between speed and compression (currently equivalent to level 6).
Nadeem Vawda	2180c97	2012-06-22 01:40:49 +0200	[diff] [blame]	74
				75	method is the compression algorithm. Currently, the only supported value is
				76	``DEFLATED``.
				77
Martin Panter	0fdf41d	2016-05-27 07:32:11 +0000	[diff] [blame]	78	The wbits argument controls the size of the history buffer (or the
				79	"window size") used when compressing data, and whether a header and
				80	trailer is included in the output. It can take several ranges of values:
				81
				82	* +9 to +15: The base-two logarithm of the window size, which
				83	therefore ranges between 512 and 32768. Larger values produce
				84	better compression at the expense of greater memory usage. The
				85	resulting output will include a zlib-specific header and trailer.
				86
				87	* −9 to −15: Uses the absolute value of wbits as the
				88	window size logarithm, while producing a raw output stream with no
				89	header or trailing checksum.
				90
				91	* +25 to +31 = 16 + (9 to 15): Uses the low 4 bits of the value as the
				92	window size logarithm, while including a basic :program:`gzip` header
				93	and trailing checksum in the output.
Nadeem Vawda	2180c97	2012-06-22 01:40:49 +0200	[diff] [blame]	94
Martin Panter	bf19d16	2015-09-09 01:01:13 +0000	[diff] [blame]	95	The memLevel argument controls the amount of memory used for the
				96	internal compression state. Valid values range from ``1`` to ``9``.
				97	Higher values use more memory, but are faster and produce smaller output.
Nadeem Vawda	2180c97	2012-06-22 01:40:49 +0200	[diff] [blame]	98
				99	strategy is used to tune the compression algorithm. Possible values are
				100	``Z_DEFAULT_STRATEGY``, ``Z_FILTERED``, and ``Z_HUFFMAN_ONLY``.
Nadeem Vawda	fd8a838	2012-06-21 02:13:12 +0200	[diff] [blame]	101
				102	zdict is a predefined compression dictionary. This is a sequence of bytes
				103	(such as a :class:`bytes` object) containing subsequences that are expected
				104	to occur frequently in the data that is to be compressed. Those subsequences
				105	that are expected to be most common should come at the end of the dictionary.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	106
Georg Brandl	9aae9e5	2012-06-26 08:51:17 +0200	[diff] [blame]	107	.. versionchanged:: 3.3
Georg Brandl	9ff06dc	2013-10-17 19:51:34 +0200	[diff] [blame]	108	Added the zdict parameter and keyword argument support.
Georg Brandl	9aae9e5	2012-06-26 08:51:17 +0200	[diff] [blame]	109
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	110
Benjamin Peterson	058e31e	2009-01-16 03:54:08 +0000	[diff] [blame]	111	.. function:: crc32(data[, value])
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	112
				113	.. index::
				114	single: Cyclic Redundancy Check
				115	single: checksum; Cyclic Redundancy Check
				116
Martin Panter	b82032f	2015-12-11 05:19:29 +0000	[diff] [blame]	117	Computes a CRC (Cyclic Redundancy Check) checksum of data. The
				118	result is an unsigned 32-bit integer. If value is present, it is used
				119	as the starting value of the checksum; otherwise, a default value of 0
				120	is used. Passing in value allows computing a running checksum over the
Benjamin Peterson	058e31e	2009-01-16 03:54:08 +0000	[diff] [blame]	121	concatenation of several inputs. The algorithm is not cryptographically
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	122	strong, and should not be used for authentication or digital signatures. Since
				123	the algorithm is designed for use as a checksum algorithm, it is not suitable
				124	for use as a general hash algorithm.
				125
Martin Panter	b82032f	2015-12-11 05:19:29 +0000	[diff] [blame]	126	.. versionchanged:: 3.0
				127	Always returns an unsigned value.
Georg Brandl	9aae9e5	2012-06-26 08:51:17 +0200	[diff] [blame]	128	To generate the same numeric value across all Python versions and
Martin Panter	b82032f	2015-12-11 05:19:29 +0000	[diff] [blame]	129	platforms, use ``crc32(data) & 0xffffffff``.
Benjamin Peterson	058e31e	2009-01-16 03:54:08 +0000	[diff] [blame]	130
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	131
Serhiy Storchaka	15f3228	2016-08-15 10:06:16 +0300	[diff] [blame]	132	.. function:: decompress(data, wbits=MAX_WBITS, bufsize=DEF_BUF_SIZE)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	133
Georg Brandl	4ad934f	2011-01-08 21:04:25 +0000	[diff] [blame]	134	Decompresses the bytes in data, returning a bytes object containing the
Martin Panter	0fdf41d	2016-05-27 07:32:11 +0000	[diff] [blame]	135	uncompressed data. The wbits parameter depends on
				136	the format of data, and is discussed further below.
Benjamin Peterson	2614cda	2010-03-21 22:36:19 +0000	[diff] [blame]	137	If bufsize is given, it is used as the initial size of the output
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	138	buffer. Raises the :exc:`error` exception if any error occurs.
				139
Martin Panter	0fdf41d	2016-05-27 07:32:11 +0000	[diff] [blame]	140	.. _decompress-wbits:
				141
				142	The wbits parameter controls the size of the history buffer
				143	(or "window size"), and what header and trailer format is expected.
				144	It is similar to the parameter for :func:`compressobj`, but accepts
				145	more ranges of values:
				146
				147	* +8 to +15: The base-two logarithm of the window size. The input
				148	must include a zlib header and trailer.
				149
				150	* 0: Automatically determine the window size from the zlib header.
Martin Panter	c618ae8	2016-05-27 11:20:21 +0000	[diff] [blame]	151	Only supported since zlib 1.2.3.5.
Martin Panter	0fdf41d	2016-05-27 07:32:11 +0000	[diff] [blame]	152
				153	* −8 to −15: Uses the absolute value of wbits as the window size
				154	logarithm. The input must be a raw stream with no header or trailer.
				155
				156	* +24 to +31 = 16 + (8 to 15): Uses the low 4 bits of the value as
				157	the window size logarithm. The input must include a gzip header and
				158	trailer.
				159
				160	* +40 to +47 = 32 + (8 to 15): Uses the low 4 bits of the value as
				161	the window size logarithm, and automatically accepts either
				162	the zlib or gzip format.
				163
				164	When decompressing a stream, the window size must not be smaller
Benjamin Peterson	2614cda	2010-03-21 22:36:19 +0000	[diff] [blame]	165	than the size originally used to compress the stream; using a too-small
Martin Panter	0fdf41d	2016-05-27 07:32:11 +0000	[diff] [blame]	166	value may result in an :exc:`error` exception. The default wbits value
Serhiy Storchaka	15f3228	2016-08-15 10:06:16 +0300	[diff] [blame]	167	corresponds to the largest window size and requires a zlib header and
				168	trailer to be included.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	169
				170	bufsize is the initial size of the buffer used to hold decompressed data. If
				171	more space is required, the buffer size will be increased as needed, so you
				172	don't have to get this value exactly right; tuning it will only save a few calls
Serhiy Storchaka	15f3228	2016-08-15 10:06:16 +0300	[diff] [blame]	173	to :c:func:`malloc`.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	174
Serhiy Storchaka	15f3228	2016-08-15 10:06:16 +0300	[diff] [blame]	175	.. versionchanged:: 3.6
				176	wbits and bufsize can be used as keyword arguments.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	177
Georg Brandl	9aae9e5	2012-06-26 08:51:17 +0200	[diff] [blame]	178	.. function:: decompressobj(wbits=15[, zdict])
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	179
				180	Returns a decompression object, to be used for decompressing data streams that
Nadeem Vawda	fd8a838	2012-06-21 02:13:12 +0200	[diff] [blame]	181	won't fit into memory at once.
				182
Martin Panter	0fdf41d	2016-05-27 07:32:11 +0000	[diff] [blame]	183	The wbits parameter controls the size of the history buffer (or the
				184	"window size"), and what header and trailer format is expected. It has
				185	the same meaning as `described for decompress() <#decompress-wbits>`__.
Nadeem Vawda	fd8a838	2012-06-21 02:13:12 +0200	[diff] [blame]	186
				187	The zdict parameter specifies a predefined compression dictionary. If
				188	provided, this must be the same dictionary as was used by the compressor that
				189	produced the data that is to be decompressed.
				190
Georg Brandl	9aae9e5	2012-06-26 08:51:17 +0200	[diff] [blame]	191	.. note::
				192
				193	If zdict is a mutable object (such as a :class:`bytearray`), you must not
				194	modify its contents between the call to :func:`decompressobj` and the first
				195	call to the decompressor's ``decompress()`` method.
				196
				197	.. versionchanged:: 3.3
				198	Added the zdict parameter.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	199
Nadeem Vawda	64d25dd	2011-09-12 00:04:13 +0200	[diff] [blame]	200
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	201	Compression objects support the following methods:
				202
				203
Georg Brandl	4ad934f	2011-01-08 21:04:25 +0000	[diff] [blame]	204	.. method:: Compress.compress(data)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	205
Georg Brandl	4ad934f	2011-01-08 21:04:25 +0000	[diff] [blame]	206	Compress data, returning a bytes object containing compressed data for at least
				207	part of the data in data. This data should be concatenated to the output
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	208	produced by any preceding calls to the :meth:`compress` method. Some input may
				209	be kept in internal buffers for later processing.
				210
				211
				212	.. method:: Compress.flush([mode])
				213
Georg Brandl	4ad934f	2011-01-08 21:04:25 +0000	[diff] [blame]	214	All pending input is processed, and a bytes object containing the remaining compressed
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	215	output is returned. mode can be selected from the constants
				216	:const:`Z_SYNC_FLUSH`, :const:`Z_FULL_FLUSH`, or :const:`Z_FINISH`,
				217	defaulting to :const:`Z_FINISH`. :const:`Z_SYNC_FLUSH` and
Georg Brandl	4ad934f	2011-01-08 21:04:25 +0000	[diff] [blame]	218	:const:`Z_FULL_FLUSH` allow compressing further bytestrings of data, while
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	219	:const:`Z_FINISH` finishes the compressed stream and prevents compressing any
				220	more data. After calling :meth:`flush` with mode set to :const:`Z_FINISH`,
				221	the :meth:`compress` method cannot be called again; the only realistic action is
				222	to delete the object.
				223
				224
				225	.. method:: Compress.copy()
				226
				227	Returns a copy of the compression object. This can be used to efficiently
				228	compress a set of data that share a common initial prefix.
				229
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	230
Nadeem Vawda	1c38546	2011-08-13 15:22:40 +0200	[diff] [blame]	231	Decompression objects support the following methods and attributes:
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	232
				233
				234	.. attribute:: Decompress.unused_data
				235
Georg Brandl	4ad934f	2011-01-08 21:04:25 +0000	[diff] [blame]	236	A bytes object which contains any bytes past the end of the compressed data. That is,
Serhiy Storchaka	5e028ae	2014-02-06 21:10:41 +0200	[diff] [blame]	237	this remains ``b""`` until the last byte that contains compression data is
Georg Brandl	4ad934f	2011-01-08 21:04:25 +0000	[diff] [blame]	238	available. If the whole bytestring turned out to contain compressed data, this is
				239	``b""``, an empty bytes object.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	240
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	241
				242	.. attribute:: Decompress.unconsumed_tail
				243
Georg Brandl	4ad934f	2011-01-08 21:04:25 +0000	[diff] [blame]	244	A bytes object that contains any data that was not consumed by the last
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	245	:meth:`decompress` call because it exceeded the limit for the uncompressed data
				246	buffer. This data has not yet been seen by the zlib machinery, so you must feed
				247	it (possibly with further data concatenated to it) back to a subsequent
				248	:meth:`decompress` method call in order to get correct output.
				249
				250
Nadeem Vawda	1c38546	2011-08-13 15:22:40 +0200	[diff] [blame]	251	.. attribute:: Decompress.eof
				252
				253	A boolean indicating whether the end of the compressed data stream has been
				254	reached.
				255
				256	This makes it possible to distinguish between a properly-formed compressed
				257	stream, and an incomplete or truncated one.
				258
				259	.. versionadded:: 3.3
				260
				261
Serhiy Storchaka	15f3228	2016-08-15 10:06:16 +0300	[diff] [blame]	262	.. method:: Decompress.decompress(data, max_length=0)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	263
Georg Brandl	4ad934f	2011-01-08 21:04:25 +0000	[diff] [blame]	264	Decompress data, returning a bytes object containing the uncompressed data
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	265	corresponding to at least part of the data in string. This data should be
				266	concatenated to the output produced by any preceding calls to the
				267	:meth:`decompress` method. Some of the input data may be preserved in internal
				268	buffers for later processing.
				269
Martin Panter	38fe4dc	2015-11-18 00:59:17 +0000	[diff] [blame]	270	If the optional parameter max_length is non-zero then the return value will be
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	271	no longer than max_length. This may mean that not all of the compressed input
				272	can be processed; and unconsumed data will be stored in the attribute
Georg Brandl	4ad934f	2011-01-08 21:04:25 +0000	[diff] [blame]	273	:attr:`unconsumed_tail`. This bytestring must be passed to a subsequent call to
Serhiy Storchaka	15f3228	2016-08-15 10:06:16 +0300	[diff] [blame]	274	:meth:`decompress` if decompression is to continue. If max_length is zero
				275	then the whole input is decompressed, and :attr:`unconsumed_tail` is empty.
				276
				277	.. versionchanged:: 3.6
				278	max_length can be used as a keyword argument.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	279
				280
				281	.. method:: Decompress.flush([length])
				282
Georg Brandl	4ad934f	2011-01-08 21:04:25 +0000	[diff] [blame]	283	All pending input is processed, and a bytes object containing the remaining
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	284	uncompressed output is returned. After calling :meth:`flush`, the
				285	:meth:`decompress` method cannot be called again; the only realistic action is
				286	to delete the object.
				287
				288	The optional parameter length sets the initial size of the output buffer.
				289
				290
				291	.. method:: Decompress.copy()
				292
				293	Returns a copy of the decompression object. This can be used to save the state
				294	of the decompressor midway through the data stream in order to speed up random
				295	seeks into the stream at a future point.
				296
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	297
Nadeem Vawda	64d25dd	2011-09-12 00:04:13 +0200	[diff] [blame]	298	Information about the version of the zlib library in use is available through
				299	the following constants:
				300
				301
				302	.. data:: ZLIB_VERSION
				303
				304	The version string of the zlib library that was used for building the module.
				305	This may be different from the zlib library actually used at runtime, which
				306	is available as :const:`ZLIB_RUNTIME_VERSION`.
				307
Nadeem Vawda	64d25dd	2011-09-12 00:04:13 +0200	[diff] [blame]	308
				309	.. data:: ZLIB_RUNTIME_VERSION
				310
				311	The version string of the zlib library actually loaded by the interpreter.
				312
				313	.. versionadded:: 3.3
				314
				315
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	316	.. seealso::
				317
				318	Module :mod:`gzip`
				319	Reading and writing :program:`gzip`\ -format files.
				320
				321	http://www.zlib.net
				322	The zlib library home page.
				323
				324	http://www.zlib.net/manual.html
				325	The zlib manual explains the semantics and usage of the library's many
				326	functions.
				327