Blame - Doc/library/zlib.rst - platform/external/python/cpython3

blob: 897d919e310810f6ca5829f14668764e37081eea [file] [log] [blame]

Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1	:mod:`zlib` --- Compression compatible with :program:`gzip`
				2	===========================================================
				3
				4	.. module:: zlib
Georg Brandl	7f01a13	2009-09-16 15:58:14 +0000	[diff] [blame]	5	:synopsis: Low-level interface to compression and decompression routines
				6	compatible with gzip.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	7
				8
				9	For applications that require data compression, the functions in this module
				10	allow compression and decompression, using the zlib library. The zlib library
				11	has its own home page at http://www.zlib.net. There are known
				12	incompatibilities between the Python module and versions of the zlib library
				13	earlier than 1.1.3; 1.1.3 has a security vulnerability, so we recommend using
				14	1.1.4 or later.
				15
				16	zlib's functions have many options and often need to be used in a particular
				17	order. This documentation doesn't attempt to cover all of the permutations;
				18	consult the zlib manual at http://www.zlib.net/manual.html for authoritative
				19	information.
				20
Éric Araujo	f2fbb9c	2012-01-16 16:55:55 +0100	[diff] [blame^]	21	For reading and writing ``.gz`` files see the :mod:`gzip` module.
Guido van Rossum	7767711	2007-11-05 19:43:04 +0000	[diff] [blame]	22
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	23	The available exception and functions in this module are:
				24
				25
				26	.. exception:: error
				27
				28	Exception raised on compression and decompression errors.
				29
				30
Benjamin Peterson	058e31e	2009-01-16 03:54:08 +0000	[diff] [blame]	31	.. function:: adler32(data[, value])
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	32
Benjamin Peterson	058e31e	2009-01-16 03:54:08 +0000	[diff] [blame]	33	Computes a Adler-32 checksum of data. (An Adler-32 checksum is almost as
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	34	reliable as a CRC32 but can be computed much more quickly.) If value is
				35	present, it is used as the starting value of the checksum; otherwise, a fixed
				36	default value is used. This allows computing a running checksum over the
Benjamin Peterson	058e31e	2009-01-16 03:54:08 +0000	[diff] [blame]	37	concatenation of several inputs. The algorithm is not cryptographically
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	38	strong, and should not be used for authentication or digital signatures. Since
				39	the algorithm is designed for use as a checksum algorithm, it is not suitable
				40	for use as a general hash algorithm.
				41
Gregory P. Smith	ab0d8a1	2008-03-17 20:24:09 +0000	[diff] [blame]	42	Always returns an unsigned 32-bit integer.
				43
Benjamin Peterson	058e31e	2009-01-16 03:54:08 +0000	[diff] [blame]	44	.. note::
				45	To generate the same numeric value across all Python versions and
				46	platforms use adler32(data) & 0xffffffff. If you are only using
				47	the checksum in packed binary format this is not necessary as the
Gregory P. Smith	fa6cf39	2009-02-01 00:30:50 +0000	[diff] [blame]	48	return value is the correct 32bit binary representation
Benjamin Peterson	058e31e	2009-01-16 03:54:08 +0000	[diff] [blame]	49	regardless of sign.
				50
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	51
Georg Brandl	4ad934f	2011-01-08 21:04:25 +0000	[diff] [blame]	52	.. function:: compress(data[, level])
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	53
Georg Brandl	4ad934f	2011-01-08 21:04:25 +0000	[diff] [blame]	54	Compresses the bytes in data, returning a bytes object containing compressed data.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	55	level is an integer from ``1`` to ``9`` controlling the level of compression;
				56	``1`` is fastest and produces the least compression, ``9`` is slowest and
				57	produces the most. The default value is ``6``. Raises the :exc:`error`
				58	exception if any error occurs.
				59
				60
				61	.. function:: compressobj([level])
				62
				63	Returns a compression object, to be used for compressing data streams that won't
				64	fit into memory at once. level is an integer from ``1`` to ``9`` controlling
				65	the level of compression; ``1`` is fastest and produces the least compression,
				66	``9`` is slowest and produces the most. The default value is ``6``.
				67
				68
Benjamin Peterson	058e31e	2009-01-16 03:54:08 +0000	[diff] [blame]	69	.. function:: crc32(data[, value])
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	70
				71	.. index::
				72	single: Cyclic Redundancy Check
				73	single: checksum; Cyclic Redundancy Check
				74
Benjamin Peterson	058e31e	2009-01-16 03:54:08 +0000	[diff] [blame]	75	Computes a CRC (Cyclic Redundancy Check) checksum of data. If value is
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	76	present, it is used as the starting value of the checksum; otherwise, a fixed
				77	default value is used. This allows computing a running checksum over the
Benjamin Peterson	058e31e	2009-01-16 03:54:08 +0000	[diff] [blame]	78	concatenation of several inputs. The algorithm is not cryptographically
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	79	strong, and should not be used for authentication or digital signatures. Since
				80	the algorithm is designed for use as a checksum algorithm, it is not suitable
				81	for use as a general hash algorithm.
				82
Gregory P. Smith	ab0d8a1	2008-03-17 20:24:09 +0000	[diff] [blame]	83	Always returns an unsigned 32-bit integer.
				84
Benjamin Peterson	058e31e	2009-01-16 03:54:08 +0000	[diff] [blame]	85	.. note::
				86	To generate the same numeric value across all Python versions and
				87	platforms use crc32(data) & 0xffffffff. If you are only using
				88	the checksum in packed binary format this is not necessary as the
Gregory P. Smith	fa6cf39	2009-02-01 00:30:50 +0000	[diff] [blame]	89	return value is the correct 32bit binary representation
Benjamin Peterson	058e31e	2009-01-16 03:54:08 +0000	[diff] [blame]	90	regardless of sign.
				91
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	92
Georg Brandl	4ad934f	2011-01-08 21:04:25 +0000	[diff] [blame]	93	.. function:: decompress(data[, wbits[, bufsize]])
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	94
Georg Brandl	4ad934f	2011-01-08 21:04:25 +0000	[diff] [blame]	95	Decompresses the bytes in data, returning a bytes object containing the
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	96	uncompressed data. The wbits parameter controls the size of the window
Benjamin Peterson	2614cda	2010-03-21 22:36:19 +0000	[diff] [blame]	97	buffer, and is discussed further below.
				98	If bufsize is given, it is used as the initial size of the output
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	99	buffer. Raises the :exc:`error` exception if any error occurs.
				100
				101	The absolute value of wbits is the base two logarithm of the size of the
				102	history buffer (the "window size") used when compressing data. Its absolute
				103	value should be between 8 and 15 for the most recent versions of the zlib
				104	library, larger values resulting in better compression at the expense of greater
Benjamin Peterson	2614cda	2010-03-21 22:36:19 +0000	[diff] [blame]	105	memory usage. When decompressing a stream, wbits must not be smaller
				106	than the size originally used to compress the stream; using a too-small
				107	value will result in an exception. The default value is therefore the
				108	highest value, 15. When wbits is negative, the standard
Jesus Cea	fb7b668	2010-05-03 16:14:58 +0000	[diff] [blame]	109	:program:`gzip` header is suppressed.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	110
				111	bufsize is the initial size of the buffer used to hold decompressed data. If
				112	more space is required, the buffer size will be increased as needed, so you
				113	don't have to get this value exactly right; tuning it will only save a few calls
Georg Brandl	60203b4	2010-10-06 10:11:56 +0000	[diff] [blame]	114	to :c:func:`malloc`. The default size is 16384.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	115
				116
				117	.. function:: decompressobj([wbits])
				118
				119	Returns a decompression object, to be used for decompressing data streams that
				120	won't fit into memory at once. The wbits parameter controls the size of the
				121	window buffer.
				122
				123	Compression objects support the following methods:
				124
				125
Georg Brandl	4ad934f	2011-01-08 21:04:25 +0000	[diff] [blame]	126	.. method:: Compress.compress(data)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	127
Georg Brandl	4ad934f	2011-01-08 21:04:25 +0000	[diff] [blame]	128	Compress data, returning a bytes object containing compressed data for at least
				129	part of the data in data. This data should be concatenated to the output
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	130	produced by any preceding calls to the :meth:`compress` method. Some input may
				131	be kept in internal buffers for later processing.
				132
				133
				134	.. method:: Compress.flush([mode])
				135
Georg Brandl	4ad934f	2011-01-08 21:04:25 +0000	[diff] [blame]	136	All pending input is processed, and a bytes object containing the remaining compressed
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	137	output is returned. mode can be selected from the constants
				138	:const:`Z_SYNC_FLUSH`, :const:`Z_FULL_FLUSH`, or :const:`Z_FINISH`,
				139	defaulting to :const:`Z_FINISH`. :const:`Z_SYNC_FLUSH` and
Georg Brandl	4ad934f	2011-01-08 21:04:25 +0000	[diff] [blame]	140	:const:`Z_FULL_FLUSH` allow compressing further bytestrings of data, while
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	141	:const:`Z_FINISH` finishes the compressed stream and prevents compressing any
				142	more data. After calling :meth:`flush` with mode set to :const:`Z_FINISH`,
				143	the :meth:`compress` method cannot be called again; the only realistic action is
				144	to delete the object.
				145
				146
				147	.. method:: Compress.copy()
				148
				149	Returns a copy of the compression object. This can be used to efficiently
				150	compress a set of data that share a common initial prefix.
				151
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	152
				153	Decompression objects support the following methods, and two attributes:
				154
				155
				156	.. attribute:: Decompress.unused_data
				157
Georg Brandl	4ad934f	2011-01-08 21:04:25 +0000	[diff] [blame]	158	A bytes object which contains any bytes past the end of the compressed data. That is,
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	159	this remains ``""`` until the last byte that contains compression data is
Georg Brandl	4ad934f	2011-01-08 21:04:25 +0000	[diff] [blame]	160	available. If the whole bytestring turned out to contain compressed data, this is
				161	``b""``, an empty bytes object.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	162
Georg Brandl	4ad934f	2011-01-08 21:04:25 +0000	[diff] [blame]	163	The only way to determine where a bytestring of compressed data ends is by actually
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	164	decompressing it. This means that when compressed data is contained part of a
				165	larger file, you can only find the end of it by reading data and feeding it
Georg Brandl	4ad934f	2011-01-08 21:04:25 +0000	[diff] [blame]	166	followed by some non-empty bytestring into a decompression object's
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	167	:meth:`decompress` method until the :attr:`unused_data` attribute is no longer
Georg Brandl	4ad934f	2011-01-08 21:04:25 +0000	[diff] [blame]	168	empty.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	169
				170
				171	.. attribute:: Decompress.unconsumed_tail
				172
Georg Brandl	4ad934f	2011-01-08 21:04:25 +0000	[diff] [blame]	173	A bytes object that contains any data that was not consumed by the last
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	174	:meth:`decompress` call because it exceeded the limit for the uncompressed data
				175	buffer. This data has not yet been seen by the zlib machinery, so you must feed
				176	it (possibly with further data concatenated to it) back to a subsequent
				177	:meth:`decompress` method call in order to get correct output.
				178
				179
Georg Brandl	4ad934f	2011-01-08 21:04:25 +0000	[diff] [blame]	180	.. method:: Decompress.decompress(data[, max_length])
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	181
Georg Brandl	4ad934f	2011-01-08 21:04:25 +0000	[diff] [blame]	182	Decompress data, returning a bytes object containing the uncompressed data
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	183	corresponding to at least part of the data in string. This data should be
				184	concatenated to the output produced by any preceding calls to the
				185	:meth:`decompress` method. Some of the input data may be preserved in internal
				186	buffers for later processing.
				187
				188	If the optional parameter max_length is supplied then the return value will be
				189	no longer than max_length. This may mean that not all of the compressed input
				190	can be processed; and unconsumed data will be stored in the attribute
Georg Brandl	4ad934f	2011-01-08 21:04:25 +0000	[diff] [blame]	191	:attr:`unconsumed_tail`. This bytestring must be passed to a subsequent call to
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	192	:meth:`decompress` if decompression is to continue. If max_length is not
Georg Brandl	4ad934f	2011-01-08 21:04:25 +0000	[diff] [blame]	193	supplied then the whole input is decompressed, and :attr:`unconsumed_tail` is
				194	empty.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	195
				196
				197	.. method:: Decompress.flush([length])
				198
Georg Brandl	4ad934f	2011-01-08 21:04:25 +0000	[diff] [blame]	199	All pending input is processed, and a bytes object containing the remaining
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	200	uncompressed output is returned. After calling :meth:`flush`, the
				201	:meth:`decompress` method cannot be called again; the only realistic action is
				202	to delete the object.
				203
				204	The optional parameter length sets the initial size of the output buffer.
				205
				206
				207	.. method:: Decompress.copy()
				208
				209	Returns a copy of the decompression object. This can be used to save the state
				210	of the decompressor midway through the data stream in order to speed up random
				211	seeks into the stream at a future point.
				212
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	213
				214	.. seealso::
				215
				216	Module :mod:`gzip`
				217	Reading and writing :program:`gzip`\ -format files.
				218
				219	http://www.zlib.net
				220	The zlib library home page.
				221
				222	http://www.zlib.net/manual.html
				223	The zlib manual explains the semantics and usage of the library's many
				224	functions.
				225