Blame - Doc/library/zlib.rst - platform/external/python/cpython2

blob: 9cffe2706b6cc2682a70fc9949d3d6fbc22bef02 [file] [log] [blame]

Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	1
				2	:mod:`zlib` --- Compression compatible with :program:`gzip`
				3	===========================================================
				4
				5	.. module:: zlib
				6	:synopsis: Low-level interface to compression and decompression routines compatible with
				7	gzip.
				8
				9
				10	For applications that require data compression, the functions in this module
				11	allow compression and decompression, using the zlib library. The zlib library
				12	has its own home page at http://www.zlib.net. There are known
				13	incompatibilities between the Python module and versions of the zlib library
				14	earlier than 1.1.3; 1.1.3 has a security vulnerability, so we recommend using
				15	1.1.4 or later.
				16
				17	zlib's functions have many options and often need to be used in a particular
				18	order. This documentation doesn't attempt to cover all of the permutations;
				19	consult the zlib manual at http://www.zlib.net/manual.html for authoritative
				20	information.
				21
Éric Araujo	c3cc2ac	2012-02-26 01:10:14 +0100	[diff] [blame]	22	For reading and writing ``.gz`` files see the :mod:`gzip` module.
Mark Summerfield	aea6e59	2007-11-05 09:22:48 +0000	[diff] [blame]	23
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	24	The available exception and functions in this module are:
				25
				26
				27	.. exception:: error
				28
				29	Exception raised on compression and decompression errors.
				30
				31
Gregory P. Smith	987735c	2009-01-11 17:57:54 +0000	[diff] [blame]	32	.. function:: adler32(data[, value])
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	33
Serhiy Storchaka	c72e66a	2015-11-02 15:06:09 +0200	[diff] [blame]	34	Computes an Adler-32 checksum of data. (An Adler-32 checksum is almost as
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	35	reliable as a CRC32 but can be computed much more quickly.) If value is
				36	present, it is used as the starting value of the checksum; otherwise, a fixed
				37	default value is used. This allows computing a running checksum over the
Gregory P. Smith	987735c	2009-01-11 17:57:54 +0000	[diff] [blame]	38	concatenation of several inputs. The algorithm is not cryptographically
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	39	strong, and should not be used for authentication or digital signatures. Since
				40	the algorithm is designed for use as a checksum algorithm, it is not suitable
				41	for use as a general hash algorithm.
				42
Gregory P. Smith	f48f9d3	2008-03-17 18:48:05 +0000	[diff] [blame]	43	This function always returns an integer object.
				44
Gregory P. Smith	987735c	2009-01-11 17:57:54 +0000	[diff] [blame]	45	.. note::
				46	To generate the same numeric value across all Python versions and
				47	platforms use adler32(data) & 0xffffffff. If you are only using
				48	the checksum in packed binary format this is not necessary as the
Gregory P. Smith	86cc502	2009-02-01 00:24:21 +0000	[diff] [blame]	49	return value is the correct 32bit binary representation
Gregory P. Smith	987735c	2009-01-11 17:57:54 +0000	[diff] [blame]	50	regardless of sign.
				51
				52	.. versionchanged:: 2.6
Gregory P. Smith	86cc502	2009-02-01 00:24:21 +0000	[diff] [blame]	53	The return value is in the range [-231, 231-1]
				54	regardless of platform. In older versions the value is
Gregory P. Smith	987735c	2009-01-11 17:57:54 +0000	[diff] [blame]	55	signed on some platforms and unsigned on others.
				56
				57	.. versionchanged:: 3.0
Gregory P. Smith	86cc502	2009-02-01 00:24:21 +0000	[diff] [blame]	58	The return value is unsigned and in the range [0, 2**32-1]
Gregory P. Smith	987735c	2009-01-11 17:57:54 +0000	[diff] [blame]	59	regardless of platform.
Gregory P. Smith	f48f9d3	2008-03-17 18:48:05 +0000	[diff] [blame]	60
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	61
				62	.. function:: compress(string[, level])
				63
				64	Compresses the data in string, returning a string contained compressed data.
Nadeem Vawda	04050b8	2012-11-11 13:52:10 +0100	[diff] [blame]	65	level is an integer from ``0`` to ``9`` controlling the level of compression;
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	66	``1`` is fastest and produces the least compression, ``9`` is slowest and
Nadeem Vawda	04050b8	2012-11-11 13:52:10 +0100	[diff] [blame]	67	produces the most. ``0`` is no compression. The default value is ``6``.
				68	Raises the :exc:`error` exception if any error occurs.
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	69
				70
Georg Brandl	cea3808	2013-10-17 19:51:00 +0200	[diff] [blame]	71	.. function:: compressobj([level[, method[, wbits[, memlevel[, strategy]]]]])
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	72
				73	Returns a compression object, to be used for compressing data streams that won't
Martin Panter	1d269c1	2016-02-03 07:06:33 +0000	[diff] [blame]	74	fit into memory at once. level is an integer from
				75	``0`` to ``9`` or ``-1``, controlling
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	76	the level of compression; ``1`` is fastest and produces the least compression,
Nadeem Vawda	04050b8	2012-11-11 13:52:10 +0100	[diff] [blame]	77	``9`` is slowest and produces the most. ``0`` is no compression. The default
Martin Panter	1d269c1	2016-02-03 07:06:33 +0000	[diff] [blame]	78	value is ``-1`` (Z_DEFAULT_COMPRESSION). Z_DEFAULT_COMPRESSION represents a default
				79	compromise between speed and compression (currently equivalent to level 6).
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	80
Georg Brandl	cea3808	2013-10-17 19:51:00 +0200	[diff] [blame]	81	method is the compression algorithm. Currently, the only supported value is
				82	``DEFLATED``.
				83
Martin Panter	9c946bb	2016-05-27 07:32:11 +0000	[diff] [blame]	84	The wbits argument controls the size of the history buffer (or the
				85	"window size") used when compressing data, and whether a header and
				86	trailer is included in the output. It can take several ranges of values.
				87	The default is 15.
				88
				89	* +9 to +15: The base-two logarithm of the window size, which
				90	therefore ranges between 512 and 32768. Larger values produce
				91	better compression at the expense of greater memory usage. The
				92	resulting output will include a zlib-specific header and trailer.
				93
				94	* −9 to −15: Uses the absolute value of wbits as the
				95	window size logarithm, while producing a raw output stream with no
				96	header or trailing checksum.
				97
				98	* +25 to +31 = 16 + (9 to 15): Uses the low 4 bits of the value as the
				99	window size logarithm, while including a basic :program:`gzip` header
				100	and trailing checksum in the output.
Georg Brandl	cea3808	2013-10-17 19:51:00 +0200	[diff] [blame]	101
				102	memlevel controls the amount of memory used for internal compression state.
				103	Valid values range from ``1`` to ``9``. Higher values using more memory,
				104	but are faster and produce smaller output. The default is 8.
				105
				106	strategy is used to tune the compression algorithm. Possible values are
				107	``Z_DEFAULT_STRATEGY``, ``Z_FILTERED``, and ``Z_HUFFMAN_ONLY``. The default
				108	is ``Z_DEFAULT_STRATEGY``.
				109
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	110
Gregory P. Smith	987735c	2009-01-11 17:57:54 +0000	[diff] [blame]	111	.. function:: crc32(data[, value])
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	112
				113	.. index::
				114	single: Cyclic Redundancy Check
				115	single: checksum; Cyclic Redundancy Check
				116
Gregory P. Smith	987735c	2009-01-11 17:57:54 +0000	[diff] [blame]	117	Computes a CRC (Cyclic Redundancy Check) checksum of data. If value is
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	118	present, it is used as the starting value of the checksum; otherwise, a fixed
				119	default value is used. This allows computing a running checksum over the
Gregory P. Smith	987735c	2009-01-11 17:57:54 +0000	[diff] [blame]	120	concatenation of several inputs. The algorithm is not cryptographically
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	121	strong, and should not be used for authentication or digital signatures. Since
				122	the algorithm is designed for use as a checksum algorithm, it is not suitable
				123	for use as a general hash algorithm.
				124
Gregory P. Smith	f48f9d3	2008-03-17 18:48:05 +0000	[diff] [blame]	125	This function always returns an integer object.
				126
Gregory P. Smith	987735c	2009-01-11 17:57:54 +0000	[diff] [blame]	127	.. note::
				128	To generate the same numeric value across all Python versions and
				129	platforms use crc32(data) & 0xffffffff. If you are only using
				130	the checksum in packed binary format this is not necessary as the
Gregory P. Smith	86cc502	2009-02-01 00:24:21 +0000	[diff] [blame]	131	return value is the correct 32bit binary representation
Gregory P. Smith	987735c	2009-01-11 17:57:54 +0000	[diff] [blame]	132	regardless of sign.
				133
				134	.. versionchanged:: 2.6
Gregory P. Smith	86cc502	2009-02-01 00:24:21 +0000	[diff] [blame]	135	The return value is in the range [-231, 231-1]
Gregory P. Smith	987735c	2009-01-11 17:57:54 +0000	[diff] [blame]	136	regardless of platform. In older versions the value would be
				137	signed on some platforms and unsigned on others.
				138
				139	.. versionchanged:: 3.0
Gregory P. Smith	86cc502	2009-02-01 00:24:21 +0000	[diff] [blame]	140	The return value is unsigned and in the range [0, 2**32-1]
Gregory P. Smith	987735c	2009-01-11 17:57:54 +0000	[diff] [blame]	141	regardless of platform.
Gregory P. Smith	f48f9d3	2008-03-17 18:48:05 +0000	[diff] [blame]	142
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	143
				144	.. function:: decompress(string[, wbits[, bufsize]])
				145
				146	Decompresses the data in string, returning a string containing the
Martin Panter	9c946bb	2016-05-27 07:32:11 +0000	[diff] [blame]	147	uncompressed data. The wbits parameter depends on
				148	the format of string, and is discussed further below.
Andrew M. Kuchling	66dab17	2010-03-01 19:51:43 +0000	[diff] [blame]	149	If bufsize is given, it is used as the initial size of the output
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	150	buffer. Raises the :exc:`error` exception if any error occurs.
				151
Martin Panter	9c946bb	2016-05-27 07:32:11 +0000	[diff] [blame]	152	.. _decompress-wbits:
				153
				154	The wbits parameter controls the size of the history buffer
				155	(or "window size"), and what header and trailer format is expected.
				156	It is similar to the parameter for :func:`compressobj`, but accepts
				157	more ranges of values:
				158
				159	* +8 to +15: The base-two logarithm of the window size. The input
				160	must include a zlib header and trailer.
				161
				162	* 0: Automatically determine the window size from the zlib header.
Martin Panter	6ecfab8	2016-05-27 11:20:21 +0000	[diff] [blame]	163	Only supported since zlib 1.2.3.5.
Martin Panter	9c946bb	2016-05-27 07:32:11 +0000	[diff] [blame]	164
				165	* −8 to −15: Uses the absolute value of wbits as the window size
				166	logarithm. The input must be a raw stream with no header or trailer.
				167
				168	* +24 to +31 = 16 + (8 to 15): Uses the low 4 bits of the value as
				169	the window size logarithm. The input must include a gzip header and
				170	trailer.
				171
				172	* +40 to +47 = 32 + (8 to 15): Uses the low 4 bits of the value as
				173	the window size logarithm, and automatically accepts either
				174	the zlib or gzip format.
				175
				176	When decompressing a stream, the window size must not be smaller
Andrew M. Kuchling	66dab17	2010-03-01 19:51:43 +0000	[diff] [blame]	177	than the size originally used to compress the stream; using a too-small
Martin Panter	9c946bb	2016-05-27 07:32:11 +0000	[diff] [blame]	178	value may result in an :exc:`error` exception. The default wbits value
				179	is 15, which corresponds to the largest window size and requires a zlib
				180	header and trailer to be included.
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	181
				182	bufsize is the initial size of the buffer used to hold decompressed data. If
				183	more space is required, the buffer size will be increased as needed, so you
				184	don't have to get this value exactly right; tuning it will only save a few calls
Sandro Tosi	98ed08f	2012-01-14 16:42:02 +0100	[diff] [blame]	185	to :c:func:`malloc`. The default size is 16384.
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	186
				187
				188	.. function:: decompressobj([wbits])
				189
				190	Returns a decompression object, to be used for decompressing data streams that
Martin Panter	9c946bb	2016-05-27 07:32:11 +0000	[diff] [blame]	191	won't fit into memory at once.
				192
				193	The wbits parameter controls the size of the history buffer (or the
				194	"window size"), and what header and trailer format is expected. It has
				195	the same meaning as `described for decompress() <#decompress-wbits>`__.
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	196
				197	Compression objects support the following methods:
				198
				199
				200	.. method:: Compress.compress(string)
				201
				202	Compress string, returning a string containing compressed data for at least
				203	part of the data in string. This data should be concatenated to the output
				204	produced by any preceding calls to the :meth:`compress` method. Some input may
				205	be kept in internal buffers for later processing.
				206
				207
				208	.. method:: Compress.flush([mode])
				209
				210	All pending input is processed, and a string containing the remaining compressed
				211	output is returned. mode can be selected from the constants
				212	:const:`Z_SYNC_FLUSH`, :const:`Z_FULL_FLUSH`, or :const:`Z_FINISH`,
				213	defaulting to :const:`Z_FINISH`. :const:`Z_SYNC_FLUSH` and
				214	:const:`Z_FULL_FLUSH` allow compressing further strings of data, while
				215	:const:`Z_FINISH` finishes the compressed stream and prevents compressing any
				216	more data. After calling :meth:`flush` with mode set to :const:`Z_FINISH`,
				217	the :meth:`compress` method cannot be called again; the only realistic action is
				218	to delete the object.
				219
				220
				221	.. method:: Compress.copy()
				222
				223	Returns a copy of the compression object. This can be used to efficiently
				224	compress a set of data that share a common initial prefix.
				225
				226	.. versionadded:: 2.5
				227
				228	Decompression objects support the following methods, and two attributes:
				229
				230
				231	.. attribute:: Decompress.unused_data
				232
				233	A string which contains any bytes past the end of the compressed data. That is,
				234	this remains ``""`` until the last byte that contains compression data is
				235	available. If the whole string turned out to contain compressed data, this is
				236	``""``, the empty string.
				237
				238	The only way to determine where a string of compressed data ends is by actually
				239	decompressing it. This means that when compressed data is contained part of a
				240	larger file, you can only find the end of it by reading data and feeding it
				241	followed by some non-empty string into a decompression object's
				242	:meth:`decompress` method until the :attr:`unused_data` attribute is no longer
				243	the empty string.
				244
				245
				246	.. attribute:: Decompress.unconsumed_tail
				247
				248	A string that contains any data that was not consumed by the last
				249	:meth:`decompress` call because it exceeded the limit for the uncompressed data
				250	buffer. This data has not yet been seen by the zlib machinery, so you must feed
				251	it (possibly with further data concatenated to it) back to a subsequent
				252	:meth:`decompress` method call in order to get correct output.
				253
				254
				255	.. method:: Decompress.decompress(string[, max_length])
				256
				257	Decompress string, returning a string containing the uncompressed data
				258	corresponding to at least part of the data in string. This data should be
				259	concatenated to the output produced by any preceding calls to the
				260	:meth:`decompress` method. Some of the input data may be preserved in internal
				261	buffers for later processing.
				262
Martin Panter	402803b	2015-11-18 00:59:17 +0000	[diff] [blame]	263	If the optional parameter max_length is non-zero then the return value will be
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	264	no longer than max_length. This may mean that not all of the compressed input
				265	can be processed; and unconsumed data will be stored in the attribute
				266	:attr:`unconsumed_tail`. This string must be passed to a subsequent call to
				267	:meth:`decompress` if decompression is to continue. If max_length is not
				268	supplied then the whole input is decompressed, and :attr:`unconsumed_tail` is an
				269	empty string.
				270
				271
				272	.. method:: Decompress.flush([length])
				273
				274	All pending input is processed, and a string containing the remaining
				275	uncompressed output is returned. After calling :meth:`flush`, the
				276	:meth:`decompress` method cannot be called again; the only realistic action is
				277	to delete the object.
				278
				279	The optional parameter length sets the initial size of the output buffer.
				280
				281
				282	.. method:: Decompress.copy()
				283
				284	Returns a copy of the decompression object. This can be used to save the state
				285	of the decompressor midway through the data stream in order to speed up random
				286	seeks into the stream at a future point.
				287
				288	.. versionadded:: 2.5
				289
				290
				291	.. seealso::
				292
				293	Module :mod:`gzip`
				294	Reading and writing :program:`gzip`\ -format files.
				295
				296	http://www.zlib.net
				297	The zlib library home page.
				298
				299	http://www.zlib.net/manual.html
				300	The zlib manual explains the semantics and usage of the library's many
				301	functions.
				302