Blame - Doc/library/zlib.rst - platform/external/python/cpython2

blob: 011870425f842228a4e01afa8e1a36764f6986f4 [file] [log] [blame]

Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	1
				2	:mod:`zlib` --- Compression compatible with :program:`gzip`
				3	===========================================================
				4
				5	.. module:: zlib
				6	:synopsis: Low-level interface to compression and decompression routines compatible with
				7	gzip.
				8
				9
				10	For applications that require data compression, the functions in this module
				11	allow compression and decompression, using the zlib library. The zlib library
				12	has its own home page at http://www.zlib.net. There are known
				13	incompatibilities between the Python module and versions of the zlib library
				14	earlier than 1.1.3; 1.1.3 has a security vulnerability, so we recommend using
				15	1.1.4 or later.
				16
				17	zlib's functions have many options and often need to be used in a particular
				18	order. This documentation doesn't attempt to cover all of the permutations;
				19	consult the zlib manual at http://www.zlib.net/manual.html for authoritative
				20	information.
				21
Mark Summerfield	aea6e59	2007-11-05 09:22:48 +0000	[diff] [blame]	22	For reading and writing ``.gz`` files see the :mod:`gzip` module. For
				23	other archive formats, see the :mod:`bz2`, :mod:`zipfile`, and
				24	:mod:`tarfile` modules.
				25
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	26	The available exception and functions in this module are:
				27
				28
				29	.. exception:: error
				30
				31	Exception raised on compression and decompression errors.
				32
				33
Gregory P. Smith	987735c	2009-01-11 17:57:54 +0000	[diff] [blame]	34	.. function:: adler32(data[, value])
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	35
Gregory P. Smith	987735c	2009-01-11 17:57:54 +0000	[diff] [blame]	36	Computes a Adler-32 checksum of data. (An Adler-32 checksum is almost as
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	37	reliable as a CRC32 but can be computed much more quickly.) If value is
				38	present, it is used as the starting value of the checksum; otherwise, a fixed
				39	default value is used. This allows computing a running checksum over the
Gregory P. Smith	987735c	2009-01-11 17:57:54 +0000	[diff] [blame]	40	concatenation of several inputs. The algorithm is not cryptographically
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	41	strong, and should not be used for authentication or digital signatures. Since
				42	the algorithm is designed for use as a checksum algorithm, it is not suitable
				43	for use as a general hash algorithm.
				44
Gregory P. Smith	f48f9d3	2008-03-17 18:48:05 +0000	[diff] [blame]	45	This function always returns an integer object.
				46
Gregory P. Smith	987735c	2009-01-11 17:57:54 +0000	[diff] [blame]	47	.. note::
				48	To generate the same numeric value across all Python versions and
				49	platforms use adler32(data) & 0xffffffff. If you are only using
				50	the checksum in packed binary format this is not necessary as the
Gregory P. Smith	86cc502	2009-02-01 00:24:21 +0000	[diff] [blame]	51	return value is the correct 32bit binary representation
Gregory P. Smith	987735c	2009-01-11 17:57:54 +0000	[diff] [blame]	52	regardless of sign.
				53
				54	.. versionchanged:: 2.6
Gregory P. Smith	86cc502	2009-02-01 00:24:21 +0000	[diff] [blame]	55	The return value is in the range [-231, 231-1]
				56	regardless of platform. In older versions the value is
Gregory P. Smith	987735c	2009-01-11 17:57:54 +0000	[diff] [blame]	57	signed on some platforms and unsigned on others.
				58
				59	.. versionchanged:: 3.0
Gregory P. Smith	86cc502	2009-02-01 00:24:21 +0000	[diff] [blame]	60	The return value is unsigned and in the range [0, 2**32-1]
Gregory P. Smith	987735c	2009-01-11 17:57:54 +0000	[diff] [blame]	61	regardless of platform.
Gregory P. Smith	f48f9d3	2008-03-17 18:48:05 +0000	[diff] [blame]	62
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	63
				64	.. function:: compress(string[, level])
				65
				66	Compresses the data in string, returning a string contained compressed data.
				67	level is an integer from ``1`` to ``9`` controlling the level of compression;
				68	``1`` is fastest and produces the least compression, ``9`` is slowest and
				69	produces the most. The default value is ``6``. Raises the :exc:`error`
				70	exception if any error occurs.
				71
				72
				73	.. function:: compressobj([level])
				74
				75	Returns a compression object, to be used for compressing data streams that won't
				76	fit into memory at once. level is an integer from ``1`` to ``9`` controlling
				77	the level of compression; ``1`` is fastest and produces the least compression,
				78	``9`` is slowest and produces the most. The default value is ``6``.
				79
				80
Gregory P. Smith	987735c	2009-01-11 17:57:54 +0000	[diff] [blame]	81	.. function:: crc32(data[, value])
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	82
				83	.. index::
				84	single: Cyclic Redundancy Check
				85	single: checksum; Cyclic Redundancy Check
				86
Gregory P. Smith	987735c	2009-01-11 17:57:54 +0000	[diff] [blame]	87	Computes a CRC (Cyclic Redundancy Check) checksum of data. If value is
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	88	present, it is used as the starting value of the checksum; otherwise, a fixed
				89	default value is used. This allows computing a running checksum over the
Gregory P. Smith	987735c	2009-01-11 17:57:54 +0000	[diff] [blame]	90	concatenation of several inputs. The algorithm is not cryptographically
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	91	strong, and should not be used for authentication or digital signatures. Since
				92	the algorithm is designed for use as a checksum algorithm, it is not suitable
				93	for use as a general hash algorithm.
				94
Gregory P. Smith	f48f9d3	2008-03-17 18:48:05 +0000	[diff] [blame]	95	This function always returns an integer object.
				96
Gregory P. Smith	987735c	2009-01-11 17:57:54 +0000	[diff] [blame]	97	.. note::
				98	To generate the same numeric value across all Python versions and
				99	platforms use crc32(data) & 0xffffffff. If you are only using
				100	the checksum in packed binary format this is not necessary as the
Gregory P. Smith	86cc502	2009-02-01 00:24:21 +0000	[diff] [blame]	101	return value is the correct 32bit binary representation
Gregory P. Smith	987735c	2009-01-11 17:57:54 +0000	[diff] [blame]	102	regardless of sign.
				103
				104	.. versionchanged:: 2.6
Gregory P. Smith	86cc502	2009-02-01 00:24:21 +0000	[diff] [blame]	105	The return value is in the range [-231, 231-1]
Gregory P. Smith	987735c	2009-01-11 17:57:54 +0000	[diff] [blame]	106	regardless of platform. In older versions the value would be
				107	signed on some platforms and unsigned on others.
				108
				109	.. versionchanged:: 3.0
Gregory P. Smith	86cc502	2009-02-01 00:24:21 +0000	[diff] [blame]	110	The return value is unsigned and in the range [0, 2**32-1]
Gregory P. Smith	987735c	2009-01-11 17:57:54 +0000	[diff] [blame]	111	regardless of platform.
Gregory P. Smith	f48f9d3	2008-03-17 18:48:05 +0000	[diff] [blame]	112
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	113
				114	.. function:: decompress(string[, wbits[, bufsize]])
				115
				116	Decompresses the data in string, returning a string containing the
				117	uncompressed data. The wbits parameter controls the size of the window
Andrew M. Kuchling	66dab17	2010-03-01 19:51:43 +0000	[diff] [blame]	118	buffer, and is discussed further below.
				119	If bufsize is given, it is used as the initial size of the output
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	120	buffer. Raises the :exc:`error` exception if any error occurs.
				121
				122	The absolute value of wbits is the base two logarithm of the size of the
				123	history buffer (the "window size") used when compressing data. Its absolute
				124	value should be between 8 and 15 for the most recent versions of the zlib
				125	library, larger values resulting in better compression at the expense of greater
Andrew M. Kuchling	66dab17	2010-03-01 19:51:43 +0000	[diff] [blame]	126	memory usage. When decompressing a stream, wbits must not be smaller
				127	than the size originally used to compress the stream; using a too-small
				128	value will result in an exception. The default value is therefore the
				129	highest value, 15. When wbits is negative, the standard
Jesus Cea	c3ce9e3	2010-05-03 16:09:21 +0000	[diff] [blame]	130	:program:`gzip` header is suppressed.
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	131
				132	bufsize is the initial size of the buffer used to hold decompressed data. If
				133	more space is required, the buffer size will be increased as needed, so you
				134	don't have to get this value exactly right; tuning it will only save a few calls
				135	to :cfunc:`malloc`. The default size is 16384.
				136
				137
				138	.. function:: decompressobj([wbits])
				139
				140	Returns a decompression object, to be used for decompressing data streams that
				141	won't fit into memory at once. The wbits parameter controls the size of the
				142	window buffer.
				143
				144	Compression objects support the following methods:
				145
				146
				147	.. method:: Compress.compress(string)
				148
				149	Compress string, returning a string containing compressed data for at least
				150	part of the data in string. This data should be concatenated to the output
				151	produced by any preceding calls to the :meth:`compress` method. Some input may
				152	be kept in internal buffers for later processing.
				153
				154
				155	.. method:: Compress.flush([mode])
				156
				157	All pending input is processed, and a string containing the remaining compressed
				158	output is returned. mode can be selected from the constants
				159	:const:`Z_SYNC_FLUSH`, :const:`Z_FULL_FLUSH`, or :const:`Z_FINISH`,
				160	defaulting to :const:`Z_FINISH`. :const:`Z_SYNC_FLUSH` and
				161	:const:`Z_FULL_FLUSH` allow compressing further strings of data, while
				162	:const:`Z_FINISH` finishes the compressed stream and prevents compressing any
				163	more data. After calling :meth:`flush` with mode set to :const:`Z_FINISH`,
				164	the :meth:`compress` method cannot be called again; the only realistic action is
				165	to delete the object.
				166
				167
				168	.. method:: Compress.copy()
				169
				170	Returns a copy of the compression object. This can be used to efficiently
				171	compress a set of data that share a common initial prefix.
				172
				173	.. versionadded:: 2.5
				174
				175	Decompression objects support the following methods, and two attributes:
				176
				177
				178	.. attribute:: Decompress.unused_data
				179
				180	A string which contains any bytes past the end of the compressed data. That is,
				181	this remains ``""`` until the last byte that contains compression data is
				182	available. If the whole string turned out to contain compressed data, this is
				183	``""``, the empty string.
				184
				185	The only way to determine where a string of compressed data ends is by actually
				186	decompressing it. This means that when compressed data is contained part of a
				187	larger file, you can only find the end of it by reading data and feeding it
				188	followed by some non-empty string into a decompression object's
				189	:meth:`decompress` method until the :attr:`unused_data` attribute is no longer
				190	the empty string.
				191
				192
				193	.. attribute:: Decompress.unconsumed_tail
				194
				195	A string that contains any data that was not consumed by the last
				196	:meth:`decompress` call because it exceeded the limit for the uncompressed data
				197	buffer. This data has not yet been seen by the zlib machinery, so you must feed
				198	it (possibly with further data concatenated to it) back to a subsequent
				199	:meth:`decompress` method call in order to get correct output.
				200
				201
				202	.. method:: Decompress.decompress(string[, max_length])
				203
				204	Decompress string, returning a string containing the uncompressed data
				205	corresponding to at least part of the data in string. This data should be
				206	concatenated to the output produced by any preceding calls to the
				207	:meth:`decompress` method. Some of the input data may be preserved in internal
				208	buffers for later processing.
				209
				210	If the optional parameter max_length is supplied then the return value will be
				211	no longer than max_length. This may mean that not all of the compressed input
				212	can be processed; and unconsumed data will be stored in the attribute
				213	:attr:`unconsumed_tail`. This string must be passed to a subsequent call to
				214	:meth:`decompress` if decompression is to continue. If max_length is not
				215	supplied then the whole input is decompressed, and :attr:`unconsumed_tail` is an
				216	empty string.
				217
				218
				219	.. method:: Decompress.flush([length])
				220
				221	All pending input is processed, and a string containing the remaining
				222	uncompressed output is returned. After calling :meth:`flush`, the
				223	:meth:`decompress` method cannot be called again; the only realistic action is
				224	to delete the object.
				225
				226	The optional parameter length sets the initial size of the output buffer.
				227
				228
				229	.. method:: Decompress.copy()
				230
				231	Returns a copy of the decompression object. This can be used to save the state
				232	of the decompressor midway through the data stream in order to speed up random
				233	seeks into the stream at a future point.
				234
				235	.. versionadded:: 2.5
				236
				237
				238	.. seealso::
				239
				240	Module :mod:`gzip`
				241	Reading and writing :program:`gzip`\ -format files.
				242
				243	http://www.zlib.net
				244	The zlib library home page.
				245
				246	http://www.zlib.net/manual.html
				247	The zlib manual explains the semantics and usage of the library's many
				248	functions.
				249