Blame - Doc/library/zlib.rst - platform/external/python/cpython3

blob: 44e9edb5dcb516c0d84e17751c22b542500ecf0c [file] [log] [blame]

Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1	:mod:`zlib` --- Compression compatible with :program:`gzip`
				2	===========================================================
				3
				4	.. module:: zlib
Georg Brandl	7f01a13	2009-09-16 15:58:14 +0000	[diff] [blame]	5	:synopsis: Low-level interface to compression and decompression routines
				6	compatible with gzip.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	7
				8
				9	For applications that require data compression, the functions in this module
				10	allow compression and decompression, using the zlib library. The zlib library
				11	has its own home page at http://www.zlib.net. There are known
				12	incompatibilities between the Python module and versions of the zlib library
				13	earlier than 1.1.3; 1.1.3 has a security vulnerability, so we recommend using
				14	1.1.4 or later.
				15
				16	zlib's functions have many options and often need to be used in a particular
				17	order. This documentation doesn't attempt to cover all of the permutations;
				18	consult the zlib manual at http://www.zlib.net/manual.html for authoritative
				19	information.
				20
Guido van Rossum	7767711	2007-11-05 19:43:04 +0000	[diff] [blame]	21	For reading and writing ``.gz`` files see the :mod:`gzip` module. For
				22	other archive formats, see the :mod:`bz2`, :mod:`zipfile`, and
				23	:mod:`tarfile` modules.
				24
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	25	The available exception and functions in this module are:
				26
				27
				28	.. exception:: error
				29
				30	Exception raised on compression and decompression errors.
				31
				32
Benjamin Peterson	058e31e	2009-01-16 03:54:08 +0000	[diff] [blame]	33	.. function:: adler32(data[, value])
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	34
Benjamin Peterson	058e31e	2009-01-16 03:54:08 +0000	[diff] [blame]	35	Computes a Adler-32 checksum of data. (An Adler-32 checksum is almost as
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	36	reliable as a CRC32 but can be computed much more quickly.) If value is
				37	present, it is used as the starting value of the checksum; otherwise, a fixed
				38	default value is used. This allows computing a running checksum over the
Benjamin Peterson	058e31e	2009-01-16 03:54:08 +0000	[diff] [blame]	39	concatenation of several inputs. The algorithm is not cryptographically
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	40	strong, and should not be used for authentication or digital signatures. Since
				41	the algorithm is designed for use as a checksum algorithm, it is not suitable
				42	for use as a general hash algorithm.
				43
Gregory P. Smith	ab0d8a1	2008-03-17 20:24:09 +0000	[diff] [blame]	44	Always returns an unsigned 32-bit integer.
				45
Benjamin Peterson	058e31e	2009-01-16 03:54:08 +0000	[diff] [blame]	46	.. note::
				47	To generate the same numeric value across all Python versions and
				48	platforms use adler32(data) & 0xffffffff. If you are only using
				49	the checksum in packed binary format this is not necessary as the
Gregory P. Smith	fa6cf39	2009-02-01 00:30:50 +0000	[diff] [blame]	50	return value is the correct 32bit binary representation
Benjamin Peterson	058e31e	2009-01-16 03:54:08 +0000	[diff] [blame]	51	regardless of sign.
				52
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	53
				54	.. function:: compress(string[, level])
				55
				56	Compresses the data in string, returning a string contained compressed data.
				57	level is an integer from ``1`` to ``9`` controlling the level of compression;
				58	``1`` is fastest and produces the least compression, ``9`` is slowest and
				59	produces the most. The default value is ``6``. Raises the :exc:`error`
				60	exception if any error occurs.
				61
				62
				63	.. function:: compressobj([level])
				64
				65	Returns a compression object, to be used for compressing data streams that won't
				66	fit into memory at once. level is an integer from ``1`` to ``9`` controlling
				67	the level of compression; ``1`` is fastest and produces the least compression,
				68	``9`` is slowest and produces the most. The default value is ``6``.
				69
				70
Benjamin Peterson	058e31e	2009-01-16 03:54:08 +0000	[diff] [blame]	71	.. function:: crc32(data[, value])
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	72
				73	.. index::
				74	single: Cyclic Redundancy Check
				75	single: checksum; Cyclic Redundancy Check
				76
Benjamin Peterson	058e31e	2009-01-16 03:54:08 +0000	[diff] [blame]	77	Computes a CRC (Cyclic Redundancy Check) checksum of data. If value is
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	78	present, it is used as the starting value of the checksum; otherwise, a fixed
				79	default value is used. This allows computing a running checksum over the
Benjamin Peterson	058e31e	2009-01-16 03:54:08 +0000	[diff] [blame]	80	concatenation of several inputs. The algorithm is not cryptographically
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	81	strong, and should not be used for authentication or digital signatures. Since
				82	the algorithm is designed for use as a checksum algorithm, it is not suitable
				83	for use as a general hash algorithm.
				84
Gregory P. Smith	ab0d8a1	2008-03-17 20:24:09 +0000	[diff] [blame]	85	Always returns an unsigned 32-bit integer.
				86
Benjamin Peterson	058e31e	2009-01-16 03:54:08 +0000	[diff] [blame]	87	.. note::
				88	To generate the same numeric value across all Python versions and
				89	platforms use crc32(data) & 0xffffffff. If you are only using
				90	the checksum in packed binary format this is not necessary as the
Gregory P. Smith	fa6cf39	2009-02-01 00:30:50 +0000	[diff] [blame]	91	return value is the correct 32bit binary representation
Benjamin Peterson	058e31e	2009-01-16 03:54:08 +0000	[diff] [blame]	92	regardless of sign.
				93
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	94
				95	.. function:: decompress(string[, wbits[, bufsize]])
				96
				97	Decompresses the data in string, returning a string containing the
				98	uncompressed data. The wbits parameter controls the size of the window
Benjamin Peterson	2614cda	2010-03-21 22:36:19 +0000	[diff] [blame]	99	buffer, and is discussed further below.
				100	If bufsize is given, it is used as the initial size of the output
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	101	buffer. Raises the :exc:`error` exception if any error occurs.
				102
				103	The absolute value of wbits is the base two logarithm of the size of the
				104	history buffer (the "window size") used when compressing data. Its absolute
				105	value should be between 8 and 15 for the most recent versions of the zlib
				106	library, larger values resulting in better compression at the expense of greater
Benjamin Peterson	2614cda	2010-03-21 22:36:19 +0000	[diff] [blame]	107	memory usage. When decompressing a stream, wbits must not be smaller
				108	than the size originally used to compress the stream; using a too-small
				109	value will result in an exception. The default value is therefore the
				110	highest value, 15. When wbits is negative, the standard
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	111	:program:`gzip` header is suppressed; this is an undocumented feature of the
				112	zlib library, used for compatibility with :program:`unzip`'s compression file
				113	format.
				114
				115	bufsize is the initial size of the buffer used to hold decompressed data. If
				116	more space is required, the buffer size will be increased as needed, so you
				117	don't have to get this value exactly right; tuning it will only save a few calls
				118	to :cfunc:`malloc`. The default size is 16384.
				119
				120
				121	.. function:: decompressobj([wbits])
				122
				123	Returns a decompression object, to be used for decompressing data streams that
				124	won't fit into memory at once. The wbits parameter controls the size of the
				125	window buffer.
				126
				127	Compression objects support the following methods:
				128
				129
				130	.. method:: Compress.compress(string)
				131
				132	Compress string, returning a string containing compressed data for at least
				133	part of the data in string. This data should be concatenated to the output
				134	produced by any preceding calls to the :meth:`compress` method. Some input may
				135	be kept in internal buffers for later processing.
				136
				137
				138	.. method:: Compress.flush([mode])
				139
				140	All pending input is processed, and a string containing the remaining compressed
				141	output is returned. mode can be selected from the constants
				142	:const:`Z_SYNC_FLUSH`, :const:`Z_FULL_FLUSH`, or :const:`Z_FINISH`,
				143	defaulting to :const:`Z_FINISH`. :const:`Z_SYNC_FLUSH` and
				144	:const:`Z_FULL_FLUSH` allow compressing further strings of data, while
				145	:const:`Z_FINISH` finishes the compressed stream and prevents compressing any
				146	more data. After calling :meth:`flush` with mode set to :const:`Z_FINISH`,
				147	the :meth:`compress` method cannot be called again; the only realistic action is
				148	to delete the object.
				149
				150
				151	.. method:: Compress.copy()
				152
				153	Returns a copy of the compression object. This can be used to efficiently
				154	compress a set of data that share a common initial prefix.
				155
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	156
				157	Decompression objects support the following methods, and two attributes:
				158
				159
				160	.. attribute:: Decompress.unused_data
				161
				162	A string which contains any bytes past the end of the compressed data. That is,
				163	this remains ``""`` until the last byte that contains compression data is
				164	available. If the whole string turned out to contain compressed data, this is
				165	``""``, the empty string.
				166
				167	The only way to determine where a string of compressed data ends is by actually
				168	decompressing it. This means that when compressed data is contained part of a
				169	larger file, you can only find the end of it by reading data and feeding it
				170	followed by some non-empty string into a decompression object's
				171	:meth:`decompress` method until the :attr:`unused_data` attribute is no longer
				172	the empty string.
				173
				174
				175	.. attribute:: Decompress.unconsumed_tail
				176
				177	A string that contains any data that was not consumed by the last
				178	:meth:`decompress` call because it exceeded the limit for the uncompressed data
				179	buffer. This data has not yet been seen by the zlib machinery, so you must feed
				180	it (possibly with further data concatenated to it) back to a subsequent
				181	:meth:`decompress` method call in order to get correct output.
				182
				183
				184	.. method:: Decompress.decompress(string[, max_length])
				185
				186	Decompress string, returning a string containing the uncompressed data
				187	corresponding to at least part of the data in string. This data should be
				188	concatenated to the output produced by any preceding calls to the
				189	:meth:`decompress` method. Some of the input data may be preserved in internal
				190	buffers for later processing.
				191
				192	If the optional parameter max_length is supplied then the return value will be
				193	no longer than max_length. This may mean that not all of the compressed input
				194	can be processed; and unconsumed data will be stored in the attribute
				195	:attr:`unconsumed_tail`. This string must be passed to a subsequent call to
				196	:meth:`decompress` if decompression is to continue. If max_length is not
				197	supplied then the whole input is decompressed, and :attr:`unconsumed_tail` is an
				198	empty string.
				199
				200
				201	.. method:: Decompress.flush([length])
				202
				203	All pending input is processed, and a string containing the remaining
				204	uncompressed output is returned. After calling :meth:`flush`, the
				205	:meth:`decompress` method cannot be called again; the only realistic action is
				206	to delete the object.
				207
				208	The optional parameter length sets the initial size of the output buffer.
				209
				210
				211	.. method:: Decompress.copy()
				212
				213	Returns a copy of the decompression object. This can be used to save the state
				214	of the decompressor midway through the data stream in order to speed up random
				215	seeks into the stream at a future point.
				216
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	217
				218	.. seealso::
				219
				220	Module :mod:`gzip`
				221	Reading and writing :program:`gzip`\ -format files.
				222
				223	http://www.zlib.net
				224	The zlib library home page.
				225
				226	http://www.zlib.net/manual.html
				227	The zlib manual explains the semantics and usage of the library's many
				228	functions.
				229