blob: 2df8b85fa9789a701346b73810f8293c4f868f45 [file] [log] [blame]
Fred Drake295da241998-08-10 19:42:37 +00001\section{\module{zlib} ---
Fred Drakeb11d1081999-04-21 18:44:41 +00002 Compression compatible with \program{gzip}}
Fred Drakeb91e9341998-07-23 17:59:49 +00003
Fred Drakebbac4321999-02-20 00:14:17 +00004\declaremodule{builtin}{zlib}
Fred Drake08caa961998-07-27 22:08:49 +00005\modulesynopsis{Low-level interface to compression and decompression
Fred Drakeb11d1081999-04-21 18:44:41 +00006 routines compatible with \program{gzip}.}
Fred Drakeb91e9341998-07-23 17:59:49 +00007
Guido van Rossum04bc9d61997-04-30 18:12:27 +00008
9For applications that require data compression, the functions in this
Fred Drake8a254b51998-04-09 15:41:44 +000010module allow compression and decompression, using the zlib library.
Andrew M. Kuchling2330e9e2005-08-31 16:52:40 +000011The zlib library has its own home page at \url{http://www.zlib.net}.
Andrew M. Kuchling57712b32004-10-19 19:50:23 +000012There are known incompatibilities between the Python module and
13versions of the zlib library earlier than 1.1.3; 1.1.3 has a security
14vulnerability, so we recommend using 1.1.4 or later.
Jeremy Hylton45b0aed1999-04-05 21:55:21 +000015
Andrew M. Kuchling2330e9e2005-08-31 16:52:40 +000016zlib's functions have many options and often need to be used in a
17particular order. This documentation doesn't attempt to cover all of
18the permutations; consult the zlib manual at
19\url{http://www.zlib.net/manual.html} for authoritative information.
20
Fred Drake74810d51998-04-03 06:49:26 +000021The available exception and functions in this module are:
Guido van Rossum04bc9d61997-04-30 18:12:27 +000022
Fred Drake74810d51998-04-03 06:49:26 +000023\begin{excdesc}{error}
24 Exception raised on compression and decompression errors.
25\end{excdesc}
26
27
Fred Drakecce10901998-03-17 06:33:25 +000028\begin{funcdesc}{adler32}{string\optional{, value}}
Guido van Rossum04bc9d61997-04-30 18:12:27 +000029 Computes a Adler-32 checksum of \var{string}. (An Adler-32
30 checksum is almost as reliable as a CRC32 but can be computed much
31 more quickly.) If \var{value} is present, it is used as the
32 starting value of the checksum; otherwise, a fixed default value is
33 used. This allows computing a running checksum over the
34 concatenation of several input strings. The algorithm is not
35 cryptographically strong, and should not be used for
Fred Drake327798c2001-10-15 13:45:49 +000036 authentication or digital signatures. Since the algorithm is
37 designed for use as a checksum algorithm, it is not suitable for
38 use as a general hash algorithm.
Guido van Rossum04bc9d61997-04-30 18:12:27 +000039\end{funcdesc}
40
Fred Drakecce10901998-03-17 06:33:25 +000041\begin{funcdesc}{compress}{string\optional{, level}}
Fred Drake59160701998-06-19 21:18:28 +000042 Compresses the data in \var{string}, returning a string contained
43 compressed data. \var{level} is an integer from \code{1} to
44 \code{9} controlling the level of compression; \code{1} is fastest
45 and produces the least compression, \code{9} is slowest and produces
46 the most. The default value is \code{6}. Raises the
47 \exception{error} exception if any error occurs.
Guido van Rossum04bc9d61997-04-30 18:12:27 +000048\end{funcdesc}
49
50\begin{funcdesc}{compressobj}{\optional{level}}
Fred Drakeed797831998-01-22 16:11:18 +000051 Returns a compression object, to be used for compressing data streams
Guido van Rossum04bc9d61997-04-30 18:12:27 +000052 that won't fit into memory at once. \var{level} is an integer from
Fred Drakeed797831998-01-22 16:11:18 +000053 \code{1} to \code{9} controlling the level of compression; \code{1} is
54 fastest and produces the least compression, \code{9} is slowest and
55 produces the most. The default value is \code{6}.
Guido van Rossum04bc9d61997-04-30 18:12:27 +000056\end{funcdesc}
57
Fred Drakecce10901998-03-17 06:33:25 +000058\begin{funcdesc}{crc32}{string\optional{, value}}
Fred Drake74810d51998-04-03 06:49:26 +000059 Computes a CRC (Cyclic Redundancy Check)%
60 \index{Cyclic Redundancy Check}
Fred Drakeb208f121998-04-04 06:28:54 +000061 \index{checksum!Cyclic Redundancy Check}
Fred Drake74810d51998-04-03 06:49:26 +000062 checksum of \var{string}. If
63 \var{value} is present, it is used as the starting value of the
64 checksum; otherwise, a fixed default value is used. This allows
65 computing a running checksum over the concatenation of several
66 input strings. The algorithm is not cryptographically strong, and
Fred Drake327798c2001-10-15 13:45:49 +000067 should not be used for authentication or digital signatures. Since
68 the algorithm is designed for use as a checksum algorithm, it is not
69 suitable for use as a general hash algorithm.
Guido van Rossum04bc9d61997-04-30 18:12:27 +000070\end{funcdesc}
71
Fred Drake38e5d272000-04-03 20:13:55 +000072\begin{funcdesc}{decompress}{string\optional{, wbits\optional{, bufsize}}}
Fred Drake59160701998-06-19 21:18:28 +000073 Decompresses the data in \var{string}, returning a string containing
74 the uncompressed data. The \var{wbits} parameter controls the size of
Fred Drake38e5d272000-04-03 20:13:55 +000075 the window buffer. If \var{bufsize} is given, it is used as the
Fred Drake59160701998-06-19 21:18:28 +000076 initial size of the output buffer. Raises the \exception{error}
77 exception if any error occurs.
Fred Drake38e5d272000-04-03 20:13:55 +000078
79The absolute value of \var{wbits} is the base two logarithm of the
80size of the history buffer (the ``window size'') used when compressing
81data. Its absolute value should be between 8 and 15 for the most
82recent versions of the zlib library, larger values resulting in better
83compression at the expense of greater memory usage. The default value
84is 15. When \var{wbits} is negative, the standard
85\program{gzip} header is suppressed; this is an undocumented feature
86of the zlib library, used for compatibility with \program{unzip}'s
87compression file format.
88
89\var{bufsize} is the initial size of the buffer used to hold
90decompressed data. If more space is required, the buffer size will be
91increased as needed, so you don't have to get this value exactly
92right; tuning it will only save a few calls to \cfunction{malloc()}. The
93default size is 16384.
94
Guido van Rossum04bc9d61997-04-30 18:12:27 +000095\end{funcdesc}
96
97\begin{funcdesc}{decompressobj}{\optional{wbits}}
Fred Drakebc524c42001-04-18 20:16:51 +000098 Returns a decompression object, to be used for decompressing data
Fred Drake59160701998-06-19 21:18:28 +000099 streams that won't fit into memory at once. The \var{wbits}
100 parameter controls the size of the window buffer.
Guido van Rossum04bc9d61997-04-30 18:12:27 +0000101\end{funcdesc}
102
103Compression objects support the following methods:
104
Fred Drake74810d51998-04-03 06:49:26 +0000105\begin{methoddesc}[Compress]{compress}{string}
Guido van Rossum04bc9d61997-04-30 18:12:27 +0000106Compress \var{string}, returning a string containing compressed data
107for at least part of the data in \var{string}. This data should be
108concatenated to the output produced by any preceding calls to the
Fred Drakeed797831998-01-22 16:11:18 +0000109\method{compress()} method. Some input may be kept in internal buffers
Guido van Rossum04bc9d61997-04-30 18:12:27 +0000110for later processing.
Fred Drake74810d51998-04-03 06:49:26 +0000111\end{methoddesc}
Guido van Rossum04bc9d61997-04-30 18:12:27 +0000112
Andrew M. Kuchlingf07c3281998-12-31 21:14:23 +0000113\begin{methoddesc}[Compress]{flush}{\optional{mode}}
114All pending input is processed, and a string containing the remaining
115compressed output is returned. \var{mode} can be selected from the
116constants \constant{Z_SYNC_FLUSH}, \constant{Z_FULL_FLUSH}, or
117\constant{Z_FINISH}, defaulting to \constant{Z_FINISH}. \constant{Z_SYNC_FLUSH} and
Andrew M. Kuchlingc1c956b2005-09-01 14:08:38 +0000118\constant{Z_FULL_FLUSH} allow compressing further strings of data, while
Andrew M. Kuchlingf07c3281998-12-31 21:14:23 +0000119\constant{Z_FINISH} finishes the compressed stream and
120prevents compressing any more data. After calling
121\method{flush()} with \var{mode} set to \constant{Z_FINISH}, the
Fred Drakeed797831998-01-22 16:11:18 +0000122\method{compress()} method cannot be called again; the only realistic
Andrew M. Kuchlingf07c3281998-12-31 21:14:23 +0000123action is to delete the object.
Fred Drake74810d51998-04-03 06:49:26 +0000124\end{methoddesc}
Guido van Rossum04bc9d61997-04-30 18:12:27 +0000125
Jeremy Hylton511e2ca2001-10-16 20:39:49 +0000126Decompression objects support the following methods, and two attributes:
Fred Drake38e5d272000-04-03 20:13:55 +0000127
128\begin{memberdesc}{unused_data}
Martin v. Löwis9e9a7c32003-06-21 14:15:25 +0000129A string which contains any bytes past the end of the compressed data.
130That is, this remains \code{""} until the last byte that contains
131compression data is available. If the whole string turned out to
132contain compressed data, this is \code{""}, the empty string.
Fred Drake38e5d272000-04-03 20:13:55 +0000133
134The only way to determine where a string of compressed data ends is by
135actually decompressing it. This means that when compressed data is
136contained part of a larger file, you can only find the end of it by
Martin v. Löwis9e9a7c32003-06-21 14:15:25 +0000137reading data and feeding it followed by some non-empty string into a
138decompression object's \method{decompress} method until the
139\member{unused_data} attribute is no longer the empty string.
Fred Drake38e5d272000-04-03 20:13:55 +0000140\end{memberdesc}
Guido van Rossum04bc9d61997-04-30 18:12:27 +0000141
Jeremy Hylton511e2ca2001-10-16 20:39:49 +0000142\begin{memberdesc}{unconsumed_tail}
143A string that contains any data that was not consumed by the last
144\method{decompress} call because it exceeded the limit for the
Martin v. Löwis9e9a7c32003-06-21 14:15:25 +0000145uncompressed data buffer. This data has not yet been seen by the zlib
146machinery, so you must feed it (possibly with further data
147concatenated to it) back to a subsequent \method{decompress} method
148call in order to get correct output.
Jeremy Hylton511e2ca2001-10-16 20:39:49 +0000149\end{memberdesc}
150
Martin v. Löwis9e9a7c32003-06-21 14:15:25 +0000151
Raymond Hettingerf9641542004-12-20 06:08:12 +0000152\begin{methoddesc}[Decompress]{decompress}{string\optional{, max_length}}
Guido van Rossum04bc9d61997-04-30 18:12:27 +0000153Decompress \var{string}, returning a string containing the
154uncompressed data corresponding to at least part of the data in
155\var{string}. This data should be concatenated to the output produced
156by any preceding calls to the
Fred Drakeed797831998-01-22 16:11:18 +0000157\method{decompress()} method. Some of the input data may be preserved
Guido van Rossum412154f1997-04-30 19:39:21 +0000158in internal buffers for later processing.
Jeremy Hylton511e2ca2001-10-16 20:39:49 +0000159
160If the optional parameter \var{max_length} is supplied then the return value
161will be no longer than \var{max_length}. This may mean that not all of the
162compressed input can be processed; and unconsumed data will be stored
163in the attribute \member{unconsumed_tail}. This string must be passed
164to a subsequent call to \method{decompress()} if decompression is to
165continue. If \var{max_length} is not supplied then the whole input is
166decompressed, and \member{unconsumed_tail} is an empty string.
Fred Drake74810d51998-04-03 06:49:26 +0000167\end{methoddesc}
Guido van Rossum04bc9d61997-04-30 18:12:27 +0000168
Fred Drake74810d51998-04-03 06:49:26 +0000169\begin{methoddesc}[Decompress]{flush}{}
Guido van Rossum04bc9d61997-04-30 18:12:27 +0000170All pending input is processed, and a string containing the remaining
Fred Drakeed797831998-01-22 16:11:18 +0000171uncompressed output is returned. After calling \method{flush()}, the
172\method{decompress()} method cannot be called again; the only realistic
Guido van Rossum04bc9d61997-04-30 18:12:27 +0000173action is to delete the object.
Fred Drake74810d51998-04-03 06:49:26 +0000174\end{methoddesc}
Guido van Rossum04bc9d61997-04-30 18:12:27 +0000175
Guido van Rossume47da0a1997-07-17 16:34:52 +0000176\begin{seealso}
Fred Drakeba0a9892000-10-18 17:43:06 +0000177 \seemodule{gzip}{Reading and writing \program{gzip}-format files.}
Andrew M. Kuchling2330e9e2005-08-31 16:52:40 +0000178 \seeurl{http://www.zlib.net}{The zlib library home page.}
179 \seeurl{http://www.zlib.net/manual.html}{The zlib manual explains
180 the semantics and usage of the library's many functions.}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000181\end{seealso}