blob: b0bc881c7563461e5094a7ddfe733036e671bf09 [file] [log] [blame]
Fred Drake295da241998-08-10 19:42:37 +00001\section{\module{zlib} ---
Fred Drakeb11d1081999-04-21 18:44:41 +00002 Compression compatible with \program{gzip}}
Fred Drakeb91e9341998-07-23 17:59:49 +00003
Fred Drakebbac4321999-02-20 00:14:17 +00004\declaremodule{builtin}{zlib}
Fred Drake08caa961998-07-27 22:08:49 +00005\modulesynopsis{Low-level interface to compression and decompression
Fred Drakeb11d1081999-04-21 18:44:41 +00006 routines compatible with \program{gzip}.}
Fred Drakeb91e9341998-07-23 17:59:49 +00007
Guido van Rossum04bc9d61997-04-30 18:12:27 +00008
9For applications that require data compression, the functions in this
Fred Drake8a254b51998-04-09 15:41:44 +000010module allow compression and decompression, using the zlib library.
11The zlib library has its own home page at
Fred Drakeb037d332001-06-25 15:30:13 +000012\url{http://www.gzip.org/zlib/}. Version 1.1.3 is the
Fred Drake315b9e02000-09-16 06:18:26 +000013most recent version as of September 2000; use a later version if one
14is available. There are known incompatibilities between the Python
15module and earlier versions of the zlib library.
Jeremy Hylton45b0aed1999-04-05 21:55:21 +000016
Fred Drake74810d51998-04-03 06:49:26 +000017The available exception and functions in this module are:
Guido van Rossum04bc9d61997-04-30 18:12:27 +000018
Fred Drake74810d51998-04-03 06:49:26 +000019\begin{excdesc}{error}
20 Exception raised on compression and decompression errors.
21\end{excdesc}
22
23
Fred Drakecce10901998-03-17 06:33:25 +000024\begin{funcdesc}{adler32}{string\optional{, value}}
Guido van Rossum04bc9d61997-04-30 18:12:27 +000025 Computes a Adler-32 checksum of \var{string}. (An Adler-32
26 checksum is almost as reliable as a CRC32 but can be computed much
27 more quickly.) If \var{value} is present, it is used as the
28 starting value of the checksum; otherwise, a fixed default value is
29 used. This allows computing a running checksum over the
30 concatenation of several input strings. The algorithm is not
31 cryptographically strong, and should not be used for
Fred Drake327798c2001-10-15 13:45:49 +000032 authentication or digital signatures. Since the algorithm is
33 designed for use as a checksum algorithm, it is not suitable for
34 use as a general hash algorithm.
Guido van Rossum04bc9d61997-04-30 18:12:27 +000035\end{funcdesc}
36
Fred Drakecce10901998-03-17 06:33:25 +000037\begin{funcdesc}{compress}{string\optional{, level}}
Fred Drake59160701998-06-19 21:18:28 +000038 Compresses the data in \var{string}, returning a string contained
39 compressed data. \var{level} is an integer from \code{1} to
40 \code{9} controlling the level of compression; \code{1} is fastest
41 and produces the least compression, \code{9} is slowest and produces
42 the most. The default value is \code{6}. Raises the
43 \exception{error} exception if any error occurs.
Guido van Rossum04bc9d61997-04-30 18:12:27 +000044\end{funcdesc}
45
46\begin{funcdesc}{compressobj}{\optional{level}}
Fred Drakeed797831998-01-22 16:11:18 +000047 Returns a compression object, to be used for compressing data streams
Guido van Rossum04bc9d61997-04-30 18:12:27 +000048 that won't fit into memory at once. \var{level} is an integer from
Fred Drakeed797831998-01-22 16:11:18 +000049 \code{1} to \code{9} controlling the level of compression; \code{1} is
50 fastest and produces the least compression, \code{9} is slowest and
51 produces the most. The default value is \code{6}.
Guido van Rossum04bc9d61997-04-30 18:12:27 +000052\end{funcdesc}
53
Fred Drakecce10901998-03-17 06:33:25 +000054\begin{funcdesc}{crc32}{string\optional{, value}}
Fred Drake74810d51998-04-03 06:49:26 +000055 Computes a CRC (Cyclic Redundancy Check)%
56 \index{Cyclic Redundancy Check}
Fred Drakeb208f121998-04-04 06:28:54 +000057 \index{checksum!Cyclic Redundancy Check}
Fred Drake74810d51998-04-03 06:49:26 +000058 checksum of \var{string}. If
59 \var{value} is present, it is used as the starting value of the
60 checksum; otherwise, a fixed default value is used. This allows
61 computing a running checksum over the concatenation of several
62 input strings. The algorithm is not cryptographically strong, and
Fred Drake327798c2001-10-15 13:45:49 +000063 should not be used for authentication or digital signatures. Since
64 the algorithm is designed for use as a checksum algorithm, it is not
65 suitable for use as a general hash algorithm.
Guido van Rossum04bc9d61997-04-30 18:12:27 +000066\end{funcdesc}
67
Fred Drake38e5d272000-04-03 20:13:55 +000068\begin{funcdesc}{decompress}{string\optional{, wbits\optional{, bufsize}}}
Fred Drake59160701998-06-19 21:18:28 +000069 Decompresses the data in \var{string}, returning a string containing
70 the uncompressed data. The \var{wbits} parameter controls the size of
Fred Drake38e5d272000-04-03 20:13:55 +000071 the window buffer. If \var{bufsize} is given, it is used as the
Fred Drake59160701998-06-19 21:18:28 +000072 initial size of the output buffer. Raises the \exception{error}
73 exception if any error occurs.
Fred Drake38e5d272000-04-03 20:13:55 +000074
75The absolute value of \var{wbits} is the base two logarithm of the
76size of the history buffer (the ``window size'') used when compressing
77data. Its absolute value should be between 8 and 15 for the most
78recent versions of the zlib library, larger values resulting in better
79compression at the expense of greater memory usage. The default value
80is 15. When \var{wbits} is negative, the standard
81\program{gzip} header is suppressed; this is an undocumented feature
82of the zlib library, used for compatibility with \program{unzip}'s
83compression file format.
84
85\var{bufsize} is the initial size of the buffer used to hold
86decompressed data. If more space is required, the buffer size will be
87increased as needed, so you don't have to get this value exactly
88right; tuning it will only save a few calls to \cfunction{malloc()}. The
89default size is 16384.
90
Guido van Rossum04bc9d61997-04-30 18:12:27 +000091\end{funcdesc}
92
93\begin{funcdesc}{decompressobj}{\optional{wbits}}
Fred Drakebc524c42001-04-18 20:16:51 +000094 Returns a decompression object, to be used for decompressing data
Fred Drake59160701998-06-19 21:18:28 +000095 streams that won't fit into memory at once. The \var{wbits}
96 parameter controls the size of the window buffer.
Guido van Rossum04bc9d61997-04-30 18:12:27 +000097\end{funcdesc}
98
99Compression objects support the following methods:
100
Fred Drake74810d51998-04-03 06:49:26 +0000101\begin{methoddesc}[Compress]{compress}{string}
Guido van Rossum04bc9d61997-04-30 18:12:27 +0000102Compress \var{string}, returning a string containing compressed data
103for at least part of the data in \var{string}. This data should be
104concatenated to the output produced by any preceding calls to the
Fred Drakeed797831998-01-22 16:11:18 +0000105\method{compress()} method. Some input may be kept in internal buffers
Guido van Rossum04bc9d61997-04-30 18:12:27 +0000106for later processing.
Fred Drake74810d51998-04-03 06:49:26 +0000107\end{methoddesc}
Guido van Rossum04bc9d61997-04-30 18:12:27 +0000108
Andrew M. Kuchlingf07c3281998-12-31 21:14:23 +0000109\begin{methoddesc}[Compress]{flush}{\optional{mode}}
110All pending input is processed, and a string containing the remaining
111compressed output is returned. \var{mode} can be selected from the
112constants \constant{Z_SYNC_FLUSH}, \constant{Z_FULL_FLUSH}, or
113\constant{Z_FINISH}, defaulting to \constant{Z_FINISH}. \constant{Z_SYNC_FLUSH} and
114\constant{Z_FULL_FLUSH} allow compressing further strings of data and
115are used to allow partial error recovery on decompression, while
116\constant{Z_FINISH} finishes the compressed stream and
117prevents compressing any more data. After calling
118\method{flush()} with \var{mode} set to \constant{Z_FINISH}, the
Fred Drakeed797831998-01-22 16:11:18 +0000119\method{compress()} method cannot be called again; the only realistic
Andrew M. Kuchlingf07c3281998-12-31 21:14:23 +0000120action is to delete the object.
Fred Drake74810d51998-04-03 06:49:26 +0000121\end{methoddesc}
Guido van Rossum04bc9d61997-04-30 18:12:27 +0000122
Jeremy Hylton511e2ca2001-10-16 20:39:49 +0000123Decompression objects support the following methods, and two attributes:
Fred Drake38e5d272000-04-03 20:13:55 +0000124
125\begin{memberdesc}{unused_data}
Martin v. Löwis9e9a7c32003-06-21 14:15:25 +0000126A string which contains any bytes past the end of the compressed data.
127That is, this remains \code{""} until the last byte that contains
128compression data is available. If the whole string turned out to
129contain compressed data, this is \code{""}, the empty string.
Fred Drake38e5d272000-04-03 20:13:55 +0000130
131The only way to determine where a string of compressed data ends is by
132actually decompressing it. This means that when compressed data is
133contained part of a larger file, you can only find the end of it by
Martin v. Löwis9e9a7c32003-06-21 14:15:25 +0000134reading data and feeding it followed by some non-empty string into a
135decompression object's \method{decompress} method until the
136\member{unused_data} attribute is no longer the empty string.
Fred Drake38e5d272000-04-03 20:13:55 +0000137\end{memberdesc}
Guido van Rossum04bc9d61997-04-30 18:12:27 +0000138
Jeremy Hylton511e2ca2001-10-16 20:39:49 +0000139\begin{memberdesc}{unconsumed_tail}
140A string that contains any data that was not consumed by the last
141\method{decompress} call because it exceeded the limit for the
Martin v. Löwis9e9a7c32003-06-21 14:15:25 +0000142uncompressed data buffer. This data has not yet been seen by the zlib
143machinery, so you must feed it (possibly with further data
144concatenated to it) back to a subsequent \method{decompress} method
145call in order to get correct output.
Jeremy Hylton511e2ca2001-10-16 20:39:49 +0000146\end{memberdesc}
147
Martin v. Löwis9e9a7c32003-06-21 14:15:25 +0000148
Jeremy Hylton511e2ca2001-10-16 20:39:49 +0000149\begin{methoddesc}[Decompress]{decompress}{string}{\optional{max_length}}
Guido van Rossum04bc9d61997-04-30 18:12:27 +0000150Decompress \var{string}, returning a string containing the
151uncompressed data corresponding to at least part of the data in
152\var{string}. This data should be concatenated to the output produced
153by any preceding calls to the
Fred Drakeed797831998-01-22 16:11:18 +0000154\method{decompress()} method. Some of the input data may be preserved
Guido van Rossum412154f1997-04-30 19:39:21 +0000155in internal buffers for later processing.
Jeremy Hylton511e2ca2001-10-16 20:39:49 +0000156
157If the optional parameter \var{max_length} is supplied then the return value
158will be no longer than \var{max_length}. This may mean that not all of the
159compressed input can be processed; and unconsumed data will be stored
160in the attribute \member{unconsumed_tail}. This string must be passed
161to a subsequent call to \method{decompress()} if decompression is to
162continue. If \var{max_length} is not supplied then the whole input is
163decompressed, and \member{unconsumed_tail} is an empty string.
Fred Drake74810d51998-04-03 06:49:26 +0000164\end{methoddesc}
Guido van Rossum04bc9d61997-04-30 18:12:27 +0000165
Fred Drake74810d51998-04-03 06:49:26 +0000166\begin{methoddesc}[Decompress]{flush}{}
Guido van Rossum04bc9d61997-04-30 18:12:27 +0000167All pending input is processed, and a string containing the remaining
Fred Drakeed797831998-01-22 16:11:18 +0000168uncompressed output is returned. After calling \method{flush()}, the
169\method{decompress()} method cannot be called again; the only realistic
Guido van Rossum04bc9d61997-04-30 18:12:27 +0000170action is to delete the object.
Fred Drake74810d51998-04-03 06:49:26 +0000171\end{methoddesc}
Guido van Rossum04bc9d61997-04-30 18:12:27 +0000172
Guido van Rossume47da0a1997-07-17 16:34:52 +0000173\begin{seealso}
Fred Drakeba0a9892000-10-18 17:43:06 +0000174 \seemodule{gzip}{Reading and writing \program{gzip}-format files.}
Fred Drakeb037d332001-06-25 15:30:13 +0000175 \seeurl{http://www.gzip.org/zlib/}{The zlib library home page.}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000176\end{seealso}