Antoine Pitrou | 37dc5f8 | 2011-04-03 17:05:46 +0200 | [diff] [blame] | 1 | :mod:`bz2` --- Support for :program:`bzip2` compression |
| 2 | ======================================================= |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 3 | |
| 4 | .. module:: bz2 |
Antoine Pitrou | 37dc5f8 | 2011-04-03 17:05:46 +0200 | [diff] [blame] | 5 | :synopsis: Interfaces for bzip2 compression and decompression. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 6 | .. moduleauthor:: Gustavo Niemeyer <niemeyer@conectiva.com> |
Antoine Pitrou | 37dc5f8 | 2011-04-03 17:05:46 +0200 | [diff] [blame] | 7 | .. moduleauthor:: Nadeem Vawda <nadeem.vawda@gmail.com> |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 8 | .. sectionauthor:: Gustavo Niemeyer <niemeyer@conectiva.com> |
Antoine Pitrou | 37dc5f8 | 2011-04-03 17:05:46 +0200 | [diff] [blame] | 9 | .. sectionauthor:: Nadeem Vawda <nadeem.vawda@gmail.com> |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 10 | |
| 11 | |
Antoine Pitrou | 37dc5f8 | 2011-04-03 17:05:46 +0200 | [diff] [blame] | 12 | This module provides a comprehensive interface for compressing and |
| 13 | decompressing data using the bzip2 compression algorithm. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 14 | |
Antoine Pitrou | 37dc5f8 | 2011-04-03 17:05:46 +0200 | [diff] [blame] | 15 | The :mod:`bz2` module contains: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 16 | |
Antoine Pitrou | 37dc5f8 | 2011-04-03 17:05:46 +0200 | [diff] [blame] | 17 | * The :class:`BZ2File` class for reading and writing compressed files. |
| 18 | * The :class:`BZ2Compressor` and :class:`BZ2Decompressor` classes for |
| 19 | incremental (de)compression. |
| 20 | * The :func:`compress` and :func:`decompress` functions for one-shot |
| 21 | (de)compression. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 22 | |
Antoine Pitrou | 37dc5f8 | 2011-04-03 17:05:46 +0200 | [diff] [blame] | 23 | All of the classes in this module may safely be accessed from multiple threads. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 24 | |
| 25 | |
| 26 | (De)compression of files |
| 27 | ------------------------ |
| 28 | |
Nadeem Vawda | 54d8144 | 2012-02-04 13:58:07 +0200 | [diff] [blame] | 29 | .. class:: BZ2File(filename=None, mode='r', buffering=None, compresslevel=9, \*, fileobj=None) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 30 | |
Antoine Pitrou | 37dc5f8 | 2011-04-03 17:05:46 +0200 | [diff] [blame] | 31 | Open a bzip2-compressed file. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 32 | |
Antoine Pitrou | 37dc5f8 | 2011-04-03 17:05:46 +0200 | [diff] [blame] | 33 | The :class:`BZ2File` can wrap an existing :term:`file object` (given by |
| 34 | *fileobj*), or operate directly on a named file (named by *filename*). |
| 35 | Exactly one of these two parameters should be provided. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 36 | |
Nadeem Vawda | 200e00a | 2011-05-27 01:52:16 +0200 | [diff] [blame] | 37 | The *mode* argument can be either ``'r'`` for reading (default), ``'w'`` for |
| 38 | overwriting, or ``'a'`` for appending. If *fileobj* is provided, a mode of |
| 39 | ``'w'`` does not truncate the file, and is instead equivalent to ``'a'``. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 40 | |
Antoine Pitrou | 37dc5f8 | 2011-04-03 17:05:46 +0200 | [diff] [blame] | 41 | The *buffering* argument is ignored. Its use is deprecated. |
| 42 | |
Nadeem Vawda | 200e00a | 2011-05-27 01:52:16 +0200 | [diff] [blame] | 43 | If *mode* is ``'w'`` or ``'a'``, *compresslevel* can be a number between |
| 44 | ``1`` and ``9`` specifying the level of compression: ``1`` produces the |
| 45 | least compression, and ``9`` (default) produces the most compression. |
| 46 | |
| 47 | If *mode* is ``'r'``, the input file may be the concatenation of multiple |
| 48 | compressed streams. |
Antoine Pitrou | 37dc5f8 | 2011-04-03 17:05:46 +0200 | [diff] [blame] | 49 | |
| 50 | :class:`BZ2File` provides all of the members specified by the |
| 51 | :class:`io.BufferedIOBase`, except for :meth:`detach` and :meth:`truncate`. |
| 52 | Iteration and the :keyword:`with` statement are supported. |
| 53 | |
| 54 | :class:`BZ2File` also provides the following method: |
| 55 | |
| 56 | .. method:: peek([n]) |
| 57 | |
| 58 | Return buffered data without advancing the file position. At least one |
| 59 | byte of data will be returned (unless at EOF). The exact number of bytes |
| 60 | returned is unspecified. |
| 61 | |
| 62 | .. versionadded:: 3.3 |
Benjamin Peterson | e0124bd | 2009-03-09 21:04:33 +0000 | [diff] [blame] | 63 | |
Benjamin Peterson | 10745a9 | 2009-03-09 21:08:47 +0000 | [diff] [blame] | 64 | .. versionchanged:: 3.1 |
Benjamin Peterson | e0124bd | 2009-03-09 21:04:33 +0000 | [diff] [blame] | 65 | Support for the :keyword:`with` statement was added. |
| 66 | |
Antoine Pitrou | 37dc5f8 | 2011-04-03 17:05:46 +0200 | [diff] [blame] | 67 | .. versionchanged:: 3.3 |
| 68 | The :meth:`fileno`, :meth:`readable`, :meth:`seekable`, :meth:`writable`, |
| 69 | :meth:`read1` and :meth:`readinto` methods were added. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 70 | |
Antoine Pitrou | 37dc5f8 | 2011-04-03 17:05:46 +0200 | [diff] [blame] | 71 | .. versionchanged:: 3.3 |
| 72 | The *fileobj* argument to the constructor was added. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 73 | |
Nadeem Vawda | 200e00a | 2011-05-27 01:52:16 +0200 | [diff] [blame] | 74 | .. versionchanged:: 3.3 |
| 75 | The ``'a'`` (append) mode was added, along with support for reading |
| 76 | multi-stream files. |
| 77 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 78 | |
Antoine Pitrou | 37dc5f8 | 2011-04-03 17:05:46 +0200 | [diff] [blame] | 79 | Incremental (de)compression |
| 80 | --------------------------- |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 81 | |
Georg Brandl | 0d8f073 | 2009-04-05 22:20:44 +0000 | [diff] [blame] | 82 | .. class:: BZ2Compressor(compresslevel=9) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 83 | |
| 84 | Create a new compressor object. This object may be used to compress data |
Antoine Pitrou | 37dc5f8 | 2011-04-03 17:05:46 +0200 | [diff] [blame] | 85 | incrementally. For one-shot compression, use the :func:`compress` function |
| 86 | instead. |
| 87 | |
| 88 | *compresslevel*, if given, must be a number between ``1`` and ``9``. The |
| 89 | default is ``9``. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 90 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 91 | .. method:: compress(data) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 92 | |
Antoine Pitrou | 37dc5f8 | 2011-04-03 17:05:46 +0200 | [diff] [blame] | 93 | Provide data to the compressor object. Returns a chunk of compressed data |
| 94 | if possible, or an empty byte string otherwise. |
| 95 | |
| 96 | When you have finished providing data to the compressor, call the |
| 97 | :meth:`flush` method to finish the compression process. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 98 | |
| 99 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 100 | .. method:: flush() |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 101 | |
Antoine Pitrou | 37dc5f8 | 2011-04-03 17:05:46 +0200 | [diff] [blame] | 102 | Finish the compression process. Returns the compressed data left in |
| 103 | internal buffers. |
| 104 | |
| 105 | The compressor object may not be used after this method has been called. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 106 | |
| 107 | |
| 108 | .. class:: BZ2Decompressor() |
| 109 | |
| 110 | Create a new decompressor object. This object may be used to decompress data |
Antoine Pitrou | 37dc5f8 | 2011-04-03 17:05:46 +0200 | [diff] [blame] | 111 | incrementally. For one-shot compression, use the :func:`decompress` function |
| 112 | instead. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 113 | |
Nadeem Vawda | 200e00a | 2011-05-27 01:52:16 +0200 | [diff] [blame] | 114 | .. note:: |
| 115 | This class does not transparently handle inputs containing multiple |
| 116 | compressed streams, unlike :func:`decompress` and :class:`BZ2File`. If |
| 117 | you need to decompress a multi-stream input with :class:`BZ2Decompressor`, |
| 118 | you must use a new decompressor for each stream. |
| 119 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 120 | .. method:: decompress(data) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 121 | |
Antoine Pitrou | 37dc5f8 | 2011-04-03 17:05:46 +0200 | [diff] [blame] | 122 | Provide data to the decompressor object. Returns a chunk of decompressed |
| 123 | data if possible, or an empty byte string otherwise. |
| 124 | |
Nadeem Vawda | 200e00a | 2011-05-27 01:52:16 +0200 | [diff] [blame] | 125 | Attempting to decompress data after the end of the current stream is |
| 126 | reached raises an :exc:`EOFError`. If any data is found after the end of |
| 127 | the stream, it is ignored and saved in the :attr:`unused_data` attribute. |
Antoine Pitrou | 37dc5f8 | 2011-04-03 17:05:46 +0200 | [diff] [blame] | 128 | |
| 129 | |
| 130 | .. attribute:: eof |
| 131 | |
| 132 | True if the end-of-stream marker has been reached. |
| 133 | |
| 134 | .. versionadded:: 3.3 |
| 135 | |
| 136 | |
| 137 | .. attribute:: unused_data |
| 138 | |
| 139 | Data found after the end of the compressed stream. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 140 | |
Nadeem Vawda | 200e00a | 2011-05-27 01:52:16 +0200 | [diff] [blame] | 141 | If this attribute is accessed before the end of the stream has been |
| 142 | reached, its value will be ``b''``. |
| 143 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 144 | |
| 145 | One-shot (de)compression |
| 146 | ------------------------ |
| 147 | |
Georg Brandl | 0d8f073 | 2009-04-05 22:20:44 +0000 | [diff] [blame] | 148 | .. function:: compress(data, compresslevel=9) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 149 | |
Antoine Pitrou | 37dc5f8 | 2011-04-03 17:05:46 +0200 | [diff] [blame] | 150 | Compress *data*. |
| 151 | |
| 152 | *compresslevel*, if given, must be a number between ``1`` and ``9``. The |
| 153 | default is ``9``. |
| 154 | |
| 155 | For incremental compression, use a :class:`BZ2Compressor` instead. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 156 | |
| 157 | |
| 158 | .. function:: decompress(data) |
| 159 | |
Antoine Pitrou | 37dc5f8 | 2011-04-03 17:05:46 +0200 | [diff] [blame] | 160 | Decompress *data*. |
| 161 | |
Nadeem Vawda | 200e00a | 2011-05-27 01:52:16 +0200 | [diff] [blame] | 162 | If *data* is the concatenation of multiple compressed streams, decompress |
| 163 | all of the streams. |
| 164 | |
Antoine Pitrou | 37dc5f8 | 2011-04-03 17:05:46 +0200 | [diff] [blame] | 165 | For incremental decompression, use a :class:`BZ2Decompressor` instead. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 166 | |
Nadeem Vawda | 200e00a | 2011-05-27 01:52:16 +0200 | [diff] [blame] | 167 | .. versionchanged:: 3.3 |
| 168 | Support for multi-stream inputs was added. |
| 169 | |