Larry Hastings | 3732ed2 | 2014-03-15 21:13:56 -0700 | [diff] [blame] | 1 | :mod:`base64` --- Base16, Base32, Base64, Base85 Data Encodings |
| 2 | =============================================================== |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 3 | |
| 4 | .. module:: base64 |
Larry Hastings | 3732ed2 | 2014-03-15 21:13:56 -0700 | [diff] [blame] | 5 | :synopsis: RFC 3548: Base16, Base32, Base64 Data Encodings; |
| 6 | Base85 and Ascii85 |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 7 | |
| 8 | |
| 9 | .. index:: |
| 10 | pair: base64; encoding |
| 11 | single: MIME; base64 encoding |
| 12 | |
Larry Hastings | 3732ed2 | 2014-03-15 21:13:56 -0700 | [diff] [blame] | 13 | This module provides functions for encoding binary data to printable |
| 14 | ASCII characters and decoding such encodings back to binary data. |
| 15 | It provides encoding and decoding functions for the encodings specified in |
Serhiy Storchaka | 56a6d85 | 2014-12-01 18:28:43 +0200 | [diff] [blame] | 16 | :rfc:`3548`, which defines the Base16, Base32, and Base64 algorithms, |
Larry Hastings | 3732ed2 | 2014-03-15 21:13:56 -0700 | [diff] [blame] | 17 | and for the de-facto standard Ascii85 and Base85 encodings. |
| 18 | |
| 19 | The :rfc:`3548` encodings are suitable for encoding binary data so that it can |
R. David Murray | 7cefc30 | 2010-10-17 23:12:16 +0000 | [diff] [blame] | 20 | safely sent by email, used as parts of URLs, or included as part of an HTTP |
| 21 | POST request. The encoding algorithm is not the same as the |
| 22 | :program:`uuencode` program. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 23 | |
Larry Hastings | 3732ed2 | 2014-03-15 21:13:56 -0700 | [diff] [blame] | 24 | There are two :rfc:`3548` interfaces provided by this module. The modern |
| 25 | interface supports encoding and decoding ASCII byte string objects using all |
| 26 | three :rfc:`3548` defined alphabets (normal, URL-safe, and filesystem-safe). |
| 27 | Additionally, the decoding functions of the modern interface also accept |
| 28 | Unicode strings containing only ASCII characters. The legacy interface provides |
| 29 | for encoding and decoding to and from file-like objects as well as byte |
| 30 | strings, but only using the Base64 standard alphabet. |
Antoine Pitrou | ea6b4d5 | 2012-02-20 19:30:23 +0100 | [diff] [blame] | 31 | |
| 32 | .. versionchanged:: 3.3 |
| 33 | ASCII-only Unicode strings are now accepted by the decoding functions of |
| 34 | the modern interface. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 35 | |
Nick Coghlan | fdf239a | 2013-10-03 00:43:22 +1000 | [diff] [blame] | 36 | .. versionchanged:: 3.4 |
| 37 | Any :term:`bytes-like object`\ s are now accepted by all |
Larry Hastings | 3732ed2 | 2014-03-15 21:13:56 -0700 | [diff] [blame] | 38 | encoding and decoding functions in this module. Ascii85/Base85 support added. |
Nick Coghlan | fdf239a | 2013-10-03 00:43:22 +1000 | [diff] [blame] | 39 | |
Georg Brandl | e6bcc91 | 2008-05-12 18:05:20 +0000 | [diff] [blame] | 40 | The modern interface provides: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 41 | |
Georg Brandl | b868a66 | 2009-04-02 02:56:10 +0000 | [diff] [blame] | 42 | .. function:: b64encode(s, altchars=None) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 43 | |
R. David Murray | 7cefc30 | 2010-10-17 23:12:16 +0000 | [diff] [blame] | 44 | Encode a byte string using Base64. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 45 | |
| 46 | *s* is the string to encode. Optional *altchars* must be a string of at least |
| 47 | length 2 (additional characters are ignored) which specifies an alternative |
| 48 | alphabet for the ``+`` and ``/`` characters. This allows an application to e.g. |
| 49 | generate URL or filesystem safe Base64 strings. The default is ``None``, for |
| 50 | which the standard Base64 alphabet is used. |
| 51 | |
Georg Brandl | 62e4231 | 2010-08-02 20:39:35 +0000 | [diff] [blame] | 52 | The encoded byte string is returned. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 53 | |
| 54 | |
R. David Murray | 6495136 | 2010-11-11 20:09:20 +0000 | [diff] [blame] | 55 | .. function:: b64decode(s, altchars=None, validate=False) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 56 | |
Georg Brandl | 62e4231 | 2010-08-02 20:39:35 +0000 | [diff] [blame] | 57 | Decode a Base64 encoded byte string. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 58 | |
R. David Murray | 7cefc30 | 2010-10-17 23:12:16 +0000 | [diff] [blame] | 59 | *s* is the byte string to decode. Optional *altchars* must be a string of |
| 60 | at least length 2 (additional characters are ignored) which specifies the |
| 61 | alternative alphabet used instead of the ``+`` and ``/`` characters. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 62 | |
Éric Araujo | 941afed | 2011-09-01 02:47:34 +0200 | [diff] [blame] | 63 | The decoded string is returned. A :exc:`binascii.Error` exception is raised |
| 64 | if *s* is incorrectly padded. |
R. David Murray | 6495136 | 2010-11-11 20:09:20 +0000 | [diff] [blame] | 65 | |
| 66 | If *validate* is ``False`` (the default), non-base64-alphabet characters are |
| 67 | discarded prior to the padding check. If *validate* is ``True``, |
| 68 | non-base64-alphabet characters in the input result in a |
| 69 | :exc:`binascii.Error`. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 70 | |
| 71 | |
| 72 | .. function:: standard_b64encode(s) |
| 73 | |
Georg Brandl | 62e4231 | 2010-08-02 20:39:35 +0000 | [diff] [blame] | 74 | Encode byte string *s* using the standard Base64 alphabet. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 75 | |
| 76 | |
| 77 | .. function:: standard_b64decode(s) |
| 78 | |
Georg Brandl | 62e4231 | 2010-08-02 20:39:35 +0000 | [diff] [blame] | 79 | Decode byte string *s* using the standard Base64 alphabet. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 80 | |
| 81 | |
| 82 | .. function:: urlsafe_b64encode(s) |
| 83 | |
Georg Brandl | 62e4231 | 2010-08-02 20:39:35 +0000 | [diff] [blame] | 84 | Encode byte string *s* using a URL-safe alphabet, which substitutes ``-`` instead of |
Benjamin Peterson | d75fcb4 | 2009-02-19 04:22:03 +0000 | [diff] [blame] | 85 | ``+`` and ``_`` instead of ``/`` in the standard Base64 alphabet. The result |
| 86 | can still contain ``=``. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 87 | |
| 88 | |
| 89 | .. function:: urlsafe_b64decode(s) |
| 90 | |
Georg Brandl | 62e4231 | 2010-08-02 20:39:35 +0000 | [diff] [blame] | 91 | Decode byte string *s* using a URL-safe alphabet, which substitutes ``-`` instead of |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 92 | ``+`` and ``_`` instead of ``/`` in the standard Base64 alphabet. |
| 93 | |
| 94 | |
| 95 | .. function:: b32encode(s) |
| 96 | |
Georg Brandl | 62e4231 | 2010-08-02 20:39:35 +0000 | [diff] [blame] | 97 | Encode a byte string using Base32. *s* is the string to encode. The encoded string |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 98 | is returned. |
| 99 | |
| 100 | |
Georg Brandl | b868a66 | 2009-04-02 02:56:10 +0000 | [diff] [blame] | 101 | .. function:: b32decode(s, casefold=False, map01=None) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 102 | |
Georg Brandl | 62e4231 | 2010-08-02 20:39:35 +0000 | [diff] [blame] | 103 | Decode a Base32 encoded byte string. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 104 | |
R. David Murray | 7cefc30 | 2010-10-17 23:12:16 +0000 | [diff] [blame] | 105 | *s* is the byte string to decode. Optional *casefold* is a flag specifying |
| 106 | whether a lowercase alphabet is acceptable as input. For security purposes, |
| 107 | the default is ``False``. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 108 | |
| 109 | :rfc:`3548` allows for optional mapping of the digit 0 (zero) to the letter O |
| 110 | (oh), and for optional mapping of the digit 1 (one) to either the letter I (eye) |
| 111 | or letter L (el). The optional argument *map01* when not ``None``, specifies |
| 112 | which letter the digit 1 should be mapped to (when *map01* is not ``None``, the |
| 113 | digit 0 is always mapped to the letter O). For security purposes the default is |
| 114 | ``None``, so that 0 and 1 are not allowed in the input. |
| 115 | |
R David Murray | 78ee328 | 2014-01-08 18:09:29 -0500 | [diff] [blame] | 116 | The decoded byte string is returned. A :exc:`binascii.Error` is raised if *s* is |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 117 | incorrectly padded or if there are non-alphabet characters present in the |
| 118 | string. |
| 119 | |
| 120 | |
| 121 | .. function:: b16encode(s) |
| 122 | |
Georg Brandl | 62e4231 | 2010-08-02 20:39:35 +0000 | [diff] [blame] | 123 | Encode a byte string using Base16. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 124 | |
Georg Brandl | 62e4231 | 2010-08-02 20:39:35 +0000 | [diff] [blame] | 125 | *s* is the string to encode. The encoded byte string is returned. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 126 | |
| 127 | |
Georg Brandl | b868a66 | 2009-04-02 02:56:10 +0000 | [diff] [blame] | 128 | .. function:: b16decode(s, casefold=False) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 129 | |
Georg Brandl | 62e4231 | 2010-08-02 20:39:35 +0000 | [diff] [blame] | 130 | Decode a Base16 encoded byte string. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 131 | |
| 132 | *s* is the string to decode. Optional *casefold* is a flag specifying whether a |
| 133 | lowercase alphabet is acceptable as input. For security purposes, the default |
| 134 | is ``False``. |
| 135 | |
Georg Brandl | 62e4231 | 2010-08-02 20:39:35 +0000 | [diff] [blame] | 136 | The decoded byte string is returned. A :exc:`TypeError` is raised if *s* were |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 137 | incorrectly padded or if there are non-alphabet characters present in the |
| 138 | string. |
| 139 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 140 | |
Antoine Pitrou | 6dd0d46 | 2013-11-17 23:52:25 +0100 | [diff] [blame] | 141 | .. function:: a85encode(s, *, foldspaces=False, wrapcol=0, pad=False, adobe=False) |
| 142 | |
| 143 | Encode a byte string using Ascii85. |
| 144 | |
| 145 | *s* is the string to encode. The encoded byte string is returned. |
| 146 | |
| 147 | *foldspaces* is an optional flag that uses the special short sequence 'y' |
| 148 | instead of 4 consecutive spaces (ASCII 0x20) as supported by 'btoa'. This |
| 149 | feature is not supported by the "standard" Ascii85 encoding. |
| 150 | |
| 151 | *wrapcol* controls whether the output should have newline ('\n') |
| 152 | characters added to it. If this is non-zero, each output line will be |
| 153 | at most this many characters long. |
| 154 | |
| 155 | *pad* controls whether the input string is padded to a multiple of 4 |
| 156 | before encoding. Note that the ``btoa`` implementation always pads. |
| 157 | |
| 158 | *adobe* controls whether the encoded byte sequence is framed with ``<~`` |
| 159 | and ``~>``, which is used by the Adobe implementation. |
| 160 | |
| 161 | .. versionadded:: 3.4 |
| 162 | |
| 163 | |
| 164 | .. function:: a85decode(s, *, foldspaces=False, adobe=False, ignorechars=b' \t\n\r\v') |
| 165 | |
| 166 | Decode an Ascii85 encoded byte string. |
| 167 | |
| 168 | *s* is the byte string to decode. |
| 169 | |
| 170 | *foldspaces* is a flag that specifies whether the 'y' short sequence |
| 171 | should be accepted as shorthand for 4 consecutive spaces (ASCII 0x20). |
| 172 | This feature is not supported by the "standard" Ascii85 encoding. |
| 173 | |
| 174 | *adobe* controls whether the input sequence is in Adobe Ascii85 format |
| 175 | (i.e. is framed with <~ and ~>). |
| 176 | |
| 177 | *ignorechars* should be a byte string containing characters to ignore |
| 178 | from the input. This should only contain whitespace characters, and by |
| 179 | default contains all whitespace characters in ASCII. |
| 180 | |
| 181 | .. versionadded:: 3.4 |
| 182 | |
| 183 | |
| 184 | .. function:: b85encode(s, pad=False) |
| 185 | |
| 186 | Encode a byte string using base85, as used in e.g. git-style binary |
| 187 | diffs. |
| 188 | |
| 189 | If *pad* is true, the input is padded with "\\0" so its length is a |
| 190 | multiple of 4 characters before encoding. |
| 191 | |
| 192 | .. versionadded:: 3.4 |
| 193 | |
| 194 | |
| 195 | .. function:: b85decode(b) |
| 196 | |
| 197 | Decode base85-encoded byte string. Padding is implicitly removed, if |
| 198 | necessary. |
| 199 | |
| 200 | .. versionadded:: 3.4 |
| 201 | |
| 202 | |
| 203 | .. note:: |
| 204 | Both Base85 and Ascii85 have an expansion factor of 5 to 4 (5 Base85 or |
| 205 | Ascii85 characters can encode 4 binary bytes), while the better-known |
| 206 | Base64 has an expansion factor of 6 to 4. They are therefore more |
| 207 | efficient when space expensive. They differ by details such as the |
| 208 | character map used for encoding. |
| 209 | |
| 210 | |
Georg Brandl | b54d801 | 2009-06-04 09:11:51 +0000 | [diff] [blame] | 211 | The legacy interface: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 212 | |
| 213 | .. function:: decode(input, output) |
| 214 | |
Georg Brandl | b54d801 | 2009-06-04 09:11:51 +0000 | [diff] [blame] | 215 | Decode the contents of the binary *input* file and write the resulting binary |
Antoine Pitrou | 11cb961 | 2010-09-15 11:11:28 +0000 | [diff] [blame] | 216 | data to the *output* file. *input* and *output* must be :term:`file objects |
| 217 | <file object>`. *input* will be read until ``input.read()`` returns an empty |
| 218 | bytes object. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 219 | |
| 220 | |
Georg Brandl | b54d801 | 2009-06-04 09:11:51 +0000 | [diff] [blame] | 221 | .. function:: decodebytes(s) |
| 222 | decodestring(s) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 223 | |
R. David Murray | 7cefc30 | 2010-10-17 23:12:16 +0000 | [diff] [blame] | 224 | Decode the byte string *s*, which must contain one or more lines of base64 |
| 225 | encoded data, and return a byte string containing the resulting binary data. |
Georg Brandl | b54d801 | 2009-06-04 09:11:51 +0000 | [diff] [blame] | 226 | ``decodestring`` is a deprecated alias. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 227 | |
R David Murray | 75fd225 | 2012-08-17 20:55:21 -0400 | [diff] [blame] | 228 | .. versionadded:: 3.1 |
| 229 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 230 | |
| 231 | .. function:: encode(input, output) |
| 232 | |
Georg Brandl | b54d801 | 2009-06-04 09:11:51 +0000 | [diff] [blame] | 233 | Encode the contents of the binary *input* file and write the resulting base64 |
Antoine Pitrou | 11cb961 | 2010-09-15 11:11:28 +0000 | [diff] [blame] | 234 | encoded data to the *output* file. *input* and *output* must be :term:`file |
| 235 | objects <file object>`. *input* will be read until ``input.read()`` returns |
| 236 | an empty bytes object. :func:`encode` returns the encoded data plus a trailing |
| 237 | newline character (``b'\n'``). |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 238 | |
| 239 | |
Georg Brandl | b54d801 | 2009-06-04 09:11:51 +0000 | [diff] [blame] | 240 | .. function:: encodebytes(s) |
| 241 | encodestring(s) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 242 | |
R. David Murray | 7cefc30 | 2010-10-17 23:12:16 +0000 | [diff] [blame] | 243 | Encode the byte string *s*, which can contain arbitrary binary data, and |
| 244 | return a byte string containing one or more lines of base64-encoded data. |
Georg Brandl | b54d801 | 2009-06-04 09:11:51 +0000 | [diff] [blame] | 245 | :func:`encodebytes` returns a string containing one or more lines of |
| 246 | base64-encoded data always including an extra trailing newline (``b'\n'``). |
| 247 | ``encodestring`` is a deprecated alias. |
| 248 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 249 | |
Christian Heimes | fe337bf | 2008-03-23 21:54:12 +0000 | [diff] [blame] | 250 | An example usage of the module: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 251 | |
| 252 | >>> import base64 |
Georg Brandl | 134c35b | 2010-10-17 11:36:28 +0000 | [diff] [blame] | 253 | >>> encoded = base64.b64encode(b'data to be encoded') |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 254 | >>> encoded |
Georg Brandl | 38d54f7 | 2009-01-18 10:43:58 +0000 | [diff] [blame] | 255 | b'ZGF0YSB0byBiZSBlbmNvZGVk' |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 256 | >>> data = base64.b64decode(encoded) |
| 257 | >>> data |
Georg Brandl | 134c35b | 2010-10-17 11:36:28 +0000 | [diff] [blame] | 258 | b'data to be encoded' |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 259 | |
| 260 | |
| 261 | .. seealso:: |
| 262 | |
| 263 | Module :mod:`binascii` |
| 264 | Support module containing ASCII-to-binary and binary-to-ASCII conversions. |
| 265 | |
| 266 | :rfc:`1521` - MIME (Multipurpose Internet Mail Extensions) Part One: Mechanisms for Specifying and Describing the Format of Internet Message Bodies |
| 267 | Section 5.2, "Base64 Content-Transfer-Encoding," provides the definition of the |
| 268 | base64 encoding. |
| 269 | |