Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 1 | :mod:`hashlib` --- Secure hashes and message digests |
| 2 | ==================================================== |
| 3 | |
| 4 | .. module:: hashlib |
| 5 | :synopsis: Secure hash and message digest algorithms. |
Benjamin Peterson | 058e31e | 2009-01-16 03:54:08 +0000 | [diff] [blame] | 6 | .. moduleauthor:: Gregory P. Smith <greg@krypto.org> |
| 7 | .. sectionauthor:: Gregory P. Smith <greg@krypto.org> |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 8 | |
| 9 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 10 | .. index:: |
| 11 | single: message digest, MD5 |
| 12 | single: secure hash algorithm, SHA1, SHA224, SHA256, SHA384, SHA512 |
| 13 | |
Raymond Hettinger | 469271d | 2011-01-27 20:38:46 +0000 | [diff] [blame] | 14 | **Source code:** :source:`Lib/hashlib.py` |
| 15 | |
| 16 | -------------- |
| 17 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 18 | This module implements a common interface to many different secure hash and |
| 19 | message digest algorithms. Included are the FIPS secure hash algorithms SHA1, |
| 20 | SHA224, SHA256, SHA384, and SHA512 (defined in FIPS 180-2) as well as RSA's MD5 |
Georg Brandl | 67ced42 | 2007-09-06 14:09:10 +0000 | [diff] [blame] | 21 | algorithm (defined in Internet :rfc:`1321`). The terms "secure hash" and |
| 22 | "message digest" are interchangeable. Older algorithms were called message |
| 23 | digests. The modern term is secure hash. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 24 | |
Christian Heimes | d5e2b6f | 2008-03-19 21:50:51 +0000 | [diff] [blame] | 25 | .. note:: |
Georg Brandl | 6e94a30 | 2013-10-06 18:26:36 +0200 | [diff] [blame] | 26 | |
| 27 | If you want the adler32 or crc32 hash functions, they are available in |
Christian Heimes | d5e2b6f | 2008-03-19 21:50:51 +0000 | [diff] [blame] | 28 | the :mod:`zlib` module. |
| 29 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 30 | .. warning:: |
| 31 | |
Georg Brandl | 6e94a30 | 2013-10-06 18:26:36 +0200 | [diff] [blame] | 32 | Some algorithms have known hash collision weaknesses, refer to the "See |
| 33 | also" section at the end. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 34 | |
Christian Heimes | e92ef13 | 2013-10-13 00:52:43 +0200 | [diff] [blame] | 35 | |
R David Murray | cde1a06 | 2013-12-20 16:33:52 -0500 | [diff] [blame] | 36 | .. _hash-algorithms: |
| 37 | |
Christian Heimes | e92ef13 | 2013-10-13 00:52:43 +0200 | [diff] [blame] | 38 | Hash algorithms |
| 39 | --------------- |
| 40 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 41 | There is one constructor method named for each type of :dfn:`hash`. All return |
| 42 | a hash object with the same simple interface. For example: use :func:`sha1` to |
Ezio Melotti | c228e96 | 2013-05-04 18:06:34 +0300 | [diff] [blame] | 43 | create a SHA1 hash object. You can now feed this object with :term:`bytes-like |
| 44 | object`\ s (normally :class:`bytes`) using the :meth:`update` method. |
| 45 | At any point you can ask it for the :dfn:`digest` of the |
Georg Brandl | 67ced42 | 2007-09-06 14:09:10 +0000 | [diff] [blame] | 46 | concatenation of the data fed to it so far using the :meth:`digest` or |
| 47 | :meth:`hexdigest` methods. |
| 48 | |
| 49 | .. note:: |
| 50 | |
Benjamin Peterson | 9cb7bd2 | 2012-12-20 20:24:37 -0600 | [diff] [blame] | 51 | For better multithreading performance, the Python :term:`GIL` is released for |
Jesus Cea | 5b22dd8 | 2013-10-04 04:20:37 +0200 | [diff] [blame] | 52 | data larger than 2047 bytes at object creation or on update. |
Antoine Pitrou | bcd5cbe | 2009-01-08 21:17:16 +0000 | [diff] [blame] | 53 | |
| 54 | .. note:: |
| 55 | |
Benjamin Peterson | bd584d5 | 2012-12-20 20:22:47 -0600 | [diff] [blame] | 56 | Feeding string objects into :meth:`update` is not supported, as hashes work |
Georg Brandl | 67ced42 | 2007-09-06 14:09:10 +0000 | [diff] [blame] | 57 | on bytes, not on characters. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 58 | |
Thomas Wouters | 1b7f891 | 2007-09-19 03:06:30 +0000 | [diff] [blame] | 59 | .. index:: single: OpenSSL; (use in module hashlib) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 60 | |
| 61 | Constructors for hash algorithms that are always present in this module are |
Christian Heimes | 4a0270d | 2012-10-06 02:23:36 +0200 | [diff] [blame] | 62 | :func:`md5`, :func:`sha1`, :func:`sha224`, :func:`sha256`, :func:`sha384`, |
Martin v. Löwis | 24e4330 | 2014-01-03 14:05:06 +0100 | [diff] [blame] | 63 | and :func:`sha512`. Additional algorithms may also be available depending upon |
Christian Heimes | 4a0270d | 2012-10-06 02:23:36 +0200 | [diff] [blame] | 64 | the OpenSSL library that Python uses on your platform. |
| 65 | |
Georg Brandl | 67ced42 | 2007-09-06 14:09:10 +0000 | [diff] [blame] | 66 | For example, to obtain the digest of the byte string ``b'Nobody inspects the |
| 67 | spammish repetition'``:: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 68 | |
| 69 | >>> import hashlib |
| 70 | >>> m = hashlib.md5() |
Georg Brandl | 67ced42 | 2007-09-06 14:09:10 +0000 | [diff] [blame] | 71 | >>> m.update(b"Nobody inspects") |
| 72 | >>> m.update(b" the spammish repetition") |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 73 | >>> m.digest() |
Georg Brandl | 67ced42 | 2007-09-06 14:09:10 +0000 | [diff] [blame] | 74 | b'\xbbd\x9c\x83\xdd\x1e\xa5\xc9\xd9\xde\xc9\xa1\x8d\xf0\xff\xe9' |
Guido van Rossum | a19f80c | 2007-11-06 20:51:31 +0000 | [diff] [blame] | 75 | >>> m.digest_size |
| 76 | 16 |
| 77 | >>> m.block_size |
| 78 | 64 |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 79 | |
Christian Heimes | fe337bf | 2008-03-23 21:54:12 +0000 | [diff] [blame] | 80 | More condensed: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 81 | |
Georg Brandl | 67ced42 | 2007-09-06 14:09:10 +0000 | [diff] [blame] | 82 | >>> hashlib.sha224(b"Nobody inspects the spammish repetition").hexdigest() |
Benjamin Peterson | 0fa3f3d | 2008-12-29 20:52:09 +0000 | [diff] [blame] | 83 | 'a4337bc45a8fc544c03f52dc550cd6e1e87021bc896588bd79e901e2' |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 84 | |
Gregory P. Smith | 13b5529 | 2010-09-06 08:30:23 +0000 | [diff] [blame] | 85 | .. function:: new(name[, data]) |
| 86 | |
| 87 | Is a generic constructor that takes the string name of the desired |
| 88 | algorithm as its first parameter. It also exists to allow access to the |
| 89 | above listed hashes as well as any other algorithms that your OpenSSL |
| 90 | library may offer. The named constructors are much faster than :func:`new` |
| 91 | and should be preferred. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 92 | |
Christian Heimes | fe337bf | 2008-03-23 21:54:12 +0000 | [diff] [blame] | 93 | Using :func:`new` with an algorithm provided by OpenSSL: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 94 | |
| 95 | >>> h = hashlib.new('ripemd160') |
Georg Brandl | 67ced42 | 2007-09-06 14:09:10 +0000 | [diff] [blame] | 96 | >>> h.update(b"Nobody inspects the spammish repetition") |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 97 | >>> h.hexdigest() |
Benjamin Peterson | 0fa3f3d | 2008-12-29 20:52:09 +0000 | [diff] [blame] | 98 | 'cc4a5ce1b3df48aec5d22d1f16b894a0b894eccc' |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 99 | |
Gregory P. Smith | 13b5529 | 2010-09-06 08:30:23 +0000 | [diff] [blame] | 100 | Hashlib provides the following constant attributes: |
Gregory P. Smith | 86508cc | 2010-03-01 02:05:26 +0000 | [diff] [blame] | 101 | |
Gregory P. Smith | 13b5529 | 2010-09-06 08:30:23 +0000 | [diff] [blame] | 102 | .. data:: algorithms_guaranteed |
Gregory P. Smith | 86508cc | 2010-03-01 02:05:26 +0000 | [diff] [blame] | 103 | |
Larry Hastings | 3732ed2 | 2014-03-15 21:13:56 -0700 | [diff] [blame] | 104 | A set containing the names of the hash algorithms guaranteed to be supported |
Gregory P. Smith | 13b5529 | 2010-09-06 08:30:23 +0000 | [diff] [blame] | 105 | by this module on all platforms. |
| 106 | |
| 107 | .. versionadded:: 3.2 |
| 108 | |
| 109 | .. data:: algorithms_available |
| 110 | |
Larry Hastings | 3732ed2 | 2014-03-15 21:13:56 -0700 | [diff] [blame] | 111 | A set containing the names of the hash algorithms that are available in the |
| 112 | running Python interpreter. These names will be recognized when passed to |
| 113 | :func:`new`. :attr:`algorithms_guaranteed` will always be a subset. The |
| 114 | same algorithm may appear multiple times in this set under different names |
| 115 | (thanks to OpenSSL). |
Gregory P. Smith | 86508cc | 2010-03-01 02:05:26 +0000 | [diff] [blame] | 116 | |
| 117 | .. versionadded:: 3.2 |
| 118 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 119 | The following values are provided as constant attributes of the hash objects |
| 120 | returned by the constructors: |
| 121 | |
| 122 | |
Benjamin Peterson | 4ac9ce4 | 2009-10-04 14:49:41 +0000 | [diff] [blame] | 123 | .. data:: hash.digest_size |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 124 | |
Guido van Rossum | a19f80c | 2007-11-06 20:51:31 +0000 | [diff] [blame] | 125 | The size of the resulting hash in bytes. |
| 126 | |
Benjamin Peterson | 4ac9ce4 | 2009-10-04 14:49:41 +0000 | [diff] [blame] | 127 | .. data:: hash.block_size |
Guido van Rossum | a19f80c | 2007-11-06 20:51:31 +0000 | [diff] [blame] | 128 | |
| 129 | The internal block size of the hash algorithm in bytes. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 130 | |
Jason R. Coombs | b2aa6f4 | 2013-08-03 11:39:39 +0200 | [diff] [blame] | 131 | A hash object has the following attributes: |
| 132 | |
| 133 | .. attribute:: hash.name |
| 134 | |
| 135 | The canonical name of this hash, always lowercase and always suitable as a |
| 136 | parameter to :func:`new` to create another hash of this type. |
| 137 | |
| 138 | .. versionchanged:: 3.4 |
| 139 | The name attribute has been present in CPython since its inception, but |
| 140 | until Python 3.4 was not formally specified, so may not exist on some |
| 141 | platforms. |
| 142 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 143 | A hash object has the following methods: |
| 144 | |
| 145 | |
| 146 | .. method:: hash.update(arg) |
| 147 | |
Georg Brandl | 67ced42 | 2007-09-06 14:09:10 +0000 | [diff] [blame] | 148 | Update the hash object with the object *arg*, which must be interpretable as |
| 149 | a buffer of bytes. Repeated calls are equivalent to a single call with the |
| 150 | concatenation of all the arguments: ``m.update(a); m.update(b)`` is |
| 151 | equivalent to ``m.update(a+b)``. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 152 | |
Georg Brandl | 705d9d5 | 2009-05-05 09:29:50 +0000 | [diff] [blame] | 153 | .. versionchanged:: 3.1 |
Georg Brandl | 67b21b7 | 2010-08-17 15:07:14 +0000 | [diff] [blame] | 154 | The Python GIL is released to allow other threads to run while hash |
Jesus Cea | 5b22dd8 | 2013-10-04 04:20:37 +0200 | [diff] [blame] | 155 | updates on data larger than 2047 bytes is taking place when using hash |
Georg Brandl | 67b21b7 | 2010-08-17 15:07:14 +0000 | [diff] [blame] | 156 | algorithms supplied by OpenSSL. |
Gregory P. Smith | 3f61d61 | 2009-05-04 00:45:33 +0000 | [diff] [blame] | 157 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 158 | |
| 159 | .. method:: hash.digest() |
| 160 | |
Georg Brandl | 67ced42 | 2007-09-06 14:09:10 +0000 | [diff] [blame] | 161 | Return the digest of the data passed to the :meth:`update` method so far. |
Senthil Kumaran | 627284c | 2010-12-30 07:07:58 +0000 | [diff] [blame] | 162 | This is a bytes object of size :attr:`digest_size` which may contain bytes in |
Georg Brandl | 67ced42 | 2007-09-06 14:09:10 +0000 | [diff] [blame] | 163 | the whole range from 0 to 255. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 164 | |
| 165 | |
| 166 | .. method:: hash.hexdigest() |
| 167 | |
Georg Brandl | 67ced42 | 2007-09-06 14:09:10 +0000 | [diff] [blame] | 168 | Like :meth:`digest` except the digest is returned as a string object of |
| 169 | double length, containing only hexadecimal digits. This may be used to |
| 170 | exchange the value safely in email or other non-binary environments. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 171 | |
| 172 | |
| 173 | .. method:: hash.copy() |
| 174 | |
| 175 | Return a copy ("clone") of the hash object. This can be used to efficiently |
Georg Brandl | 67ced42 | 2007-09-06 14:09:10 +0000 | [diff] [blame] | 176 | compute the digests of data sharing a common initial substring. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 177 | |
| 178 | |
Benjamin Peterson | c402d8d | 2015-09-27 01:23:10 -0700 | [diff] [blame] | 179 | Key derivation |
| 180 | -------------- |
Christian Heimes | e92ef13 | 2013-10-13 00:52:43 +0200 | [diff] [blame] | 181 | |
| 182 | Key derivation and key stretching algorithms are designed for secure password |
Benjamin Peterson | 0ccff4d | 2014-05-26 15:41:26 -0700 | [diff] [blame] | 183 | hashing. Naive algorithms such as ``sha1(password)`` are not resistant against |
| 184 | brute-force attacks. A good password hashing function must be tunable, slow, and |
Benjamin Peterson | 0d81d80 | 2014-05-26 15:42:29 -0700 | [diff] [blame] | 185 | include a `salt <https://en.wikipedia.org/wiki/Salt_%28cryptography%29>`_. |
Christian Heimes | e92ef13 | 2013-10-13 00:52:43 +0200 | [diff] [blame] | 186 | |
| 187 | |
Martin Panter | bc85e35 | 2016-02-22 09:21:49 +0000 | [diff] [blame] | 188 | .. function:: pbkdf2_hmac(hash_name, password, salt, iterations, dklen=None) |
Christian Heimes | e92ef13 | 2013-10-13 00:52:43 +0200 | [diff] [blame] | 189 | |
| 190 | The function provides PKCS#5 password-based key derivation function 2. It |
| 191 | uses HMAC as pseudorandom function. |
| 192 | |
Martin Panter | bc85e35 | 2016-02-22 09:21:49 +0000 | [diff] [blame] | 193 | The string *hash_name* is the desired name of the hash digest algorithm for |
Christian Heimes | e92ef13 | 2013-10-13 00:52:43 +0200 | [diff] [blame] | 194 | HMAC, e.g. 'sha1' or 'sha256'. *password* and *salt* are interpreted as |
| 195 | buffers of bytes. Applications and libraries should limit *password* to |
Martin Panter | bc85e35 | 2016-02-22 09:21:49 +0000 | [diff] [blame] | 196 | a sensible length (e.g. 1024). *salt* should be about 16 or more bytes from |
Christian Heimes | e92ef13 | 2013-10-13 00:52:43 +0200 | [diff] [blame] | 197 | a proper source, e.g. :func:`os.urandom`. |
| 198 | |
Martin Panter | bc85e35 | 2016-02-22 09:21:49 +0000 | [diff] [blame] | 199 | The number of *iterations* should be chosen based on the hash algorithm and |
| 200 | computing power. As of 2013, at least 100,000 iterations of SHA-256 are |
| 201 | suggested. |
Christian Heimes | e92ef13 | 2013-10-13 00:52:43 +0200 | [diff] [blame] | 202 | |
| 203 | *dklen* is the length of the derived key. If *dklen* is ``None`` then the |
Martin Panter | bc85e35 | 2016-02-22 09:21:49 +0000 | [diff] [blame] | 204 | digest size of the hash algorithm *hash_name* is used, e.g. 64 for SHA-512. |
Christian Heimes | e92ef13 | 2013-10-13 00:52:43 +0200 | [diff] [blame] | 205 | |
| 206 | >>> import hashlib, binascii |
| 207 | >>> dk = hashlib.pbkdf2_hmac('sha256', b'password', b'salt', 100000) |
| 208 | >>> binascii.hexlify(dk) |
| 209 | b'0394a2ede332c9a13eb82e9b24631604c31df978b4e2f0fbd2c549944f9d79a5' |
| 210 | |
| 211 | .. versionadded:: 3.4 |
| 212 | |
Benjamin Peterson | f9ea5f3 | 2014-05-26 15:45:14 -0700 | [diff] [blame] | 213 | .. note:: |
| 214 | |
| 215 | A fast implementation of *pbkdf2_hmac* is available with OpenSSL. The |
| 216 | Python implementation uses an inline version of :mod:`hmac`. It is about |
| 217 | three times slower and doesn't release the GIL. |
Christian Heimes | e92ef13 | 2013-10-13 00:52:43 +0200 | [diff] [blame] | 218 | |
| 219 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 220 | .. seealso:: |
| 221 | |
| 222 | Module :mod:`hmac` |
| 223 | A module to generate message authentication codes using hashes. |
| 224 | |
| 225 | Module :mod:`base64` |
| 226 | Another way to encode binary hashes for non-binary environments. |
| 227 | |
| 228 | http://csrc.nist.gov/publications/fips/fips180-2/fips180-2.pdf |
| 229 | The FIPS 180-2 publication on Secure Hash Algorithms. |
| 230 | |
Benjamin Peterson | 1dd72e6 | 2015-09-27 02:05:01 -0700 | [diff] [blame] | 231 | https://en.wikipedia.org/wiki/Cryptographic_hash_function#Cryptographic_hash_algorithms |
Georg Brandl | fd0eb3f | 2010-05-21 20:28:13 +0000 | [diff] [blame] | 232 | Wikipedia article with information on which algorithms have known issues and |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 233 | what that means regarding their use. |
| 234 | |
Christian Heimes | e92ef13 | 2013-10-13 00:52:43 +0200 | [diff] [blame] | 235 | http://www.ietf.org/rfc/rfc2898.txt |
| 236 | PKCS #5: Password-Based Cryptography Specification Version 2.0 |