blob: 2cb3c78f09ac4d09105dbbc66f0592cd7502ca43 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`hashlib` --- Secure hashes and message digests
2====================================================
3
4.. module:: hashlib
5 :synopsis: Secure hash and message digest algorithms.
Terry Jan Reedyfa089b92016-06-11 15:02:54 -04006
Benjamin Peterson058e31e2009-01-16 03:54:08 +00007.. moduleauthor:: Gregory P. Smith <greg@krypto.org>
8.. sectionauthor:: Gregory P. Smith <greg@krypto.org>
Georg Brandl116aa622007-08-15 14:28:22 +00009
Terry Jan Reedyfa089b92016-06-11 15:02:54 -040010**Source code:** :source:`Lib/hashlib.py`
Georg Brandl116aa622007-08-15 14:28:22 +000011
Georg Brandl116aa622007-08-15 14:28:22 +000012.. index::
13 single: message digest, MD5
14 single: secure hash algorithm, SHA1, SHA224, SHA256, SHA384, SHA512
15
Zachary Ware4199bba2016-08-10 01:05:19 -050016.. testsetup::
17
18 import hashlib
19
20
Raymond Hettinger469271d2011-01-27 20:38:46 +000021--------------
22
Georg Brandl116aa622007-08-15 14:28:22 +000023This module implements a common interface to many different secure hash and
24message digest algorithms. Included are the FIPS secure hash algorithms SHA1,
25SHA224, SHA256, SHA384, and SHA512 (defined in FIPS 180-2) as well as RSA's MD5
Georg Brandl67ced422007-09-06 14:09:10 +000026algorithm (defined in Internet :rfc:`1321`). The terms "secure hash" and
27"message digest" are interchangeable. Older algorithms were called message
28digests. The modern term is secure hash.
Georg Brandl116aa622007-08-15 14:28:22 +000029
Christian Heimesd5e2b6f2008-03-19 21:50:51 +000030.. note::
Georg Brandl6e94a302013-10-06 18:26:36 +020031
32 If you want the adler32 or crc32 hash functions, they are available in
Christian Heimesd5e2b6f2008-03-19 21:50:51 +000033 the :mod:`zlib` module.
34
Georg Brandl116aa622007-08-15 14:28:22 +000035.. warning::
36
Georg Brandl6e94a302013-10-06 18:26:36 +020037 Some algorithms have known hash collision weaknesses, refer to the "See
38 also" section at the end.
Georg Brandl116aa622007-08-15 14:28:22 +000039
Christian Heimese92ef132013-10-13 00:52:43 +020040
R David Murraycde1a062013-12-20 16:33:52 -050041.. _hash-algorithms:
42
Christian Heimese92ef132013-10-13 00:52:43 +020043Hash algorithms
44---------------
45
Georg Brandl116aa622007-08-15 14:28:22 +000046There is one constructor method named for each type of :dfn:`hash`. All return
Gregory P. Smith8907dcd2016-06-11 17:56:12 -070047a hash object with the same simple interface. For example: use :func:`sha256` to
48create a SHA-256 hash object. You can now feed this object with :term:`bytes-like
Serhiy Storchakae5ea1ab2016-05-18 13:54:54 +030049objects <bytes-like object>` (normally :class:`bytes`) using the :meth:`update` method.
Ezio Melottic228e962013-05-04 18:06:34 +030050At any point you can ask it for the :dfn:`digest` of the
Georg Brandl67ced422007-09-06 14:09:10 +000051concatenation of the data fed to it so far using the :meth:`digest` or
52:meth:`hexdigest` methods.
53
54.. note::
55
Benjamin Peterson9cb7bd22012-12-20 20:24:37 -060056 For better multithreading performance, the Python :term:`GIL` is released for
Jesus Cea5b22dd82013-10-04 04:20:37 +020057 data larger than 2047 bytes at object creation or on update.
Antoine Pitroubcd5cbe2009-01-08 21:17:16 +000058
59.. note::
60
Benjamin Petersonbd584d52012-12-20 20:22:47 -060061 Feeding string objects into :meth:`update` is not supported, as hashes work
Georg Brandl67ced422007-09-06 14:09:10 +000062 on bytes, not on characters.
Georg Brandl116aa622007-08-15 14:28:22 +000063
Thomas Wouters1b7f8912007-09-19 03:06:30 +000064.. index:: single: OpenSSL; (use in module hashlib)
Georg Brandl116aa622007-08-15 14:28:22 +000065
66Constructors for hash algorithms that are always present in this module are
Gregory P. Smith8907dcd2016-06-11 17:56:12 -070067:func:`sha1`, :func:`sha224`, :func:`sha256`, :func:`sha384`,
Christian Heimes121b9482016-09-06 22:03:25 +020068:func:`sha512`, :func:`blake2b`, and :func:`blake2s`.
69:func:`md5` is normally available as well, though it
Gregory P. Smith8907dcd2016-06-11 17:56:12 -070070may be missing if you are using a rare "FIPS compliant" build of Python.
71Additional algorithms may also be available depending upon the OpenSSL
Christian Heimes6fe2a752016-09-07 11:58:24 +020072library that Python uses on your platform. On most platforms the
73:func:`sha3_224`, :func:`sha3_256`, :func:`sha3_384`, :func:`sha3_512`,
74:func:`shake_128`, :func:`shake_256` are also available.
75
76.. versionadded:: 3.6
77 SHA3 (Keccak) and SHAKE constructors :func:`sha3_224`, :func:`sha3_256`,
78 :func:`sha3_384`, :func:`sha3_512`, :func:`shake_128`, :func:`shake_256`.
Christian Heimes4a0270d2012-10-06 02:23:36 +020079
Christian Heimes121b9482016-09-06 22:03:25 +020080.. versionadded:: 3.6
81 :func:`blake2b` and :func:`blake2s` were added.
82
Georg Brandl67ced422007-09-06 14:09:10 +000083For example, to obtain the digest of the byte string ``b'Nobody inspects the
84spammish repetition'``::
Georg Brandl116aa622007-08-15 14:28:22 +000085
86 >>> import hashlib
Gregory P. Smith8907dcd2016-06-11 17:56:12 -070087 >>> m = hashlib.sha256()
Georg Brandl67ced422007-09-06 14:09:10 +000088 >>> m.update(b"Nobody inspects")
89 >>> m.update(b" the spammish repetition")
Georg Brandl116aa622007-08-15 14:28:22 +000090 >>> m.digest()
Gregory P. Smith8907dcd2016-06-11 17:56:12 -070091 b'\x03\x1e\xdd}Ae\x15\x93\xc5\xfe\\\x00o\xa5u+7\xfd\xdf\xf7\xbcN\x84:\xa6\xaf\x0c\x95\x0fK\x94\x06'
Guido van Rossuma19f80c2007-11-06 20:51:31 +000092 >>> m.digest_size
Gregory P. Smith8907dcd2016-06-11 17:56:12 -070093 32
Guido van Rossuma19f80c2007-11-06 20:51:31 +000094 >>> m.block_size
95 64
Georg Brandl116aa622007-08-15 14:28:22 +000096
Christian Heimesfe337bf2008-03-23 21:54:12 +000097More condensed:
Georg Brandl116aa622007-08-15 14:28:22 +000098
Georg Brandl67ced422007-09-06 14:09:10 +000099 >>> hashlib.sha224(b"Nobody inspects the spammish repetition").hexdigest()
Benjamin Peterson0fa3f3d2008-12-29 20:52:09 +0000100 'a4337bc45a8fc544c03f52dc550cd6e1e87021bc896588bd79e901e2'
Georg Brandl116aa622007-08-15 14:28:22 +0000101
Gregory P. Smith13b55292010-09-06 08:30:23 +0000102.. function:: new(name[, data])
103
104 Is a generic constructor that takes the string name of the desired
105 algorithm as its first parameter. It also exists to allow access to the
106 above listed hashes as well as any other algorithms that your OpenSSL
107 library may offer. The named constructors are much faster than :func:`new`
108 and should be preferred.
Georg Brandl116aa622007-08-15 14:28:22 +0000109
Christian Heimesfe337bf2008-03-23 21:54:12 +0000110Using :func:`new` with an algorithm provided by OpenSSL:
Georg Brandl116aa622007-08-15 14:28:22 +0000111
112 >>> h = hashlib.new('ripemd160')
Georg Brandl67ced422007-09-06 14:09:10 +0000113 >>> h.update(b"Nobody inspects the spammish repetition")
Georg Brandl116aa622007-08-15 14:28:22 +0000114 >>> h.hexdigest()
Benjamin Peterson0fa3f3d2008-12-29 20:52:09 +0000115 'cc4a5ce1b3df48aec5d22d1f16b894a0b894eccc'
Georg Brandl116aa622007-08-15 14:28:22 +0000116
Gregory P. Smith13b55292010-09-06 08:30:23 +0000117Hashlib provides the following constant attributes:
Gregory P. Smith86508cc2010-03-01 02:05:26 +0000118
Gregory P. Smith13b55292010-09-06 08:30:23 +0000119.. data:: algorithms_guaranteed
Gregory P. Smith86508cc2010-03-01 02:05:26 +0000120
Larry Hastings3732ed22014-03-15 21:13:56 -0700121 A set containing the names of the hash algorithms guaranteed to be supported
Gregory P. Smith7bfb4152016-06-11 18:02:13 -0700122 by this module on all platforms. Note that 'md5' is in this list despite
123 some upstream vendors offering an odd "FIPS compliant" Python build that
124 excludes it.
Gregory P. Smith13b55292010-09-06 08:30:23 +0000125
126 .. versionadded:: 3.2
127
128.. data:: algorithms_available
129
Larry Hastings3732ed22014-03-15 21:13:56 -0700130 A set containing the names of the hash algorithms that are available in the
131 running Python interpreter. These names will be recognized when passed to
132 :func:`new`. :attr:`algorithms_guaranteed` will always be a subset. The
133 same algorithm may appear multiple times in this set under different names
134 (thanks to OpenSSL).
Gregory P. Smith86508cc2010-03-01 02:05:26 +0000135
136 .. versionadded:: 3.2
137
Georg Brandl116aa622007-08-15 14:28:22 +0000138The following values are provided as constant attributes of the hash objects
139returned by the constructors:
140
141
Benjamin Peterson4ac9ce42009-10-04 14:49:41 +0000142.. data:: hash.digest_size
Georg Brandl116aa622007-08-15 14:28:22 +0000143
Guido van Rossuma19f80c2007-11-06 20:51:31 +0000144 The size of the resulting hash in bytes.
145
Benjamin Peterson4ac9ce42009-10-04 14:49:41 +0000146.. data:: hash.block_size
Guido van Rossuma19f80c2007-11-06 20:51:31 +0000147
148 The internal block size of the hash algorithm in bytes.
Georg Brandl116aa622007-08-15 14:28:22 +0000149
Jason R. Coombsb2aa6f42013-08-03 11:39:39 +0200150A hash object has the following attributes:
151
152.. attribute:: hash.name
153
154 The canonical name of this hash, always lowercase and always suitable as a
155 parameter to :func:`new` to create another hash of this type.
156
157 .. versionchanged:: 3.4
158 The name attribute has been present in CPython since its inception, but
159 until Python 3.4 was not formally specified, so may not exist on some
160 platforms.
161
Georg Brandl116aa622007-08-15 14:28:22 +0000162A hash object has the following methods:
163
164
165.. method:: hash.update(arg)
166
Georg Brandl67ced422007-09-06 14:09:10 +0000167 Update the hash object with the object *arg*, which must be interpretable as
168 a buffer of bytes. Repeated calls are equivalent to a single call with the
169 concatenation of all the arguments: ``m.update(a); m.update(b)`` is
170 equivalent to ``m.update(a+b)``.
Georg Brandl116aa622007-08-15 14:28:22 +0000171
Georg Brandl705d9d52009-05-05 09:29:50 +0000172 .. versionchanged:: 3.1
Georg Brandl67b21b72010-08-17 15:07:14 +0000173 The Python GIL is released to allow other threads to run while hash
Jesus Cea5b22dd82013-10-04 04:20:37 +0200174 updates on data larger than 2047 bytes is taking place when using hash
Georg Brandl67b21b72010-08-17 15:07:14 +0000175 algorithms supplied by OpenSSL.
Gregory P. Smith3f61d612009-05-04 00:45:33 +0000176
Georg Brandl116aa622007-08-15 14:28:22 +0000177
178.. method:: hash.digest()
179
Georg Brandl67ced422007-09-06 14:09:10 +0000180 Return the digest of the data passed to the :meth:`update` method so far.
Senthil Kumaran627284c2010-12-30 07:07:58 +0000181 This is a bytes object of size :attr:`digest_size` which may contain bytes in
Georg Brandl67ced422007-09-06 14:09:10 +0000182 the whole range from 0 to 255.
Georg Brandl116aa622007-08-15 14:28:22 +0000183
184
185.. method:: hash.hexdigest()
186
Georg Brandl67ced422007-09-06 14:09:10 +0000187 Like :meth:`digest` except the digest is returned as a string object of
188 double length, containing only hexadecimal digits. This may be used to
189 exchange the value safely in email or other non-binary environments.
Georg Brandl116aa622007-08-15 14:28:22 +0000190
191
192.. method:: hash.copy()
193
194 Return a copy ("clone") of the hash object. This can be used to efficiently
Georg Brandl67ced422007-09-06 14:09:10 +0000195 compute the digests of data sharing a common initial substring.
Georg Brandl116aa622007-08-15 14:28:22 +0000196
197
Christian Heimes6fe2a752016-09-07 11:58:24 +0200198SHAKE variable length digests
199-----------------------------
200
201The :func:`shake_128` and :func:`shake_256` algorithms provide variable
202length digests with length_in_bits//2 up to 128 or 256 bits of security.
203As such, their digest methods require a length. Maximum length is not limited
204by the SHAKE algorithm.
205
206.. method:: shake.digest(length)
207
208 Return the digest of the data passed to the :meth:`update` method so far.
209 This is a bytes object of size ``length`` which may contain bytes in
210 the whole range from 0 to 255.
211
212
213.. method:: shake.hexdigest(length)
214
215 Like :meth:`digest` except the digest is returned as a string object of
216 double length, containing only hexadecimal digits. This may be used to
217 exchange the value safely in email or other non-binary environments.
218
219
Benjamin Petersonc402d8d2015-09-27 01:23:10 -0700220Key derivation
221--------------
Christian Heimese92ef132013-10-13 00:52:43 +0200222
223Key derivation and key stretching algorithms are designed for secure password
Benjamin Peterson0ccff4d2014-05-26 15:41:26 -0700224hashing. Naive algorithms such as ``sha1(password)`` are not resistant against
225brute-force attacks. A good password hashing function must be tunable, slow, and
Benjamin Peterson0d81d802014-05-26 15:42:29 -0700226include a `salt <https://en.wikipedia.org/wiki/Salt_%28cryptography%29>`_.
Christian Heimese92ef132013-10-13 00:52:43 +0200227
228
Martin Panterbc85e352016-02-22 09:21:49 +0000229.. function:: pbkdf2_hmac(hash_name, password, salt, iterations, dklen=None)
Christian Heimese92ef132013-10-13 00:52:43 +0200230
231 The function provides PKCS#5 password-based key derivation function 2. It
232 uses HMAC as pseudorandom function.
233
Martin Panterbc85e352016-02-22 09:21:49 +0000234 The string *hash_name* is the desired name of the hash digest algorithm for
Christian Heimese92ef132013-10-13 00:52:43 +0200235 HMAC, e.g. 'sha1' or 'sha256'. *password* and *salt* are interpreted as
236 buffers of bytes. Applications and libraries should limit *password* to
Martin Panterbc85e352016-02-22 09:21:49 +0000237 a sensible length (e.g. 1024). *salt* should be about 16 or more bytes from
Christian Heimese92ef132013-10-13 00:52:43 +0200238 a proper source, e.g. :func:`os.urandom`.
239
Martin Panterbc85e352016-02-22 09:21:49 +0000240 The number of *iterations* should be chosen based on the hash algorithm and
241 computing power. As of 2013, at least 100,000 iterations of SHA-256 are
242 suggested.
Christian Heimese92ef132013-10-13 00:52:43 +0200243
244 *dklen* is the length of the derived key. If *dklen* is ``None`` then the
Martin Panterbc85e352016-02-22 09:21:49 +0000245 digest size of the hash algorithm *hash_name* is used, e.g. 64 for SHA-512.
Christian Heimese92ef132013-10-13 00:52:43 +0200246
247 >>> import hashlib, binascii
248 >>> dk = hashlib.pbkdf2_hmac('sha256', b'password', b'salt', 100000)
249 >>> binascii.hexlify(dk)
250 b'0394a2ede332c9a13eb82e9b24631604c31df978b4e2f0fbd2c549944f9d79a5'
251
252 .. versionadded:: 3.4
253
Benjamin Petersonf9ea5f32014-05-26 15:45:14 -0700254 .. note::
255
256 A fast implementation of *pbkdf2_hmac* is available with OpenSSL. The
257 Python implementation uses an inline version of :mod:`hmac`. It is about
258 three times slower and doesn't release the GIL.
Christian Heimese92ef132013-10-13 00:52:43 +0200259
Christian Heimes39093e92016-09-06 20:22:28 +0200260.. function:: scrypt(password, *, salt, n, r, p, maxmem=0, dklen=64)
261
262 The function provides scrypt password-based key derivation function as
263 defined in :rfc:`7914`.
264
265 *password* and *salt* must be bytes-like objects. Applications and
266 libraries should limit *password* to a sensible length (e.g. 1024). *salt*
267 should be about 16 or more bytes from a proper source, e.g. :func:`os.urandom`.
268
269 *n* is the CPU/Memory cost factor, *r* the block size, *p* parallelization
270 factor and *maxmem* limits memory (OpenSSL 1.1.0 defaults to 32 MB).
271 *dklen* is the length of the derived key.
272
273 Availability: OpenSSL 1.1+
274
275 .. versionadded:: 3.6
276
Christian Heimese92ef132013-10-13 00:52:43 +0200277
Christian Heimes121b9482016-09-06 22:03:25 +0200278BLAKE2
279------
280
281BLAKE2 takes additional arguments, see :ref:`hashlib-blake2`.
282
283
Georg Brandl116aa622007-08-15 14:28:22 +0000284.. seealso::
285
286 Module :mod:`hmac`
287 A module to generate message authentication codes using hashes.
288
289 Module :mod:`base64`
290 Another way to encode binary hashes for non-binary environments.
291
Christian Heimes121b9482016-09-06 22:03:25 +0200292 See :ref:`hashlib-blake2`.
293
Georg Brandl116aa622007-08-15 14:28:22 +0000294 http://csrc.nist.gov/publications/fips/fips180-2/fips180-2.pdf
295 The FIPS 180-2 publication on Secure Hash Algorithms.
296
Benjamin Peterson1dd72e62015-09-27 02:05:01 -0700297 https://en.wikipedia.org/wiki/Cryptographic_hash_function#Cryptographic_hash_algorithms
Georg Brandlfd0eb3f2010-05-21 20:28:13 +0000298 Wikipedia article with information on which algorithms have known issues and
Georg Brandl116aa622007-08-15 14:28:22 +0000299 what that means regarding their use.
300
Serhiy Storchaka6dff0202016-05-07 10:49:07 +0300301 https://www.ietf.org/rfc/rfc2898.txt
Christian Heimese92ef132013-10-13 00:52:43 +0200302 PKCS #5: Password-Based Cryptography Specification Version 2.0