blob: eda18adc9e5ec80c784d5f473173698ec5b6d11c [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`hashlib` --- Secure hashes and message digests
2====================================================
3
4.. module:: hashlib
5 :synopsis: Secure hash and message digest algorithms.
Terry Jan Reedyfa089b92016-06-11 15:02:54 -04006
Benjamin Peterson058e31e2009-01-16 03:54:08 +00007.. moduleauthor:: Gregory P. Smith <greg@krypto.org>
8.. sectionauthor:: Gregory P. Smith <greg@krypto.org>
Georg Brandl116aa622007-08-15 14:28:22 +00009
Terry Jan Reedyfa089b92016-06-11 15:02:54 -040010**Source code:** :source:`Lib/hashlib.py`
Georg Brandl116aa622007-08-15 14:28:22 +000011
Georg Brandl116aa622007-08-15 14:28:22 +000012.. index::
13 single: message digest, MD5
14 single: secure hash algorithm, SHA1, SHA224, SHA256, SHA384, SHA512
15
Zachary Ware4199bba2016-08-10 01:05:19 -050016.. testsetup::
17
18 import hashlib
19
20
Raymond Hettinger469271d2011-01-27 20:38:46 +000021--------------
22
Georg Brandl116aa622007-08-15 14:28:22 +000023This module implements a common interface to many different secure hash and
24message digest algorithms. Included are the FIPS secure hash algorithms SHA1,
25SHA224, SHA256, SHA384, and SHA512 (defined in FIPS 180-2) as well as RSA's MD5
Georg Brandl67ced422007-09-06 14:09:10 +000026algorithm (defined in Internet :rfc:`1321`). The terms "secure hash" and
27"message digest" are interchangeable. Older algorithms were called message
28digests. The modern term is secure hash.
Georg Brandl116aa622007-08-15 14:28:22 +000029
Christian Heimesd5e2b6f2008-03-19 21:50:51 +000030.. note::
Georg Brandl6e94a302013-10-06 18:26:36 +020031
32 If you want the adler32 or crc32 hash functions, they are available in
Christian Heimesd5e2b6f2008-03-19 21:50:51 +000033 the :mod:`zlib` module.
34
Georg Brandl116aa622007-08-15 14:28:22 +000035.. warning::
36
Georg Brandl6e94a302013-10-06 18:26:36 +020037 Some algorithms have known hash collision weaknesses, refer to the "See
38 also" section at the end.
Georg Brandl116aa622007-08-15 14:28:22 +000039
Christian Heimese92ef132013-10-13 00:52:43 +020040
R David Murraycde1a062013-12-20 16:33:52 -050041.. _hash-algorithms:
42
Christian Heimese92ef132013-10-13 00:52:43 +020043Hash algorithms
44---------------
45
Georg Brandl116aa622007-08-15 14:28:22 +000046There is one constructor method named for each type of :dfn:`hash`. All return
Gregory P. Smith8907dcd2016-06-11 17:56:12 -070047a hash object with the same simple interface. For example: use :func:`sha256` to
48create a SHA-256 hash object. You can now feed this object with :term:`bytes-like
Serhiy Storchakae5ea1ab2016-05-18 13:54:54 +030049objects <bytes-like object>` (normally :class:`bytes`) using the :meth:`update` method.
Ezio Melottic228e962013-05-04 18:06:34 +030050At any point you can ask it for the :dfn:`digest` of the
Georg Brandl67ced422007-09-06 14:09:10 +000051concatenation of the data fed to it so far using the :meth:`digest` or
52:meth:`hexdigest` methods.
53
54.. note::
55
Benjamin Peterson9cb7bd22012-12-20 20:24:37 -060056 For better multithreading performance, the Python :term:`GIL` is released for
Jesus Cea5b22dd82013-10-04 04:20:37 +020057 data larger than 2047 bytes at object creation or on update.
Antoine Pitroubcd5cbe2009-01-08 21:17:16 +000058
59.. note::
60
Benjamin Petersonbd584d52012-12-20 20:22:47 -060061 Feeding string objects into :meth:`update` is not supported, as hashes work
Georg Brandl67ced422007-09-06 14:09:10 +000062 on bytes, not on characters.
Georg Brandl116aa622007-08-15 14:28:22 +000063
Thomas Wouters1b7f8912007-09-19 03:06:30 +000064.. index:: single: OpenSSL; (use in module hashlib)
Georg Brandl116aa622007-08-15 14:28:22 +000065
66Constructors for hash algorithms that are always present in this module are
Gregory P. Smith8907dcd2016-06-11 17:56:12 -070067:func:`sha1`, :func:`sha224`, :func:`sha256`, :func:`sha384`,
Christian Heimes121b9482016-09-06 22:03:25 +020068:func:`sha512`, :func:`blake2b`, and :func:`blake2s`.
69:func:`md5` is normally available as well, though it
Gregory P. Smith8907dcd2016-06-11 17:56:12 -070070may be missing if you are using a rare "FIPS compliant" build of Python.
71Additional algorithms may also be available depending upon the OpenSSL
Christian Heimes6fe2a752016-09-07 11:58:24 +020072library that Python uses on your platform. On most platforms the
73:func:`sha3_224`, :func:`sha3_256`, :func:`sha3_384`, :func:`sha3_512`,
74:func:`shake_128`, :func:`shake_256` are also available.
75
76.. versionadded:: 3.6
77 SHA3 (Keccak) and SHAKE constructors :func:`sha3_224`, :func:`sha3_256`,
78 :func:`sha3_384`, :func:`sha3_512`, :func:`shake_128`, :func:`shake_256`.
Christian Heimes4a0270d2012-10-06 02:23:36 +020079
Christian Heimes121b9482016-09-06 22:03:25 +020080.. versionadded:: 3.6
81 :func:`blake2b` and :func:`blake2s` were added.
82
Georg Brandl67ced422007-09-06 14:09:10 +000083For example, to obtain the digest of the byte string ``b'Nobody inspects the
84spammish repetition'``::
Georg Brandl116aa622007-08-15 14:28:22 +000085
86 >>> import hashlib
Gregory P. Smith8907dcd2016-06-11 17:56:12 -070087 >>> m = hashlib.sha256()
Georg Brandl67ced422007-09-06 14:09:10 +000088 >>> m.update(b"Nobody inspects")
89 >>> m.update(b" the spammish repetition")
Georg Brandl116aa622007-08-15 14:28:22 +000090 >>> m.digest()
Gregory P. Smith8907dcd2016-06-11 17:56:12 -070091 b'\x03\x1e\xdd}Ae\x15\x93\xc5\xfe\\\x00o\xa5u+7\xfd\xdf\xf7\xbcN\x84:\xa6\xaf\x0c\x95\x0fK\x94\x06'
Guido van Rossuma19f80c2007-11-06 20:51:31 +000092 >>> m.digest_size
Gregory P. Smith8907dcd2016-06-11 17:56:12 -070093 32
Guido van Rossuma19f80c2007-11-06 20:51:31 +000094 >>> m.block_size
95 64
Georg Brandl116aa622007-08-15 14:28:22 +000096
Christian Heimesfe337bf2008-03-23 21:54:12 +000097More condensed:
Georg Brandl116aa622007-08-15 14:28:22 +000098
Georg Brandl67ced422007-09-06 14:09:10 +000099 >>> hashlib.sha224(b"Nobody inspects the spammish repetition").hexdigest()
Benjamin Peterson0fa3f3d2008-12-29 20:52:09 +0000100 'a4337bc45a8fc544c03f52dc550cd6e1e87021bc896588bd79e901e2'
Georg Brandl116aa622007-08-15 14:28:22 +0000101
Gregory P. Smith13b55292010-09-06 08:30:23 +0000102.. function:: new(name[, data])
103
104 Is a generic constructor that takes the string name of the desired
105 algorithm as its first parameter. It also exists to allow access to the
106 above listed hashes as well as any other algorithms that your OpenSSL
107 library may offer. The named constructors are much faster than :func:`new`
108 and should be preferred.
Georg Brandl116aa622007-08-15 14:28:22 +0000109
Christian Heimesfe337bf2008-03-23 21:54:12 +0000110Using :func:`new` with an algorithm provided by OpenSSL:
Georg Brandl116aa622007-08-15 14:28:22 +0000111
112 >>> h = hashlib.new('ripemd160')
Georg Brandl67ced422007-09-06 14:09:10 +0000113 >>> h.update(b"Nobody inspects the spammish repetition")
Georg Brandl116aa622007-08-15 14:28:22 +0000114 >>> h.hexdigest()
Benjamin Peterson0fa3f3d2008-12-29 20:52:09 +0000115 'cc4a5ce1b3df48aec5d22d1f16b894a0b894eccc'
Georg Brandl116aa622007-08-15 14:28:22 +0000116
Gregory P. Smith13b55292010-09-06 08:30:23 +0000117Hashlib provides the following constant attributes:
Gregory P. Smith86508cc2010-03-01 02:05:26 +0000118
Gregory P. Smith13b55292010-09-06 08:30:23 +0000119.. data:: algorithms_guaranteed
Gregory P. Smith86508cc2010-03-01 02:05:26 +0000120
Larry Hastings3732ed22014-03-15 21:13:56 -0700121 A set containing the names of the hash algorithms guaranteed to be supported
Gregory P. Smith7bfb4152016-06-11 18:02:13 -0700122 by this module on all platforms. Note that 'md5' is in this list despite
123 some upstream vendors offering an odd "FIPS compliant" Python build that
124 excludes it.
Gregory P. Smith13b55292010-09-06 08:30:23 +0000125
126 .. versionadded:: 3.2
127
128.. data:: algorithms_available
129
Larry Hastings3732ed22014-03-15 21:13:56 -0700130 A set containing the names of the hash algorithms that are available in the
131 running Python interpreter. These names will be recognized when passed to
132 :func:`new`. :attr:`algorithms_guaranteed` will always be a subset. The
133 same algorithm may appear multiple times in this set under different names
134 (thanks to OpenSSL).
Gregory P. Smith86508cc2010-03-01 02:05:26 +0000135
136 .. versionadded:: 3.2
137
Georg Brandl116aa622007-08-15 14:28:22 +0000138The following values are provided as constant attributes of the hash objects
139returned by the constructors:
140
141
Benjamin Peterson4ac9ce42009-10-04 14:49:41 +0000142.. data:: hash.digest_size
Georg Brandl116aa622007-08-15 14:28:22 +0000143
Guido van Rossuma19f80c2007-11-06 20:51:31 +0000144 The size of the resulting hash in bytes.
145
Benjamin Peterson4ac9ce42009-10-04 14:49:41 +0000146.. data:: hash.block_size
Guido van Rossuma19f80c2007-11-06 20:51:31 +0000147
148 The internal block size of the hash algorithm in bytes.
Georg Brandl116aa622007-08-15 14:28:22 +0000149
Jason R. Coombsb2aa6f42013-08-03 11:39:39 +0200150A hash object has the following attributes:
151
152.. attribute:: hash.name
153
154 The canonical name of this hash, always lowercase and always suitable as a
155 parameter to :func:`new` to create another hash of this type.
156
157 .. versionchanged:: 3.4
158 The name attribute has been present in CPython since its inception, but
159 until Python 3.4 was not formally specified, so may not exist on some
160 platforms.
161
Georg Brandl116aa622007-08-15 14:28:22 +0000162A hash object has the following methods:
163
164
165.. method:: hash.update(arg)
166
Georg Brandl67ced422007-09-06 14:09:10 +0000167 Update the hash object with the object *arg*, which must be interpretable as
168 a buffer of bytes. Repeated calls are equivalent to a single call with the
169 concatenation of all the arguments: ``m.update(a); m.update(b)`` is
170 equivalent to ``m.update(a+b)``.
Georg Brandl116aa622007-08-15 14:28:22 +0000171
Georg Brandl705d9d52009-05-05 09:29:50 +0000172 .. versionchanged:: 3.1
Georg Brandl67b21b72010-08-17 15:07:14 +0000173 The Python GIL is released to allow other threads to run while hash
Jesus Cea5b22dd82013-10-04 04:20:37 +0200174 updates on data larger than 2047 bytes is taking place when using hash
Georg Brandl67b21b72010-08-17 15:07:14 +0000175 algorithms supplied by OpenSSL.
Gregory P. Smith3f61d612009-05-04 00:45:33 +0000176
Georg Brandl116aa622007-08-15 14:28:22 +0000177
178.. method:: hash.digest()
179
Georg Brandl67ced422007-09-06 14:09:10 +0000180 Return the digest of the data passed to the :meth:`update` method so far.
Senthil Kumaran627284c2010-12-30 07:07:58 +0000181 This is a bytes object of size :attr:`digest_size` which may contain bytes in
Georg Brandl67ced422007-09-06 14:09:10 +0000182 the whole range from 0 to 255.
Georg Brandl116aa622007-08-15 14:28:22 +0000183
184
185.. method:: hash.hexdigest()
186
Georg Brandl67ced422007-09-06 14:09:10 +0000187 Like :meth:`digest` except the digest is returned as a string object of
188 double length, containing only hexadecimal digits. This may be used to
189 exchange the value safely in email or other non-binary environments.
Georg Brandl116aa622007-08-15 14:28:22 +0000190
191
192.. method:: hash.copy()
193
194 Return a copy ("clone") of the hash object. This can be used to efficiently
Georg Brandl67ced422007-09-06 14:09:10 +0000195 compute the digests of data sharing a common initial substring.
Georg Brandl116aa622007-08-15 14:28:22 +0000196
197
Christian Heimes6fe2a752016-09-07 11:58:24 +0200198SHAKE variable length digests
199-----------------------------
200
201The :func:`shake_128` and :func:`shake_256` algorithms provide variable
202length digests with length_in_bits//2 up to 128 or 256 bits of security.
203As such, their digest methods require a length. Maximum length is not limited
204by the SHAKE algorithm.
205
206.. method:: shake.digest(length)
207
208 Return the digest of the data passed to the :meth:`update` method so far.
209 This is a bytes object of size ``length`` which may contain bytes in
210 the whole range from 0 to 255.
211
212
213.. method:: shake.hexdigest(length)
214
215 Like :meth:`digest` except the digest is returned as a string object of
216 double length, containing only hexadecimal digits. This may be used to
217 exchange the value safely in email or other non-binary environments.
218
219
Benjamin Petersonc402d8d2015-09-27 01:23:10 -0700220Key derivation
221--------------
Christian Heimese92ef132013-10-13 00:52:43 +0200222
223Key derivation and key stretching algorithms are designed for secure password
Benjamin Peterson0ccff4d2014-05-26 15:41:26 -0700224hashing. Naive algorithms such as ``sha1(password)`` are not resistant against
225brute-force attacks. A good password hashing function must be tunable, slow, and
Benjamin Peterson0d81d802014-05-26 15:42:29 -0700226include a `salt <https://en.wikipedia.org/wiki/Salt_%28cryptography%29>`_.
Christian Heimese92ef132013-10-13 00:52:43 +0200227
228
Martin Panterbc85e352016-02-22 09:21:49 +0000229.. function:: pbkdf2_hmac(hash_name, password, salt, iterations, dklen=None)
Christian Heimese92ef132013-10-13 00:52:43 +0200230
231 The function provides PKCS#5 password-based key derivation function 2. It
232 uses HMAC as pseudorandom function.
233
Martin Panterbc85e352016-02-22 09:21:49 +0000234 The string *hash_name* is the desired name of the hash digest algorithm for
Christian Heimese92ef132013-10-13 00:52:43 +0200235 HMAC, e.g. 'sha1' or 'sha256'. *password* and *salt* are interpreted as
236 buffers of bytes. Applications and libraries should limit *password* to
Martin Panterbc85e352016-02-22 09:21:49 +0000237 a sensible length (e.g. 1024). *salt* should be about 16 or more bytes from
Christian Heimese92ef132013-10-13 00:52:43 +0200238 a proper source, e.g. :func:`os.urandom`.
239
Martin Panterbc85e352016-02-22 09:21:49 +0000240 The number of *iterations* should be chosen based on the hash algorithm and
241 computing power. As of 2013, at least 100,000 iterations of SHA-256 are
242 suggested.
Christian Heimese92ef132013-10-13 00:52:43 +0200243
244 *dklen* is the length of the derived key. If *dklen* is ``None`` then the
Martin Panterbc85e352016-02-22 09:21:49 +0000245 digest size of the hash algorithm *hash_name* is used, e.g. 64 for SHA-512.
Christian Heimese92ef132013-10-13 00:52:43 +0200246
247 >>> import hashlib, binascii
248 >>> dk = hashlib.pbkdf2_hmac('sha256', b'password', b'salt', 100000)
249 >>> binascii.hexlify(dk)
250 b'0394a2ede332c9a13eb82e9b24631604c31df978b4e2f0fbd2c549944f9d79a5'
251
252 .. versionadded:: 3.4
253
Benjamin Petersonf9ea5f32014-05-26 15:45:14 -0700254 .. note::
255
256 A fast implementation of *pbkdf2_hmac* is available with OpenSSL. The
257 Python implementation uses an inline version of :mod:`hmac`. It is about
258 three times slower and doesn't release the GIL.
Christian Heimese92ef132013-10-13 00:52:43 +0200259
Christian Heimes39093e92016-09-06 20:22:28 +0200260.. function:: scrypt(password, *, salt, n, r, p, maxmem=0, dklen=64)
261
262 The function provides scrypt password-based key derivation function as
263 defined in :rfc:`7914`.
264
265 *password* and *salt* must be bytes-like objects. Applications and
266 libraries should limit *password* to a sensible length (e.g. 1024). *salt*
267 should be about 16 or more bytes from a proper source, e.g. :func:`os.urandom`.
268
269 *n* is the CPU/Memory cost factor, *r* the block size, *p* parallelization
Victor Stinner8c663fd2017-11-08 14:44:44 -0800270 factor and *maxmem* limits memory (OpenSSL 1.1.0 defaults to 32 MiB).
Christian Heimes39093e92016-09-06 20:22:28 +0200271 *dklen* is the length of the derived key.
272
273 Availability: OpenSSL 1.1+
274
275 .. versionadded:: 3.6
276
Christian Heimese92ef132013-10-13 00:52:43 +0200277
Christian Heimes121b9482016-09-06 22:03:25 +0200278BLAKE2
279------
280
INADA Naokie2f9e772017-01-13 19:29:58 +0900281.. sectionauthor:: Dmitry Chestnykh
282
283.. index::
284 single: blake2b, blake2s
285
286BLAKE2_ is a cryptographic hash function defined in RFC-7693_ that comes in two
287flavors:
288
289* **BLAKE2b**, optimized for 64-bit platforms and produces digests of any size
290 between 1 and 64 bytes,
291
292* **BLAKE2s**, optimized for 8- to 32-bit platforms and produces digests of any
293 size between 1 and 32 bytes.
294
295BLAKE2 supports **keyed mode** (a faster and simpler replacement for HMAC_),
296**salted hashing**, **personalization**, and **tree hashing**.
297
298Hash objects from this module follow the API of standard library's
299:mod:`hashlib` objects.
300
301
302Creating hash objects
303^^^^^^^^^^^^^^^^^^^^^
304
305New hash objects are created by calling constructor functions:
306
307
308.. function:: blake2b(data=b'', digest_size=64, key=b'', salt=b'', \
309 person=b'', fanout=1, depth=1, leaf_size=0, node_offset=0, \
310 node_depth=0, inner_size=0, last_node=False)
311
312.. function:: blake2s(data=b'', digest_size=32, key=b'', salt=b'', \
313 person=b'', fanout=1, depth=1, leaf_size=0, node_offset=0, \
314 node_depth=0, inner_size=0, last_node=False)
315
316
317These functions return the corresponding hash objects for calculating
318BLAKE2b or BLAKE2s. They optionally take these general parameters:
319
320* *data*: initial chunk of data to hash, which must be interpretable as buffer
321 of bytes.
322
323* *digest_size*: size of output digest in bytes.
324
325* *key*: key for keyed hashing (up to 64 bytes for BLAKE2b, up to 32 bytes for
326 BLAKE2s).
327
328* *salt*: salt for randomized hashing (up to 16 bytes for BLAKE2b, up to 8
329 bytes for BLAKE2s).
330
331* *person*: personalization string (up to 16 bytes for BLAKE2b, up to 8 bytes
332 for BLAKE2s).
333
334The following table shows limits for general parameters (in bytes):
335
336======= =========== ======== ========= ===========
337Hash digest_size len(key) len(salt) len(person)
338======= =========== ======== ========= ===========
339BLAKE2b 64 64 16 16
340BLAKE2s 32 32 8 8
341======= =========== ======== ========= ===========
342
343.. note::
344
345 BLAKE2 specification defines constant lengths for salt and personalization
346 parameters, however, for convenience, this implementation accepts byte
347 strings of any size up to the specified length. If the length of the
348 parameter is less than specified, it is padded with zeros, thus, for
349 example, ``b'salt'`` and ``b'salt\x00'`` is the same value. (This is not
350 the case for *key*.)
351
352These sizes are available as module `constants`_ described below.
353
354Constructor functions also accept the following tree hashing parameters:
355
356* *fanout*: fanout (0 to 255, 0 if unlimited, 1 in sequential mode).
357
358* *depth*: maximal depth of tree (1 to 255, 255 if unlimited, 1 in
359 sequential mode).
360
361* *leaf_size*: maximal byte length of leaf (0 to 2**32-1, 0 if unlimited or in
362 sequential mode).
363
364* *node_offset*: node offset (0 to 2**64-1 for BLAKE2b, 0 to 2**48-1 for
365 BLAKE2s, 0 for the first, leftmost, leaf, or in sequential mode).
366
367* *node_depth*: node depth (0 to 255, 0 for leaves, or in sequential mode).
368
369* *inner_size*: inner digest size (0 to 64 for BLAKE2b, 0 to 32 for
370 BLAKE2s, 0 in sequential mode).
371
372* *last_node*: boolean indicating whether the processed node is the last
373 one (`False` for sequential mode).
374
375.. figure:: hashlib-blake2-tree.png
376 :alt: Explanation of tree mode parameters.
377
378See section 2.10 in `BLAKE2 specification
379<https://blake2.net/blake2_20130129.pdf>`_ for comprehensive review of tree
380hashing.
381
382
383Constants
384^^^^^^^^^
385
386.. data:: blake2b.SALT_SIZE
387.. data:: blake2s.SALT_SIZE
388
389Salt length (maximum length accepted by constructors).
390
391
392.. data:: blake2b.PERSON_SIZE
393.. data:: blake2s.PERSON_SIZE
394
395Personalization string length (maximum length accepted by constructors).
396
397
398.. data:: blake2b.MAX_KEY_SIZE
399.. data:: blake2s.MAX_KEY_SIZE
400
401Maximum key size.
402
403
404.. data:: blake2b.MAX_DIGEST_SIZE
405.. data:: blake2s.MAX_DIGEST_SIZE
406
407Maximum digest size that the hash function can output.
408
409
410Examples
411^^^^^^^^
412
413Simple hashing
414""""""""""""""
415
416To calculate hash of some data, you should first construct a hash object by
417calling the appropriate constructor function (:func:`blake2b` or
418:func:`blake2s`), then update it with the data by calling :meth:`update` on the
419object, and, finally, get the digest out of the object by calling
420:meth:`digest` (or :meth:`hexdigest` for hex-encoded string).
421
422 >>> from hashlib import blake2b
423 >>> h = blake2b()
424 >>> h.update(b'Hello world')
425 >>> h.hexdigest()
426 '6ff843ba685842aa82031d3f53c48b66326df7639a63d128974c5c14f31a0f33343a8c65551134ed1ae0f2b0dd2bb495dc81039e3eeb0aa1bb0388bbeac29183'
427
428
429As a shortcut, you can pass the first chunk of data to update directly to the
430constructor as the first argument (or as *data* keyword argument):
431
432 >>> from hashlib import blake2b
433 >>> blake2b(b'Hello world').hexdigest()
434 '6ff843ba685842aa82031d3f53c48b66326df7639a63d128974c5c14f31a0f33343a8c65551134ed1ae0f2b0dd2bb495dc81039e3eeb0aa1bb0388bbeac29183'
435
436You can call :meth:`hash.update` as many times as you need to iteratively
437update the hash:
438
439 >>> from hashlib import blake2b
440 >>> items = [b'Hello', b' ', b'world']
441 >>> h = blake2b()
442 >>> for item in items:
443 ... h.update(item)
444 >>> h.hexdigest()
445 '6ff843ba685842aa82031d3f53c48b66326df7639a63d128974c5c14f31a0f33343a8c65551134ed1ae0f2b0dd2bb495dc81039e3eeb0aa1bb0388bbeac29183'
446
447
448Using different digest sizes
449""""""""""""""""""""""""""""
450
451BLAKE2 has configurable size of digests up to 64 bytes for BLAKE2b and up to 32
452bytes for BLAKE2s. For example, to replace SHA-1 with BLAKE2b without changing
453the size of output, we can tell BLAKE2b to produce 20-byte digests:
454
455 >>> from hashlib import blake2b
456 >>> h = blake2b(digest_size=20)
457 >>> h.update(b'Replacing SHA1 with the more secure function')
458 >>> h.hexdigest()
459 'd24f26cf8de66472d58d4e1b1774b4c9158b1f4c'
460 >>> h.digest_size
461 20
462 >>> len(h.digest())
463 20
464
465Hash objects with different digest sizes have completely different outputs
466(shorter hashes are *not* prefixes of longer hashes); BLAKE2b and BLAKE2s
467produce different outputs even if the output length is the same:
468
469 >>> from hashlib import blake2b, blake2s
470 >>> blake2b(digest_size=10).hexdigest()
471 '6fa1d8fcfd719046d762'
472 >>> blake2b(digest_size=11).hexdigest()
473 'eb6ec15daf9546254f0809'
474 >>> blake2s(digest_size=10).hexdigest()
475 '1bf21a98c78a1c376ae9'
476 >>> blake2s(digest_size=11).hexdigest()
477 '567004bf96e4a25773ebf4'
478
479
480Keyed hashing
481"""""""""""""
482
483Keyed hashing can be used for authentication as a faster and simpler
484replacement for `Hash-based message authentication code
Sanyam Khurana1b4587a2017-12-06 22:09:33 +0530485<https://en.wikipedia.org/wiki/Hash-based_message_authentication_code>`_ (HMAC).
INADA Naokie2f9e772017-01-13 19:29:58 +0900486BLAKE2 can be securely used in prefix-MAC mode thanks to the
487indifferentiability property inherited from BLAKE.
488
489This example shows how to get a (hex-encoded) 128-bit authentication code for
490message ``b'message data'`` with key ``b'pseudorandom key'``::
491
492 >>> from hashlib import blake2b
493 >>> h = blake2b(key=b'pseudorandom key', digest_size=16)
494 >>> h.update(b'message data')
495 >>> h.hexdigest()
496 '3d363ff7401e02026f4a4687d4863ced'
497
498
499As a practical example, a web application can symmetrically sign cookies sent
500to users and later verify them to make sure they weren't tampered with::
501
502 >>> from hashlib import blake2b
503 >>> from hmac import compare_digest
504 >>>
505 >>> SECRET_KEY = b'pseudorandomly generated server secret key'
506 >>> AUTH_SIZE = 16
507 >>>
508 >>> def sign(cookie):
sww312ffea2017-09-13 23:24:36 -0700509 ... h = blake2b(digest_size=AUTH_SIZE, key=SECRET_KEY)
510 ... h.update(cookie)
511 ... return h.hexdigest().encode('utf-8')
INADA Naokie2f9e772017-01-13 19:29:58 +0900512 >>>
Dmitry Chestnykhaecc08a2017-09-23 19:18:40 +0200513 >>> def verify(cookie, sig):
514 ... good_sig = sign(cookie)
515 ... return compare_digest(good_sig, sig)
516 >>>
517 >>> cookie = b'user-alice'
INADA Naokie2f9e772017-01-13 19:29:58 +0900518 >>> sig = sign(cookie)
519 >>> print("{0},{1}".format(cookie.decode('utf-8'), sig))
Dmitry Chestnykhaecc08a2017-09-23 19:18:40 +0200520 user-alice,b'43b3c982cf697e0c5ab22172d1ca7421'
521 >>> verify(cookie, sig)
INADA Naokie2f9e772017-01-13 19:29:58 +0900522 True
Dmitry Chestnykhaecc08a2017-09-23 19:18:40 +0200523 >>> verify(b'user-bob', sig)
INADA Naokie2f9e772017-01-13 19:29:58 +0900524 False
Dmitry Chestnykhaecc08a2017-09-23 19:18:40 +0200525 >>> verify(cookie, b'0102030405060708090a0b0c0d0e0f00')
INADA Naokie2f9e772017-01-13 19:29:58 +0900526 False
527
528Even though there's a native keyed hashing mode, BLAKE2 can, of course, be used
529in HMAC construction with :mod:`hmac` module::
530
531 >>> import hmac, hashlib
532 >>> m = hmac.new(b'secret key', digestmod=hashlib.blake2s)
533 >>> m.update(b'message')
534 >>> m.hexdigest()
535 'e3c8102868d28b5ff85fc35dda07329970d1a01e273c37481326fe0c861c8142'
536
537
538Randomized hashing
539""""""""""""""""""
540
541By setting *salt* parameter users can introduce randomization to the hash
542function. Randomized hashing is useful for protecting against collision attacks
543on the hash function used in digital signatures.
544
545 Randomized hashing is designed for situations where one party, the message
546 preparer, generates all or part of a message to be signed by a second
547 party, the message signer. If the message preparer is able to find
548 cryptographic hash function collisions (i.e., two messages producing the
549 same hash value), then she might prepare meaningful versions of the message
550 that would produce the same hash value and digital signature, but with
551 different results (e.g., transferring $1,000,000 to an account, rather than
552 $10). Cryptographic hash functions have been designed with collision
553 resistance as a major goal, but the current concentration on attacking
554 cryptographic hash functions may result in a given cryptographic hash
555 function providing less collision resistance than expected. Randomized
556 hashing offers the signer additional protection by reducing the likelihood
557 that a preparer can generate two or more messages that ultimately yield the
558 same hash value during the digital signature generation process --- even if
559 it is practical to find collisions for the hash function. However, the use
560 of randomized hashing may reduce the amount of security provided by a
561 digital signature when all portions of the message are prepared
562 by the signer.
563
564 (`NIST SP-800-106 "Randomized Hashing for Digital Signatures"
Sanyam Khurana338cd832018-01-20 05:55:37 +0530565 <https://csrc.nist.gov/publications/detail/sp/800-106/final>`_)
INADA Naokie2f9e772017-01-13 19:29:58 +0900566
567In BLAKE2 the salt is processed as a one-time input to the hash function during
568initialization, rather than as an input to each compression function.
569
570.. warning::
571
572 *Salted hashing* (or just hashing) with BLAKE2 or any other general-purpose
573 cryptographic hash function, such as SHA-256, is not suitable for hashing
574 passwords. See `BLAKE2 FAQ <https://blake2.net/#qa>`_ for more
575 information.
576..
577
578 >>> import os
579 >>> from hashlib import blake2b
580 >>> msg = b'some message'
581 >>> # Calculate the first hash with a random salt.
582 >>> salt1 = os.urandom(blake2b.SALT_SIZE)
583 >>> h1 = blake2b(salt=salt1)
584 >>> h1.update(msg)
585 >>> # Calculate the second hash with a different random salt.
586 >>> salt2 = os.urandom(blake2b.SALT_SIZE)
587 >>> h2 = blake2b(salt=salt2)
588 >>> h2.update(msg)
589 >>> # The digests are different.
590 >>> h1.digest() != h2.digest()
591 True
592
593
594Personalization
595"""""""""""""""
596
597Sometimes it is useful to force hash function to produce different digests for
598the same input for different purposes. Quoting the authors of the Skein hash
599function:
600
601 We recommend that all application designers seriously consider doing this;
602 we have seen many protocols where a hash that is computed in one part of
603 the protocol can be used in an entirely different part because two hash
604 computations were done on similar or related data, and the attacker can
605 force the application to make the hash inputs the same. Personalizing each
606 hash function used in the protocol summarily stops this type of attack.
607
608 (`The Skein Hash Function Family
609 <http://www.skein-hash.info/sites/default/files/skein1.3.pdf>`_,
610 p. 21)
611
612BLAKE2 can be personalized by passing bytes to the *person* argument::
613
614 >>> from hashlib import blake2b
615 >>> FILES_HASH_PERSON = b'MyApp Files Hash'
616 >>> BLOCK_HASH_PERSON = b'MyApp Block Hash'
617 >>> h = blake2b(digest_size=32, person=FILES_HASH_PERSON)
618 >>> h.update(b'the same content')
619 >>> h.hexdigest()
620 '20d9cd024d4fb086aae819a1432dd2466de12947831b75c5a30cf2676095d3b4'
621 >>> h = blake2b(digest_size=32, person=BLOCK_HASH_PERSON)
622 >>> h.update(b'the same content')
623 >>> h.hexdigest()
624 'cf68fb5761b9c44e7878bfb2c4c9aea52264a80b75005e65619778de59f383a3'
625
626Personalization together with the keyed mode can also be used to derive different
627keys from a single one.
628
629 >>> from hashlib import blake2s
630 >>> from base64 import b64decode, b64encode
631 >>> orig_key = b64decode(b'Rm5EPJai72qcK3RGBpW3vPNfZy5OZothY+kHY6h21KM=')
632 >>> enc_key = blake2s(key=orig_key, person=b'kEncrypt').digest()
633 >>> mac_key = blake2s(key=orig_key, person=b'kMAC').digest()
634 >>> print(b64encode(enc_key).decode('utf-8'))
635 rbPb15S/Z9t+agffno5wuhB77VbRi6F9Iv2qIxU7WHw=
636 >>> print(b64encode(mac_key).decode('utf-8'))
637 G9GtHFE1YluXY1zWPlYk1e/nWfu0WSEb0KRcjhDeP/o=
638
639Tree mode
640"""""""""
641
642Here's an example of hashing a minimal tree with two leaf nodes::
643
644 10
645 / \
646 00 01
647
648This example uses 64-byte internal digests, and returns the 32-byte final
649digest::
650
651 >>> from hashlib import blake2b
652 >>>
653 >>> FANOUT = 2
654 >>> DEPTH = 2
655 >>> LEAF_SIZE = 4096
656 >>> INNER_SIZE = 64
657 >>>
658 >>> buf = bytearray(6000)
659 >>>
660 >>> # Left leaf
661 ... h00 = blake2b(buf[0:LEAF_SIZE], fanout=FANOUT, depth=DEPTH,
662 ... leaf_size=LEAF_SIZE, inner_size=INNER_SIZE,
663 ... node_offset=0, node_depth=0, last_node=False)
664 >>> # Right leaf
665 ... h01 = blake2b(buf[LEAF_SIZE:], fanout=FANOUT, depth=DEPTH,
666 ... leaf_size=LEAF_SIZE, inner_size=INNER_SIZE,
667 ... node_offset=1, node_depth=0, last_node=True)
668 >>> # Root node
669 ... h10 = blake2b(digest_size=32, fanout=FANOUT, depth=DEPTH,
670 ... leaf_size=LEAF_SIZE, inner_size=INNER_SIZE,
671 ... node_offset=0, node_depth=1, last_node=True)
672 >>> h10.update(h00.digest())
673 >>> h10.update(h01.digest())
674 >>> h10.hexdigest()
675 '3ad2a9b37c6070e374c7a8c508fe20ca86b6ed54e286e93a0318e95e881db5aa'
676
677Credits
678^^^^^^^
679
680BLAKE2_ was designed by *Jean-Philippe Aumasson*, *Samuel Neves*, *Zooko
681Wilcox-O'Hearn*, and *Christian Winnerlein* based on SHA-3_ finalist BLAKE_
682created by *Jean-Philippe Aumasson*, *Luca Henzen*, *Willi Meier*, and
683*Raphael C.-W. Phan*.
684
685It uses core algorithm from ChaCha_ cipher designed by *Daniel J. Bernstein*.
686
687The stdlib implementation is based on pyblake2_ module. It was written by
688*Dmitry Chestnykh* based on C implementation written by *Samuel Neves*. The
689documentation was copied from pyblake2_ and written by *Dmitry Chestnykh*.
690
691The C code was partly rewritten for Python by *Christian Heimes*.
692
693The following public domain dedication applies for both C hash function
694implementation, extension code, and this documentation:
695
696 To the extent possible under law, the author(s) have dedicated all copyright
697 and related and neighboring rights to this software to the public domain
698 worldwide. This software is distributed without any warranty.
699
700 You should have received a copy of the CC0 Public Domain Dedication along
701 with this software. If not, see
Sanyam Khurana1b4587a2017-12-06 22:09:33 +0530702 https://creativecommons.org/publicdomain/zero/1.0/.
INADA Naokie2f9e772017-01-13 19:29:58 +0900703
704The following people have helped with development or contributed their changes
705to the project and the public domain according to the Creative Commons Public
706Domain Dedication 1.0 Universal:
707
708* *Alexandr Sokolovskiy*
709
710.. _RFC-7693: https://tools.ietf.org/html/rfc7693
711.. _BLAKE2: https://blake2.net
712.. _HMAC: https://en.wikipedia.org/wiki/Hash-based_message_authentication_code
713.. _BLAKE: https://131002.net/blake/
714.. _SHA-3: https://en.wikipedia.org/wiki/NIST_hash_function_competition
715.. _ChaCha: https://cr.yp.to/chacha.html
716.. _pyblake2: https://pythonhosted.org/pyblake2/
717
Christian Heimes121b9482016-09-06 22:03:25 +0200718
719
Georg Brandl116aa622007-08-15 14:28:22 +0000720.. seealso::
721
722 Module :mod:`hmac`
723 A module to generate message authentication codes using hashes.
724
725 Module :mod:`base64`
726 Another way to encode binary hashes for non-binary environments.
727
INADA Naokie2f9e772017-01-13 19:29:58 +0900728 https://blake2.net
729 Official BLAKE2 website.
Christian Heimes121b9482016-09-06 22:03:25 +0200730
Sanyam Khurana338cd832018-01-20 05:55:37 +0530731 https://csrc.nist.gov/csrc/media/publications/fips/180/2/archive/2002-08-01/documents/fips180-2.pdf
Georg Brandl116aa622007-08-15 14:28:22 +0000732 The FIPS 180-2 publication on Secure Hash Algorithms.
733
Benjamin Peterson1dd72e62015-09-27 02:05:01 -0700734 https://en.wikipedia.org/wiki/Cryptographic_hash_function#Cryptographic_hash_algorithms
Georg Brandlfd0eb3f2010-05-21 20:28:13 +0000735 Wikipedia article with information on which algorithms have known issues and
Georg Brandl116aa622007-08-15 14:28:22 +0000736 what that means regarding their use.
737
Serhiy Storchaka6dff0202016-05-07 10:49:07 +0300738 https://www.ietf.org/rfc/rfc2898.txt
Christian Heimese92ef132013-10-13 00:52:43 +0200739 PKCS #5: Password-Based Cryptography Specification Version 2.0