blob: 93bcc91f91dd1dbfbc6e83c88f46699c526b12e1 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`hashlib` --- Secure hashes and message digests
2====================================================
3
4.. module:: hashlib
5 :synopsis: Secure hash and message digest algorithms.
Terry Jan Reedyfa089b92016-06-11 15:02:54 -04006
Benjamin Peterson058e31e2009-01-16 03:54:08 +00007.. moduleauthor:: Gregory P. Smith <greg@krypto.org>
8.. sectionauthor:: Gregory P. Smith <greg@krypto.org>
Georg Brandl116aa622007-08-15 14:28:22 +00009
Terry Jan Reedyfa089b92016-06-11 15:02:54 -040010**Source code:** :source:`Lib/hashlib.py`
Georg Brandl116aa622007-08-15 14:28:22 +000011
Georg Brandl116aa622007-08-15 14:28:22 +000012.. index::
13 single: message digest, MD5
14 single: secure hash algorithm, SHA1, SHA224, SHA256, SHA384, SHA512
15
Raymond Hettinger469271d2011-01-27 20:38:46 +000016--------------
17
Georg Brandl116aa622007-08-15 14:28:22 +000018This module implements a common interface to many different secure hash and
19message digest algorithms. Included are the FIPS secure hash algorithms SHA1,
20SHA224, SHA256, SHA384, and SHA512 (defined in FIPS 180-2) as well as RSA's MD5
Georg Brandl67ced422007-09-06 14:09:10 +000021algorithm (defined in Internet :rfc:`1321`). The terms "secure hash" and
22"message digest" are interchangeable. Older algorithms were called message
23digests. The modern term is secure hash.
Georg Brandl116aa622007-08-15 14:28:22 +000024
Christian Heimesd5e2b6f2008-03-19 21:50:51 +000025.. note::
Georg Brandl6e94a302013-10-06 18:26:36 +020026
27 If you want the adler32 or crc32 hash functions, they are available in
Christian Heimesd5e2b6f2008-03-19 21:50:51 +000028 the :mod:`zlib` module.
29
Georg Brandl116aa622007-08-15 14:28:22 +000030.. warning::
31
Georg Brandl6e94a302013-10-06 18:26:36 +020032 Some algorithms have known hash collision weaknesses, refer to the "See
33 also" section at the end.
Georg Brandl116aa622007-08-15 14:28:22 +000034
Christian Heimese92ef132013-10-13 00:52:43 +020035
R David Murraycde1a062013-12-20 16:33:52 -050036.. _hash-algorithms:
37
Christian Heimese92ef132013-10-13 00:52:43 +020038Hash algorithms
39---------------
40
Georg Brandl116aa622007-08-15 14:28:22 +000041There is one constructor method named for each type of :dfn:`hash`. All return
Gregory P. Smith8907dcd2016-06-11 17:56:12 -070042a hash object with the same simple interface. For example: use :func:`sha256` to
43create a SHA-256 hash object. You can now feed this object with :term:`bytes-like
Serhiy Storchakae5ea1ab2016-05-18 13:54:54 +030044objects <bytes-like object>` (normally :class:`bytes`) using the :meth:`update` method.
Ezio Melottic228e962013-05-04 18:06:34 +030045At any point you can ask it for the :dfn:`digest` of the
Georg Brandl67ced422007-09-06 14:09:10 +000046concatenation of the data fed to it so far using the :meth:`digest` or
47:meth:`hexdigest` methods.
48
49.. note::
50
Benjamin Peterson9cb7bd22012-12-20 20:24:37 -060051 For better multithreading performance, the Python :term:`GIL` is released for
Jesus Cea5b22dd82013-10-04 04:20:37 +020052 data larger than 2047 bytes at object creation or on update.
Antoine Pitroubcd5cbe2009-01-08 21:17:16 +000053
54.. note::
55
Benjamin Petersonbd584d52012-12-20 20:22:47 -060056 Feeding string objects into :meth:`update` is not supported, as hashes work
Georg Brandl67ced422007-09-06 14:09:10 +000057 on bytes, not on characters.
Georg Brandl116aa622007-08-15 14:28:22 +000058
Thomas Wouters1b7f8912007-09-19 03:06:30 +000059.. index:: single: OpenSSL; (use in module hashlib)
Georg Brandl116aa622007-08-15 14:28:22 +000060
61Constructors for hash algorithms that are always present in this module are
Gregory P. Smith8907dcd2016-06-11 17:56:12 -070062:func:`sha1`, :func:`sha224`, :func:`sha256`, :func:`sha384`,
63and :func:`sha512`. :func:`md5` is normally available as well, though it
64may be missing if you are using a rare "FIPS compliant" build of Python.
65Additional algorithms may also be available depending upon the OpenSSL
66library that Python uses on your platform.
Christian Heimes4a0270d2012-10-06 02:23:36 +020067
Georg Brandl67ced422007-09-06 14:09:10 +000068For example, to obtain the digest of the byte string ``b'Nobody inspects the
69spammish repetition'``::
Georg Brandl116aa622007-08-15 14:28:22 +000070
71 >>> import hashlib
Gregory P. Smith8907dcd2016-06-11 17:56:12 -070072 >>> m = hashlib.sha256()
Georg Brandl67ced422007-09-06 14:09:10 +000073 >>> m.update(b"Nobody inspects")
74 >>> m.update(b" the spammish repetition")
Georg Brandl116aa622007-08-15 14:28:22 +000075 >>> m.digest()
Gregory P. Smith8907dcd2016-06-11 17:56:12 -070076 b'\x03\x1e\xdd}Ae\x15\x93\xc5\xfe\\\x00o\xa5u+7\xfd\xdf\xf7\xbcN\x84:\xa6\xaf\x0c\x95\x0fK\x94\x06'
Guido van Rossuma19f80c2007-11-06 20:51:31 +000077 >>> m.digest_size
Gregory P. Smith8907dcd2016-06-11 17:56:12 -070078 32
Guido van Rossuma19f80c2007-11-06 20:51:31 +000079 >>> m.block_size
80 64
Georg Brandl116aa622007-08-15 14:28:22 +000081
Christian Heimesfe337bf2008-03-23 21:54:12 +000082More condensed:
Georg Brandl116aa622007-08-15 14:28:22 +000083
Georg Brandl67ced422007-09-06 14:09:10 +000084 >>> hashlib.sha224(b"Nobody inspects the spammish repetition").hexdigest()
Benjamin Peterson0fa3f3d2008-12-29 20:52:09 +000085 'a4337bc45a8fc544c03f52dc550cd6e1e87021bc896588bd79e901e2'
Georg Brandl116aa622007-08-15 14:28:22 +000086
Gregory P. Smith13b55292010-09-06 08:30:23 +000087.. function:: new(name[, data])
88
89 Is a generic constructor that takes the string name of the desired
90 algorithm as its first parameter. It also exists to allow access to the
91 above listed hashes as well as any other algorithms that your OpenSSL
92 library may offer. The named constructors are much faster than :func:`new`
93 and should be preferred.
Georg Brandl116aa622007-08-15 14:28:22 +000094
Christian Heimesfe337bf2008-03-23 21:54:12 +000095Using :func:`new` with an algorithm provided by OpenSSL:
Georg Brandl116aa622007-08-15 14:28:22 +000096
97 >>> h = hashlib.new('ripemd160')
Georg Brandl67ced422007-09-06 14:09:10 +000098 >>> h.update(b"Nobody inspects the spammish repetition")
Georg Brandl116aa622007-08-15 14:28:22 +000099 >>> h.hexdigest()
Benjamin Peterson0fa3f3d2008-12-29 20:52:09 +0000100 'cc4a5ce1b3df48aec5d22d1f16b894a0b894eccc'
Georg Brandl116aa622007-08-15 14:28:22 +0000101
Gregory P. Smith13b55292010-09-06 08:30:23 +0000102Hashlib provides the following constant attributes:
Gregory P. Smith86508cc2010-03-01 02:05:26 +0000103
Gregory P. Smith13b55292010-09-06 08:30:23 +0000104.. data:: algorithms_guaranteed
Gregory P. Smith86508cc2010-03-01 02:05:26 +0000105
Larry Hastings3732ed22014-03-15 21:13:56 -0700106 A set containing the names of the hash algorithms guaranteed to be supported
Gregory P. Smith7bfb4152016-06-11 18:02:13 -0700107 by this module on all platforms. Note that 'md5' is in this list despite
108 some upstream vendors offering an odd "FIPS compliant" Python build that
109 excludes it.
Gregory P. Smith13b55292010-09-06 08:30:23 +0000110
111 .. versionadded:: 3.2
112
113.. data:: algorithms_available
114
Larry Hastings3732ed22014-03-15 21:13:56 -0700115 A set containing the names of the hash algorithms that are available in the
116 running Python interpreter. These names will be recognized when passed to
117 :func:`new`. :attr:`algorithms_guaranteed` will always be a subset. The
118 same algorithm may appear multiple times in this set under different names
119 (thanks to OpenSSL).
Gregory P. Smith86508cc2010-03-01 02:05:26 +0000120
121 .. versionadded:: 3.2
122
Georg Brandl116aa622007-08-15 14:28:22 +0000123The following values are provided as constant attributes of the hash objects
124returned by the constructors:
125
126
Benjamin Peterson4ac9ce42009-10-04 14:49:41 +0000127.. data:: hash.digest_size
Georg Brandl116aa622007-08-15 14:28:22 +0000128
Guido van Rossuma19f80c2007-11-06 20:51:31 +0000129 The size of the resulting hash in bytes.
130
Benjamin Peterson4ac9ce42009-10-04 14:49:41 +0000131.. data:: hash.block_size
Guido van Rossuma19f80c2007-11-06 20:51:31 +0000132
133 The internal block size of the hash algorithm in bytes.
Georg Brandl116aa622007-08-15 14:28:22 +0000134
Jason R. Coombsb2aa6f42013-08-03 11:39:39 +0200135A hash object has the following attributes:
136
137.. attribute:: hash.name
138
139 The canonical name of this hash, always lowercase and always suitable as a
140 parameter to :func:`new` to create another hash of this type.
141
142 .. versionchanged:: 3.4
143 The name attribute has been present in CPython since its inception, but
144 until Python 3.4 was not formally specified, so may not exist on some
145 platforms.
146
Georg Brandl116aa622007-08-15 14:28:22 +0000147A hash object has the following methods:
148
149
150.. method:: hash.update(arg)
151
Georg Brandl67ced422007-09-06 14:09:10 +0000152 Update the hash object with the object *arg*, which must be interpretable as
153 a buffer of bytes. Repeated calls are equivalent to a single call with the
154 concatenation of all the arguments: ``m.update(a); m.update(b)`` is
155 equivalent to ``m.update(a+b)``.
Georg Brandl116aa622007-08-15 14:28:22 +0000156
Georg Brandl705d9d52009-05-05 09:29:50 +0000157 .. versionchanged:: 3.1
Georg Brandl67b21b72010-08-17 15:07:14 +0000158 The Python GIL is released to allow other threads to run while hash
Jesus Cea5b22dd82013-10-04 04:20:37 +0200159 updates on data larger than 2047 bytes is taking place when using hash
Georg Brandl67b21b72010-08-17 15:07:14 +0000160 algorithms supplied by OpenSSL.
Gregory P. Smith3f61d612009-05-04 00:45:33 +0000161
Georg Brandl116aa622007-08-15 14:28:22 +0000162
163.. method:: hash.digest()
164
Georg Brandl67ced422007-09-06 14:09:10 +0000165 Return the digest of the data passed to the :meth:`update` method so far.
Senthil Kumaran627284c2010-12-30 07:07:58 +0000166 This is a bytes object of size :attr:`digest_size` which may contain bytes in
Georg Brandl67ced422007-09-06 14:09:10 +0000167 the whole range from 0 to 255.
Georg Brandl116aa622007-08-15 14:28:22 +0000168
169
170.. method:: hash.hexdigest()
171
Georg Brandl67ced422007-09-06 14:09:10 +0000172 Like :meth:`digest` except the digest is returned as a string object of
173 double length, containing only hexadecimal digits. This may be used to
174 exchange the value safely in email or other non-binary environments.
Georg Brandl116aa622007-08-15 14:28:22 +0000175
176
177.. method:: hash.copy()
178
179 Return a copy ("clone") of the hash object. This can be used to efficiently
Georg Brandl67ced422007-09-06 14:09:10 +0000180 compute the digests of data sharing a common initial substring.
Georg Brandl116aa622007-08-15 14:28:22 +0000181
182
Benjamin Petersonc402d8d2015-09-27 01:23:10 -0700183Key derivation
184--------------
Christian Heimese92ef132013-10-13 00:52:43 +0200185
186Key derivation and key stretching algorithms are designed for secure password
Benjamin Peterson0ccff4d2014-05-26 15:41:26 -0700187hashing. Naive algorithms such as ``sha1(password)`` are not resistant against
188brute-force attacks. A good password hashing function must be tunable, slow, and
Benjamin Peterson0d81d802014-05-26 15:42:29 -0700189include a `salt <https://en.wikipedia.org/wiki/Salt_%28cryptography%29>`_.
Christian Heimese92ef132013-10-13 00:52:43 +0200190
191
Martin Panterbc85e352016-02-22 09:21:49 +0000192.. function:: pbkdf2_hmac(hash_name, password, salt, iterations, dklen=None)
Christian Heimese92ef132013-10-13 00:52:43 +0200193
194 The function provides PKCS#5 password-based key derivation function 2. It
195 uses HMAC as pseudorandom function.
196
Martin Panterbc85e352016-02-22 09:21:49 +0000197 The string *hash_name* is the desired name of the hash digest algorithm for
Christian Heimese92ef132013-10-13 00:52:43 +0200198 HMAC, e.g. 'sha1' or 'sha256'. *password* and *salt* are interpreted as
199 buffers of bytes. Applications and libraries should limit *password* to
Martin Panterbc85e352016-02-22 09:21:49 +0000200 a sensible length (e.g. 1024). *salt* should be about 16 or more bytes from
Christian Heimese92ef132013-10-13 00:52:43 +0200201 a proper source, e.g. :func:`os.urandom`.
202
Martin Panterbc85e352016-02-22 09:21:49 +0000203 The number of *iterations* should be chosen based on the hash algorithm and
204 computing power. As of 2013, at least 100,000 iterations of SHA-256 are
205 suggested.
Christian Heimese92ef132013-10-13 00:52:43 +0200206
207 *dklen* is the length of the derived key. If *dklen* is ``None`` then the
Martin Panterbc85e352016-02-22 09:21:49 +0000208 digest size of the hash algorithm *hash_name* is used, e.g. 64 for SHA-512.
Christian Heimese92ef132013-10-13 00:52:43 +0200209
210 >>> import hashlib, binascii
211 >>> dk = hashlib.pbkdf2_hmac('sha256', b'password', b'salt', 100000)
212 >>> binascii.hexlify(dk)
213 b'0394a2ede332c9a13eb82e9b24631604c31df978b4e2f0fbd2c549944f9d79a5'
214
215 .. versionadded:: 3.4
216
Benjamin Petersonf9ea5f32014-05-26 15:45:14 -0700217 .. note::
218
219 A fast implementation of *pbkdf2_hmac* is available with OpenSSL. The
220 Python implementation uses an inline version of :mod:`hmac`. It is about
221 three times slower and doesn't release the GIL.
Christian Heimese92ef132013-10-13 00:52:43 +0200222
223
Georg Brandl116aa622007-08-15 14:28:22 +0000224.. seealso::
225
226 Module :mod:`hmac`
227 A module to generate message authentication codes using hashes.
228
229 Module :mod:`base64`
230 Another way to encode binary hashes for non-binary environments.
231
232 http://csrc.nist.gov/publications/fips/fips180-2/fips180-2.pdf
233 The FIPS 180-2 publication on Secure Hash Algorithms.
234
Benjamin Peterson1dd72e62015-09-27 02:05:01 -0700235 https://en.wikipedia.org/wiki/Cryptographic_hash_function#Cryptographic_hash_algorithms
Georg Brandlfd0eb3f2010-05-21 20:28:13 +0000236 Wikipedia article with information on which algorithms have known issues and
Georg Brandl116aa622007-08-15 14:28:22 +0000237 what that means regarding their use.
238
Serhiy Storchaka6dff0202016-05-07 10:49:07 +0300239 https://www.ietf.org/rfc/rfc2898.txt
Christian Heimese92ef132013-10-13 00:52:43 +0200240 PKCS #5: Password-Based Cryptography Specification Version 2.0