blob: b1daba124ffb972fffb92bdbc12bb13afd009388 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`hashlib` --- Secure hashes and message digests
2====================================================
3
4.. module:: hashlib
5 :synopsis: Secure hash and message digest algorithms.
Benjamin Peterson058e31e2009-01-16 03:54:08 +00006.. moduleauthor:: Gregory P. Smith <greg@krypto.org>
7.. sectionauthor:: Gregory P. Smith <greg@krypto.org>
Georg Brandl116aa622007-08-15 14:28:22 +00008
9
Georg Brandl116aa622007-08-15 14:28:22 +000010.. index::
11 single: message digest, MD5
12 single: secure hash algorithm, SHA1, SHA224, SHA256, SHA384, SHA512
13
Raymond Hettinger469271d2011-01-27 20:38:46 +000014**Source code:** :source:`Lib/hashlib.py`
15
16--------------
17
Georg Brandl116aa622007-08-15 14:28:22 +000018This module implements a common interface to many different secure hash and
19message digest algorithms. Included are the FIPS secure hash algorithms SHA1,
20SHA224, SHA256, SHA384, and SHA512 (defined in FIPS 180-2) as well as RSA's MD5
Georg Brandl67ced422007-09-06 14:09:10 +000021algorithm (defined in Internet :rfc:`1321`). The terms "secure hash" and
22"message digest" are interchangeable. Older algorithms were called message
23digests. The modern term is secure hash.
Georg Brandl116aa622007-08-15 14:28:22 +000024
Christian Heimesd5e2b6f2008-03-19 21:50:51 +000025.. note::
Georg Brandl6e94a302013-10-06 18:26:36 +020026
27 If you want the adler32 or crc32 hash functions, they are available in
Christian Heimesd5e2b6f2008-03-19 21:50:51 +000028 the :mod:`zlib` module.
29
Georg Brandl116aa622007-08-15 14:28:22 +000030.. warning::
31
Georg Brandl6e94a302013-10-06 18:26:36 +020032 Some algorithms have known hash collision weaknesses, refer to the "See
33 also" section at the end.
Georg Brandl116aa622007-08-15 14:28:22 +000034
Christian Heimese92ef132013-10-13 00:52:43 +020035
36Hash algorithms
37---------------
38
Georg Brandl116aa622007-08-15 14:28:22 +000039There is one constructor method named for each type of :dfn:`hash`. All return
40a hash object with the same simple interface. For example: use :func:`sha1` to
Ezio Melottic228e962013-05-04 18:06:34 +030041create a SHA1 hash object. You can now feed this object with :term:`bytes-like
42object`\ s (normally :class:`bytes`) using the :meth:`update` method.
43At any point you can ask it for the :dfn:`digest` of the
Georg Brandl67ced422007-09-06 14:09:10 +000044concatenation of the data fed to it so far using the :meth:`digest` or
45:meth:`hexdigest` methods.
46
47.. note::
48
Benjamin Peterson9cb7bd22012-12-20 20:24:37 -060049 For better multithreading performance, the Python :term:`GIL` is released for
Jesus Cea5b22dd82013-10-04 04:20:37 +020050 data larger than 2047 bytes at object creation or on update.
Antoine Pitroubcd5cbe2009-01-08 21:17:16 +000051
52.. note::
53
Benjamin Petersonbd584d52012-12-20 20:22:47 -060054 Feeding string objects into :meth:`update` is not supported, as hashes work
Georg Brandl67ced422007-09-06 14:09:10 +000055 on bytes, not on characters.
Georg Brandl116aa622007-08-15 14:28:22 +000056
Thomas Wouters1b7f8912007-09-19 03:06:30 +000057.. index:: single: OpenSSL; (use in module hashlib)
Georg Brandl116aa622007-08-15 14:28:22 +000058
59Constructors for hash algorithms that are always present in this module are
Christian Heimes4a0270d2012-10-06 02:23:36 +020060:func:`md5`, :func:`sha1`, :func:`sha224`, :func:`sha256`, :func:`sha384`,
61:func:`sha512`, :func:`sha3_224`, :func:`sha3_256`, :func:`sha3_384`, and
62:func:`sha3_512`. Additional algorithms may also be available depending upon
63the OpenSSL library that Python uses on your platform.
64
65 .. versionchanged:: 3.4
66 Add sha3 family of hash algorithms.
Georg Brandl116aa622007-08-15 14:28:22 +000067
Georg Brandl67ced422007-09-06 14:09:10 +000068For example, to obtain the digest of the byte string ``b'Nobody inspects the
69spammish repetition'``::
Georg Brandl116aa622007-08-15 14:28:22 +000070
71 >>> import hashlib
72 >>> m = hashlib.md5()
Georg Brandl67ced422007-09-06 14:09:10 +000073 >>> m.update(b"Nobody inspects")
74 >>> m.update(b" the spammish repetition")
Georg Brandl116aa622007-08-15 14:28:22 +000075 >>> m.digest()
Georg Brandl67ced422007-09-06 14:09:10 +000076 b'\xbbd\x9c\x83\xdd\x1e\xa5\xc9\xd9\xde\xc9\xa1\x8d\xf0\xff\xe9'
Guido van Rossuma19f80c2007-11-06 20:51:31 +000077 >>> m.digest_size
78 16
79 >>> m.block_size
80 64
Georg Brandl116aa622007-08-15 14:28:22 +000081
Christian Heimesfe337bf2008-03-23 21:54:12 +000082More condensed:
Georg Brandl116aa622007-08-15 14:28:22 +000083
Georg Brandl67ced422007-09-06 14:09:10 +000084 >>> hashlib.sha224(b"Nobody inspects the spammish repetition").hexdigest()
Benjamin Peterson0fa3f3d2008-12-29 20:52:09 +000085 'a4337bc45a8fc544c03f52dc550cd6e1e87021bc896588bd79e901e2'
Georg Brandl116aa622007-08-15 14:28:22 +000086
Gregory P. Smith13b55292010-09-06 08:30:23 +000087.. function:: new(name[, data])
88
89 Is a generic constructor that takes the string name of the desired
90 algorithm as its first parameter. It also exists to allow access to the
91 above listed hashes as well as any other algorithms that your OpenSSL
92 library may offer. The named constructors are much faster than :func:`new`
93 and should be preferred.
Georg Brandl116aa622007-08-15 14:28:22 +000094
Christian Heimesfe337bf2008-03-23 21:54:12 +000095Using :func:`new` with an algorithm provided by OpenSSL:
Georg Brandl116aa622007-08-15 14:28:22 +000096
97 >>> h = hashlib.new('ripemd160')
Georg Brandl67ced422007-09-06 14:09:10 +000098 >>> h.update(b"Nobody inspects the spammish repetition")
Georg Brandl116aa622007-08-15 14:28:22 +000099 >>> h.hexdigest()
Benjamin Peterson0fa3f3d2008-12-29 20:52:09 +0000100 'cc4a5ce1b3df48aec5d22d1f16b894a0b894eccc'
Georg Brandl116aa622007-08-15 14:28:22 +0000101
Gregory P. Smith13b55292010-09-06 08:30:23 +0000102Hashlib provides the following constant attributes:
Gregory P. Smith86508cc2010-03-01 02:05:26 +0000103
Gregory P. Smith13b55292010-09-06 08:30:23 +0000104.. data:: algorithms_guaranteed
Gregory P. Smith86508cc2010-03-01 02:05:26 +0000105
Gregory P. Smith13b55292010-09-06 08:30:23 +0000106 Contains the names of the hash algorithms guaranteed to be supported
107 by this module on all platforms.
108
109 .. versionadded:: 3.2
110
111.. data:: algorithms_available
112
113 Contains the names of the hash algorithms that are available
114 in the running Python interpreter. These names will be recognized
115 when passed to :func:`new`. :attr:`algorithms_guaranteed`
116 will always be a subset. Duplicate algorithms with different
117 name formats may appear in this set (thanks to OpenSSL).
Gregory P. Smith86508cc2010-03-01 02:05:26 +0000118
119 .. versionadded:: 3.2
120
Georg Brandl116aa622007-08-15 14:28:22 +0000121The following values are provided as constant attributes of the hash objects
122returned by the constructors:
123
124
Benjamin Peterson4ac9ce42009-10-04 14:49:41 +0000125.. data:: hash.digest_size
Georg Brandl116aa622007-08-15 14:28:22 +0000126
Guido van Rossuma19f80c2007-11-06 20:51:31 +0000127 The size of the resulting hash in bytes.
128
Benjamin Peterson4ac9ce42009-10-04 14:49:41 +0000129.. data:: hash.block_size
Guido van Rossuma19f80c2007-11-06 20:51:31 +0000130
131 The internal block size of the hash algorithm in bytes.
Georg Brandl116aa622007-08-15 14:28:22 +0000132
Jason R. Coombsb2aa6f42013-08-03 11:39:39 +0200133A hash object has the following attributes:
134
135.. attribute:: hash.name
136
137 The canonical name of this hash, always lowercase and always suitable as a
138 parameter to :func:`new` to create another hash of this type.
139
140 .. versionchanged:: 3.4
141 The name attribute has been present in CPython since its inception, but
142 until Python 3.4 was not formally specified, so may not exist on some
143 platforms.
144
Georg Brandl116aa622007-08-15 14:28:22 +0000145A hash object has the following methods:
146
147
148.. method:: hash.update(arg)
149
Georg Brandl67ced422007-09-06 14:09:10 +0000150 Update the hash object with the object *arg*, which must be interpretable as
151 a buffer of bytes. Repeated calls are equivalent to a single call with the
152 concatenation of all the arguments: ``m.update(a); m.update(b)`` is
153 equivalent to ``m.update(a+b)``.
Georg Brandl116aa622007-08-15 14:28:22 +0000154
Georg Brandl705d9d52009-05-05 09:29:50 +0000155 .. versionchanged:: 3.1
Georg Brandl67b21b72010-08-17 15:07:14 +0000156 The Python GIL is released to allow other threads to run while hash
Jesus Cea5b22dd82013-10-04 04:20:37 +0200157 updates on data larger than 2047 bytes is taking place when using hash
Georg Brandl67b21b72010-08-17 15:07:14 +0000158 algorithms supplied by OpenSSL.
Gregory P. Smith3f61d612009-05-04 00:45:33 +0000159
Georg Brandl116aa622007-08-15 14:28:22 +0000160
161.. method:: hash.digest()
162
Georg Brandl67ced422007-09-06 14:09:10 +0000163 Return the digest of the data passed to the :meth:`update` method so far.
Senthil Kumaran627284c2010-12-30 07:07:58 +0000164 This is a bytes object of size :attr:`digest_size` which may contain bytes in
Georg Brandl67ced422007-09-06 14:09:10 +0000165 the whole range from 0 to 255.
Georg Brandl116aa622007-08-15 14:28:22 +0000166
167
168.. method:: hash.hexdigest()
169
Georg Brandl67ced422007-09-06 14:09:10 +0000170 Like :meth:`digest` except the digest is returned as a string object of
171 double length, containing only hexadecimal digits. This may be used to
172 exchange the value safely in email or other non-binary environments.
Georg Brandl116aa622007-08-15 14:28:22 +0000173
174
175.. method:: hash.copy()
176
177 Return a copy ("clone") of the hash object. This can be used to efficiently
Georg Brandl67ced422007-09-06 14:09:10 +0000178 compute the digests of data sharing a common initial substring.
Georg Brandl116aa622007-08-15 14:28:22 +0000179
180
Christian Heimese92ef132013-10-13 00:52:43 +0200181Key Derivation Function
182-----------------------
183
184Key derivation and key stretching algorithms are designed for secure password
185hashing. Naive algorithms such as ``sha1(password)`` are not resistant
186against brute-force attacks. A good password hashing function must be tunable,
187slow and include a salt.
188
189
190.. function:: pbkdf2_hmac(name, password, salt, rounds, dklen=None)
191
192 The function provides PKCS#5 password-based key derivation function 2. It
193 uses HMAC as pseudorandom function.
194
195 The string *name* is the desired name of the hash digest algorithm for
196 HMAC, e.g. 'sha1' or 'sha256'. *password* and *salt* are interpreted as
197 buffers of bytes. Applications and libraries should limit *password* to
198 a sensible value (e.g. 1024). *salt* should be about 16 or more bytes from
199 a proper source, e.g. :func:`os.urandom`.
200
201 The number of *rounds* should be chosen based on the hash algorithm and
202 computing power. As of 2013 a value of at least 100,000 rounds of SHA-256
203 have been suggested.
204
205 *dklen* is the length of the derived key. If *dklen* is ``None`` then the
206 digest size of the hash algorithm *name* is used, e.g. 64 for SHA-512.
207
208 >>> import hashlib, binascii
209 >>> dk = hashlib.pbkdf2_hmac('sha256', b'password', b'salt', 100000)
210 >>> binascii.hexlify(dk)
211 b'0394a2ede332c9a13eb82e9b24631604c31df978b4e2f0fbd2c549944f9d79a5'
212
213 .. versionadded:: 3.4
214
Christian Heimese7236222013-10-19 14:24:44 +0200215 .. note:: A fast implementation of *pbkdf2_hmac* is available with OpenSSL.
216 The Python implementation uses an inline version of :mod:`hmac`. It is
217 about three times slower and doesn't release the GIL.
Christian Heimese92ef132013-10-13 00:52:43 +0200218
219
Georg Brandl116aa622007-08-15 14:28:22 +0000220.. seealso::
221
222 Module :mod:`hmac`
223 A module to generate message authentication codes using hashes.
224
225 Module :mod:`base64`
226 Another way to encode binary hashes for non-binary environments.
227
228 http://csrc.nist.gov/publications/fips/fips180-2/fips180-2.pdf
229 The FIPS 180-2 publication on Secure Hash Algorithms.
230
Georg Brandlfd0eb3f2010-05-21 20:28:13 +0000231 http://en.wikipedia.org/wiki/Cryptographic_hash_function#Cryptographic_hash_algorithms
232 Wikipedia article with information on which algorithms have known issues and
Georg Brandl116aa622007-08-15 14:28:22 +0000233 what that means regarding their use.
234
Christian Heimese92ef132013-10-13 00:52:43 +0200235 http://www.ietf.org/rfc/rfc2898.txt
236 PKCS #5: Password-Based Cryptography Specification Version 2.0