Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 1 | :mod:`random` --- Generate pseudo-random numbers |
| 2 | ================================================ |
| 3 | |
| 4 | .. module:: random |
| 5 | :synopsis: Generate pseudo-random numbers with various common distributions. |
| 6 | |
Raymond Hettinger | 1048094 | 2011-01-10 03:26:08 +0000 | [diff] [blame] | 7 | **Source code:** :source:`Lib/random.py` |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 8 | |
Raymond Hettinger | 4f707fd | 2011-01-10 19:54:11 +0000 | [diff] [blame] | 9 | -------------- |
| 10 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 11 | This module implements pseudo-random number generators for various |
| 12 | distributions. |
| 13 | |
Raymond Hettinger | b21dac1 | 2010-09-07 05:32:49 +0000 | [diff] [blame] | 14 | For integers, there is uniform selection from a range. For sequences, there is |
| 15 | uniform selection of a random element, a function to generate a random |
| 16 | permutation of a list in-place, and a function for random sampling without |
| 17 | replacement. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 18 | |
| 19 | On the real line, there are functions to compute uniform, normal (Gaussian), |
| 20 | lognormal, negative exponential, gamma, and beta distributions. For generating |
| 21 | distributions of angles, the von Mises distribution is available. |
| 22 | |
Georg Brandl | 92849d1 | 2016-02-19 08:57:38 +0100 | [diff] [blame] | 23 | Almost all module functions depend on the basic function :func:`.random`, which |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 24 | generates a random float uniformly in the semi-open range [0.0, 1.0). Python |
| 25 | uses the Mersenne Twister as the core generator. It produces 53-bit precision |
| 26 | floats and has a period of 2\*\*19937-1. The underlying implementation in C is |
| 27 | both fast and threadsafe. The Mersenne Twister is one of the most extensively |
| 28 | tested random number generators in existence. However, being completely |
| 29 | deterministic, it is not suitable for all purposes, and is completely unsuitable |
| 30 | for cryptographic purposes. |
| 31 | |
| 32 | The functions supplied by this module are actually bound methods of a hidden |
| 33 | instance of the :class:`random.Random` class. You can instantiate your own |
Raymond Hettinger | 28de64f | 2008-01-13 23:40:30 +0000 | [diff] [blame] | 34 | instances of :class:`Random` to get generators that don't share state. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 35 | |
| 36 | Class :class:`Random` can also be subclassed if you want to use a different |
Georg Brandl | 92849d1 | 2016-02-19 08:57:38 +0100 | [diff] [blame] | 37 | basic generator of your own devising: in that case, override the :meth:`~Random.random`, |
| 38 | :meth:`~Random.seed`, :meth:`~Random.getstate`, and :meth:`~Random.setstate` methods. |
| 39 | Optionally, a new generator can supply a :meth:`~Random.getrandbits` method --- this |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 40 | allows :meth:`randrange` to produce selections over an arbitrarily large range. |
| 41 | |
Benjamin Peterson | 21896a3 | 2010-03-21 22:03:03 +0000 | [diff] [blame] | 42 | The :mod:`random` module also provides the :class:`SystemRandom` class which |
| 43 | uses the system function :func:`os.urandom` to generate random numbers |
| 44 | from sources provided by the operating system. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 45 | |
Raymond Hettinger | c89a451 | 2014-05-11 02:26:23 -0700 | [diff] [blame] | 46 | .. warning:: |
| 47 | |
| 48 | The pseudo-random generators of this module should not be used for |
Steven D'Aprano | b2871fa | 2016-04-17 01:42:33 +1000 | [diff] [blame] | 49 | security purposes. For security or cryptographic uses, see the |
| 50 | :mod:`secrets` module. |
Raymond Hettinger | c89a451 | 2014-05-11 02:26:23 -0700 | [diff] [blame] | 51 | |
Raymond Hettinger | e132910 | 2016-11-21 12:33:50 -0800 | [diff] [blame] | 52 | .. seealso:: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 53 | |
Raymond Hettinger | e132910 | 2016-11-21 12:33:50 -0800 | [diff] [blame] | 54 | M. Matsumoto and T. Nishimura, "Mersenne Twister: A 623-dimensionally |
| 55 | equidistributed uniform pseudorandom number generator", ACM Transactions on |
Serhiy Storchaka | 0264e46 | 2016-11-26 13:49:59 +0200 | [diff] [blame] | 56 | Modeling and Computer Simulation Vol. 8, No. 1, January pp.3--30 1998. |
Raymond Hettinger | e132910 | 2016-11-21 12:33:50 -0800 | [diff] [blame] | 57 | |
| 58 | |
| 59 | `Complementary-Multiply-with-Carry recipe |
Andre Delfino | e8a2076 | 2020-09-26 21:47:25 -0300 | [diff] [blame] | 60 | <https://code.activestate.com/recipes/576707/>`_ for a compatible alternative |
Raymond Hettinger | e132910 | 2016-11-21 12:33:50 -0800 | [diff] [blame] | 61 | random number generator with a long period and comparatively simple update |
| 62 | operations. |
| 63 | |
| 64 | |
| 65 | Bookkeeping functions |
| 66 | --------------------- |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 67 | |
Ezio Melotti | e0add76 | 2012-09-14 06:32:35 +0300 | [diff] [blame] | 68 | .. function:: seed(a=None, version=2) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 69 | |
Raymond Hettinger | f763a72 | 2010-09-07 00:38:15 +0000 | [diff] [blame] | 70 | Initialize the random number generator. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 71 | |
Ezio Melotti | e0add76 | 2012-09-14 06:32:35 +0300 | [diff] [blame] | 72 | If *a* is omitted or ``None``, the current system time is used. If |
Raymond Hettinger | f763a72 | 2010-09-07 00:38:15 +0000 | [diff] [blame] | 73 | randomness sources are provided by the operating system, they are used |
| 74 | instead of the system time (see the :func:`os.urandom` function for details |
| 75 | on availability). |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 76 | |
Ezio Melotti | e0add76 | 2012-09-14 06:32:35 +0300 | [diff] [blame] | 77 | If *a* is an int, it is used directly. |
Raymond Hettinger | f763a72 | 2010-09-07 00:38:15 +0000 | [diff] [blame] | 78 | |
| 79 | With version 2 (the default), a :class:`str`, :class:`bytes`, or :class:`bytearray` |
Raymond Hettinger | 16eb827 | 2016-09-04 11:17:28 -0700 | [diff] [blame] | 80 | object gets converted to an :class:`int` and all of its bits are used. |
| 81 | |
| 82 | With version 1 (provided for reproducing random sequences from older versions |
| 83 | of Python), the algorithm for :class:`str` and :class:`bytes` generates a |
| 84 | narrower range of seeds. |
Raymond Hettinger | f763a72 | 2010-09-07 00:38:15 +0000 | [diff] [blame] | 85 | |
| 86 | .. versionchanged:: 3.2 |
| 87 | Moved to the version 2 scheme which uses all of the bits in a string seed. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 88 | |
Raymond Hettinger | d0cdeaa | 2019-08-22 09:19:36 -0700 | [diff] [blame] | 89 | .. deprecated:: 3.9 |
| 90 | In the future, the *seed* must be one of the following types: |
| 91 | *NoneType*, :class:`int`, :class:`float`, :class:`str`, |
| 92 | :class:`bytes`, or :class:`bytearray`. |
| 93 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 94 | .. function:: getstate() |
| 95 | |
| 96 | Return an object capturing the current internal state of the generator. This |
| 97 | object can be passed to :func:`setstate` to restore the state. |
| 98 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 99 | |
| 100 | .. function:: setstate(state) |
| 101 | |
| 102 | *state* should have been obtained from a previous call to :func:`getstate`, and |
| 103 | :func:`setstate` restores the internal state of the generator to what it was at |
Sandro Tosi | 985104a | 2012-08-12 15:12:15 +0200 | [diff] [blame] | 104 | the time :func:`getstate` was called. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 105 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 106 | |
Raymond Hettinger | f01d1be | 2020-05-04 22:52:13 -0700 | [diff] [blame] | 107 | Functions for bytes |
| 108 | ------------------- |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 109 | |
Victor Stinner | 9f5fe79 | 2020-04-17 19:05:35 +0200 | [diff] [blame] | 110 | .. function:: randbytes(n) |
| 111 | |
| 112 | Generate *n* random bytes. |
| 113 | |
Raymond Hettinger | f01d1be | 2020-05-04 22:52:13 -0700 | [diff] [blame] | 114 | This method should not be used for generating security tokens. |
| 115 | Use :func:`secrets.token_bytes` instead. |
| 116 | |
Victor Stinner | 9f5fe79 | 2020-04-17 19:05:35 +0200 | [diff] [blame] | 117 | .. versionadded:: 3.9 |
| 118 | |
| 119 | |
Raymond Hettinger | e132910 | 2016-11-21 12:33:50 -0800 | [diff] [blame] | 120 | Functions for integers |
| 121 | ---------------------- |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 122 | |
Ezio Melotti | e0add76 | 2012-09-14 06:32:35 +0300 | [diff] [blame] | 123 | .. function:: randrange(stop) |
| 124 | randrange(start, stop[, step]) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 125 | |
| 126 | Return a randomly selected element from ``range(start, stop, step)``. This is |
| 127 | equivalent to ``choice(range(start, stop, step))``, but doesn't actually build a |
| 128 | range object. |
| 129 | |
Raymond Hettinger | 0515661 | 2010-09-07 04:44:52 +0000 | [diff] [blame] | 130 | The positional argument pattern matches that of :func:`range`. Keyword arguments |
| 131 | should not be used because the function may use them in unexpected ways. |
| 132 | |
| 133 | .. versionchanged:: 3.2 |
| 134 | :meth:`randrange` is more sophisticated about producing equally distributed |
| 135 | values. Formerly it used a style like ``int(random()*n)`` which could produce |
| 136 | slightly uneven distributions. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 137 | |
| 138 | .. function:: randint(a, b) |
| 139 | |
Raymond Hettinger | afd3045 | 2009-02-24 10:57:02 +0000 | [diff] [blame] | 140 | Return a random integer *N* such that ``a <= N <= b``. Alias for |
| 141 | ``randrange(a, b+1)``. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 142 | |
Raymond Hettinger | f01d1be | 2020-05-04 22:52:13 -0700 | [diff] [blame] | 143 | .. function:: getrandbits(k) |
| 144 | |
| 145 | Returns a Python integer with *k* random bits. This method is supplied with |
| 146 | the MersenneTwister generator and some other generators may also provide it |
| 147 | as an optional part of the API. When available, :meth:`getrandbits` enables |
| 148 | :meth:`randrange` to handle arbitrarily large ranges. |
| 149 | |
| 150 | .. versionchanged:: 3.9 |
| 151 | This method now accepts zero for *k*. |
| 152 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 153 | |
Raymond Hettinger | e132910 | 2016-11-21 12:33:50 -0800 | [diff] [blame] | 154 | Functions for sequences |
| 155 | ----------------------- |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 156 | |
| 157 | .. function:: choice(seq) |
| 158 | |
| 159 | Return a random element from the non-empty sequence *seq*. If *seq* is empty, |
| 160 | raises :exc:`IndexError`. |
| 161 | |
Raymond Hettinger | 9016f28 | 2016-09-26 21:45:57 -0700 | [diff] [blame] | 162 | .. function:: choices(population, weights=None, *, cum_weights=None, k=1) |
Raymond Hettinger | e8f1e00 | 2016-09-06 17:15:29 -0700 | [diff] [blame] | 163 | |
| 164 | Return a *k* sized list of elements chosen from the *population* with replacement. |
| 165 | If the *population* is empty, raises :exc:`IndexError`. |
| 166 | |
| 167 | If a *weights* sequence is specified, selections are made according to the |
| 168 | relative weights. Alternatively, if a *cum_weights* sequence is given, the |
Raymond Hettinger | 1c3a121 | 2016-10-12 01:42:10 -0400 | [diff] [blame] | 169 | selections are made according to the cumulative weights (perhaps computed |
| 170 | using :func:`itertools.accumulate`). For example, the relative weights |
| 171 | ``[10, 5, 30, 5]`` are equivalent to the cumulative weights |
| 172 | ``[10, 15, 45, 50]``. Internally, the relative weights are converted to |
| 173 | cumulative weights before making selections, so supplying the cumulative |
| 174 | weights saves work. |
Raymond Hettinger | e8f1e00 | 2016-09-06 17:15:29 -0700 | [diff] [blame] | 175 | |
| 176 | If neither *weights* nor *cum_weights* are specified, selections are made |
| 177 | with equal probability. If a weights sequence is supplied, it must be |
| 178 | the same length as the *population* sequence. It is a :exc:`TypeError` |
| 179 | to specify both *weights* and *cum_weights*. |
| 180 | |
| 181 | The *weights* or *cum_weights* can use any numeric type that interoperates |
| 182 | with the :class:`float` values returned by :func:`random` (that includes |
Ram Rachum | b0dfc75 | 2020-09-29 04:32:10 +0300 | [diff] [blame] | 183 | integers, floats, and fractions but excludes decimals). Weights are assumed |
| 184 | to be non-negative and finite. A :exc:`ValueError` is raised if all |
Raymond Hettinger | 041d8b4 | 2019-11-23 02:22:13 -0800 | [diff] [blame] | 185 | weights are zero. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 186 | |
Raymond Hettinger | 40ebe94 | 2019-01-30 13:30:20 -0800 | [diff] [blame] | 187 | For a given seed, the :func:`choices` function with equal weighting |
| 188 | typically produces a different sequence than repeated calls to |
| 189 | :func:`choice`. The algorithm used by :func:`choices` uses floating |
| 190 | point arithmetic for internal consistency and speed. The algorithm used |
| 191 | by :func:`choice` defaults to integer arithmetic with repeated selections |
| 192 | to avoid small biases from round-off error. |
| 193 | |
Raymond Hettinger | 1c3a121 | 2016-10-12 01:42:10 -0400 | [diff] [blame] | 194 | .. versionadded:: 3.6 |
| 195 | |
Raymond Hettinger | 041d8b4 | 2019-11-23 02:22:13 -0800 | [diff] [blame] | 196 | .. versionchanged:: 3.9 |
| 197 | Raises a :exc:`ValueError` if all weights are zero. |
| 198 | |
Raymond Hettinger | 1c3a121 | 2016-10-12 01:42:10 -0400 | [diff] [blame] | 199 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 200 | .. function:: shuffle(x[, random]) |
| 201 | |
Raymond Hettinger | a3950e4 | 2016-11-17 01:49:54 -0800 | [diff] [blame] | 202 | Shuffle the sequence *x* in place. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 203 | |
Raymond Hettinger | a3950e4 | 2016-11-17 01:49:54 -0800 | [diff] [blame] | 204 | The optional argument *random* is a 0-argument function returning a random |
| 205 | float in [0.0, 1.0); by default, this is the function :func:`.random`. |
| 206 | |
| 207 | To shuffle an immutable sequence and return a new shuffled list, use |
| 208 | ``sample(x, k=len(x))`` instead. |
| 209 | |
| 210 | Note that even for small ``len(x)``, the total number of permutations of *x* |
| 211 | can quickly grow larger than the period of most random number generators. |
| 212 | This implies that most permutations of a long sequence can never be |
| 213 | generated. For example, a sequence of length 2080 is the largest that |
| 214 | can fit within the period of the Mersenne Twister random number generator. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 215 | |
Raymond Hettinger | 190fac9 | 2020-05-02 16:45:32 -0700 | [diff] [blame] | 216 | .. deprecated-removed:: 3.9 3.11 |
| 217 | The optional parameter *random*. |
| 218 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 219 | |
Raymond Hettinger | 81a5fc3 | 2020-05-08 07:53:15 -0700 | [diff] [blame] | 220 | .. function:: sample(population, k, *, counts=None) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 221 | |
Raymond Hettinger | 1acde19 | 2008-01-14 01:00:53 +0000 | [diff] [blame] | 222 | Return a *k* length list of unique elements chosen from the population sequence |
| 223 | or set. Used for random sampling without replacement. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 224 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 225 | Returns a new list containing elements from the population while leaving the |
| 226 | original population unchanged. The resulting list is in selection order so that |
| 227 | all sub-slices will also be valid random samples. This allows raffle winners |
| 228 | (the sample) to be partitioned into grand prize and second place winners (the |
| 229 | subslices). |
| 230 | |
Guido van Rossum | 2cc30da | 2007-11-02 23:46:40 +0000 | [diff] [blame] | 231 | Members of the population need not be :term:`hashable` or unique. If the population |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 232 | contains repeats, then each occurrence is a possible selection in the sample. |
| 233 | |
Raymond Hettinger | 81a5fc3 | 2020-05-08 07:53:15 -0700 | [diff] [blame] | 234 | Repeated elements can be specified one at a time or with the optional |
| 235 | keyword-only *counts* parameter. For example, ``sample(['red', 'blue'], |
| 236 | counts=[4, 2], k=5)`` is equivalent to ``sample(['red', 'red', 'red', 'red', |
| 237 | 'blue', 'blue'], k=5)``. |
| 238 | |
Raymond Hettinger | a3950e4 | 2016-11-17 01:49:54 -0800 | [diff] [blame] | 239 | To choose a sample from a range of integers, use a :func:`range` object as an |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 240 | argument. This is especially fast and space efficient for sampling from a large |
Raymond Hettinger | a3950e4 | 2016-11-17 01:49:54 -0800 | [diff] [blame] | 241 | population: ``sample(range(10000000), k=60)``. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 242 | |
Raymond Hettinger | f07d949 | 2012-07-09 12:43:57 -0700 | [diff] [blame] | 243 | If the sample size is larger than the population size, a :exc:`ValueError` |
Raymond Hettinger | 86a20f8 | 2012-07-08 16:01:53 -0700 | [diff] [blame] | 244 | is raised. |
| 245 | |
Raymond Hettinger | 81a5fc3 | 2020-05-08 07:53:15 -0700 | [diff] [blame] | 246 | .. versionchanged:: 3.9 |
| 247 | Added the *counts* parameter. |
| 248 | |
Raymond Hettinger | 4fe0020 | 2020-04-19 00:36:42 -0700 | [diff] [blame] | 249 | .. deprecated:: 3.9 |
| 250 | In the future, the *population* must be a sequence. Instances of |
| 251 | :class:`set` are no longer supported. The set must first be converted |
| 252 | to a :class:`list` or :class:`tuple`, preferably in a deterministic |
| 253 | order so that the sample is reproducible. |
| 254 | |
| 255 | |
Raymond Hettinger | f2bd04f | 2020-10-13 16:41:26 -0700 | [diff] [blame] | 256 | .. _real-valued-distributions: |
| 257 | |
Raymond Hettinger | e132910 | 2016-11-21 12:33:50 -0800 | [diff] [blame] | 258 | Real-valued distributions |
| 259 | ------------------------- |
| 260 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 261 | The following functions generate specific real-valued distributions. Function |
| 262 | parameters are named after the corresponding variables in the distribution's |
| 263 | equation, as used in common mathematical practice; most of these equations can |
| 264 | be found in any statistics text. |
| 265 | |
| 266 | |
| 267 | .. function:: random() |
| 268 | |
| 269 | Return the next random floating point number in the range [0.0, 1.0). |
| 270 | |
| 271 | |
| 272 | .. function:: uniform(a, b) |
| 273 | |
Benjamin Peterson | b58dda7 | 2009-01-18 22:27:04 +0000 | [diff] [blame] | 274 | Return a random floating point number *N* such that ``a <= N <= b`` for |
| 275 | ``a <= b`` and ``b <= N <= a`` for ``b < a``. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 276 | |
Raymond Hettinger | be40db0 | 2009-06-11 23:12:14 +0000 | [diff] [blame] | 277 | The end-point value ``b`` may or may not be included in the range |
| 278 | depending on floating-point rounding in the equation ``a + (b-a) * random()``. |
Benjamin Peterson | 35e8c46 | 2008-04-24 02:34:53 +0000 | [diff] [blame] | 279 | |
Georg Brandl | 73dd7c7 | 2011-09-17 20:36:28 +0200 | [diff] [blame] | 280 | |
Christian Heimes | fe337bf | 2008-03-23 21:54:12 +0000 | [diff] [blame] | 281 | .. function:: triangular(low, high, mode) |
| 282 | |
Benjamin Peterson | b58dda7 | 2009-01-18 22:27:04 +0000 | [diff] [blame] | 283 | Return a random floating point number *N* such that ``low <= N <= high`` and |
Christian Heimes | cc47b05 | 2008-03-25 14:56:36 +0000 | [diff] [blame] | 284 | with the specified *mode* between those bounds. The *low* and *high* bounds |
| 285 | default to zero and one. The *mode* argument defaults to the midpoint |
| 286 | between the bounds, giving a symmetric distribution. |
Christian Heimes | fe337bf | 2008-03-23 21:54:12 +0000 | [diff] [blame] | 287 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 288 | |
| 289 | .. function:: betavariate(alpha, beta) |
| 290 | |
Benjamin Peterson | b58dda7 | 2009-01-18 22:27:04 +0000 | [diff] [blame] | 291 | Beta distribution. Conditions on the parameters are ``alpha > 0`` and |
| 292 | ``beta > 0``. Returned values range between 0 and 1. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 293 | |
| 294 | |
| 295 | .. function:: expovariate(lambd) |
| 296 | |
Mark Dickinson | 2f94736 | 2009-01-07 17:54:07 +0000 | [diff] [blame] | 297 | Exponential distribution. *lambd* is 1.0 divided by the desired |
| 298 | mean. It should be nonzero. (The parameter would be called |
| 299 | "lambda", but that is a reserved word in Python.) Returned values |
| 300 | range from 0 to positive infinity if *lambd* is positive, and from |
| 301 | negative infinity to 0 if *lambd* is negative. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 302 | |
| 303 | |
| 304 | .. function:: gammavariate(alpha, beta) |
| 305 | |
Benjamin Peterson | b58dda7 | 2009-01-18 22:27:04 +0000 | [diff] [blame] | 306 | Gamma distribution. (*Not* the gamma function!) Conditions on the |
| 307 | parameters are ``alpha > 0`` and ``beta > 0``. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 308 | |
Georg Brandl | 73dd7c7 | 2011-09-17 20:36:28 +0200 | [diff] [blame] | 309 | The probability distribution function is:: |
| 310 | |
| 311 | x ** (alpha - 1) * math.exp(-x / beta) |
| 312 | pdf(x) = -------------------------------------- |
| 313 | math.gamma(alpha) * beta ** alpha |
| 314 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 315 | |
| 316 | .. function:: gauss(mu, sigma) |
| 317 | |
Benjamin Peterson | b58dda7 | 2009-01-18 22:27:04 +0000 | [diff] [blame] | 318 | Gaussian distribution. *mu* is the mean, and *sigma* is the standard |
| 319 | deviation. This is slightly faster than the :func:`normalvariate` function |
| 320 | defined below. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 321 | |
Raymond Hettinger | 3cde378 | 2020-10-25 07:59:01 -0700 | [diff] [blame] | 322 | Multithreading note: When two threads call this function |
| 323 | simultaneously, it is possible that they will receive the |
| 324 | same return value. This can be avoided in three ways. |
| 325 | 1) Have each thread use a different instance of the random |
| 326 | number generator. 2) Put locks around all calls. 3) Use the |
| 327 | slower, but thread-safe :func:`normalvariate` function instead. |
| 328 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 329 | |
| 330 | .. function:: lognormvariate(mu, sigma) |
| 331 | |
| 332 | Log normal distribution. If you take the natural logarithm of this |
| 333 | distribution, you'll get a normal distribution with mean *mu* and standard |
| 334 | deviation *sigma*. *mu* can have any value, and *sigma* must be greater than |
| 335 | zero. |
| 336 | |
| 337 | |
| 338 | .. function:: normalvariate(mu, sigma) |
| 339 | |
| 340 | Normal distribution. *mu* is the mean, and *sigma* is the standard deviation. |
| 341 | |
| 342 | |
| 343 | .. function:: vonmisesvariate(mu, kappa) |
| 344 | |
| 345 | *mu* is the mean angle, expressed in radians between 0 and 2\*\ *pi*, and *kappa* |
| 346 | is the concentration parameter, which must be greater than or equal to zero. If |
| 347 | *kappa* is equal to zero, this distribution reduces to a uniform random angle |
| 348 | over the range 0 to 2\*\ *pi*. |
| 349 | |
| 350 | |
| 351 | .. function:: paretovariate(alpha) |
| 352 | |
| 353 | Pareto distribution. *alpha* is the shape parameter. |
| 354 | |
| 355 | |
| 356 | .. function:: weibullvariate(alpha, beta) |
| 357 | |
| 358 | Weibull distribution. *alpha* is the scale parameter and *beta* is the shape |
| 359 | parameter. |
| 360 | |
| 361 | |
Raymond Hettinger | e132910 | 2016-11-21 12:33:50 -0800 | [diff] [blame] | 362 | Alternative Generator |
| 363 | --------------------- |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 364 | |
Matthias Bussonnier | 31e8d69 | 2019-04-16 09:47:11 -0700 | [diff] [blame] | 365 | .. class:: Random([seed]) |
| 366 | |
| 367 | Class that implements the default pseudo-random number generator used by the |
| 368 | :mod:`random` module. |
| 369 | |
Raymond Hettinger | d0cdeaa | 2019-08-22 09:19:36 -0700 | [diff] [blame] | 370 | .. deprecated:: 3.9 |
| 371 | In the future, the *seed* must be one of the following types: |
| 372 | :class:`NoneType`, :class:`int`, :class:`float`, :class:`str`, |
| 373 | :class:`bytes`, or :class:`bytearray`. |
| 374 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 375 | .. class:: SystemRandom([seed]) |
| 376 | |
| 377 | Class that uses the :func:`os.urandom` function for generating random numbers |
| 378 | from sources provided by the operating system. Not available on all systems. |
Raymond Hettinger | 3cdf871 | 2010-12-02 05:35:35 +0000 | [diff] [blame] | 379 | Does not rely on software state, and sequences are not reproducible. Accordingly, |
Raymond Hettinger | afd3045 | 2009-02-24 10:57:02 +0000 | [diff] [blame] | 380 | the :meth:`seed` method has no effect and is ignored. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 381 | The :meth:`getstate` and :meth:`setstate` methods raise |
| 382 | :exc:`NotImplementedError` if called. |
| 383 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 384 | |
Raymond Hettinger | 435cb0f | 2010-09-06 23:36:31 +0000 | [diff] [blame] | 385 | Notes on Reproducibility |
Antoine Pitrou | e72b586 | 2010-12-12 20:13:31 +0000 | [diff] [blame] | 386 | ------------------------ |
Raymond Hettinger | 435cb0f | 2010-09-06 23:36:31 +0000 | [diff] [blame] | 387 | |
Julien Palard | 58a4054 | 2020-01-31 10:50:14 +0100 | [diff] [blame] | 388 | Sometimes it is useful to be able to reproduce the sequences given by a |
| 389 | pseudo-random number generator. By re-using a seed value, the same sequence should be |
Raymond Hettinger | 435cb0f | 2010-09-06 23:36:31 +0000 | [diff] [blame] | 390 | reproducible from run to run as long as multiple threads are not running. |
| 391 | |
| 392 | Most of the random module's algorithms and seeding functions are subject to |
| 393 | change across Python versions, but two aspects are guaranteed not to change: |
| 394 | |
| 395 | * If a new seeding method is added, then a backward compatible seeder will be |
| 396 | offered. |
| 397 | |
Georg Brandl | 92849d1 | 2016-02-19 08:57:38 +0100 | [diff] [blame] | 398 | * The generator's :meth:`~Random.random` method will continue to produce the same |
Raymond Hettinger | 435cb0f | 2010-09-06 23:36:31 +0000 | [diff] [blame] | 399 | sequence when the compatible seeder is given the same seed. |
Raymond Hettinger | 2fdc7b1 | 2010-12-02 02:41:33 +0000 | [diff] [blame] | 400 | |
Raymond Hettinger | 6e35394 | 2010-12-04 23:42:12 +0000 | [diff] [blame] | 401 | .. _random-examples: |
Raymond Hettinger | 2fdc7b1 | 2010-12-02 02:41:33 +0000 | [diff] [blame] | 402 | |
Raymond Hettinger | 8b2ff4c | 2020-10-13 11:54:21 -0700 | [diff] [blame] | 403 | Examples |
| 404 | -------- |
Raymond Hettinger | 2fdc7b1 | 2010-12-02 02:41:33 +0000 | [diff] [blame] | 405 | |
Raymond Hettinger | 0a1a909 | 2016-11-17 00:45:35 -0800 | [diff] [blame] | 406 | Basic examples:: |
Raymond Hettinger | 3cdf871 | 2010-12-02 05:35:35 +0000 | [diff] [blame] | 407 | |
Raymond Hettinger | 0a1a909 | 2016-11-17 00:45:35 -0800 | [diff] [blame] | 408 | >>> random() # Random float: 0.0 <= x < 1.0 |
Raymond Hettinger | 3cdf871 | 2010-12-02 05:35:35 +0000 | [diff] [blame] | 409 | 0.37444887175646646 |
| 410 | |
Raymond Hettinger | e132910 | 2016-11-21 12:33:50 -0800 | [diff] [blame] | 411 | >>> uniform(2.5, 10.0) # Random float: 2.5 <= x < 10.0 |
Raymond Hettinger | 0a1a909 | 2016-11-17 00:45:35 -0800 | [diff] [blame] | 412 | 3.1800146073117523 |
Raymond Hettinger | 3cdf871 | 2010-12-02 05:35:35 +0000 | [diff] [blame] | 413 | |
Raymond Hettinger | e132910 | 2016-11-21 12:33:50 -0800 | [diff] [blame] | 414 | >>> expovariate(1 / 5) # Interval between arrivals averaging 5 seconds |
Raymond Hettinger | 0a1a909 | 2016-11-17 00:45:35 -0800 | [diff] [blame] | 415 | 5.148957571865031 |
| 416 | |
Raymond Hettinger | e132910 | 2016-11-21 12:33:50 -0800 | [diff] [blame] | 417 | >>> randrange(10) # Integer from 0 to 9 inclusive |
Raymond Hettinger | 3cdf871 | 2010-12-02 05:35:35 +0000 | [diff] [blame] | 418 | 7 |
| 419 | |
Raymond Hettinger | 0a1a909 | 2016-11-17 00:45:35 -0800 | [diff] [blame] | 420 | >>> randrange(0, 101, 2) # Even integer from 0 to 100 inclusive |
Raymond Hettinger | 3cdf871 | 2010-12-02 05:35:35 +0000 | [diff] [blame] | 421 | 26 |
| 422 | |
Raymond Hettinger | 6befb64 | 2016-11-21 01:59:39 -0800 | [diff] [blame] | 423 | >>> choice(['win', 'lose', 'draw']) # Single random element from a sequence |
| 424 | 'draw' |
Raymond Hettinger | 3cdf871 | 2010-12-02 05:35:35 +0000 | [diff] [blame] | 425 | |
Raymond Hettinger | 0a1a909 | 2016-11-17 00:45:35 -0800 | [diff] [blame] | 426 | >>> deck = 'ace two three four'.split() |
| 427 | >>> shuffle(deck) # Shuffle a list |
Raymond Hettinger | 1c3a121 | 2016-10-12 01:42:10 -0400 | [diff] [blame] | 428 | >>> deck |
Raymond Hettinger | 0a1a909 | 2016-11-17 00:45:35 -0800 | [diff] [blame] | 429 | ['four', 'two', 'ace', 'three'] |
Raymond Hettinger | 3cdf871 | 2010-12-02 05:35:35 +0000 | [diff] [blame] | 430 | |
Raymond Hettinger | 0a1a909 | 2016-11-17 00:45:35 -0800 | [diff] [blame] | 431 | >>> sample([10, 20, 30, 40, 50], k=4) # Four samples without replacement |
| 432 | [40, 10, 50, 30] |
Raymond Hettinger | 3cdf871 | 2010-12-02 05:35:35 +0000 | [diff] [blame] | 433 | |
Raymond Hettinger | 0a1a909 | 2016-11-17 00:45:35 -0800 | [diff] [blame] | 434 | Simulations:: |
| 435 | |
Raymond Hettinger | 71c62e1 | 2016-12-04 11:00:34 -0800 | [diff] [blame] | 436 | >>> # Six roulette wheel spins (weighted sampling with replacement) |
Raymond Hettinger | 1c3a121 | 2016-10-12 01:42:10 -0400 | [diff] [blame] | 437 | >>> choices(['red', 'black', 'green'], [18, 18, 2], k=6) |
| 438 | ['red', 'green', 'black', 'black', 'red', 'black'] |
Raymond Hettinger | 2fdc7b1 | 2010-12-02 02:41:33 +0000 | [diff] [blame] | 439 | |
Raymond Hettinger | 81a5fc3 | 2020-05-08 07:53:15 -0700 | [diff] [blame] | 440 | >>> # Deal 20 cards without replacement from a deck |
| 441 | >>> # of 52 playing cards, and determine the proportion of cards |
| 442 | >>> # with a ten-value: ten, jack, queen, or king. |
| 443 | >>> dealt = sample(['tens', 'low cards'], counts=[16, 36], k=20) |
| 444 | >>> dealt.count('tens') / 20 |
Raymond Hettinger | 0a1a909 | 2016-11-17 00:45:35 -0800 | [diff] [blame] | 445 | 0.15 |
| 446 | |
Raymond Hettinger | 71c62e1 | 2016-12-04 11:00:34 -0800 | [diff] [blame] | 447 | >>> # Estimate the probability of getting 5 or more heads from 7 spins |
| 448 | >>> # of a biased coin that settles on heads 60% of the time. |
Raymond Hettinger | 9abb725 | 2019-02-15 12:40:18 -0800 | [diff] [blame] | 449 | >>> def trial(): |
| 450 | ... return choices('HT', cum_weights=(0.60, 1.00), k=7).count('H') >= 5 |
| 451 | ... |
Raymond Hettinger | d3a8d61 | 2020-04-21 16:11:00 -0700 | [diff] [blame] | 452 | >>> sum(trial() for i in range(10_000)) / 10_000 |
Raymond Hettinger | 16ef5d4 | 2016-10-31 22:53:52 -0700 | [diff] [blame] | 453 | 0.4169 |
| 454 | |
Raymond Hettinger | 71c62e1 | 2016-12-04 11:00:34 -0800 | [diff] [blame] | 455 | >>> # Probability of the median of 5 samples being in middle two quartiles |
Raymond Hettinger | 9abb725 | 2019-02-15 12:40:18 -0800 | [diff] [blame] | 456 | >>> def trial(): |
Raymond Hettinger | d3a8d61 | 2020-04-21 16:11:00 -0700 | [diff] [blame] | 457 | ... return 2_500 <= sorted(choices(range(10_000), k=5))[2] < 7_500 |
Raymond Hettinger | 9abb725 | 2019-02-15 12:40:18 -0800 | [diff] [blame] | 458 | ... |
Raymond Hettinger | d3a8d61 | 2020-04-21 16:11:00 -0700 | [diff] [blame] | 459 | >>> sum(trial() for i in range(10_000)) / 10_000 |
Raymond Hettinger | 71c62e1 | 2016-12-04 11:00:34 -0800 | [diff] [blame] | 460 | 0.7958 |
| 461 | |
Raymond Hettinger | 1c3a121 | 2016-10-12 01:42:10 -0400 | [diff] [blame] | 462 | Example of `statistical bootstrapping |
| 463 | <https://en.wikipedia.org/wiki/Bootstrapping_(statistics)>`_ using resampling |
Raymond Hettinger | d3a8d61 | 2020-04-21 16:11:00 -0700 | [diff] [blame] | 464 | with replacement to estimate a confidence interval for the mean of a sample:: |
Raymond Hettinger | 2fdc7b1 | 2010-12-02 02:41:33 +0000 | [diff] [blame] | 465 | |
Raymond Hettinger | 1c3a121 | 2016-10-12 01:42:10 -0400 | [diff] [blame] | 466 | # http://statistics.about.com/od/Applications/a/Example-Of-Bootstrapping.htm |
Raymond Hettinger | 47d9987 | 2019-02-21 15:06:29 -0800 | [diff] [blame] | 467 | from statistics import fmean as mean |
Raymond Hettinger | 1c3a121 | 2016-10-12 01:42:10 -0400 | [diff] [blame] | 468 | from random import choices |
Raymond Hettinger | f5b7c7b | 2016-09-05 13:15:02 -0700 | [diff] [blame] | 469 | |
Raymond Hettinger | d3a8d61 | 2020-04-21 16:11:00 -0700 | [diff] [blame] | 470 | data = [41, 50, 29, 37, 81, 30, 73, 63, 20, 35, 68, 22, 60, 31, 95] |
| 471 | means = sorted(mean(choices(data, k=len(data))) for i in range(100)) |
Raymond Hettinger | 2589ee3 | 2016-11-16 21:34:17 -0800 | [diff] [blame] | 472 | print(f'The sample mean of {mean(data):.1f} has a 90% confidence ' |
Raymond Hettinger | d3a8d61 | 2020-04-21 16:11:00 -0700 | [diff] [blame] | 473 | f'interval from {means[5]:.1f} to {means[94]:.1f}') |
Raymond Hettinger | 2589ee3 | 2016-11-16 21:34:17 -0800 | [diff] [blame] | 474 | |
Raymond Hettinger | 00305ad | 2016-11-16 22:56:11 -0800 | [diff] [blame] | 475 | Example of a `resampling permutation test |
| 476 | <https://en.wikipedia.org/wiki/Resampling_(statistics)#Permutation_tests>`_ |
| 477 | to determine the statistical significance or `p-value |
| 478 | <https://en.wikipedia.org/wiki/P-value>`_ of an observed difference |
| 479 | between the effects of a drug versus a placebo:: |
| 480 | |
| 481 | # Example from "Statistics is Easy" by Dennis Shasha and Manda Wilson |
Raymond Hettinger | 47d9987 | 2019-02-21 15:06:29 -0800 | [diff] [blame] | 482 | from statistics import fmean as mean |
Raymond Hettinger | 00305ad | 2016-11-16 22:56:11 -0800 | [diff] [blame] | 483 | from random import shuffle |
| 484 | |
| 485 | drug = [54, 73, 53, 70, 73, 68, 52, 65, 65] |
| 486 | placebo = [54, 51, 58, 44, 55, 52, 42, 47, 58, 46] |
| 487 | observed_diff = mean(drug) - mean(placebo) |
| 488 | |
Raymond Hettinger | d3a8d61 | 2020-04-21 16:11:00 -0700 | [diff] [blame] | 489 | n = 10_000 |
Raymond Hettinger | 00305ad | 2016-11-16 22:56:11 -0800 | [diff] [blame] | 490 | count = 0 |
| 491 | combined = drug + placebo |
| 492 | for i in range(n): |
| 493 | shuffle(combined) |
| 494 | new_diff = mean(combined[:len(drug)]) - mean(combined[len(drug):]) |
| 495 | count += (new_diff >= observed_diff) |
| 496 | |
| 497 | print(f'{n} label reshufflings produced only {count} instances with a difference') |
| 498 | print(f'at least as extreme as the observed difference of {observed_diff:.1f}.') |
| 499 | print(f'The one-sided p-value of {count / n:.4f} leads us to reject the null') |
Raymond Hettinger | 6befb64 | 2016-11-21 01:59:39 -0800 | [diff] [blame] | 500 | print(f'hypothesis that there is no difference between the drug and the placebo.') |
| 501 | |
Raymond Hettinger | d3a8d61 | 2020-04-21 16:11:00 -0700 | [diff] [blame] | 502 | Simulation of arrival times and service deliveries for a multiserver queue:: |
Raymond Hettinger | 6befb64 | 2016-11-21 01:59:39 -0800 | [diff] [blame] | 503 | |
Raymond Hettinger | d3a8d61 | 2020-04-21 16:11:00 -0700 | [diff] [blame] | 504 | from heapq import heappush, heappop |
Raymond Hettinger | 1149d93 | 2016-11-21 14:13:07 -0800 | [diff] [blame] | 505 | from random import expovariate, gauss |
Raymond Hettinger | e16d2f7 | 2020-05-21 01:37:38 -0700 | [diff] [blame] | 506 | from statistics import mean, quantiles |
Raymond Hettinger | 6befb64 | 2016-11-21 01:59:39 -0800 | [diff] [blame] | 507 | |
| 508 | average_arrival_interval = 5.6 |
Raymond Hettinger | d3a8d61 | 2020-04-21 16:11:00 -0700 | [diff] [blame] | 509 | average_service_time = 15.0 |
| 510 | stdev_service_time = 3.5 |
| 511 | num_servers = 3 |
Raymond Hettinger | 6befb64 | 2016-11-21 01:59:39 -0800 | [diff] [blame] | 512 | |
Raymond Hettinger | d3a8d61 | 2020-04-21 16:11:00 -0700 | [diff] [blame] | 513 | waits = [] |
| 514 | arrival_time = 0.0 |
| 515 | servers = [0.0] * num_servers # time when each server becomes available |
| 516 | for i in range(100_000): |
| 517 | arrival_time += expovariate(1.0 / average_arrival_interval) |
| 518 | next_server_available = heappop(servers) |
| 519 | wait = max(0.0, next_server_available - arrival_time) |
| 520 | waits.append(wait) |
| 521 | service_duration = gauss(average_service_time, stdev_service_time) |
| 522 | service_completed = arrival_time + wait + service_duration |
| 523 | heappush(servers, service_completed) |
Raymond Hettinger | 1149d93 | 2016-11-21 14:13:07 -0800 | [diff] [blame] | 524 | |
Raymond Hettinger | e16d2f7 | 2020-05-21 01:37:38 -0700 | [diff] [blame] | 525 | print(f'Mean wait: {mean(waits):.1f} Max wait: {max(waits):.1f}') |
| 526 | print('Quartiles:', [round(q, 1) for q in quantiles(waits)]) |
Raymond Hettinger | 6befb64 | 2016-11-21 01:59:39 -0800 | [diff] [blame] | 527 | |
Raymond Hettinger | 0537405 | 2016-11-21 10:52:04 -0800 | [diff] [blame] | 528 | .. seealso:: |
| 529 | |
| 530 | `Statistics for Hackers <https://www.youtube.com/watch?v=Iq9DzN6mvYA>`_ |
| 531 | a video tutorial by |
| 532 | `Jake Vanderplas <https://us.pycon.org/2016/speaker/profile/295/>`_ |
| 533 | on statistical analysis using just a few fundamental concepts |
| 534 | including simulation, sampling, shuffling, and cross-validation. |
| 535 | |
| 536 | `Economics Simulation |
| 537 | <http://nbviewer.jupyter.org/url/norvig.com/ipython/Economics.ipynb>`_ |
| 538 | a simulation of a marketplace by |
| 539 | `Peter Norvig <http://norvig.com/bio.html>`_ that shows effective |
Raymond Hettinger | 7f94619 | 2016-11-21 15:13:18 -0800 | [diff] [blame] | 540 | use of many of the tools and distributions provided by this module |
Raymond Hettinger | 0537405 | 2016-11-21 10:52:04 -0800 | [diff] [blame] | 541 | (gauss, uniform, sample, betavariate, choice, triangular, and randrange). |
| 542 | |
| 543 | `A Concrete Introduction to Probability (using Python) |
| 544 | <http://nbviewer.jupyter.org/url/norvig.com/ipython/Probability.ipynb>`_ |
| 545 | a tutorial by `Peter Norvig <http://norvig.com/bio.html>`_ covering |
| 546 | the basics of probability theory, how to write simulations, and |
Raymond Hettinger | 7f94619 | 2016-11-21 15:13:18 -0800 | [diff] [blame] | 547 | how to perform data analysis using Python. |
Raymond Hettinger | 8b2ff4c | 2020-10-13 11:54:21 -0700 | [diff] [blame] | 548 | |
Raymond Hettinger | f2bd04f | 2020-10-13 16:41:26 -0700 | [diff] [blame] | 549 | |
| 550 | Recipes |
| 551 | ------- |
| 552 | |
| 553 | The default :func:`.random` returns multiples of 2⁻⁵³ in the range |
| 554 | *0.0 ≤ x < 1.0*. All such numbers are evenly spaced and are exactly |
Raymond Hettinger | b67cbbd | 2020-10-14 23:41:55 -0700 | [diff] [blame] | 555 | representable as Python floats. However, many other representable |
| 556 | floats in that interval are not possible selections. For example, |
| 557 | ``0.05954861408025609`` isn't an integer multiple of 2⁻⁵³. |
Raymond Hettinger | f2bd04f | 2020-10-13 16:41:26 -0700 | [diff] [blame] | 558 | |
| 559 | The following recipe takes a different approach. All floats in the |
| 560 | interval are possible selections. The mantissa comes from a uniform |
| 561 | distribution of integers in the range *2⁵² ≤ mantissa < 2⁵³*. The |
| 562 | exponent comes from a geometric distribution where exponents smaller |
| 563 | than *-53* occur half as often as the next larger exponent. |
| 564 | |
| 565 | :: |
| 566 | |
| 567 | from random import Random |
| 568 | from math import ldexp |
| 569 | |
| 570 | class FullRandom(Random): |
| 571 | |
| 572 | def random(self): |
| 573 | mantissa = 0x10_0000_0000_0000 | self.getrandbits(52) |
| 574 | exponent = -53 |
| 575 | x = 0 |
| 576 | while not x: |
| 577 | x = self.getrandbits(32) |
| 578 | exponent += x.bit_length() - 32 |
| 579 | return ldexp(mantissa, exponent) |
| 580 | |
| 581 | All :ref:`real valued distributions <real-valued-distributions>` |
| 582 | in the class will use the new method:: |
| 583 | |
| 584 | >>> fr = FullRandom() |
| 585 | >>> fr.random() |
| 586 | 0.05954861408025609 |
| 587 | >>> fr.expovariate(0.25) |
| 588 | 8.87925541791544 |
| 589 | |
| 590 | The recipe is conceptually equivalent to an algorithm that chooses from |
| 591 | all the multiples of 2⁻¹⁰⁷⁴ in the range *0.0 ≤ x < 1.0*. All such |
| 592 | numbers are evenly spaced, but most have to be rounded down to the |
| 593 | nearest representable Python float. (The value 2⁻¹⁰⁷⁴ is the smallest |
| 594 | positive unnormalized float and is equal to ``math.ulp(0.0)``.) |
| 595 | |
| 596 | |
| 597 | .. seealso:: |
| 598 | |
Raymond Hettinger | 8b2ff4c | 2020-10-13 11:54:21 -0700 | [diff] [blame] | 599 | `Generating Pseudo-random Floating-Point Values |
| 600 | <https://allendowney.com/research/rand/downey07randfloat.pdf>`_ a |
| 601 | paper by Allen B. Downey describing ways to generate more |
| 602 | fine-grained floats than normally generated by :func:`.random`. |