Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 1 | :mod:`random` --- Generate pseudo-random numbers |
| 2 | ================================================ |
| 3 | |
| 4 | .. module:: random |
| 5 | :synopsis: Generate pseudo-random numbers with various common distributions. |
| 6 | |
Raymond Hettinger | 1048094 | 2011-01-10 03:26:08 +0000 | [diff] [blame] | 7 | **Source code:** :source:`Lib/random.py` |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 8 | |
Raymond Hettinger | 4f707fd | 2011-01-10 19:54:11 +0000 | [diff] [blame] | 9 | -------------- |
| 10 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 11 | This module implements pseudo-random number generators for various |
| 12 | distributions. |
| 13 | |
Raymond Hettinger | b21dac1 | 2010-09-07 05:32:49 +0000 | [diff] [blame] | 14 | For integers, there is uniform selection from a range. For sequences, there is |
| 15 | uniform selection of a random element, a function to generate a random |
| 16 | permutation of a list in-place, and a function for random sampling without |
| 17 | replacement. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 18 | |
| 19 | On the real line, there are functions to compute uniform, normal (Gaussian), |
| 20 | lognormal, negative exponential, gamma, and beta distributions. For generating |
| 21 | distributions of angles, the von Mises distribution is available. |
| 22 | |
Georg Brandl | 92849d1 | 2016-02-19 08:57:38 +0100 | [diff] [blame] | 23 | Almost all module functions depend on the basic function :func:`.random`, which |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 24 | generates a random float uniformly in the semi-open range [0.0, 1.0). Python |
| 25 | uses the Mersenne Twister as the core generator. It produces 53-bit precision |
| 26 | floats and has a period of 2\*\*19937-1. The underlying implementation in C is |
| 27 | both fast and threadsafe. The Mersenne Twister is one of the most extensively |
| 28 | tested random number generators in existence. However, being completely |
| 29 | deterministic, it is not suitable for all purposes, and is completely unsuitable |
| 30 | for cryptographic purposes. |
| 31 | |
| 32 | The functions supplied by this module are actually bound methods of a hidden |
| 33 | instance of the :class:`random.Random` class. You can instantiate your own |
Raymond Hettinger | 28de64f | 2008-01-13 23:40:30 +0000 | [diff] [blame] | 34 | instances of :class:`Random` to get generators that don't share state. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 35 | |
| 36 | Class :class:`Random` can also be subclassed if you want to use a different |
Georg Brandl | 92849d1 | 2016-02-19 08:57:38 +0100 | [diff] [blame] | 37 | basic generator of your own devising: in that case, override the :meth:`~Random.random`, |
| 38 | :meth:`~Random.seed`, :meth:`~Random.getstate`, and :meth:`~Random.setstate` methods. |
| 39 | Optionally, a new generator can supply a :meth:`~Random.getrandbits` method --- this |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 40 | allows :meth:`randrange` to produce selections over an arbitrarily large range. |
| 41 | |
Benjamin Peterson | 21896a3 | 2010-03-21 22:03:03 +0000 | [diff] [blame] | 42 | The :mod:`random` module also provides the :class:`SystemRandom` class which |
| 43 | uses the system function :func:`os.urandom` to generate random numbers |
| 44 | from sources provided by the operating system. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 45 | |
Raymond Hettinger | c89a451 | 2014-05-11 02:26:23 -0700 | [diff] [blame] | 46 | .. warning:: |
| 47 | |
| 48 | The pseudo-random generators of this module should not be used for |
Steven D'Aprano | b2871fa | 2016-04-17 01:42:33 +1000 | [diff] [blame] | 49 | security purposes. For security or cryptographic uses, see the |
| 50 | :mod:`secrets` module. |
Raymond Hettinger | c89a451 | 2014-05-11 02:26:23 -0700 | [diff] [blame] | 51 | |
Raymond Hettinger | e132910 | 2016-11-21 12:33:50 -0800 | [diff] [blame] | 52 | .. seealso:: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 53 | |
Raymond Hettinger | e132910 | 2016-11-21 12:33:50 -0800 | [diff] [blame] | 54 | M. Matsumoto and T. Nishimura, "Mersenne Twister: A 623-dimensionally |
| 55 | equidistributed uniform pseudorandom number generator", ACM Transactions on |
Serhiy Storchaka | 0264e46 | 2016-11-26 13:49:59 +0200 | [diff] [blame] | 56 | Modeling and Computer Simulation Vol. 8, No. 1, January pp.3--30 1998. |
Raymond Hettinger | e132910 | 2016-11-21 12:33:50 -0800 | [diff] [blame] | 57 | |
| 58 | |
| 59 | `Complementary-Multiply-with-Carry recipe |
| 60 | <https://code.activestate.com/recipes/576707/>`_ for a compatible alternative |
| 61 | random number generator with a long period and comparatively simple update |
| 62 | operations. |
| 63 | |
| 64 | |
| 65 | Bookkeeping functions |
| 66 | --------------------- |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 67 | |
Ezio Melotti | e0add76 | 2012-09-14 06:32:35 +0300 | [diff] [blame] | 68 | .. function:: seed(a=None, version=2) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 69 | |
Raymond Hettinger | f763a72 | 2010-09-07 00:38:15 +0000 | [diff] [blame] | 70 | Initialize the random number generator. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 71 | |
Ezio Melotti | e0add76 | 2012-09-14 06:32:35 +0300 | [diff] [blame] | 72 | If *a* is omitted or ``None``, the current system time is used. If |
Raymond Hettinger | f763a72 | 2010-09-07 00:38:15 +0000 | [diff] [blame] | 73 | randomness sources are provided by the operating system, they are used |
| 74 | instead of the system time (see the :func:`os.urandom` function for details |
| 75 | on availability). |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 76 | |
Ezio Melotti | e0add76 | 2012-09-14 06:32:35 +0300 | [diff] [blame] | 77 | If *a* is an int, it is used directly. |
Raymond Hettinger | f763a72 | 2010-09-07 00:38:15 +0000 | [diff] [blame] | 78 | |
| 79 | With version 2 (the default), a :class:`str`, :class:`bytes`, or :class:`bytearray` |
Raymond Hettinger | 16eb827 | 2016-09-04 11:17:28 -0700 | [diff] [blame] | 80 | object gets converted to an :class:`int` and all of its bits are used. |
| 81 | |
| 82 | With version 1 (provided for reproducing random sequences from older versions |
| 83 | of Python), the algorithm for :class:`str` and :class:`bytes` generates a |
| 84 | narrower range of seeds. |
Raymond Hettinger | f763a72 | 2010-09-07 00:38:15 +0000 | [diff] [blame] | 85 | |
| 86 | .. versionchanged:: 3.2 |
| 87 | Moved to the version 2 scheme which uses all of the bits in a string seed. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 88 | |
Raymond Hettinger | d0cdeaa | 2019-08-22 09:19:36 -0700 | [diff] [blame] | 89 | .. deprecated:: 3.9 |
| 90 | In the future, the *seed* must be one of the following types: |
| 91 | *NoneType*, :class:`int`, :class:`float`, :class:`str`, |
| 92 | :class:`bytes`, or :class:`bytearray`. |
| 93 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 94 | .. function:: getstate() |
| 95 | |
| 96 | Return an object capturing the current internal state of the generator. This |
| 97 | object can be passed to :func:`setstate` to restore the state. |
| 98 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 99 | |
| 100 | .. function:: setstate(state) |
| 101 | |
| 102 | *state* should have been obtained from a previous call to :func:`getstate`, and |
| 103 | :func:`setstate` restores the internal state of the generator to what it was at |
Sandro Tosi | 985104a | 2012-08-12 15:12:15 +0200 | [diff] [blame] | 104 | the time :func:`getstate` was called. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 105 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 106 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 107 | .. function:: getrandbits(k) |
| 108 | |
Ezio Melotti | 0639d5a | 2009-12-19 23:26:38 +0000 | [diff] [blame] | 109 | Returns a Python integer with *k* random bits. This method is supplied with |
Georg Brandl | 5c10664 | 2007-11-29 17:41:05 +0000 | [diff] [blame] | 110 | the MersenneTwister generator and some other generators may also provide it |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 111 | as an optional part of the API. When available, :meth:`getrandbits` enables |
| 112 | :meth:`randrange` to handle arbitrarily large ranges. |
| 113 | |
Antoine Pitrou | 75a3378 | 2020-04-17 19:32:14 +0200 | [diff] [blame] | 114 | .. versionchanged:: 3.9 |
| 115 | This method now accepts zero for *k*. |
| 116 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 117 | |
Victor Stinner | 9f5fe79 | 2020-04-17 19:05:35 +0200 | [diff] [blame] | 118 | .. function:: randbytes(n) |
| 119 | |
| 120 | Generate *n* random bytes. |
| 121 | |
| 122 | .. versionadded:: 3.9 |
| 123 | |
| 124 | |
Raymond Hettinger | e132910 | 2016-11-21 12:33:50 -0800 | [diff] [blame] | 125 | Functions for integers |
| 126 | ---------------------- |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 127 | |
Ezio Melotti | e0add76 | 2012-09-14 06:32:35 +0300 | [diff] [blame] | 128 | .. function:: randrange(stop) |
| 129 | randrange(start, stop[, step]) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 130 | |
| 131 | Return a randomly selected element from ``range(start, stop, step)``. This is |
| 132 | equivalent to ``choice(range(start, stop, step))``, but doesn't actually build a |
| 133 | range object. |
| 134 | |
Raymond Hettinger | 0515661 | 2010-09-07 04:44:52 +0000 | [diff] [blame] | 135 | The positional argument pattern matches that of :func:`range`. Keyword arguments |
| 136 | should not be used because the function may use them in unexpected ways. |
| 137 | |
| 138 | .. versionchanged:: 3.2 |
| 139 | :meth:`randrange` is more sophisticated about producing equally distributed |
| 140 | values. Formerly it used a style like ``int(random()*n)`` which could produce |
| 141 | slightly uneven distributions. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 142 | |
| 143 | .. function:: randint(a, b) |
| 144 | |
Raymond Hettinger | afd3045 | 2009-02-24 10:57:02 +0000 | [diff] [blame] | 145 | Return a random integer *N* such that ``a <= N <= b``. Alias for |
| 146 | ``randrange(a, b+1)``. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 147 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 148 | |
Raymond Hettinger | e132910 | 2016-11-21 12:33:50 -0800 | [diff] [blame] | 149 | Functions for sequences |
| 150 | ----------------------- |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 151 | |
| 152 | .. function:: choice(seq) |
| 153 | |
| 154 | Return a random element from the non-empty sequence *seq*. If *seq* is empty, |
| 155 | raises :exc:`IndexError`. |
| 156 | |
Raymond Hettinger | 9016f28 | 2016-09-26 21:45:57 -0700 | [diff] [blame] | 157 | .. function:: choices(population, weights=None, *, cum_weights=None, k=1) |
Raymond Hettinger | e8f1e00 | 2016-09-06 17:15:29 -0700 | [diff] [blame] | 158 | |
| 159 | Return a *k* sized list of elements chosen from the *population* with replacement. |
| 160 | If the *population* is empty, raises :exc:`IndexError`. |
| 161 | |
| 162 | If a *weights* sequence is specified, selections are made according to the |
| 163 | relative weights. Alternatively, if a *cum_weights* sequence is given, the |
Raymond Hettinger | 1c3a121 | 2016-10-12 01:42:10 -0400 | [diff] [blame] | 164 | selections are made according to the cumulative weights (perhaps computed |
| 165 | using :func:`itertools.accumulate`). For example, the relative weights |
| 166 | ``[10, 5, 30, 5]`` are equivalent to the cumulative weights |
| 167 | ``[10, 15, 45, 50]``. Internally, the relative weights are converted to |
| 168 | cumulative weights before making selections, so supplying the cumulative |
| 169 | weights saves work. |
Raymond Hettinger | e8f1e00 | 2016-09-06 17:15:29 -0700 | [diff] [blame] | 170 | |
| 171 | If neither *weights* nor *cum_weights* are specified, selections are made |
| 172 | with equal probability. If a weights sequence is supplied, it must be |
| 173 | the same length as the *population* sequence. It is a :exc:`TypeError` |
| 174 | to specify both *weights* and *cum_weights*. |
| 175 | |
| 176 | The *weights* or *cum_weights* can use any numeric type that interoperates |
| 177 | with the :class:`float` values returned by :func:`random` (that includes |
Raymond Hettinger | 041d8b4 | 2019-11-23 02:22:13 -0800 | [diff] [blame] | 178 | integers, floats, and fractions but excludes decimals). Behavior is |
| 179 | undefined if any weight is negative. A :exc:`ValueError` is raised if all |
| 180 | weights are zero. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 181 | |
Raymond Hettinger | 40ebe94 | 2019-01-30 13:30:20 -0800 | [diff] [blame] | 182 | For a given seed, the :func:`choices` function with equal weighting |
| 183 | typically produces a different sequence than repeated calls to |
| 184 | :func:`choice`. The algorithm used by :func:`choices` uses floating |
| 185 | point arithmetic for internal consistency and speed. The algorithm used |
| 186 | by :func:`choice` defaults to integer arithmetic with repeated selections |
| 187 | to avoid small biases from round-off error. |
| 188 | |
Raymond Hettinger | 1c3a121 | 2016-10-12 01:42:10 -0400 | [diff] [blame] | 189 | .. versionadded:: 3.6 |
| 190 | |
Raymond Hettinger | 041d8b4 | 2019-11-23 02:22:13 -0800 | [diff] [blame] | 191 | .. versionchanged:: 3.9 |
| 192 | Raises a :exc:`ValueError` if all weights are zero. |
| 193 | |
Raymond Hettinger | 1c3a121 | 2016-10-12 01:42:10 -0400 | [diff] [blame] | 194 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 195 | .. function:: shuffle(x[, random]) |
| 196 | |
Raymond Hettinger | a3950e4 | 2016-11-17 01:49:54 -0800 | [diff] [blame] | 197 | Shuffle the sequence *x* in place. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 198 | |
Raymond Hettinger | a3950e4 | 2016-11-17 01:49:54 -0800 | [diff] [blame] | 199 | The optional argument *random* is a 0-argument function returning a random |
| 200 | float in [0.0, 1.0); by default, this is the function :func:`.random`. |
| 201 | |
| 202 | To shuffle an immutable sequence and return a new shuffled list, use |
| 203 | ``sample(x, k=len(x))`` instead. |
| 204 | |
| 205 | Note that even for small ``len(x)``, the total number of permutations of *x* |
| 206 | can quickly grow larger than the period of most random number generators. |
| 207 | This implies that most permutations of a long sequence can never be |
| 208 | generated. For example, a sequence of length 2080 is the largest that |
| 209 | can fit within the period of the Mersenne Twister random number generator. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 210 | |
| 211 | |
| 212 | .. function:: sample(population, k) |
| 213 | |
Raymond Hettinger | 1acde19 | 2008-01-14 01:00:53 +0000 | [diff] [blame] | 214 | Return a *k* length list of unique elements chosen from the population sequence |
| 215 | or set. Used for random sampling without replacement. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 216 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 217 | Returns a new list containing elements from the population while leaving the |
| 218 | original population unchanged. The resulting list is in selection order so that |
| 219 | all sub-slices will also be valid random samples. This allows raffle winners |
| 220 | (the sample) to be partitioned into grand prize and second place winners (the |
| 221 | subslices). |
| 222 | |
Guido van Rossum | 2cc30da | 2007-11-02 23:46:40 +0000 | [diff] [blame] | 223 | Members of the population need not be :term:`hashable` or unique. If the population |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 224 | contains repeats, then each occurrence is a possible selection in the sample. |
| 225 | |
Raymond Hettinger | a3950e4 | 2016-11-17 01:49:54 -0800 | [diff] [blame] | 226 | To choose a sample from a range of integers, use a :func:`range` object as an |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 227 | argument. This is especially fast and space efficient for sampling from a large |
Raymond Hettinger | a3950e4 | 2016-11-17 01:49:54 -0800 | [diff] [blame] | 228 | population: ``sample(range(10000000), k=60)``. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 229 | |
Raymond Hettinger | f07d949 | 2012-07-09 12:43:57 -0700 | [diff] [blame] | 230 | If the sample size is larger than the population size, a :exc:`ValueError` |
Raymond Hettinger | 86a20f8 | 2012-07-08 16:01:53 -0700 | [diff] [blame] | 231 | is raised. |
| 232 | |
Raymond Hettinger | 4fe0020 | 2020-04-19 00:36:42 -0700 | [diff] [blame^] | 233 | .. deprecated:: 3.9 |
| 234 | In the future, the *population* must be a sequence. Instances of |
| 235 | :class:`set` are no longer supported. The set must first be converted |
| 236 | to a :class:`list` or :class:`tuple`, preferably in a deterministic |
| 237 | order so that the sample is reproducible. |
| 238 | |
| 239 | |
Raymond Hettinger | e132910 | 2016-11-21 12:33:50 -0800 | [diff] [blame] | 240 | Real-valued distributions |
| 241 | ------------------------- |
| 242 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 243 | The following functions generate specific real-valued distributions. Function |
| 244 | parameters are named after the corresponding variables in the distribution's |
| 245 | equation, as used in common mathematical practice; most of these equations can |
| 246 | be found in any statistics text. |
| 247 | |
| 248 | |
| 249 | .. function:: random() |
| 250 | |
| 251 | Return the next random floating point number in the range [0.0, 1.0). |
| 252 | |
| 253 | |
| 254 | .. function:: uniform(a, b) |
| 255 | |
Benjamin Peterson | b58dda7 | 2009-01-18 22:27:04 +0000 | [diff] [blame] | 256 | Return a random floating point number *N* such that ``a <= N <= b`` for |
| 257 | ``a <= b`` and ``b <= N <= a`` for ``b < a``. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 258 | |
Raymond Hettinger | be40db0 | 2009-06-11 23:12:14 +0000 | [diff] [blame] | 259 | The end-point value ``b`` may or may not be included in the range |
| 260 | depending on floating-point rounding in the equation ``a + (b-a) * random()``. |
Benjamin Peterson | 35e8c46 | 2008-04-24 02:34:53 +0000 | [diff] [blame] | 261 | |
Georg Brandl | 73dd7c7 | 2011-09-17 20:36:28 +0200 | [diff] [blame] | 262 | |
Christian Heimes | fe337bf | 2008-03-23 21:54:12 +0000 | [diff] [blame] | 263 | .. function:: triangular(low, high, mode) |
| 264 | |
Benjamin Peterson | b58dda7 | 2009-01-18 22:27:04 +0000 | [diff] [blame] | 265 | Return a random floating point number *N* such that ``low <= N <= high`` and |
Christian Heimes | cc47b05 | 2008-03-25 14:56:36 +0000 | [diff] [blame] | 266 | with the specified *mode* between those bounds. The *low* and *high* bounds |
| 267 | default to zero and one. The *mode* argument defaults to the midpoint |
| 268 | between the bounds, giving a symmetric distribution. |
Christian Heimes | fe337bf | 2008-03-23 21:54:12 +0000 | [diff] [blame] | 269 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 270 | |
| 271 | .. function:: betavariate(alpha, beta) |
| 272 | |
Benjamin Peterson | b58dda7 | 2009-01-18 22:27:04 +0000 | [diff] [blame] | 273 | Beta distribution. Conditions on the parameters are ``alpha > 0`` and |
| 274 | ``beta > 0``. Returned values range between 0 and 1. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 275 | |
| 276 | |
| 277 | .. function:: expovariate(lambd) |
| 278 | |
Mark Dickinson | 2f94736 | 2009-01-07 17:54:07 +0000 | [diff] [blame] | 279 | Exponential distribution. *lambd* is 1.0 divided by the desired |
| 280 | mean. It should be nonzero. (The parameter would be called |
| 281 | "lambda", but that is a reserved word in Python.) Returned values |
| 282 | range from 0 to positive infinity if *lambd* is positive, and from |
| 283 | negative infinity to 0 if *lambd* is negative. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 284 | |
| 285 | |
| 286 | .. function:: gammavariate(alpha, beta) |
| 287 | |
Benjamin Peterson | b58dda7 | 2009-01-18 22:27:04 +0000 | [diff] [blame] | 288 | Gamma distribution. (*Not* the gamma function!) Conditions on the |
| 289 | parameters are ``alpha > 0`` and ``beta > 0``. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 290 | |
Georg Brandl | 73dd7c7 | 2011-09-17 20:36:28 +0200 | [diff] [blame] | 291 | The probability distribution function is:: |
| 292 | |
| 293 | x ** (alpha - 1) * math.exp(-x / beta) |
| 294 | pdf(x) = -------------------------------------- |
| 295 | math.gamma(alpha) * beta ** alpha |
| 296 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 297 | |
| 298 | .. function:: gauss(mu, sigma) |
| 299 | |
Benjamin Peterson | b58dda7 | 2009-01-18 22:27:04 +0000 | [diff] [blame] | 300 | Gaussian distribution. *mu* is the mean, and *sigma* is the standard |
| 301 | deviation. This is slightly faster than the :func:`normalvariate` function |
| 302 | defined below. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 303 | |
| 304 | |
| 305 | .. function:: lognormvariate(mu, sigma) |
| 306 | |
| 307 | Log normal distribution. If you take the natural logarithm of this |
| 308 | distribution, you'll get a normal distribution with mean *mu* and standard |
| 309 | deviation *sigma*. *mu* can have any value, and *sigma* must be greater than |
| 310 | zero. |
| 311 | |
| 312 | |
| 313 | .. function:: normalvariate(mu, sigma) |
| 314 | |
| 315 | Normal distribution. *mu* is the mean, and *sigma* is the standard deviation. |
| 316 | |
| 317 | |
| 318 | .. function:: vonmisesvariate(mu, kappa) |
| 319 | |
| 320 | *mu* is the mean angle, expressed in radians between 0 and 2\*\ *pi*, and *kappa* |
| 321 | is the concentration parameter, which must be greater than or equal to zero. If |
| 322 | *kappa* is equal to zero, this distribution reduces to a uniform random angle |
| 323 | over the range 0 to 2\*\ *pi*. |
| 324 | |
| 325 | |
| 326 | .. function:: paretovariate(alpha) |
| 327 | |
| 328 | Pareto distribution. *alpha* is the shape parameter. |
| 329 | |
| 330 | |
| 331 | .. function:: weibullvariate(alpha, beta) |
| 332 | |
| 333 | Weibull distribution. *alpha* is the scale parameter and *beta* is the shape |
| 334 | parameter. |
| 335 | |
| 336 | |
Raymond Hettinger | e132910 | 2016-11-21 12:33:50 -0800 | [diff] [blame] | 337 | Alternative Generator |
| 338 | --------------------- |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 339 | |
Matthias Bussonnier | 31e8d69 | 2019-04-16 09:47:11 -0700 | [diff] [blame] | 340 | .. class:: Random([seed]) |
| 341 | |
| 342 | Class that implements the default pseudo-random number generator used by the |
| 343 | :mod:`random` module. |
| 344 | |
Raymond Hettinger | d0cdeaa | 2019-08-22 09:19:36 -0700 | [diff] [blame] | 345 | .. deprecated:: 3.9 |
| 346 | In the future, the *seed* must be one of the following types: |
| 347 | :class:`NoneType`, :class:`int`, :class:`float`, :class:`str`, |
| 348 | :class:`bytes`, or :class:`bytearray`. |
| 349 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 350 | .. class:: SystemRandom([seed]) |
| 351 | |
| 352 | Class that uses the :func:`os.urandom` function for generating random numbers |
| 353 | from sources provided by the operating system. Not available on all systems. |
Raymond Hettinger | 3cdf871 | 2010-12-02 05:35:35 +0000 | [diff] [blame] | 354 | Does not rely on software state, and sequences are not reproducible. Accordingly, |
Raymond Hettinger | afd3045 | 2009-02-24 10:57:02 +0000 | [diff] [blame] | 355 | the :meth:`seed` method has no effect and is ignored. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 356 | The :meth:`getstate` and :meth:`setstate` methods raise |
| 357 | :exc:`NotImplementedError` if called. |
| 358 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 359 | |
Raymond Hettinger | 435cb0f | 2010-09-06 23:36:31 +0000 | [diff] [blame] | 360 | Notes on Reproducibility |
Antoine Pitrou | e72b586 | 2010-12-12 20:13:31 +0000 | [diff] [blame] | 361 | ------------------------ |
Raymond Hettinger | 435cb0f | 2010-09-06 23:36:31 +0000 | [diff] [blame] | 362 | |
Julien Palard | 58a4054 | 2020-01-31 10:50:14 +0100 | [diff] [blame] | 363 | Sometimes it is useful to be able to reproduce the sequences given by a |
| 364 | pseudo-random number generator. By re-using a seed value, the same sequence should be |
Raymond Hettinger | 435cb0f | 2010-09-06 23:36:31 +0000 | [diff] [blame] | 365 | reproducible from run to run as long as multiple threads are not running. |
| 366 | |
| 367 | Most of the random module's algorithms and seeding functions are subject to |
| 368 | change across Python versions, but two aspects are guaranteed not to change: |
| 369 | |
| 370 | * If a new seeding method is added, then a backward compatible seeder will be |
| 371 | offered. |
| 372 | |
Georg Brandl | 92849d1 | 2016-02-19 08:57:38 +0100 | [diff] [blame] | 373 | * The generator's :meth:`~Random.random` method will continue to produce the same |
Raymond Hettinger | 435cb0f | 2010-09-06 23:36:31 +0000 | [diff] [blame] | 374 | sequence when the compatible seeder is given the same seed. |
Raymond Hettinger | 2fdc7b1 | 2010-12-02 02:41:33 +0000 | [diff] [blame] | 375 | |
Raymond Hettinger | 6e35394 | 2010-12-04 23:42:12 +0000 | [diff] [blame] | 376 | .. _random-examples: |
Raymond Hettinger | 2fdc7b1 | 2010-12-02 02:41:33 +0000 | [diff] [blame] | 377 | |
Raymond Hettinger | 2fdc7b1 | 2010-12-02 02:41:33 +0000 | [diff] [blame] | 378 | Examples and Recipes |
Antoine Pitrou | e72b586 | 2010-12-12 20:13:31 +0000 | [diff] [blame] | 379 | -------------------- |
Raymond Hettinger | 2fdc7b1 | 2010-12-02 02:41:33 +0000 | [diff] [blame] | 380 | |
Raymond Hettinger | 0a1a909 | 2016-11-17 00:45:35 -0800 | [diff] [blame] | 381 | Basic examples:: |
Raymond Hettinger | 3cdf871 | 2010-12-02 05:35:35 +0000 | [diff] [blame] | 382 | |
Raymond Hettinger | 0a1a909 | 2016-11-17 00:45:35 -0800 | [diff] [blame] | 383 | >>> random() # Random float: 0.0 <= x < 1.0 |
Raymond Hettinger | 3cdf871 | 2010-12-02 05:35:35 +0000 | [diff] [blame] | 384 | 0.37444887175646646 |
| 385 | |
Raymond Hettinger | e132910 | 2016-11-21 12:33:50 -0800 | [diff] [blame] | 386 | >>> uniform(2.5, 10.0) # Random float: 2.5 <= x < 10.0 |
Raymond Hettinger | 0a1a909 | 2016-11-17 00:45:35 -0800 | [diff] [blame] | 387 | 3.1800146073117523 |
Raymond Hettinger | 3cdf871 | 2010-12-02 05:35:35 +0000 | [diff] [blame] | 388 | |
Raymond Hettinger | e132910 | 2016-11-21 12:33:50 -0800 | [diff] [blame] | 389 | >>> expovariate(1 / 5) # Interval between arrivals averaging 5 seconds |
Raymond Hettinger | 0a1a909 | 2016-11-17 00:45:35 -0800 | [diff] [blame] | 390 | 5.148957571865031 |
| 391 | |
Raymond Hettinger | e132910 | 2016-11-21 12:33:50 -0800 | [diff] [blame] | 392 | >>> randrange(10) # Integer from 0 to 9 inclusive |
Raymond Hettinger | 3cdf871 | 2010-12-02 05:35:35 +0000 | [diff] [blame] | 393 | 7 |
| 394 | |
Raymond Hettinger | 0a1a909 | 2016-11-17 00:45:35 -0800 | [diff] [blame] | 395 | >>> randrange(0, 101, 2) # Even integer from 0 to 100 inclusive |
Raymond Hettinger | 3cdf871 | 2010-12-02 05:35:35 +0000 | [diff] [blame] | 396 | 26 |
| 397 | |
Raymond Hettinger | 6befb64 | 2016-11-21 01:59:39 -0800 | [diff] [blame] | 398 | >>> choice(['win', 'lose', 'draw']) # Single random element from a sequence |
| 399 | 'draw' |
Raymond Hettinger | 3cdf871 | 2010-12-02 05:35:35 +0000 | [diff] [blame] | 400 | |
Raymond Hettinger | 0a1a909 | 2016-11-17 00:45:35 -0800 | [diff] [blame] | 401 | >>> deck = 'ace two three four'.split() |
| 402 | >>> shuffle(deck) # Shuffle a list |
Raymond Hettinger | 1c3a121 | 2016-10-12 01:42:10 -0400 | [diff] [blame] | 403 | >>> deck |
Raymond Hettinger | 0a1a909 | 2016-11-17 00:45:35 -0800 | [diff] [blame] | 404 | ['four', 'two', 'ace', 'three'] |
Raymond Hettinger | 3cdf871 | 2010-12-02 05:35:35 +0000 | [diff] [blame] | 405 | |
Raymond Hettinger | 0a1a909 | 2016-11-17 00:45:35 -0800 | [diff] [blame] | 406 | >>> sample([10, 20, 30, 40, 50], k=4) # Four samples without replacement |
| 407 | [40, 10, 50, 30] |
Raymond Hettinger | 3cdf871 | 2010-12-02 05:35:35 +0000 | [diff] [blame] | 408 | |
Raymond Hettinger | 0a1a909 | 2016-11-17 00:45:35 -0800 | [diff] [blame] | 409 | Simulations:: |
| 410 | |
Raymond Hettinger | 71c62e1 | 2016-12-04 11:00:34 -0800 | [diff] [blame] | 411 | >>> # Six roulette wheel spins (weighted sampling with replacement) |
Raymond Hettinger | 1c3a121 | 2016-10-12 01:42:10 -0400 | [diff] [blame] | 412 | >>> choices(['red', 'black', 'green'], [18, 18, 2], k=6) |
| 413 | ['red', 'green', 'black', 'black', 'red', 'black'] |
Raymond Hettinger | 2fdc7b1 | 2010-12-02 02:41:33 +0000 | [diff] [blame] | 414 | |
Raymond Hettinger | 71c62e1 | 2016-12-04 11:00:34 -0800 | [diff] [blame] | 415 | >>> # Deal 20 cards without replacement from a deck of 52 playing cards |
| 416 | >>> # and determine the proportion of cards with a ten-value |
| 417 | >>> # (a ten, jack, queen, or king). |
Raymond Hettinger | 0a1a909 | 2016-11-17 00:45:35 -0800 | [diff] [blame] | 418 | >>> deck = collections.Counter(tens=16, low_cards=36) |
| 419 | >>> seen = sample(list(deck.elements()), k=20) |
Raymond Hettinger | 71c62e1 | 2016-12-04 11:00:34 -0800 | [diff] [blame] | 420 | >>> seen.count('tens') / 20 |
Raymond Hettinger | 0a1a909 | 2016-11-17 00:45:35 -0800 | [diff] [blame] | 421 | 0.15 |
| 422 | |
Raymond Hettinger | 71c62e1 | 2016-12-04 11:00:34 -0800 | [diff] [blame] | 423 | >>> # Estimate the probability of getting 5 or more heads from 7 spins |
| 424 | >>> # of a biased coin that settles on heads 60% of the time. |
Raymond Hettinger | 9abb725 | 2019-02-15 12:40:18 -0800 | [diff] [blame] | 425 | >>> def trial(): |
| 426 | ... return choices('HT', cum_weights=(0.60, 1.00), k=7).count('H') >= 5 |
| 427 | ... |
Raymond Hettinger | 71c62e1 | 2016-12-04 11:00:34 -0800 | [diff] [blame] | 428 | >>> sum(trial() for i in range(10000)) / 10000 |
Raymond Hettinger | 16ef5d4 | 2016-10-31 22:53:52 -0700 | [diff] [blame] | 429 | 0.4169 |
| 430 | |
Raymond Hettinger | 71c62e1 | 2016-12-04 11:00:34 -0800 | [diff] [blame] | 431 | >>> # Probability of the median of 5 samples being in middle two quartiles |
Raymond Hettinger | 9abb725 | 2019-02-15 12:40:18 -0800 | [diff] [blame] | 432 | >>> def trial(): |
| 433 | ... return 2500 <= sorted(choices(range(10000), k=5))[2] < 7500 |
| 434 | ... |
Raymond Hettinger | 71c62e1 | 2016-12-04 11:00:34 -0800 | [diff] [blame] | 435 | >>> sum(trial() for i in range(10000)) / 10000 |
| 436 | 0.7958 |
| 437 | |
Raymond Hettinger | 1c3a121 | 2016-10-12 01:42:10 -0400 | [diff] [blame] | 438 | Example of `statistical bootstrapping |
| 439 | <https://en.wikipedia.org/wiki/Bootstrapping_(statistics)>`_ using resampling |
Raymond Hettinger | 0a1a909 | 2016-11-17 00:45:35 -0800 | [diff] [blame] | 440 | with replacement to estimate a confidence interval for the mean of a sample of |
| 441 | size five:: |
Raymond Hettinger | 2fdc7b1 | 2010-12-02 02:41:33 +0000 | [diff] [blame] | 442 | |
Raymond Hettinger | 1c3a121 | 2016-10-12 01:42:10 -0400 | [diff] [blame] | 443 | # http://statistics.about.com/od/Applications/a/Example-Of-Bootstrapping.htm |
Raymond Hettinger | 47d9987 | 2019-02-21 15:06:29 -0800 | [diff] [blame] | 444 | from statistics import fmean as mean |
Raymond Hettinger | 1c3a121 | 2016-10-12 01:42:10 -0400 | [diff] [blame] | 445 | from random import choices |
Raymond Hettinger | f5b7c7b | 2016-09-05 13:15:02 -0700 | [diff] [blame] | 446 | |
Raymond Hettinger | 1c3a121 | 2016-10-12 01:42:10 -0400 | [diff] [blame] | 447 | data = 1, 2, 4, 4, 10 |
| 448 | means = sorted(mean(choices(data, k=5)) for i in range(20)) |
Raymond Hettinger | 2589ee3 | 2016-11-16 21:34:17 -0800 | [diff] [blame] | 449 | print(f'The sample mean of {mean(data):.1f} has a 90% confidence ' |
| 450 | f'interval from {means[1]:.1f} to {means[-2]:.1f}') |
| 451 | |
Raymond Hettinger | 00305ad | 2016-11-16 22:56:11 -0800 | [diff] [blame] | 452 | Example of a `resampling permutation test |
| 453 | <https://en.wikipedia.org/wiki/Resampling_(statistics)#Permutation_tests>`_ |
| 454 | to determine the statistical significance or `p-value |
| 455 | <https://en.wikipedia.org/wiki/P-value>`_ of an observed difference |
| 456 | between the effects of a drug versus a placebo:: |
| 457 | |
| 458 | # Example from "Statistics is Easy" by Dennis Shasha and Manda Wilson |
Raymond Hettinger | 47d9987 | 2019-02-21 15:06:29 -0800 | [diff] [blame] | 459 | from statistics import fmean as mean |
Raymond Hettinger | 00305ad | 2016-11-16 22:56:11 -0800 | [diff] [blame] | 460 | from random import shuffle |
| 461 | |
| 462 | drug = [54, 73, 53, 70, 73, 68, 52, 65, 65] |
| 463 | placebo = [54, 51, 58, 44, 55, 52, 42, 47, 58, 46] |
| 464 | observed_diff = mean(drug) - mean(placebo) |
| 465 | |
| 466 | n = 10000 |
| 467 | count = 0 |
| 468 | combined = drug + placebo |
| 469 | for i in range(n): |
| 470 | shuffle(combined) |
| 471 | new_diff = mean(combined[:len(drug)]) - mean(combined[len(drug):]) |
| 472 | count += (new_diff >= observed_diff) |
| 473 | |
| 474 | print(f'{n} label reshufflings produced only {count} instances with a difference') |
| 475 | print(f'at least as extreme as the observed difference of {observed_diff:.1f}.') |
| 476 | print(f'The one-sided p-value of {count / n:.4f} leads us to reject the null') |
Raymond Hettinger | 6befb64 | 2016-11-21 01:59:39 -0800 | [diff] [blame] | 477 | print(f'hypothesis that there is no difference between the drug and the placebo.') |
| 478 | |
| 479 | Simulation of arrival times and service deliveries in a single server queue:: |
| 480 | |
Raymond Hettinger | 1149d93 | 2016-11-21 14:13:07 -0800 | [diff] [blame] | 481 | from random import expovariate, gauss |
| 482 | from statistics import mean, median, stdev |
Raymond Hettinger | 6befb64 | 2016-11-21 01:59:39 -0800 | [diff] [blame] | 483 | |
| 484 | average_arrival_interval = 5.6 |
| 485 | average_service_time = 5.0 |
| 486 | stdev_service_time = 0.5 |
| 487 | |
| 488 | num_waiting = 0 |
Raymond Hettinger | 1149d93 | 2016-11-21 14:13:07 -0800 | [diff] [blame] | 489 | arrivals = [] |
| 490 | starts = [] |
Raymond Hettinger | 6befb64 | 2016-11-21 01:59:39 -0800 | [diff] [blame] | 491 | arrival = service_end = 0.0 |
Raymond Hettinger | 8ab1258 | 2016-11-21 10:16:01 -0800 | [diff] [blame] | 492 | for i in range(20000): |
| 493 | if arrival <= service_end: |
| 494 | num_waiting += 1 |
| 495 | arrival += expovariate(1.0 / average_arrival_interval) |
Raymond Hettinger | 1149d93 | 2016-11-21 14:13:07 -0800 | [diff] [blame] | 496 | arrivals.append(arrival) |
Raymond Hettinger | 8ab1258 | 2016-11-21 10:16:01 -0800 | [diff] [blame] | 497 | else: |
Raymond Hettinger | 6befb64 | 2016-11-21 01:59:39 -0800 | [diff] [blame] | 498 | num_waiting -= 1 |
| 499 | service_start = service_end if num_waiting else arrival |
| 500 | service_time = gauss(average_service_time, stdev_service_time) |
| 501 | service_end = service_start + service_time |
Raymond Hettinger | 1149d93 | 2016-11-21 14:13:07 -0800 | [diff] [blame] | 502 | starts.append(service_start) |
| 503 | |
| 504 | waits = [start - arrival for arrival, start in zip(arrivals, starts)] |
| 505 | print(f'Mean wait: {mean(waits):.1f}. Stdev wait: {stdev(waits):.1f}.') |
| 506 | print(f'Median wait: {median(waits):.1f}. Max wait: {max(waits):.1f}.') |
Raymond Hettinger | 6befb64 | 2016-11-21 01:59:39 -0800 | [diff] [blame] | 507 | |
Raymond Hettinger | 0537405 | 2016-11-21 10:52:04 -0800 | [diff] [blame] | 508 | .. seealso:: |
| 509 | |
| 510 | `Statistics for Hackers <https://www.youtube.com/watch?v=Iq9DzN6mvYA>`_ |
| 511 | a video tutorial by |
| 512 | `Jake Vanderplas <https://us.pycon.org/2016/speaker/profile/295/>`_ |
| 513 | on statistical analysis using just a few fundamental concepts |
| 514 | including simulation, sampling, shuffling, and cross-validation. |
| 515 | |
| 516 | `Economics Simulation |
| 517 | <http://nbviewer.jupyter.org/url/norvig.com/ipython/Economics.ipynb>`_ |
| 518 | a simulation of a marketplace by |
| 519 | `Peter Norvig <http://norvig.com/bio.html>`_ that shows effective |
Raymond Hettinger | 7f94619 | 2016-11-21 15:13:18 -0800 | [diff] [blame] | 520 | use of many of the tools and distributions provided by this module |
Raymond Hettinger | 0537405 | 2016-11-21 10:52:04 -0800 | [diff] [blame] | 521 | (gauss, uniform, sample, betavariate, choice, triangular, and randrange). |
| 522 | |
| 523 | `A Concrete Introduction to Probability (using Python) |
| 524 | <http://nbviewer.jupyter.org/url/norvig.com/ipython/Probability.ipynb>`_ |
| 525 | a tutorial by `Peter Norvig <http://norvig.com/bio.html>`_ covering |
| 526 | the basics of probability theory, how to write simulations, and |
Raymond Hettinger | 7f94619 | 2016-11-21 15:13:18 -0800 | [diff] [blame] | 527 | how to perform data analysis using Python. |