Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 1 | :mod:`random` --- Generate pseudo-random numbers |
| 2 | ================================================ |
| 3 | |
| 4 | .. module:: random |
| 5 | :synopsis: Generate pseudo-random numbers with various common distributions. |
| 6 | |
Raymond Hettinger | 1048094 | 2011-01-10 03:26:08 +0000 | [diff] [blame] | 7 | **Source code:** :source:`Lib/random.py` |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 8 | |
Raymond Hettinger | 4f707fd | 2011-01-10 19:54:11 +0000 | [diff] [blame] | 9 | -------------- |
| 10 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 11 | This module implements pseudo-random number generators for various |
| 12 | distributions. |
| 13 | |
Raymond Hettinger | b21dac1 | 2010-09-07 05:32:49 +0000 | [diff] [blame] | 14 | For integers, there is uniform selection from a range. For sequences, there is |
| 15 | uniform selection of a random element, a function to generate a random |
| 16 | permutation of a list in-place, and a function for random sampling without |
| 17 | replacement. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 18 | |
| 19 | On the real line, there are functions to compute uniform, normal (Gaussian), |
| 20 | lognormal, negative exponential, gamma, and beta distributions. For generating |
| 21 | distributions of angles, the von Mises distribution is available. |
| 22 | |
Georg Brandl | 92849d1 | 2016-02-19 08:57:38 +0100 | [diff] [blame] | 23 | Almost all module functions depend on the basic function :func:`.random`, which |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 24 | generates a random float uniformly in the semi-open range [0.0, 1.0). Python |
| 25 | uses the Mersenne Twister as the core generator. It produces 53-bit precision |
| 26 | floats and has a period of 2\*\*19937-1. The underlying implementation in C is |
| 27 | both fast and threadsafe. The Mersenne Twister is one of the most extensively |
| 28 | tested random number generators in existence. However, being completely |
| 29 | deterministic, it is not suitable for all purposes, and is completely unsuitable |
| 30 | for cryptographic purposes. |
| 31 | |
| 32 | The functions supplied by this module are actually bound methods of a hidden |
| 33 | instance of the :class:`random.Random` class. You can instantiate your own |
Raymond Hettinger | 28de64f | 2008-01-13 23:40:30 +0000 | [diff] [blame] | 34 | instances of :class:`Random` to get generators that don't share state. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 35 | |
| 36 | Class :class:`Random` can also be subclassed if you want to use a different |
Georg Brandl | 92849d1 | 2016-02-19 08:57:38 +0100 | [diff] [blame] | 37 | basic generator of your own devising: in that case, override the :meth:`~Random.random`, |
| 38 | :meth:`~Random.seed`, :meth:`~Random.getstate`, and :meth:`~Random.setstate` methods. |
| 39 | Optionally, a new generator can supply a :meth:`~Random.getrandbits` method --- this |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 40 | allows :meth:`randrange` to produce selections over an arbitrarily large range. |
| 41 | |
Benjamin Peterson | 21896a3 | 2010-03-21 22:03:03 +0000 | [diff] [blame] | 42 | The :mod:`random` module also provides the :class:`SystemRandom` class which |
| 43 | uses the system function :func:`os.urandom` to generate random numbers |
| 44 | from sources provided by the operating system. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 45 | |
Raymond Hettinger | c89a451 | 2014-05-11 02:26:23 -0700 | [diff] [blame] | 46 | .. warning:: |
| 47 | |
| 48 | The pseudo-random generators of this module should not be used for |
Steven D'Aprano | b2871fa | 2016-04-17 01:42:33 +1000 | [diff] [blame] | 49 | security purposes. For security or cryptographic uses, see the |
| 50 | :mod:`secrets` module. |
Raymond Hettinger | c89a451 | 2014-05-11 02:26:23 -0700 | [diff] [blame] | 51 | |
Raymond Hettinger | e132910 | 2016-11-21 12:33:50 -0800 | [diff] [blame] | 52 | .. seealso:: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 53 | |
Raymond Hettinger | e132910 | 2016-11-21 12:33:50 -0800 | [diff] [blame] | 54 | M. Matsumoto and T. Nishimura, "Mersenne Twister: A 623-dimensionally |
| 55 | equidistributed uniform pseudorandom number generator", ACM Transactions on |
Serhiy Storchaka | 0264e46 | 2016-11-26 13:49:59 +0200 | [diff] [blame] | 56 | Modeling and Computer Simulation Vol. 8, No. 1, January pp.3--30 1998. |
Raymond Hettinger | e132910 | 2016-11-21 12:33:50 -0800 | [diff] [blame] | 57 | |
| 58 | |
| 59 | `Complementary-Multiply-with-Carry recipe |
| 60 | <https://code.activestate.com/recipes/576707/>`_ for a compatible alternative |
| 61 | random number generator with a long period and comparatively simple update |
| 62 | operations. |
| 63 | |
| 64 | |
| 65 | Bookkeeping functions |
| 66 | --------------------- |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 67 | |
Ezio Melotti | e0add76 | 2012-09-14 06:32:35 +0300 | [diff] [blame] | 68 | .. function:: seed(a=None, version=2) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 69 | |
Raymond Hettinger | f763a72 | 2010-09-07 00:38:15 +0000 | [diff] [blame] | 70 | Initialize the random number generator. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 71 | |
Ezio Melotti | e0add76 | 2012-09-14 06:32:35 +0300 | [diff] [blame] | 72 | If *a* is omitted or ``None``, the current system time is used. If |
Raymond Hettinger | f763a72 | 2010-09-07 00:38:15 +0000 | [diff] [blame] | 73 | randomness sources are provided by the operating system, they are used |
| 74 | instead of the system time (see the :func:`os.urandom` function for details |
| 75 | on availability). |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 76 | |
Ezio Melotti | e0add76 | 2012-09-14 06:32:35 +0300 | [diff] [blame] | 77 | If *a* is an int, it is used directly. |
Raymond Hettinger | f763a72 | 2010-09-07 00:38:15 +0000 | [diff] [blame] | 78 | |
| 79 | With version 2 (the default), a :class:`str`, :class:`bytes`, or :class:`bytearray` |
Raymond Hettinger | 16eb827 | 2016-09-04 11:17:28 -0700 | [diff] [blame] | 80 | object gets converted to an :class:`int` and all of its bits are used. |
| 81 | |
| 82 | With version 1 (provided for reproducing random sequences from older versions |
| 83 | of Python), the algorithm for :class:`str` and :class:`bytes` generates a |
| 84 | narrower range of seeds. |
Raymond Hettinger | f763a72 | 2010-09-07 00:38:15 +0000 | [diff] [blame] | 85 | |
| 86 | .. versionchanged:: 3.2 |
| 87 | Moved to the version 2 scheme which uses all of the bits in a string seed. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 88 | |
| 89 | .. function:: getstate() |
| 90 | |
| 91 | Return an object capturing the current internal state of the generator. This |
| 92 | object can be passed to :func:`setstate` to restore the state. |
| 93 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 94 | |
| 95 | .. function:: setstate(state) |
| 96 | |
| 97 | *state* should have been obtained from a previous call to :func:`getstate`, and |
| 98 | :func:`setstate` restores the internal state of the generator to what it was at |
Sandro Tosi | 985104a | 2012-08-12 15:12:15 +0200 | [diff] [blame] | 99 | the time :func:`getstate` was called. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 100 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 101 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 102 | .. function:: getrandbits(k) |
| 103 | |
Ezio Melotti | 0639d5a | 2009-12-19 23:26:38 +0000 | [diff] [blame] | 104 | Returns a Python integer with *k* random bits. This method is supplied with |
Georg Brandl | 5c10664 | 2007-11-29 17:41:05 +0000 | [diff] [blame] | 105 | the MersenneTwister generator and some other generators may also provide it |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 106 | as an optional part of the API. When available, :meth:`getrandbits` enables |
| 107 | :meth:`randrange` to handle arbitrarily large ranges. |
| 108 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 109 | |
Raymond Hettinger | e132910 | 2016-11-21 12:33:50 -0800 | [diff] [blame] | 110 | Functions for integers |
| 111 | ---------------------- |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 112 | |
Ezio Melotti | e0add76 | 2012-09-14 06:32:35 +0300 | [diff] [blame] | 113 | .. function:: randrange(stop) |
| 114 | randrange(start, stop[, step]) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 115 | |
| 116 | Return a randomly selected element from ``range(start, stop, step)``. This is |
| 117 | equivalent to ``choice(range(start, stop, step))``, but doesn't actually build a |
| 118 | range object. |
| 119 | |
Raymond Hettinger | 0515661 | 2010-09-07 04:44:52 +0000 | [diff] [blame] | 120 | The positional argument pattern matches that of :func:`range`. Keyword arguments |
| 121 | should not be used because the function may use them in unexpected ways. |
| 122 | |
| 123 | .. versionchanged:: 3.2 |
| 124 | :meth:`randrange` is more sophisticated about producing equally distributed |
| 125 | values. Formerly it used a style like ``int(random()*n)`` which could produce |
| 126 | slightly uneven distributions. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 127 | |
| 128 | .. function:: randint(a, b) |
| 129 | |
Raymond Hettinger | afd3045 | 2009-02-24 10:57:02 +0000 | [diff] [blame] | 130 | Return a random integer *N* such that ``a <= N <= b``. Alias for |
| 131 | ``randrange(a, b+1)``. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 132 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 133 | |
Raymond Hettinger | e132910 | 2016-11-21 12:33:50 -0800 | [diff] [blame] | 134 | Functions for sequences |
| 135 | ----------------------- |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 136 | |
| 137 | .. function:: choice(seq) |
| 138 | |
| 139 | Return a random element from the non-empty sequence *seq*. If *seq* is empty, |
| 140 | raises :exc:`IndexError`. |
| 141 | |
Raymond Hettinger | 9016f28 | 2016-09-26 21:45:57 -0700 | [diff] [blame] | 142 | .. function:: choices(population, weights=None, *, cum_weights=None, k=1) |
Raymond Hettinger | e8f1e00 | 2016-09-06 17:15:29 -0700 | [diff] [blame] | 143 | |
| 144 | Return a *k* sized list of elements chosen from the *population* with replacement. |
| 145 | If the *population* is empty, raises :exc:`IndexError`. |
| 146 | |
| 147 | If a *weights* sequence is specified, selections are made according to the |
| 148 | relative weights. Alternatively, if a *cum_weights* sequence is given, the |
Raymond Hettinger | 1c3a121 | 2016-10-12 01:42:10 -0400 | [diff] [blame] | 149 | selections are made according to the cumulative weights (perhaps computed |
| 150 | using :func:`itertools.accumulate`). For example, the relative weights |
| 151 | ``[10, 5, 30, 5]`` are equivalent to the cumulative weights |
| 152 | ``[10, 15, 45, 50]``. Internally, the relative weights are converted to |
| 153 | cumulative weights before making selections, so supplying the cumulative |
| 154 | weights saves work. |
Raymond Hettinger | e8f1e00 | 2016-09-06 17:15:29 -0700 | [diff] [blame] | 155 | |
| 156 | If neither *weights* nor *cum_weights* are specified, selections are made |
| 157 | with equal probability. If a weights sequence is supplied, it must be |
| 158 | the same length as the *population* sequence. It is a :exc:`TypeError` |
| 159 | to specify both *weights* and *cum_weights*. |
| 160 | |
| 161 | The *weights* or *cum_weights* can use any numeric type that interoperates |
| 162 | with the :class:`float` values returned by :func:`random` (that includes |
| 163 | integers, floats, and fractions but excludes decimals). |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 164 | |
Raymond Hettinger | 1c3a121 | 2016-10-12 01:42:10 -0400 | [diff] [blame] | 165 | .. versionadded:: 3.6 |
| 166 | |
| 167 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 168 | .. function:: shuffle(x[, random]) |
| 169 | |
Raymond Hettinger | a3950e4 | 2016-11-17 01:49:54 -0800 | [diff] [blame] | 170 | Shuffle the sequence *x* in place. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 171 | |
Raymond Hettinger | a3950e4 | 2016-11-17 01:49:54 -0800 | [diff] [blame] | 172 | The optional argument *random* is a 0-argument function returning a random |
| 173 | float in [0.0, 1.0); by default, this is the function :func:`.random`. |
| 174 | |
| 175 | To shuffle an immutable sequence and return a new shuffled list, use |
| 176 | ``sample(x, k=len(x))`` instead. |
| 177 | |
| 178 | Note that even for small ``len(x)``, the total number of permutations of *x* |
| 179 | can quickly grow larger than the period of most random number generators. |
| 180 | This implies that most permutations of a long sequence can never be |
| 181 | generated. For example, a sequence of length 2080 is the largest that |
| 182 | can fit within the period of the Mersenne Twister random number generator. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 183 | |
| 184 | |
| 185 | .. function:: sample(population, k) |
| 186 | |
Raymond Hettinger | 1acde19 | 2008-01-14 01:00:53 +0000 | [diff] [blame] | 187 | Return a *k* length list of unique elements chosen from the population sequence |
| 188 | or set. Used for random sampling without replacement. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 189 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 190 | Returns a new list containing elements from the population while leaving the |
| 191 | original population unchanged. The resulting list is in selection order so that |
| 192 | all sub-slices will also be valid random samples. This allows raffle winners |
| 193 | (the sample) to be partitioned into grand prize and second place winners (the |
| 194 | subslices). |
| 195 | |
Guido van Rossum | 2cc30da | 2007-11-02 23:46:40 +0000 | [diff] [blame] | 196 | Members of the population need not be :term:`hashable` or unique. If the population |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 197 | contains repeats, then each occurrence is a possible selection in the sample. |
| 198 | |
Raymond Hettinger | a3950e4 | 2016-11-17 01:49:54 -0800 | [diff] [blame] | 199 | To choose a sample from a range of integers, use a :func:`range` object as an |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 200 | argument. This is especially fast and space efficient for sampling from a large |
Raymond Hettinger | a3950e4 | 2016-11-17 01:49:54 -0800 | [diff] [blame] | 201 | population: ``sample(range(10000000), k=60)``. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 202 | |
Raymond Hettinger | f07d949 | 2012-07-09 12:43:57 -0700 | [diff] [blame] | 203 | If the sample size is larger than the population size, a :exc:`ValueError` |
Raymond Hettinger | 86a20f8 | 2012-07-08 16:01:53 -0700 | [diff] [blame] | 204 | is raised. |
| 205 | |
Raymond Hettinger | e132910 | 2016-11-21 12:33:50 -0800 | [diff] [blame] | 206 | Real-valued distributions |
| 207 | ------------------------- |
| 208 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 209 | The following functions generate specific real-valued distributions. Function |
| 210 | parameters are named after the corresponding variables in the distribution's |
| 211 | equation, as used in common mathematical practice; most of these equations can |
| 212 | be found in any statistics text. |
| 213 | |
| 214 | |
| 215 | .. function:: random() |
| 216 | |
| 217 | Return the next random floating point number in the range [0.0, 1.0). |
| 218 | |
| 219 | |
| 220 | .. function:: uniform(a, b) |
| 221 | |
Benjamin Peterson | b58dda7 | 2009-01-18 22:27:04 +0000 | [diff] [blame] | 222 | Return a random floating point number *N* such that ``a <= N <= b`` for |
| 223 | ``a <= b`` and ``b <= N <= a`` for ``b < a``. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 224 | |
Raymond Hettinger | be40db0 | 2009-06-11 23:12:14 +0000 | [diff] [blame] | 225 | The end-point value ``b`` may or may not be included in the range |
| 226 | depending on floating-point rounding in the equation ``a + (b-a) * random()``. |
Benjamin Peterson | 35e8c46 | 2008-04-24 02:34:53 +0000 | [diff] [blame] | 227 | |
Georg Brandl | 73dd7c7 | 2011-09-17 20:36:28 +0200 | [diff] [blame] | 228 | |
Christian Heimes | fe337bf | 2008-03-23 21:54:12 +0000 | [diff] [blame] | 229 | .. function:: triangular(low, high, mode) |
| 230 | |
Benjamin Peterson | b58dda7 | 2009-01-18 22:27:04 +0000 | [diff] [blame] | 231 | Return a random floating point number *N* such that ``low <= N <= high`` and |
Christian Heimes | cc47b05 | 2008-03-25 14:56:36 +0000 | [diff] [blame] | 232 | with the specified *mode* between those bounds. The *low* and *high* bounds |
| 233 | default to zero and one. The *mode* argument defaults to the midpoint |
| 234 | between the bounds, giving a symmetric distribution. |
Christian Heimes | fe337bf | 2008-03-23 21:54:12 +0000 | [diff] [blame] | 235 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 236 | |
| 237 | .. function:: betavariate(alpha, beta) |
| 238 | |
Benjamin Peterson | b58dda7 | 2009-01-18 22:27:04 +0000 | [diff] [blame] | 239 | Beta distribution. Conditions on the parameters are ``alpha > 0`` and |
| 240 | ``beta > 0``. Returned values range between 0 and 1. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 241 | |
| 242 | |
| 243 | .. function:: expovariate(lambd) |
| 244 | |
Mark Dickinson | 2f94736 | 2009-01-07 17:54:07 +0000 | [diff] [blame] | 245 | Exponential distribution. *lambd* is 1.0 divided by the desired |
| 246 | mean. It should be nonzero. (The parameter would be called |
| 247 | "lambda", but that is a reserved word in Python.) Returned values |
| 248 | range from 0 to positive infinity if *lambd* is positive, and from |
| 249 | negative infinity to 0 if *lambd* is negative. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 250 | |
| 251 | |
| 252 | .. function:: gammavariate(alpha, beta) |
| 253 | |
Benjamin Peterson | b58dda7 | 2009-01-18 22:27:04 +0000 | [diff] [blame] | 254 | Gamma distribution. (*Not* the gamma function!) Conditions on the |
| 255 | parameters are ``alpha > 0`` and ``beta > 0``. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 256 | |
Georg Brandl | 73dd7c7 | 2011-09-17 20:36:28 +0200 | [diff] [blame] | 257 | The probability distribution function is:: |
| 258 | |
| 259 | x ** (alpha - 1) * math.exp(-x / beta) |
| 260 | pdf(x) = -------------------------------------- |
| 261 | math.gamma(alpha) * beta ** alpha |
| 262 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 263 | |
| 264 | .. function:: gauss(mu, sigma) |
| 265 | |
Benjamin Peterson | b58dda7 | 2009-01-18 22:27:04 +0000 | [diff] [blame] | 266 | Gaussian distribution. *mu* is the mean, and *sigma* is the standard |
| 267 | deviation. This is slightly faster than the :func:`normalvariate` function |
| 268 | defined below. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 269 | |
| 270 | |
| 271 | .. function:: lognormvariate(mu, sigma) |
| 272 | |
| 273 | Log normal distribution. If you take the natural logarithm of this |
| 274 | distribution, you'll get a normal distribution with mean *mu* and standard |
| 275 | deviation *sigma*. *mu* can have any value, and *sigma* must be greater than |
| 276 | zero. |
| 277 | |
| 278 | |
| 279 | .. function:: normalvariate(mu, sigma) |
| 280 | |
| 281 | Normal distribution. *mu* is the mean, and *sigma* is the standard deviation. |
| 282 | |
| 283 | |
| 284 | .. function:: vonmisesvariate(mu, kappa) |
| 285 | |
| 286 | *mu* is the mean angle, expressed in radians between 0 and 2\*\ *pi*, and *kappa* |
| 287 | is the concentration parameter, which must be greater than or equal to zero. If |
| 288 | *kappa* is equal to zero, this distribution reduces to a uniform random angle |
| 289 | over the range 0 to 2\*\ *pi*. |
| 290 | |
| 291 | |
| 292 | .. function:: paretovariate(alpha) |
| 293 | |
| 294 | Pareto distribution. *alpha* is the shape parameter. |
| 295 | |
| 296 | |
| 297 | .. function:: weibullvariate(alpha, beta) |
| 298 | |
| 299 | Weibull distribution. *alpha* is the scale parameter and *beta* is the shape |
| 300 | parameter. |
| 301 | |
| 302 | |
Raymond Hettinger | e132910 | 2016-11-21 12:33:50 -0800 | [diff] [blame] | 303 | Alternative Generator |
| 304 | --------------------- |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 305 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 306 | .. class:: SystemRandom([seed]) |
| 307 | |
| 308 | Class that uses the :func:`os.urandom` function for generating random numbers |
| 309 | from sources provided by the operating system. Not available on all systems. |
Raymond Hettinger | 3cdf871 | 2010-12-02 05:35:35 +0000 | [diff] [blame] | 310 | Does not rely on software state, and sequences are not reproducible. Accordingly, |
Raymond Hettinger | afd3045 | 2009-02-24 10:57:02 +0000 | [diff] [blame] | 311 | the :meth:`seed` method has no effect and is ignored. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 312 | The :meth:`getstate` and :meth:`setstate` methods raise |
| 313 | :exc:`NotImplementedError` if called. |
| 314 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 315 | |
Raymond Hettinger | 435cb0f | 2010-09-06 23:36:31 +0000 | [diff] [blame] | 316 | Notes on Reproducibility |
Antoine Pitrou | e72b586 | 2010-12-12 20:13:31 +0000 | [diff] [blame] | 317 | ------------------------ |
Raymond Hettinger | 435cb0f | 2010-09-06 23:36:31 +0000 | [diff] [blame] | 318 | |
| 319 | Sometimes it is useful to be able to reproduce the sequences given by a pseudo |
| 320 | random number generator. By re-using a seed value, the same sequence should be |
| 321 | reproducible from run to run as long as multiple threads are not running. |
| 322 | |
| 323 | Most of the random module's algorithms and seeding functions are subject to |
| 324 | change across Python versions, but two aspects are guaranteed not to change: |
| 325 | |
| 326 | * If a new seeding method is added, then a backward compatible seeder will be |
| 327 | offered. |
| 328 | |
Georg Brandl | 92849d1 | 2016-02-19 08:57:38 +0100 | [diff] [blame] | 329 | * The generator's :meth:`~Random.random` method will continue to produce the same |
Raymond Hettinger | 435cb0f | 2010-09-06 23:36:31 +0000 | [diff] [blame] | 330 | sequence when the compatible seeder is given the same seed. |
Raymond Hettinger | 2fdc7b1 | 2010-12-02 02:41:33 +0000 | [diff] [blame] | 331 | |
Raymond Hettinger | 6e35394 | 2010-12-04 23:42:12 +0000 | [diff] [blame] | 332 | .. _random-examples: |
Raymond Hettinger | 2fdc7b1 | 2010-12-02 02:41:33 +0000 | [diff] [blame] | 333 | |
Raymond Hettinger | 2fdc7b1 | 2010-12-02 02:41:33 +0000 | [diff] [blame] | 334 | Examples and Recipes |
Antoine Pitrou | e72b586 | 2010-12-12 20:13:31 +0000 | [diff] [blame] | 335 | -------------------- |
Raymond Hettinger | 2fdc7b1 | 2010-12-02 02:41:33 +0000 | [diff] [blame] | 336 | |
Raymond Hettinger | 0a1a909 | 2016-11-17 00:45:35 -0800 | [diff] [blame] | 337 | Basic examples:: |
Raymond Hettinger | 3cdf871 | 2010-12-02 05:35:35 +0000 | [diff] [blame] | 338 | |
Raymond Hettinger | 0a1a909 | 2016-11-17 00:45:35 -0800 | [diff] [blame] | 339 | >>> random() # Random float: 0.0 <= x < 1.0 |
Raymond Hettinger | 3cdf871 | 2010-12-02 05:35:35 +0000 | [diff] [blame] | 340 | 0.37444887175646646 |
| 341 | |
Raymond Hettinger | e132910 | 2016-11-21 12:33:50 -0800 | [diff] [blame] | 342 | >>> uniform(2.5, 10.0) # Random float: 2.5 <= x < 10.0 |
Raymond Hettinger | 0a1a909 | 2016-11-17 00:45:35 -0800 | [diff] [blame] | 343 | 3.1800146073117523 |
Raymond Hettinger | 3cdf871 | 2010-12-02 05:35:35 +0000 | [diff] [blame] | 344 | |
Raymond Hettinger | e132910 | 2016-11-21 12:33:50 -0800 | [diff] [blame] | 345 | >>> expovariate(1 / 5) # Interval between arrivals averaging 5 seconds |
Raymond Hettinger | 0a1a909 | 2016-11-17 00:45:35 -0800 | [diff] [blame] | 346 | 5.148957571865031 |
| 347 | |
Raymond Hettinger | e132910 | 2016-11-21 12:33:50 -0800 | [diff] [blame] | 348 | >>> randrange(10) # Integer from 0 to 9 inclusive |
Raymond Hettinger | 3cdf871 | 2010-12-02 05:35:35 +0000 | [diff] [blame] | 349 | 7 |
| 350 | |
Raymond Hettinger | 0a1a909 | 2016-11-17 00:45:35 -0800 | [diff] [blame] | 351 | >>> randrange(0, 101, 2) # Even integer from 0 to 100 inclusive |
Raymond Hettinger | 3cdf871 | 2010-12-02 05:35:35 +0000 | [diff] [blame] | 352 | 26 |
| 353 | |
Raymond Hettinger | 6befb64 | 2016-11-21 01:59:39 -0800 | [diff] [blame] | 354 | >>> choice(['win', 'lose', 'draw']) # Single random element from a sequence |
| 355 | 'draw' |
Raymond Hettinger | 3cdf871 | 2010-12-02 05:35:35 +0000 | [diff] [blame] | 356 | |
Raymond Hettinger | 0a1a909 | 2016-11-17 00:45:35 -0800 | [diff] [blame] | 357 | >>> deck = 'ace two three four'.split() |
| 358 | >>> shuffle(deck) # Shuffle a list |
Raymond Hettinger | 1c3a121 | 2016-10-12 01:42:10 -0400 | [diff] [blame] | 359 | >>> deck |
Raymond Hettinger | 0a1a909 | 2016-11-17 00:45:35 -0800 | [diff] [blame] | 360 | ['four', 'two', 'ace', 'three'] |
Raymond Hettinger | 3cdf871 | 2010-12-02 05:35:35 +0000 | [diff] [blame] | 361 | |
Raymond Hettinger | 0a1a909 | 2016-11-17 00:45:35 -0800 | [diff] [blame] | 362 | >>> sample([10, 20, 30, 40, 50], k=4) # Four samples without replacement |
| 363 | [40, 10, 50, 30] |
Raymond Hettinger | 3cdf871 | 2010-12-02 05:35:35 +0000 | [diff] [blame] | 364 | |
Raymond Hettinger | 0a1a909 | 2016-11-17 00:45:35 -0800 | [diff] [blame] | 365 | Simulations:: |
| 366 | |
Raymond Hettinger | 71c62e1 | 2016-12-04 11:00:34 -0800 | [diff] [blame] | 367 | >>> # Six roulette wheel spins (weighted sampling with replacement) |
Raymond Hettinger | 1c3a121 | 2016-10-12 01:42:10 -0400 | [diff] [blame] | 368 | >>> choices(['red', 'black', 'green'], [18, 18, 2], k=6) |
| 369 | ['red', 'green', 'black', 'black', 'red', 'black'] |
Raymond Hettinger | 2fdc7b1 | 2010-12-02 02:41:33 +0000 | [diff] [blame] | 370 | |
Raymond Hettinger | 71c62e1 | 2016-12-04 11:00:34 -0800 | [diff] [blame] | 371 | >>> # Deal 20 cards without replacement from a deck of 52 playing cards |
| 372 | >>> # and determine the proportion of cards with a ten-value |
| 373 | >>> # (a ten, jack, queen, or king). |
Raymond Hettinger | 0a1a909 | 2016-11-17 00:45:35 -0800 | [diff] [blame] | 374 | >>> deck = collections.Counter(tens=16, low_cards=36) |
| 375 | >>> seen = sample(list(deck.elements()), k=20) |
Raymond Hettinger | 71c62e1 | 2016-12-04 11:00:34 -0800 | [diff] [blame] | 376 | >>> seen.count('tens') / 20 |
Raymond Hettinger | 0a1a909 | 2016-11-17 00:45:35 -0800 | [diff] [blame] | 377 | 0.15 |
| 378 | |
Raymond Hettinger | 71c62e1 | 2016-12-04 11:00:34 -0800 | [diff] [blame] | 379 | >>> # Estimate the probability of getting 5 or more heads from 7 spins |
| 380 | >>> # of a biased coin that settles on heads 60% of the time. |
| 381 | >>> trial = lambda: choices('HT', cum_weights=(0.60, 1.00), k=7).count('H') >= 5 |
| 382 | >>> sum(trial() for i in range(10000)) / 10000 |
Raymond Hettinger | 16ef5d4 | 2016-10-31 22:53:52 -0700 | [diff] [blame] | 383 | 0.4169 |
| 384 | |
Raymond Hettinger | 71c62e1 | 2016-12-04 11:00:34 -0800 | [diff] [blame] | 385 | >>> # Probability of the median of 5 samples being in middle two quartiles |
| 386 | >>> trial = lambda : 2500 <= sorted(choices(range(10000), k=5))[2] < 7500 |
| 387 | >>> sum(trial() for i in range(10000)) / 10000 |
| 388 | 0.7958 |
| 389 | |
Raymond Hettinger | 1c3a121 | 2016-10-12 01:42:10 -0400 | [diff] [blame] | 390 | Example of `statistical bootstrapping |
| 391 | <https://en.wikipedia.org/wiki/Bootstrapping_(statistics)>`_ using resampling |
Raymond Hettinger | 0a1a909 | 2016-11-17 00:45:35 -0800 | [diff] [blame] | 392 | with replacement to estimate a confidence interval for the mean of a sample of |
| 393 | size five:: |
Raymond Hettinger | 2fdc7b1 | 2010-12-02 02:41:33 +0000 | [diff] [blame] | 394 | |
Raymond Hettinger | 1c3a121 | 2016-10-12 01:42:10 -0400 | [diff] [blame] | 395 | # http://statistics.about.com/od/Applications/a/Example-Of-Bootstrapping.htm |
| 396 | from statistics import mean |
| 397 | from random import choices |
Raymond Hettinger | f5b7c7b | 2016-09-05 13:15:02 -0700 | [diff] [blame] | 398 | |
Raymond Hettinger | 1c3a121 | 2016-10-12 01:42:10 -0400 | [diff] [blame] | 399 | data = 1, 2, 4, 4, 10 |
| 400 | means = sorted(mean(choices(data, k=5)) for i in range(20)) |
Raymond Hettinger | 2589ee3 | 2016-11-16 21:34:17 -0800 | [diff] [blame] | 401 | print(f'The sample mean of {mean(data):.1f} has a 90% confidence ' |
| 402 | f'interval from {means[1]:.1f} to {means[-2]:.1f}') |
| 403 | |
Raymond Hettinger | 00305ad | 2016-11-16 22:56:11 -0800 | [diff] [blame] | 404 | Example of a `resampling permutation test |
| 405 | <https://en.wikipedia.org/wiki/Resampling_(statistics)#Permutation_tests>`_ |
| 406 | to determine the statistical significance or `p-value |
| 407 | <https://en.wikipedia.org/wiki/P-value>`_ of an observed difference |
| 408 | between the effects of a drug versus a placebo:: |
| 409 | |
| 410 | # Example from "Statistics is Easy" by Dennis Shasha and Manda Wilson |
| 411 | from statistics import mean |
| 412 | from random import shuffle |
| 413 | |
| 414 | drug = [54, 73, 53, 70, 73, 68, 52, 65, 65] |
| 415 | placebo = [54, 51, 58, 44, 55, 52, 42, 47, 58, 46] |
| 416 | observed_diff = mean(drug) - mean(placebo) |
| 417 | |
| 418 | n = 10000 |
| 419 | count = 0 |
| 420 | combined = drug + placebo |
| 421 | for i in range(n): |
| 422 | shuffle(combined) |
| 423 | new_diff = mean(combined[:len(drug)]) - mean(combined[len(drug):]) |
| 424 | count += (new_diff >= observed_diff) |
| 425 | |
| 426 | print(f'{n} label reshufflings produced only {count} instances with a difference') |
| 427 | print(f'at least as extreme as the observed difference of {observed_diff:.1f}.') |
| 428 | print(f'The one-sided p-value of {count / n:.4f} leads us to reject the null') |
Raymond Hettinger | 6befb64 | 2016-11-21 01:59:39 -0800 | [diff] [blame] | 429 | print(f'hypothesis that there is no difference between the drug and the placebo.') |
| 430 | |
| 431 | Simulation of arrival times and service deliveries in a single server queue:: |
| 432 | |
Raymond Hettinger | 1149d93 | 2016-11-21 14:13:07 -0800 | [diff] [blame] | 433 | from random import expovariate, gauss |
| 434 | from statistics import mean, median, stdev |
Raymond Hettinger | 6befb64 | 2016-11-21 01:59:39 -0800 | [diff] [blame] | 435 | |
| 436 | average_arrival_interval = 5.6 |
| 437 | average_service_time = 5.0 |
| 438 | stdev_service_time = 0.5 |
| 439 | |
| 440 | num_waiting = 0 |
Raymond Hettinger | 1149d93 | 2016-11-21 14:13:07 -0800 | [diff] [blame] | 441 | arrivals = [] |
| 442 | starts = [] |
Raymond Hettinger | 6befb64 | 2016-11-21 01:59:39 -0800 | [diff] [blame] | 443 | arrival = service_end = 0.0 |
Raymond Hettinger | 8ab1258 | 2016-11-21 10:16:01 -0800 | [diff] [blame] | 444 | for i in range(20000): |
| 445 | if arrival <= service_end: |
| 446 | num_waiting += 1 |
| 447 | arrival += expovariate(1.0 / average_arrival_interval) |
Raymond Hettinger | 1149d93 | 2016-11-21 14:13:07 -0800 | [diff] [blame] | 448 | arrivals.append(arrival) |
Raymond Hettinger | 8ab1258 | 2016-11-21 10:16:01 -0800 | [diff] [blame] | 449 | else: |
Raymond Hettinger | 6befb64 | 2016-11-21 01:59:39 -0800 | [diff] [blame] | 450 | num_waiting -= 1 |
| 451 | service_start = service_end if num_waiting else arrival |
| 452 | service_time = gauss(average_service_time, stdev_service_time) |
| 453 | service_end = service_start + service_time |
Raymond Hettinger | 1149d93 | 2016-11-21 14:13:07 -0800 | [diff] [blame] | 454 | starts.append(service_start) |
| 455 | |
| 456 | waits = [start - arrival for arrival, start in zip(arrivals, starts)] |
| 457 | print(f'Mean wait: {mean(waits):.1f}. Stdev wait: {stdev(waits):.1f}.') |
| 458 | print(f'Median wait: {median(waits):.1f}. Max wait: {max(waits):.1f}.') |
Raymond Hettinger | 6befb64 | 2016-11-21 01:59:39 -0800 | [diff] [blame] | 459 | |
Raymond Hettinger | 0537405 | 2016-11-21 10:52:04 -0800 | [diff] [blame] | 460 | .. seealso:: |
| 461 | |
| 462 | `Statistics for Hackers <https://www.youtube.com/watch?v=Iq9DzN6mvYA>`_ |
| 463 | a video tutorial by |
| 464 | `Jake Vanderplas <https://us.pycon.org/2016/speaker/profile/295/>`_ |
| 465 | on statistical analysis using just a few fundamental concepts |
| 466 | including simulation, sampling, shuffling, and cross-validation. |
| 467 | |
| 468 | `Economics Simulation |
| 469 | <http://nbviewer.jupyter.org/url/norvig.com/ipython/Economics.ipynb>`_ |
| 470 | a simulation of a marketplace by |
| 471 | `Peter Norvig <http://norvig.com/bio.html>`_ that shows effective |
Raymond Hettinger | 7f94619 | 2016-11-21 15:13:18 -0800 | [diff] [blame] | 472 | use of many of the tools and distributions provided by this module |
Raymond Hettinger | 0537405 | 2016-11-21 10:52:04 -0800 | [diff] [blame] | 473 | (gauss, uniform, sample, betavariate, choice, triangular, and randrange). |
| 474 | |
| 475 | `A Concrete Introduction to Probability (using Python) |
| 476 | <http://nbviewer.jupyter.org/url/norvig.com/ipython/Probability.ipynb>`_ |
| 477 | a tutorial by `Peter Norvig <http://norvig.com/bio.html>`_ covering |
| 478 | the basics of probability theory, how to write simulations, and |
Raymond Hettinger | 7f94619 | 2016-11-21 15:13:18 -0800 | [diff] [blame] | 479 | how to perform data analysis using Python. |