| :mod:`random` --- Generate pseudo-random numbers |
| ================================================ |
| |
| .. module:: random |
| :synopsis: Generate pseudo-random numbers with various common distributions. |
| |
| **Source code:** :source:`Lib/random.py` |
| |
| -------------- |
| |
| This module implements pseudo-random number generators for various |
| distributions. |
| |
| For integers, there is uniform selection from a range. For sequences, there is |
| uniform selection of a random element, a function to generate a random |
| permutation of a list in-place, and a function for random sampling without |
| replacement. |
| |
| On the real line, there are functions to compute uniform, normal (Gaussian), |
| lognormal, negative exponential, gamma, and beta distributions. For generating |
| distributions of angles, the von Mises distribution is available. |
| |
| Almost all module functions depend on the basic function :func:`.random`, which |
| generates a random float uniformly in the semi-open range [0.0, 1.0). Python |
| uses the Mersenne Twister as the core generator. It produces 53-bit precision |
| floats and has a period of 2\*\*19937-1. The underlying implementation in C is |
| both fast and threadsafe. The Mersenne Twister is one of the most extensively |
| tested random number generators in existence. However, being completely |
| deterministic, it is not suitable for all purposes, and is completely unsuitable |
| for cryptographic purposes. |
| |
| The functions supplied by this module are actually bound methods of a hidden |
| instance of the :class:`random.Random` class. You can instantiate your own |
| instances of :class:`Random` to get generators that don't share state. |
| |
| Class :class:`Random` can also be subclassed if you want to use a different |
| basic generator of your own devising: in that case, override the :meth:`~Random.random`, |
| :meth:`~Random.seed`, :meth:`~Random.getstate`, and :meth:`~Random.setstate` methods. |
| Optionally, a new generator can supply a :meth:`~Random.getrandbits` method --- this |
| allows :meth:`randrange` to produce selections over an arbitrarily large range. |
| |
| The :mod:`random` module also provides the :class:`SystemRandom` class which |
| uses the system function :func:`os.urandom` to generate random numbers |
| from sources provided by the operating system. |
| |
| .. warning:: |
| |
| The pseudo-random generators of this module should not be used for |
| security purposes. For security or cryptographic uses, see the |
| :mod:`secrets` module. |
| |
| .. seealso:: |
| |
| M. Matsumoto and T. Nishimura, "Mersenne Twister: A 623-dimensionally |
| equidistributed uniform pseudorandom number generator", ACM Transactions on |
| Modeling and Computer Simulation Vol. 8, No. 1, January pp.3--30 1998. |
| |
| |
| `Complementary-Multiply-with-Carry recipe |
| <https://code.activestate.com/recipes/576707/>`_ for a compatible alternative |
| random number generator with a long period and comparatively simple update |
| operations. |
| |
| |
| Bookkeeping functions |
| --------------------- |
| |
| .. function:: seed(a=None, version=2) |
| |
| Initialize the random number generator. |
| |
| If *a* is omitted or ``None``, the current system time is used. If |
| randomness sources are provided by the operating system, they are used |
| instead of the system time (see the :func:`os.urandom` function for details |
| on availability). |
| |
| If *a* is an int, it is used directly. |
| |
| With version 2 (the default), a :class:`str`, :class:`bytes`, or :class:`bytearray` |
| object gets converted to an :class:`int` and all of its bits are used. |
| |
| With version 1 (provided for reproducing random sequences from older versions |
| of Python), the algorithm for :class:`str` and :class:`bytes` generates a |
| narrower range of seeds. |
| |
| .. versionchanged:: 3.2 |
| Moved to the version 2 scheme which uses all of the bits in a string seed. |
| |
| .. deprecated:: 3.9 |
| In the future, the *seed* must be one of the following types: |
| *NoneType*, :class:`int`, :class:`float`, :class:`str`, |
| :class:`bytes`, or :class:`bytearray`. |
| |
| .. function:: getstate() |
| |
| Return an object capturing the current internal state of the generator. This |
| object can be passed to :func:`setstate` to restore the state. |
| |
| |
| .. function:: setstate(state) |
| |
| *state* should have been obtained from a previous call to :func:`getstate`, and |
| :func:`setstate` restores the internal state of the generator to what it was at |
| the time :func:`getstate` was called. |
| |
| |
| Functions for bytes |
| ------------------- |
| |
| .. function:: randbytes(n) |
| |
| Generate *n* random bytes. |
| |
| This method should not be used for generating security tokens. |
| Use :func:`secrets.token_bytes` instead. |
| |
| .. versionadded:: 3.9 |
| |
| |
| Functions for integers |
| ---------------------- |
| |
| .. function:: randrange(stop) |
| randrange(start, stop[, step]) |
| |
| Return a randomly selected element from ``range(start, stop, step)``. This is |
| equivalent to ``choice(range(start, stop, step))``, but doesn't actually build a |
| range object. |
| |
| The positional argument pattern matches that of :func:`range`. Keyword arguments |
| should not be used because the function may use them in unexpected ways. |
| |
| .. versionchanged:: 3.2 |
| :meth:`randrange` is more sophisticated about producing equally distributed |
| values. Formerly it used a style like ``int(random()*n)`` which could produce |
| slightly uneven distributions. |
| |
| .. deprecated:: 3.10 |
| The automatic conversion of non-integer types to equivalent integers is |
| deprecated. Currently ``randrange(10.0)`` is losslessly converted to |
| ``randrange(10)``. In the future, this will raise a :exc:`TypeError`. |
| |
| .. deprecated:: 3.10 |
| The exception raised for non-integral values such as ``randrange(10.5)`` |
| or ``randrange('10')`` will be changed from :exc:`ValueError` to |
| :exc:`TypeError`. |
| |
| .. function:: randint(a, b) |
| |
| Return a random integer *N* such that ``a <= N <= b``. Alias for |
| ``randrange(a, b+1)``. |
| |
| .. function:: getrandbits(k) |
| |
| Returns a non-negative Python integer with *k* random bits. This method |
| is supplied with the MersenneTwister generator and some other generators |
| may also provide it as an optional part of the API. When available, |
| :meth:`getrandbits` enables :meth:`randrange` to handle arbitrarily large |
| ranges. |
| |
| .. versionchanged:: 3.9 |
| This method now accepts zero for *k*. |
| |
| |
| Functions for sequences |
| ----------------------- |
| |
| .. function:: choice(seq) |
| |
| Return a random element from the non-empty sequence *seq*. If *seq* is empty, |
| raises :exc:`IndexError`. |
| |
| .. function:: choices(population, weights=None, *, cum_weights=None, k=1) |
| |
| Return a *k* sized list of elements chosen from the *population* with replacement. |
| If the *population* is empty, raises :exc:`IndexError`. |
| |
| If a *weights* sequence is specified, selections are made according to the |
| relative weights. Alternatively, if a *cum_weights* sequence is given, the |
| selections are made according to the cumulative weights (perhaps computed |
| using :func:`itertools.accumulate`). For example, the relative weights |
| ``[10, 5, 30, 5]`` are equivalent to the cumulative weights |
| ``[10, 15, 45, 50]``. Internally, the relative weights are converted to |
| cumulative weights before making selections, so supplying the cumulative |
| weights saves work. |
| |
| If neither *weights* nor *cum_weights* are specified, selections are made |
| with equal probability. If a weights sequence is supplied, it must be |
| the same length as the *population* sequence. It is a :exc:`TypeError` |
| to specify both *weights* and *cum_weights*. |
| |
| The *weights* or *cum_weights* can use any numeric type that interoperates |
| with the :class:`float` values returned by :func:`random` (that includes |
| integers, floats, and fractions but excludes decimals). Weights are assumed |
| to be non-negative and finite. A :exc:`ValueError` is raised if all |
| weights are zero. |
| |
| For a given seed, the :func:`choices` function with equal weighting |
| typically produces a different sequence than repeated calls to |
| :func:`choice`. The algorithm used by :func:`choices` uses floating |
| point arithmetic for internal consistency and speed. The algorithm used |
| by :func:`choice` defaults to integer arithmetic with repeated selections |
| to avoid small biases from round-off error. |
| |
| .. versionadded:: 3.6 |
| |
| .. versionchanged:: 3.9 |
| Raises a :exc:`ValueError` if all weights are zero. |
| |
| |
| .. function:: shuffle(x[, random]) |
| |
| Shuffle the sequence *x* in place. |
| |
| The optional argument *random* is a 0-argument function returning a random |
| float in [0.0, 1.0); by default, this is the function :func:`.random`. |
| |
| To shuffle an immutable sequence and return a new shuffled list, use |
| ``sample(x, k=len(x))`` instead. |
| |
| Note that even for small ``len(x)``, the total number of permutations of *x* |
| can quickly grow larger than the period of most random number generators. |
| This implies that most permutations of a long sequence can never be |
| generated. For example, a sequence of length 2080 is the largest that |
| can fit within the period of the Mersenne Twister random number generator. |
| |
| .. deprecated-removed:: 3.9 3.11 |
| The optional parameter *random*. |
| |
| |
| .. function:: sample(population, k, *, counts=None) |
| |
| Return a *k* length list of unique elements chosen from the population sequence |
| or set. Used for random sampling without replacement. |
| |
| Returns a new list containing elements from the population while leaving the |
| original population unchanged. The resulting list is in selection order so that |
| all sub-slices will also be valid random samples. This allows raffle winners |
| (the sample) to be partitioned into grand prize and second place winners (the |
| subslices). |
| |
| Members of the population need not be :term:`hashable` or unique. If the population |
| contains repeats, then each occurrence is a possible selection in the sample. |
| |
| Repeated elements can be specified one at a time or with the optional |
| keyword-only *counts* parameter. For example, ``sample(['red', 'blue'], |
| counts=[4, 2], k=5)`` is equivalent to ``sample(['red', 'red', 'red', 'red', |
| 'blue', 'blue'], k=5)``. |
| |
| To choose a sample from a range of integers, use a :func:`range` object as an |
| argument. This is especially fast and space efficient for sampling from a large |
| population: ``sample(range(10000000), k=60)``. |
| |
| If the sample size is larger than the population size, a :exc:`ValueError` |
| is raised. |
| |
| .. versionchanged:: 3.9 |
| Added the *counts* parameter. |
| |
| .. deprecated:: 3.9 |
| In the future, the *population* must be a sequence. Instances of |
| :class:`set` are no longer supported. The set must first be converted |
| to a :class:`list` or :class:`tuple`, preferably in a deterministic |
| order so that the sample is reproducible. |
| |
| |
| .. _real-valued-distributions: |
| |
| Real-valued distributions |
| ------------------------- |
| |
| The following functions generate specific real-valued distributions. Function |
| parameters are named after the corresponding variables in the distribution's |
| equation, as used in common mathematical practice; most of these equations can |
| be found in any statistics text. |
| |
| |
| .. function:: random() |
| |
| Return the next random floating point number in the range [0.0, 1.0). |
| |
| |
| .. function:: uniform(a, b) |
| |
| Return a random floating point number *N* such that ``a <= N <= b`` for |
| ``a <= b`` and ``b <= N <= a`` for ``b < a``. |
| |
| The end-point value ``b`` may or may not be included in the range |
| depending on floating-point rounding in the equation ``a + (b-a) * random()``. |
| |
| |
| .. function:: triangular(low, high, mode) |
| |
| Return a random floating point number *N* such that ``low <= N <= high`` and |
| with the specified *mode* between those bounds. The *low* and *high* bounds |
| default to zero and one. The *mode* argument defaults to the midpoint |
| between the bounds, giving a symmetric distribution. |
| |
| |
| .. function:: betavariate(alpha, beta) |
| |
| Beta distribution. Conditions on the parameters are ``alpha > 0`` and |
| ``beta > 0``. Returned values range between 0 and 1. |
| |
| |
| .. function:: expovariate(lambd) |
| |
| Exponential distribution. *lambd* is 1.0 divided by the desired |
| mean. It should be nonzero. (The parameter would be called |
| "lambda", but that is a reserved word in Python.) Returned values |
| range from 0 to positive infinity if *lambd* is positive, and from |
| negative infinity to 0 if *lambd* is negative. |
| |
| |
| .. function:: gammavariate(alpha, beta) |
| |
| Gamma distribution. (*Not* the gamma function!) Conditions on the |
| parameters are ``alpha > 0`` and ``beta > 0``. |
| |
| The probability distribution function is:: |
| |
| x ** (alpha - 1) * math.exp(-x / beta) |
| pdf(x) = -------------------------------------- |
| math.gamma(alpha) * beta ** alpha |
| |
| |
| .. function:: gauss(mu, sigma) |
| |
| Gaussian distribution. *mu* is the mean, and *sigma* is the standard |
| deviation. This is slightly faster than the :func:`normalvariate` function |
| defined below. |
| |
| Multithreading note: When two threads call this function |
| simultaneously, it is possible that they will receive the |
| same return value. This can be avoided in three ways. |
| 1) Have each thread use a different instance of the random |
| number generator. 2) Put locks around all calls. 3) Use the |
| slower, but thread-safe :func:`normalvariate` function instead. |
| |
| |
| .. function:: lognormvariate(mu, sigma) |
| |
| Log normal distribution. If you take the natural logarithm of this |
| distribution, you'll get a normal distribution with mean *mu* and standard |
| deviation *sigma*. *mu* can have any value, and *sigma* must be greater than |
| zero. |
| |
| |
| .. function:: normalvariate(mu, sigma) |
| |
| Normal distribution. *mu* is the mean, and *sigma* is the standard deviation. |
| |
| |
| .. function:: vonmisesvariate(mu, kappa) |
| |
| *mu* is the mean angle, expressed in radians between 0 and 2\*\ *pi*, and *kappa* |
| is the concentration parameter, which must be greater than or equal to zero. If |
| *kappa* is equal to zero, this distribution reduces to a uniform random angle |
| over the range 0 to 2\*\ *pi*. |
| |
| |
| .. function:: paretovariate(alpha) |
| |
| Pareto distribution. *alpha* is the shape parameter. |
| |
| |
| .. function:: weibullvariate(alpha, beta) |
| |
| Weibull distribution. *alpha* is the scale parameter and *beta* is the shape |
| parameter. |
| |
| |
| Alternative Generator |
| --------------------- |
| |
| .. class:: Random([seed]) |
| |
| Class that implements the default pseudo-random number generator used by the |
| :mod:`random` module. |
| |
| .. deprecated:: 3.9 |
| In the future, the *seed* must be one of the following types: |
| :class:`NoneType`, :class:`int`, :class:`float`, :class:`str`, |
| :class:`bytes`, or :class:`bytearray`. |
| |
| .. class:: SystemRandom([seed]) |
| |
| Class that uses the :func:`os.urandom` function for generating random numbers |
| from sources provided by the operating system. Not available on all systems. |
| Does not rely on software state, and sequences are not reproducible. Accordingly, |
| the :meth:`seed` method has no effect and is ignored. |
| The :meth:`getstate` and :meth:`setstate` methods raise |
| :exc:`NotImplementedError` if called. |
| |
| |
| Notes on Reproducibility |
| ------------------------ |
| |
| Sometimes it is useful to be able to reproduce the sequences given by a |
| pseudo-random number generator. By re-using a seed value, the same sequence should be |
| reproducible from run to run as long as multiple threads are not running. |
| |
| Most of the random module's algorithms and seeding functions are subject to |
| change across Python versions, but two aspects are guaranteed not to change: |
| |
| * If a new seeding method is added, then a backward compatible seeder will be |
| offered. |
| |
| * The generator's :meth:`~Random.random` method will continue to produce the same |
| sequence when the compatible seeder is given the same seed. |
| |
| .. _random-examples: |
| |
| Examples |
| -------- |
| |
| Basic examples:: |
| |
| >>> random() # Random float: 0.0 <= x < 1.0 |
| 0.37444887175646646 |
| |
| >>> uniform(2.5, 10.0) # Random float: 2.5 <= x < 10.0 |
| 3.1800146073117523 |
| |
| >>> expovariate(1 / 5) # Interval between arrivals averaging 5 seconds |
| 5.148957571865031 |
| |
| >>> randrange(10) # Integer from 0 to 9 inclusive |
| 7 |
| |
| >>> randrange(0, 101, 2) # Even integer from 0 to 100 inclusive |
| 26 |
| |
| >>> choice(['win', 'lose', 'draw']) # Single random element from a sequence |
| 'draw' |
| |
| >>> deck = 'ace two three four'.split() |
| >>> shuffle(deck) # Shuffle a list |
| >>> deck |
| ['four', 'two', 'ace', 'three'] |
| |
| >>> sample([10, 20, 30, 40, 50], k=4) # Four samples without replacement |
| [40, 10, 50, 30] |
| |
| Simulations:: |
| |
| >>> # Six roulette wheel spins (weighted sampling with replacement) |
| >>> choices(['red', 'black', 'green'], [18, 18, 2], k=6) |
| ['red', 'green', 'black', 'black', 'red', 'black'] |
| |
| >>> # Deal 20 cards without replacement from a deck |
| >>> # of 52 playing cards, and determine the proportion of cards |
| >>> # with a ten-value: ten, jack, queen, or king. |
| >>> dealt = sample(['tens', 'low cards'], counts=[16, 36], k=20) |
| >>> dealt.count('tens') / 20 |
| 0.15 |
| |
| >>> # Estimate the probability of getting 5 or more heads from 7 spins |
| >>> # of a biased coin that settles on heads 60% of the time. |
| >>> def trial(): |
| ... return choices('HT', cum_weights=(0.60, 1.00), k=7).count('H') >= 5 |
| ... |
| >>> sum(trial() for i in range(10_000)) / 10_000 |
| 0.4169 |
| |
| >>> # Probability of the median of 5 samples being in middle two quartiles |
| >>> def trial(): |
| ... return 2_500 <= sorted(choices(range(10_000), k=5))[2] < 7_500 |
| ... |
| >>> sum(trial() for i in range(10_000)) / 10_000 |
| 0.7958 |
| |
| Example of `statistical bootstrapping |
| <https://en.wikipedia.org/wiki/Bootstrapping_(statistics)>`_ using resampling |
| with replacement to estimate a confidence interval for the mean of a sample:: |
| |
| # http://statistics.about.com/od/Applications/a/Example-Of-Bootstrapping.htm |
| from statistics import fmean as mean |
| from random import choices |
| |
| data = [41, 50, 29, 37, 81, 30, 73, 63, 20, 35, 68, 22, 60, 31, 95] |
| means = sorted(mean(choices(data, k=len(data))) for i in range(100)) |
| print(f'The sample mean of {mean(data):.1f} has a 90% confidence ' |
| f'interval from {means[5]:.1f} to {means[94]:.1f}') |
| |
| Example of a `resampling permutation test |
| <https://en.wikipedia.org/wiki/Resampling_(statistics)#Permutation_tests>`_ |
| to determine the statistical significance or `p-value |
| <https://en.wikipedia.org/wiki/P-value>`_ of an observed difference |
| between the effects of a drug versus a placebo:: |
| |
| # Example from "Statistics is Easy" by Dennis Shasha and Manda Wilson |
| from statistics import fmean as mean |
| from random import shuffle |
| |
| drug = [54, 73, 53, 70, 73, 68, 52, 65, 65] |
| placebo = [54, 51, 58, 44, 55, 52, 42, 47, 58, 46] |
| observed_diff = mean(drug) - mean(placebo) |
| |
| n = 10_000 |
| count = 0 |
| combined = drug + placebo |
| for i in range(n): |
| shuffle(combined) |
| new_diff = mean(combined[:len(drug)]) - mean(combined[len(drug):]) |
| count += (new_diff >= observed_diff) |
| |
| print(f'{n} label reshufflings produced only {count} instances with a difference') |
| print(f'at least as extreme as the observed difference of {observed_diff:.1f}.') |
| print(f'The one-sided p-value of {count / n:.4f} leads us to reject the null') |
| print(f'hypothesis that there is no difference between the drug and the placebo.') |
| |
| Simulation of arrival times and service deliveries for a multiserver queue:: |
| |
| from heapq import heappush, heappop |
| from random import expovariate, gauss |
| from statistics import mean, quantiles |
| |
| average_arrival_interval = 5.6 |
| average_service_time = 15.0 |
| stdev_service_time = 3.5 |
| num_servers = 3 |
| |
| waits = [] |
| arrival_time = 0.0 |
| servers = [0.0] * num_servers # time when each server becomes available |
| for i in range(100_000): |
| arrival_time += expovariate(1.0 / average_arrival_interval) |
| next_server_available = heappop(servers) |
| wait = max(0.0, next_server_available - arrival_time) |
| waits.append(wait) |
| service_duration = gauss(average_service_time, stdev_service_time) |
| service_completed = arrival_time + wait + service_duration |
| heappush(servers, service_completed) |
| |
| print(f'Mean wait: {mean(waits):.1f} Max wait: {max(waits):.1f}') |
| print('Quartiles:', [round(q, 1) for q in quantiles(waits)]) |
| |
| .. seealso:: |
| |
| `Statistics for Hackers <https://www.youtube.com/watch?v=Iq9DzN6mvYA>`_ |
| a video tutorial by |
| `Jake Vanderplas <https://us.pycon.org/2016/speaker/profile/295/>`_ |
| on statistical analysis using just a few fundamental concepts |
| including simulation, sampling, shuffling, and cross-validation. |
| |
| `Economics Simulation |
| <http://nbviewer.jupyter.org/url/norvig.com/ipython/Economics.ipynb>`_ |
| a simulation of a marketplace by |
| `Peter Norvig <http://norvig.com/bio.html>`_ that shows effective |
| use of many of the tools and distributions provided by this module |
| (gauss, uniform, sample, betavariate, choice, triangular, and randrange). |
| |
| `A Concrete Introduction to Probability (using Python) |
| <http://nbviewer.jupyter.org/url/norvig.com/ipython/Probability.ipynb>`_ |
| a tutorial by `Peter Norvig <http://norvig.com/bio.html>`_ covering |
| the basics of probability theory, how to write simulations, and |
| how to perform data analysis using Python. |
| |
| |
| Recipes |
| ------- |
| |
| The default :func:`.random` returns multiples of 2⁻⁵³ in the range |
| *0.0 ≤ x < 1.0*. All such numbers are evenly spaced and are exactly |
| representable as Python floats. However, many other representable |
| floats in that interval are not possible selections. For example, |
| ``0.05954861408025609`` isn't an integer multiple of 2⁻⁵³. |
| |
| The following recipe takes a different approach. All floats in the |
| interval are possible selections. The mantissa comes from a uniform |
| distribution of integers in the range *2⁵² ≤ mantissa < 2⁵³*. The |
| exponent comes from a geometric distribution where exponents smaller |
| than *-53* occur half as often as the next larger exponent. |
| |
| :: |
| |
| from random import Random |
| from math import ldexp |
| |
| class FullRandom(Random): |
| |
| def random(self): |
| mantissa = 0x10_0000_0000_0000 | self.getrandbits(52) |
| exponent = -53 |
| x = 0 |
| while not x: |
| x = self.getrandbits(32) |
| exponent += x.bit_length() - 32 |
| return ldexp(mantissa, exponent) |
| |
| All :ref:`real valued distributions <real-valued-distributions>` |
| in the class will use the new method:: |
| |
| >>> fr = FullRandom() |
| >>> fr.random() |
| 0.05954861408025609 |
| >>> fr.expovariate(0.25) |
| 8.87925541791544 |
| |
| The recipe is conceptually equivalent to an algorithm that chooses from |
| all the multiples of 2⁻¹⁰⁷⁴ in the range *0.0 ≤ x < 1.0*. All such |
| numbers are evenly spaced, but most have to be rounded down to the |
| nearest representable Python float. (The value 2⁻¹⁰⁷⁴ is the smallest |
| positive unnormalized float and is equal to ``math.ulp(0.0)``.) |
| |
| |
| .. seealso:: |
| |
| `Generating Pseudo-random Floating-Point Values |
| <https://allendowney.com/research/rand/downey07randfloat.pdf>`_ a |
| paper by Allen B. Downey describing ways to generate more |
| fine-grained floats than normally generated by :func:`.random`. |