Blame - Doc/library/statistics.rst - platform/external/python/cpython3

blob: ea3d7dab0f17375a16a2620d563a2d4d65a3858b [file] [log] [blame]

Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	1	:mod:`statistics` --- Mathematical statistics functions
				2	=======================================================
				3
				4	.. module:: statistics
				5	:synopsis: mathematical statistics functions
Terry Jan Reedy	fa089b9	2016-06-11 15:02:54 -0400	[diff] [blame]	6
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	7	.. moduleauthor:: Steven D'Aprano <steve+python@pearwood.info>
				8	.. sectionauthor:: Steven D'Aprano <steve+python@pearwood.info>
				9
				10	.. versionadded:: 3.4
				11
Terry Jan Reedy	fa089b9	2016-06-11 15:02:54 -0400	[diff] [blame]	12	Source code: :source:`Lib/statistics.py`
				13
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	14	.. testsetup:: *
				15
				16	from statistics import *
				17	__name__ = '<doctest>'
				18
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	19	--------------
				20
				21	This module provides functions for calculating mathematical statistics of
				22	numeric (:class:`Real`-valued) data.
				23
Nick Coghlan	73afe2a	2014-02-08 19:58:04 +1000	[diff] [blame]	24	.. note::
				25
				26	Unless explicitly noted otherwise, these functions support :class:`int`,
				27	:class:`float`, :class:`decimal.Decimal` and :class:`fractions.Fraction`.
				28	Behaviour with other types (whether in the numeric tower or not) is
				29	currently unsupported. Mixed types are also undefined and
				30	implementation-dependent. If your input data consists of mixed types,
				31	you may be able to use :func:`map` to ensure a consistent result, e.g.
				32	``map(float, input_data)``.
				33
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	34	Averages and measures of central location
				35	-----------------------------------------
				36
				37	These functions calculate an average or typical value from a population
				38	or sample.
				39
				40	======================= =============================================
				41	:func:`mean` Arithmetic mean ("average") of data.
				42	:func:`median` Median (middle value) of data.
				43	:func:`median_low` Low median of data.
				44	:func:`median_high` High median of data.
				45	:func:`median_grouped` Median, or 50th percentile, of grouped data.
				46	:func:`mode` Mode (most common value) of discrete data.
				47	======================= =============================================
				48
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	49	Measures of spread
				50	------------------
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	51
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	52	These functions calculate a measure of how much the population or sample
				53	tends to deviate from the typical or average values.
				54
				55	======================= =============================================
				56	:func:`pstdev` Population standard deviation of data.
				57	:func:`pvariance` Population variance of data.
				58	:func:`stdev` Sample standard deviation of data.
				59	:func:`variance` Sample variance of data.
				60	======================= =============================================
				61
				62
				63	Function details
				64	----------------
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	65
Georg Brandl	e051b55	2013-11-04 07:30:50 +0100	[diff] [blame]	66	Note: The functions do not require the data given to them to be sorted.
				67	However, for reading convenience, most of the examples show sorted sequences.
				68
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	69	.. function:: mean(data)
				70
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	71	Return the sample arithmetic mean of data, a sequence or iterator of
				72	real-valued numbers.
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	73
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	74	The arithmetic mean is the sum of the data divided by the number of data
				75	points. It is commonly called "the average", although it is only one of many
				76	different mathematical averages. It is a measure of the central location of
				77	the data.
				78
				79	If data is empty, :exc:`StatisticsError` will be raised.
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	80
				81	Some examples of use:
				82
				83	.. doctest::
				84
				85	>>> mean([1, 2, 3, 4, 4])
				86	2.8
				87	>>> mean([-1.0, 2.5, 3.25, 5.75])
				88	2.625
				89
				90	>>> from fractions import Fraction as F
				91	>>> mean([F(3, 7), F(1, 21), F(5, 3), F(1, 3)])
				92	Fraction(13, 21)
				93
				94	>>> from decimal import Decimal as D
				95	>>> mean([D("0.5"), D("0.75"), D("0.625"), D("0.375")])
				96	Decimal('0.5625')
				97
				98	.. note::
				99
Georg Brandl	a3fdcaa	2013-10-21 09:08:39 +0200	[diff] [blame]	100	The mean is strongly affected by outliers and is not a robust estimator
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	101	for central location: the mean is not necessarily a typical example of the
				102	data points. For more robust, although less efficient, measures of
				103	central location, see :func:`median` and :func:`mode`. (In this case,
				104	"efficient" refers to statistical efficiency rather than computational
				105	efficiency.)
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	106
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	107	The sample mean gives an unbiased estimate of the true population mean,
				108	which means that, taken on average over all the possible samples,
				109	``mean(sample)`` converges on the true mean of the entire population. If
				110	data represents the entire population rather than a sample, then
				111	``mean(data)`` is equivalent to calculating the true population mean μ.
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	112
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	113
				114	.. function:: median(data)
				115
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	116	Return the median (middle value) of numeric data, using the common "mean of
				117	middle two" method. If data is empty, :exc:`StatisticsError` is raised.
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	118
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	119	The median is a robust measure of central location, and is less affected by
				120	the presence of outliers in your data. When the number of data points is
				121	odd, the middle data point is returned:
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	122
				123	.. doctest::
				124
				125	>>> median([1, 3, 5])
				126	3
				127
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	128	When the number of data points is even, the median is interpolated by taking
				129	the average of the two middle values:
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	130
				131	.. doctest::
				132
				133	>>> median([1, 3, 5, 7])
				134	4.0
				135
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	136	This is suited for when your data is discrete, and you don't mind that the
				137	median may not be an actual data point.
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	138
Berker Peksag	9c1dba2	2014-09-28 00:00:58 +0300	[diff] [blame]	139	.. seealso:: :func:`median_low`, :func:`median_high`, :func:`median_grouped`
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	140
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	141
				142	.. function:: median_low(data)
				143
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	144	Return the low median of numeric data. If data is empty,
				145	:exc:`StatisticsError` is raised.
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	146
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	147	The low median is always a member of the data set. When the number of data
				148	points is odd, the middle value is returned. When it is even, the smaller of
				149	the two middle values is returned.
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	150
				151	.. doctest::
				152
				153	>>> median_low([1, 3, 5])
				154	3
				155	>>> median_low([1, 3, 5, 7])
				156	3
				157
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	158	Use the low median when your data are discrete and you prefer the median to
				159	be an actual data point rather than interpolated.
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	160
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	161
				162	.. function:: median_high(data)
				163
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	164	Return the high median of data. If data is empty, :exc:`StatisticsError`
				165	is raised.
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	166
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	167	The high median is always a member of the data set. When the number of data
				168	points is odd, the middle value is returned. When it is even, the larger of
				169	the two middle values is returned.
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	170
				171	.. doctest::
				172
				173	>>> median_high([1, 3, 5])
				174	3
				175	>>> median_high([1, 3, 5, 7])
				176	5
				177
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	178	Use the high median when your data are discrete and you prefer the median to
				179	be an actual data point rather than interpolated.
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	180
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	181
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	182	.. function:: median_grouped(data, interval=1)
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	183
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	184	Return the median of grouped continuous data, calculated as the 50th
				185	percentile, using interpolation. If data is empty, :exc:`StatisticsError`
				186	is raised.
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	187
				188	.. doctest::
				189
				190	>>> median_grouped([52, 52, 53, 54])
				191	52.5
				192
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	193	In the following example, the data are rounded, so that each value represents
				194	the midpoint of data classes, e.g. 1 is the midpoint of the class 0.5-1.5, 2
				195	is the midpoint of 1.5-2.5, 3 is the midpoint of 2.5-3.5, etc. With the data
				196	given, the middle value falls somewhere in the class 3.5-4.5, and
				197	interpolation is used to estimate it:
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	198
				199	.. doctest::
				200
				201	>>> median_grouped([1, 2, 2, 3, 4, 4, 4, 4, 4, 5])
				202	3.7
				203
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	204	Optional argument interval represents the class interval, and defaults
				205	to 1. Changing the class interval naturally will change the interpolation:
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	206
				207	.. doctest::
				208
				209	>>> median_grouped([1, 3, 3, 5, 7], interval=1)
				210	3.25
				211	>>> median_grouped([1, 3, 3, 5, 7], interval=2)
				212	3.5
				213
				214	This function does not check whether the data points are at least
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	215	interval apart.
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	216
				217	.. impl-detail::
				218
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	219	Under some circumstances, :func:`median_grouped` may coerce data points to
				220	floats. This behaviour is likely to change in the future.
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	221
				222	.. seealso::
				223
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	224	* "Statistics for the Behavioral Sciences", Frederick J Gravetter and
				225	Larry B Wallnau (8th Edition).
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	226
Serhiy Storchaka	6dff020	2016-05-07 10:49:07 +0300	[diff] [blame]	227	* Calculating the `median <https://www.ualberta.ca/~opscan/median.html>`_.
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	228
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	229	* The `SSMEDIAN
Georg Brandl	525d355	2014-10-29 10:26:56 +0100	[diff] [blame]	230	<https://help.gnome.org/users/gnumeric/stable/gnumeric.html#gnumeric-function-SSMEDIAN>`_
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	231	function in the Gnome Gnumeric spreadsheet, including `this discussion
				232	<https://mail.gnome.org/archives/gnumeric-list/2011-April/msg00018.html>`_.
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	233
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	234
				235	.. function:: mode(data)
				236
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	237	Return the most common data point from discrete or nominal data. The mode
				238	(when it exists) is the most typical value, and is a robust measure of
				239	central location.
				240
				241	If data is empty, or if there is not exactly one most common value,
				242	:exc:`StatisticsError` is raised.
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	243
				244	``mode`` assumes discrete data, and returns a single value. This is the
				245	standard treatment of the mode as commonly taught in schools:
				246
				247	.. doctest::
				248
				249	>>> mode([1, 1, 2, 3, 3, 3, 3, 4])
				250	3
				251
				252	The mode is unique in that it is the only statistic which also applies
				253	to nominal (non-numeric) data:
				254
				255	.. doctest::
				256
				257	>>> mode(["red", "blue", "blue", "red", "green", "red", "red"])
				258	'red'
				259
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	260
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	261	.. function:: pstdev(data, mu=None)
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	262
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	263	Return the population standard deviation (the square root of the population
				264	variance). See :func:`pvariance` for arguments and other details.
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	265
				266	.. doctest::
				267
				268	>>> pstdev([1.5, 2.5, 2.5, 2.75, 3.25, 4.75])
				269	0.986893273527251
				270
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	271
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	272	.. function:: pvariance(data, mu=None)
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	273
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	274	Return the population variance of data, a non-empty iterable of real-valued
				275	numbers. Variance, or second moment about the mean, is a measure of the
				276	variability (spread or dispersion) of data. A large variance indicates that
				277	the data is spread out; a small variance indicates it is clustered closely
				278	around the mean.
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	279
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	280	If the optional second argument mu is given, it should be the mean of
				281	data. If it is missing or ``None`` (the default), the mean is
Ned Deily	3586673	2013-10-19 12:10:01 -0700	[diff] [blame]	282	automatically calculated.
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	283
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	284	Use this function to calculate the variance from the entire population. To
				285	estimate the variance from a sample, the :func:`variance` function is usually
				286	a better choice.
				287
				288	Raises :exc:`StatisticsError` if data is empty.
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	289
				290	Examples:
				291
				292	.. doctest::
				293
				294	>>> data = [0.0, 0.25, 0.25, 1.25, 1.5, 1.75, 2.75, 3.25]
				295	>>> pvariance(data)
				296	1.25
				297
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	298	If you have already calculated the mean of your data, you can pass it as the
				299	optional second argument mu to avoid recalculation:
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	300
				301	.. doctest::
				302
				303	>>> mu = mean(data)
				304	>>> pvariance(data, mu)
				305	1.25
				306
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	307	This function does not attempt to verify that you have passed the actual mean
				308	as mu. Using arbitrary values for mu may lead to invalid or impossible
				309	results.
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	310
				311	Decimals and Fractions are supported:
				312
				313	.. doctest::
				314
				315	>>> from decimal import Decimal as D
				316	>>> pvariance([D("27.5"), D("30.25"), D("30.25"), D("34.5"), D("41.75")])
				317	Decimal('24.815')
				318
				319	>>> from fractions import Fraction as F
				320	>>> pvariance([F(1, 4), F(5, 4), F(1, 2)])
				321	Fraction(13, 72)
				322
				323	.. note::
				324
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	325	When called with the entire population, this gives the population variance
				326	σ². When called on a sample instead, this is the biased sample variance
				327	s², also known as variance with N degrees of freedom.
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	328
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	329	If you somehow know the true population mean μ, you may use this function
				330	to calculate the variance of a sample, giving the known population mean as
				331	the second argument. Provided the data points are representative
				332	(e.g. independent and identically distributed), the result will be an
				333	unbiased estimate of the population variance.
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	334
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	335
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	336	.. function:: stdev(data, xbar=None)
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	337
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	338	Return the sample standard deviation (the square root of the sample
				339	variance). See :func:`variance` for arguments and other details.
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	340
				341	.. doctest::
				342
				343	>>> stdev([1.5, 2.5, 2.5, 2.75, 3.25, 4.75])
				344	1.0810874155219827
				345
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	346
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	347	.. function:: variance(data, xbar=None)
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	348
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	349	Return the sample variance of data, an iterable of at least two real-valued
				350	numbers. Variance, or second moment about the mean, is a measure of the
				351	variability (spread or dispersion) of data. A large variance indicates that
				352	the data is spread out; a small variance indicates it is clustered closely
				353	around the mean.
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	354
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	355	If the optional second argument xbar is given, it should be the mean of
				356	data. If it is missing or ``None`` (the default), the mean is
Ned Deily	3586673	2013-10-19 12:10:01 -0700	[diff] [blame]	357	automatically calculated.
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	358
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	359	Use this function when your data is a sample from a population. To calculate
				360	the variance from the entire population, see :func:`pvariance`.
				361
				362	Raises :exc:`StatisticsError` if data has fewer than two values.
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	363
				364	Examples:
				365
				366	.. doctest::
				367
				368	>>> data = [2.75, 1.75, 1.25, 0.25, 0.5, 1.25, 3.5]
				369	>>> variance(data)
				370	1.3720238095238095
				371
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	372	If you have already calculated the mean of your data, you can pass it as the
				373	optional second argument xbar to avoid recalculation:
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	374
				375	.. doctest::
				376
				377	>>> m = mean(data)
				378	>>> variance(data, m)
				379	1.3720238095238095
				380
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	381	This function does not attempt to verify that you have passed the actual mean
				382	as xbar. Using arbitrary values for xbar can lead to invalid or
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	383	impossible results.
				384
				385	Decimal and Fraction values are supported:
				386
				387	.. doctest::
				388
				389	>>> from decimal import Decimal as D
				390	>>> variance([D("27.5"), D("30.25"), D("30.25"), D("34.5"), D("41.75")])
				391	Decimal('31.01875')
				392
				393	>>> from fractions import Fraction as F
				394	>>> variance([F(1, 6), F(1, 2), F(5, 3)])
				395	Fraction(67, 108)
				396
				397	.. note::
				398
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	399	This is the sample variance s² with Bessel's correction, also known as
				400	variance with N-1 degrees of freedom. Provided that the data points are
				401	representative (e.g. independent and identically distributed), the result
				402	should be an unbiased estimate of the true population variance.
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	403
Georg Brandl	eb2aeec	2013-10-21 08:57:26 +0200	[diff] [blame]	404	If you somehow know the actual population mean μ you should pass it to the
				405	:func:`pvariance` function as the mu parameter to get the variance of a
				406	sample.
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	407
				408	Exceptions
				409	----------
				410
				411	A single exception is defined:
				412
Benjamin Peterson	4ea16e5	2013-10-20 17:52:54 -0400	[diff] [blame]	413	.. exception:: StatisticsError
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	414
Benjamin Peterson	44c3065	2013-10-20 17:52:09 -0400	[diff] [blame]	415	Subclass of :exc:`ValueError` for statistics-related exceptions.
Larry Hastings	f5e987b	2013-10-19 11:50:09 -0700	[diff] [blame]	416
				417	..
				418	# This modelines must appear within the last ten lines of the file.
				419	kate: indent-width 3; remove-trailing-space on; replace-tabs on; encoding utf-8;