blob: 232fb75247430b4ace6e6afe5496d196ca10bf3a [file] [log] [blame]
Larry Hastingsf5e987b2013-10-19 11:50:09 -07001:mod:`statistics` --- Mathematical statistics functions
2=======================================================
3
4.. module:: statistics
5 :synopsis: mathematical statistics functions
Terry Jan Reedyfa089b92016-06-11 15:02:54 -04006
Larry Hastingsf5e987b2013-10-19 11:50:09 -07007.. moduleauthor:: Steven D'Aprano <steve+python@pearwood.info>
8.. sectionauthor:: Steven D'Aprano <steve+python@pearwood.info>
9
10.. versionadded:: 3.4
11
Terry Jan Reedyfa089b92016-06-11 15:02:54 -040012**Source code:** :source:`Lib/statistics.py`
13
Larry Hastingsf5e987b2013-10-19 11:50:09 -070014.. testsetup:: *
15
16 from statistics import *
17 __name__ = '<doctest>'
18
Larry Hastingsf5e987b2013-10-19 11:50:09 -070019--------------
20
21This module provides functions for calculating mathematical statistics of
22numeric (:class:`Real`-valued) data.
23
Nick Coghlan73afe2a2014-02-08 19:58:04 +100024.. note::
25
26 Unless explicitly noted otherwise, these functions support :class:`int`,
27 :class:`float`, :class:`decimal.Decimal` and :class:`fractions.Fraction`.
28 Behaviour with other types (whether in the numeric tower or not) is
29 currently unsupported. Mixed types are also undefined and
30 implementation-dependent. If your input data consists of mixed types,
31 you may be able to use :func:`map` to ensure a consistent result, e.g.
32 ``map(float, input_data)``.
33
Larry Hastingsf5e987b2013-10-19 11:50:09 -070034Averages and measures of central location
35-----------------------------------------
36
37These functions calculate an average or typical value from a population
38or sample.
39
40======================= =============================================
41:func:`mean` Arithmetic mean ("average") of data.
Steven D'Aprano22873182016-08-24 02:34:25 +100042:func:`geometric_mean` Geometric mean of data.
43:func:`harmonic_mean` Harmonic mean of data.
Larry Hastingsf5e987b2013-10-19 11:50:09 -070044:func:`median` Median (middle value) of data.
45:func:`median_low` Low median of data.
46:func:`median_high` High median of data.
47:func:`median_grouped` Median, or 50th percentile, of grouped data.
48:func:`mode` Mode (most common value) of discrete data.
49======================= =============================================
50
Georg Brandleb2aeec2013-10-21 08:57:26 +020051Measures of spread
52------------------
Larry Hastingsf5e987b2013-10-19 11:50:09 -070053
Georg Brandleb2aeec2013-10-21 08:57:26 +020054These functions calculate a measure of how much the population or sample
55tends to deviate from the typical or average values.
56
57======================= =============================================
58:func:`pstdev` Population standard deviation of data.
59:func:`pvariance` Population variance of data.
60:func:`stdev` Sample standard deviation of data.
61:func:`variance` Sample variance of data.
62======================= =============================================
63
64
65Function details
66----------------
Larry Hastingsf5e987b2013-10-19 11:50:09 -070067
Georg Brandle051b552013-11-04 07:30:50 +010068Note: The functions do not require the data given to them to be sorted.
69However, for reading convenience, most of the examples show sorted sequences.
70
Larry Hastingsf5e987b2013-10-19 11:50:09 -070071.. function:: mean(data)
72
Georg Brandleb2aeec2013-10-21 08:57:26 +020073 Return the sample arithmetic mean of *data*, a sequence or iterator of
74 real-valued numbers.
Larry Hastingsf5e987b2013-10-19 11:50:09 -070075
Georg Brandleb2aeec2013-10-21 08:57:26 +020076 The arithmetic mean is the sum of the data divided by the number of data
77 points. It is commonly called "the average", although it is only one of many
78 different mathematical averages. It is a measure of the central location of
79 the data.
80
81 If *data* is empty, :exc:`StatisticsError` will be raised.
Larry Hastingsf5e987b2013-10-19 11:50:09 -070082
83 Some examples of use:
84
85 .. doctest::
86
87 >>> mean([1, 2, 3, 4, 4])
88 2.8
89 >>> mean([-1.0, 2.5, 3.25, 5.75])
90 2.625
91
92 >>> from fractions import Fraction as F
93 >>> mean([F(3, 7), F(1, 21), F(5, 3), F(1, 3)])
94 Fraction(13, 21)
95
96 >>> from decimal import Decimal as D
97 >>> mean([D("0.5"), D("0.75"), D("0.625"), D("0.375")])
98 Decimal('0.5625')
99
100 .. note::
101
Georg Brandla3fdcaa2013-10-21 09:08:39 +0200102 The mean is strongly affected by outliers and is not a robust estimator
Georg Brandleb2aeec2013-10-21 08:57:26 +0200103 for central location: the mean is not necessarily a typical example of the
104 data points. For more robust, although less efficient, measures of
105 central location, see :func:`median` and :func:`mode`. (In this case,
106 "efficient" refers to statistical efficiency rather than computational
107 efficiency.)
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700108
Georg Brandleb2aeec2013-10-21 08:57:26 +0200109 The sample mean gives an unbiased estimate of the true population mean,
110 which means that, taken on average over all the possible samples,
111 ``mean(sample)`` converges on the true mean of the entire population. If
112 *data* represents the entire population rather than a sample, then
113 ``mean(data)`` is equivalent to calculating the true population mean μ.
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700114
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700115
Steven D'Aprano22873182016-08-24 02:34:25 +1000116.. function:: geometric_mean(data)
117
118 Return the geometric mean of *data*, a sequence or iterator of
119 real-valued numbers.
120
121 The geometric mean is the *n*-th root of the product of *n* data points.
122 It is a type of average, a measure of the central location of the data.
123
124 The geometric mean is appropriate when averaging quantities which
125 are multiplied together rather than added, for example growth rates.
126 Suppose an investment grows by 10% in the first year, falls by 5% in
127 the second, then grows by 12% in the third, what is the average rate
128 of growth over the three years?
129
130 .. doctest::
131
132 >>> geometric_mean([1.10, 0.95, 1.12])
133 1.0538483123382172
134
135 giving an average growth of 5.385%. Using the arithmetic mean will
136 give approximately 5.667%, which is too high.
137
Zachary Warec019bd32016-08-23 13:23:31 -0500138 :exc:`StatisticsError` is raised if *data* is empty, or any
Steven D'Aprano22873182016-08-24 02:34:25 +1000139 element is less than zero.
140
Zachary Warec019bd32016-08-23 13:23:31 -0500141 .. versionadded:: 3.6
142
Steven D'Aprano22873182016-08-24 02:34:25 +1000143
144.. function:: harmonic_mean(data)
145
146 Return the harmonic mean of *data*, a sequence or iterator of
147 real-valued numbers.
148
149 The harmonic mean, sometimes called the subcontrary mean, is the
Zachary Warec019bd32016-08-23 13:23:31 -0500150 reciprocal of the arithmetic :func:`mean` of the reciprocals of the
Steven D'Aprano22873182016-08-24 02:34:25 +1000151 data. For example, the harmonic mean of three values *a*, *b* and *c*
152 will be equivalent to ``3/(1/a + 1/b + 1/c)``.
153
154 The harmonic mean is a type of average, a measure of the central
155 location of the data. It is often appropriate when averaging quantities
156 which are rates or ratios, for example speeds. For example:
157
158 Suppose an investor purchases an equal value of shares in each of
159 three companies, with P/E (price/earning) ratios of 2.5, 3 and 10.
160 What is the average P/E ratio for the investor's portfolio?
161
162 .. doctest::
163
164 >>> harmonic_mean([2.5, 3, 10]) # For an equal investment portfolio.
165 3.6
166
167 Using the arithmetic mean would give an average of about 5.167, which
168 is too high.
169
Zachary Warec019bd32016-08-23 13:23:31 -0500170 :exc:`StatisticsError` is raised if *data* is empty, or any element
Steven D'Aprano22873182016-08-24 02:34:25 +1000171 is less than zero.
172
Zachary Warec019bd32016-08-23 13:23:31 -0500173 .. versionadded:: 3.6
174
Steven D'Aprano22873182016-08-24 02:34:25 +1000175
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700176.. function:: median(data)
177
Georg Brandleb2aeec2013-10-21 08:57:26 +0200178 Return the median (middle value) of numeric data, using the common "mean of
179 middle two" method. If *data* is empty, :exc:`StatisticsError` is raised.
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700180
Georg Brandleb2aeec2013-10-21 08:57:26 +0200181 The median is a robust measure of central location, and is less affected by
182 the presence of outliers in your data. When the number of data points is
183 odd, the middle data point is returned:
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700184
185 .. doctest::
186
187 >>> median([1, 3, 5])
188 3
189
Georg Brandleb2aeec2013-10-21 08:57:26 +0200190 When the number of data points is even, the median is interpolated by taking
191 the average of the two middle values:
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700192
193 .. doctest::
194
195 >>> median([1, 3, 5, 7])
196 4.0
197
Georg Brandleb2aeec2013-10-21 08:57:26 +0200198 This is suited for when your data is discrete, and you don't mind that the
199 median may not be an actual data point.
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700200
Berker Peksag9c1dba22014-09-28 00:00:58 +0300201 .. seealso:: :func:`median_low`, :func:`median_high`, :func:`median_grouped`
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700202
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700203
204.. function:: median_low(data)
205
Georg Brandleb2aeec2013-10-21 08:57:26 +0200206 Return the low median of numeric data. If *data* is empty,
207 :exc:`StatisticsError` is raised.
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700208
Georg Brandleb2aeec2013-10-21 08:57:26 +0200209 The low median is always a member of the data set. When the number of data
210 points is odd, the middle value is returned. When it is even, the smaller of
211 the two middle values is returned.
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700212
213 .. doctest::
214
215 >>> median_low([1, 3, 5])
216 3
217 >>> median_low([1, 3, 5, 7])
218 3
219
Georg Brandleb2aeec2013-10-21 08:57:26 +0200220 Use the low median when your data are discrete and you prefer the median to
221 be an actual data point rather than interpolated.
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700222
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700223
224.. function:: median_high(data)
225
Georg Brandleb2aeec2013-10-21 08:57:26 +0200226 Return the high median of data. If *data* is empty, :exc:`StatisticsError`
227 is raised.
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700228
Georg Brandleb2aeec2013-10-21 08:57:26 +0200229 The high median is always a member of the data set. When the number of data
230 points is odd, the middle value is returned. When it is even, the larger of
231 the two middle values is returned.
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700232
233 .. doctest::
234
235 >>> median_high([1, 3, 5])
236 3
237 >>> median_high([1, 3, 5, 7])
238 5
239
Georg Brandleb2aeec2013-10-21 08:57:26 +0200240 Use the high median when your data are discrete and you prefer the median to
241 be an actual data point rather than interpolated.
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700242
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700243
Georg Brandleb2aeec2013-10-21 08:57:26 +0200244.. function:: median_grouped(data, interval=1)
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700245
Georg Brandleb2aeec2013-10-21 08:57:26 +0200246 Return the median of grouped continuous data, calculated as the 50th
247 percentile, using interpolation. If *data* is empty, :exc:`StatisticsError`
248 is raised.
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700249
250 .. doctest::
251
252 >>> median_grouped([52, 52, 53, 54])
253 52.5
254
Georg Brandleb2aeec2013-10-21 08:57:26 +0200255 In the following example, the data are rounded, so that each value represents
256 the midpoint of data classes, e.g. 1 is the midpoint of the class 0.5-1.5, 2
257 is the midpoint of 1.5-2.5, 3 is the midpoint of 2.5-3.5, etc. With the data
258 given, the middle value falls somewhere in the class 3.5-4.5, and
259 interpolation is used to estimate it:
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700260
261 .. doctest::
262
263 >>> median_grouped([1, 2, 2, 3, 4, 4, 4, 4, 4, 5])
264 3.7
265
Georg Brandleb2aeec2013-10-21 08:57:26 +0200266 Optional argument *interval* represents the class interval, and defaults
267 to 1. Changing the class interval naturally will change the interpolation:
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700268
269 .. doctest::
270
271 >>> median_grouped([1, 3, 3, 5, 7], interval=1)
272 3.25
273 >>> median_grouped([1, 3, 3, 5, 7], interval=2)
274 3.5
275
276 This function does not check whether the data points are at least
Georg Brandleb2aeec2013-10-21 08:57:26 +0200277 *interval* apart.
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700278
279 .. impl-detail::
280
Georg Brandleb2aeec2013-10-21 08:57:26 +0200281 Under some circumstances, :func:`median_grouped` may coerce data points to
282 floats. This behaviour is likely to change in the future.
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700283
284 .. seealso::
285
Georg Brandleb2aeec2013-10-21 08:57:26 +0200286 * "Statistics for the Behavioral Sciences", Frederick J Gravetter and
287 Larry B Wallnau (8th Edition).
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700288
Serhiy Storchaka6dff0202016-05-07 10:49:07 +0300289 * Calculating the `median <https://www.ualberta.ca/~opscan/median.html>`_.
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700290
Georg Brandleb2aeec2013-10-21 08:57:26 +0200291 * The `SSMEDIAN
Georg Brandl525d3552014-10-29 10:26:56 +0100292 <https://help.gnome.org/users/gnumeric/stable/gnumeric.html#gnumeric-function-SSMEDIAN>`_
Georg Brandleb2aeec2013-10-21 08:57:26 +0200293 function in the Gnome Gnumeric spreadsheet, including `this discussion
294 <https://mail.gnome.org/archives/gnumeric-list/2011-April/msg00018.html>`_.
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700295
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700296
297.. function:: mode(data)
298
Georg Brandleb2aeec2013-10-21 08:57:26 +0200299 Return the most common data point from discrete or nominal *data*. The mode
300 (when it exists) is the most typical value, and is a robust measure of
301 central location.
302
303 If *data* is empty, or if there is not exactly one most common value,
304 :exc:`StatisticsError` is raised.
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700305
306 ``mode`` assumes discrete data, and returns a single value. This is the
307 standard treatment of the mode as commonly taught in schools:
308
309 .. doctest::
310
311 >>> mode([1, 1, 2, 3, 3, 3, 3, 4])
312 3
313
314 The mode is unique in that it is the only statistic which also applies
315 to nominal (non-numeric) data:
316
317 .. doctest::
318
319 >>> mode(["red", "blue", "blue", "red", "green", "red", "red"])
320 'red'
321
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700322
Georg Brandleb2aeec2013-10-21 08:57:26 +0200323.. function:: pstdev(data, mu=None)
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700324
Georg Brandleb2aeec2013-10-21 08:57:26 +0200325 Return the population standard deviation (the square root of the population
326 variance). See :func:`pvariance` for arguments and other details.
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700327
328 .. doctest::
329
330 >>> pstdev([1.5, 2.5, 2.5, 2.75, 3.25, 4.75])
331 0.986893273527251
332
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700333
Georg Brandleb2aeec2013-10-21 08:57:26 +0200334.. function:: pvariance(data, mu=None)
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700335
Georg Brandleb2aeec2013-10-21 08:57:26 +0200336 Return the population variance of *data*, a non-empty iterable of real-valued
337 numbers. Variance, or second moment about the mean, is a measure of the
338 variability (spread or dispersion) of data. A large variance indicates that
339 the data is spread out; a small variance indicates it is clustered closely
340 around the mean.
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700341
Georg Brandleb2aeec2013-10-21 08:57:26 +0200342 If the optional second argument *mu* is given, it should be the mean of
343 *data*. If it is missing or ``None`` (the default), the mean is
Ned Deily35866732013-10-19 12:10:01 -0700344 automatically calculated.
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700345
Georg Brandleb2aeec2013-10-21 08:57:26 +0200346 Use this function to calculate the variance from the entire population. To
347 estimate the variance from a sample, the :func:`variance` function is usually
348 a better choice.
349
350 Raises :exc:`StatisticsError` if *data* is empty.
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700351
352 Examples:
353
354 .. doctest::
355
356 >>> data = [0.0, 0.25, 0.25, 1.25, 1.5, 1.75, 2.75, 3.25]
357 >>> pvariance(data)
358 1.25
359
Georg Brandleb2aeec2013-10-21 08:57:26 +0200360 If you have already calculated the mean of your data, you can pass it as the
361 optional second argument *mu* to avoid recalculation:
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700362
363 .. doctest::
364
365 >>> mu = mean(data)
366 >>> pvariance(data, mu)
367 1.25
368
Georg Brandleb2aeec2013-10-21 08:57:26 +0200369 This function does not attempt to verify that you have passed the actual mean
370 as *mu*. Using arbitrary values for *mu* may lead to invalid or impossible
371 results.
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700372
373 Decimals and Fractions are supported:
374
375 .. doctest::
376
377 >>> from decimal import Decimal as D
378 >>> pvariance([D("27.5"), D("30.25"), D("30.25"), D("34.5"), D("41.75")])
379 Decimal('24.815')
380
381 >>> from fractions import Fraction as F
382 >>> pvariance([F(1, 4), F(5, 4), F(1, 2)])
383 Fraction(13, 72)
384
385 .. note::
386
Georg Brandleb2aeec2013-10-21 08:57:26 +0200387 When called with the entire population, this gives the population variance
388 σ². When called on a sample instead, this is the biased sample variance
389 s², also known as variance with N degrees of freedom.
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700390
Georg Brandleb2aeec2013-10-21 08:57:26 +0200391 If you somehow know the true population mean μ, you may use this function
392 to calculate the variance of a sample, giving the known population mean as
393 the second argument. Provided the data points are representative
394 (e.g. independent and identically distributed), the result will be an
395 unbiased estimate of the population variance.
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700396
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700397
Georg Brandleb2aeec2013-10-21 08:57:26 +0200398.. function:: stdev(data, xbar=None)
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700399
Georg Brandleb2aeec2013-10-21 08:57:26 +0200400 Return the sample standard deviation (the square root of the sample
401 variance). See :func:`variance` for arguments and other details.
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700402
403 .. doctest::
404
405 >>> stdev([1.5, 2.5, 2.5, 2.75, 3.25, 4.75])
406 1.0810874155219827
407
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700408
Georg Brandleb2aeec2013-10-21 08:57:26 +0200409.. function:: variance(data, xbar=None)
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700410
Georg Brandleb2aeec2013-10-21 08:57:26 +0200411 Return the sample variance of *data*, an iterable of at least two real-valued
412 numbers. Variance, or second moment about the mean, is a measure of the
413 variability (spread or dispersion) of data. A large variance indicates that
414 the data is spread out; a small variance indicates it is clustered closely
415 around the mean.
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700416
Georg Brandleb2aeec2013-10-21 08:57:26 +0200417 If the optional second argument *xbar* is given, it should be the mean of
418 *data*. If it is missing or ``None`` (the default), the mean is
Ned Deily35866732013-10-19 12:10:01 -0700419 automatically calculated.
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700420
Georg Brandleb2aeec2013-10-21 08:57:26 +0200421 Use this function when your data is a sample from a population. To calculate
422 the variance from the entire population, see :func:`pvariance`.
423
424 Raises :exc:`StatisticsError` if *data* has fewer than two values.
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700425
426 Examples:
427
428 .. doctest::
429
430 >>> data = [2.75, 1.75, 1.25, 0.25, 0.5, 1.25, 3.5]
431 >>> variance(data)
432 1.3720238095238095
433
Georg Brandleb2aeec2013-10-21 08:57:26 +0200434 If you have already calculated the mean of your data, you can pass it as the
435 optional second argument *xbar* to avoid recalculation:
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700436
437 .. doctest::
438
439 >>> m = mean(data)
440 >>> variance(data, m)
441 1.3720238095238095
442
Georg Brandleb2aeec2013-10-21 08:57:26 +0200443 This function does not attempt to verify that you have passed the actual mean
444 as *xbar*. Using arbitrary values for *xbar* can lead to invalid or
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700445 impossible results.
446
447 Decimal and Fraction values are supported:
448
449 .. doctest::
450
451 >>> from decimal import Decimal as D
452 >>> variance([D("27.5"), D("30.25"), D("30.25"), D("34.5"), D("41.75")])
453 Decimal('31.01875')
454
455 >>> from fractions import Fraction as F
456 >>> variance([F(1, 6), F(1, 2), F(5, 3)])
457 Fraction(67, 108)
458
459 .. note::
460
Georg Brandleb2aeec2013-10-21 08:57:26 +0200461 This is the sample variance s² with Bessel's correction, also known as
462 variance with N-1 degrees of freedom. Provided that the data points are
463 representative (e.g. independent and identically distributed), the result
464 should be an unbiased estimate of the true population variance.
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700465
Georg Brandleb2aeec2013-10-21 08:57:26 +0200466 If you somehow know the actual population mean μ you should pass it to the
467 :func:`pvariance` function as the *mu* parameter to get the variance of a
468 sample.
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700469
470Exceptions
471----------
472
473A single exception is defined:
474
Benjamin Peterson4ea16e52013-10-20 17:52:54 -0400475.. exception:: StatisticsError
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700476
Benjamin Peterson44c30652013-10-20 17:52:09 -0400477 Subclass of :exc:`ValueError` for statistics-related exceptions.
Larry Hastingsf5e987b2013-10-19 11:50:09 -0700478
479..
480 # This modelines must appear within the last ten lines of the file.
481 kate: indent-width 3; remove-trailing-space on; replace-tabs on; encoding utf-8;