blob: 6729a20a9c2ecca8d60a49c3129e048358f0d7ab [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001.. XXX: reference/datamodel and this have quite a few overlaps!
2
3
4.. _bltin-types:
5
6**************
7Built-in Types
8**************
9
10The following sections describe the standard types that are built into the
11interpreter.
12
Georg Brandl116aa622007-08-15 14:28:22 +000013.. index:: pair: built-in; types
14
Antoine Pitroue231e392009-12-19 18:22:15 +000015The principal built-in types are numerics, sequences, mappings, classes,
Georg Brandl116aa622007-08-15 14:28:22 +000016instances and exceptions.
17
Georg Brandl388349a2011-10-08 18:32:40 +020018Some collection classes are mutable. The methods that add, subtract, or
19rearrange their members in place, and don't return a specific item, never return
20the collection instance itself but ``None``.
21
Georg Brandl116aa622007-08-15 14:28:22 +000022Some operations are supported by several object types; in particular,
23practically all objects can be compared, tested for truth value, and converted
24to a string (with the :func:`repr` function or the slightly different
25:func:`str` function). The latter function is implicitly used when an object is
26written by the :func:`print` function.
27
28
29.. _truth:
30
31Truth Value Testing
32===================
33
34.. index::
35 statement: if
36 statement: while
37 pair: truth; value
38 pair: Boolean; operations
39 single: false
40
41Any object can be tested for truth value, for use in an :keyword:`if` or
42:keyword:`while` condition or as operand of the Boolean operations below. The
43following values are considered false:
44
45 .. index:: single: None (Built-in object)
46
47* ``None``
48
49 .. index:: single: False (Built-in object)
50
51* ``False``
52
Mark Summerfieldbbfd71d2008-07-01 15:50:04 +000053* zero of any numeric type, for example, ``0``, ``0.0``, ``0j``.
Georg Brandl116aa622007-08-15 14:28:22 +000054
55* any empty sequence, for example, ``''``, ``()``, ``[]``.
56
57* any empty mapping, for example, ``{}``.
58
59* instances of user-defined classes, if the class defines a :meth:`__bool__` or
60 :meth:`__len__` method, when that method returns the integer zero or
Ezio Melotti0656a562011-08-15 14:27:19 +030061 :class:`bool` value ``False``. [1]_
Georg Brandl116aa622007-08-15 14:28:22 +000062
63.. index:: single: true
64
65All other values are considered true --- so objects of many types are always
66true.
67
68.. index::
69 operator: or
70 operator: and
71 single: False
72 single: True
73
74Operations and built-in functions that have a Boolean result always return ``0``
75or ``False`` for false and ``1`` or ``True`` for true, unless otherwise stated.
76(Important exception: the Boolean operations ``or`` and ``and`` always return
77one of their operands.)
78
79
80.. _boolean:
81
82Boolean Operations --- :keyword:`and`, :keyword:`or`, :keyword:`not`
83====================================================================
84
85.. index:: pair: Boolean; operations
86
87These are the Boolean operations, ordered by ascending priority:
88
89+-------------+---------------------------------+-------+
90| Operation | Result | Notes |
91+=============+=================================+=======+
92| ``x or y`` | if *x* is false, then *y*, else | \(1) |
93| | *x* | |
94+-------------+---------------------------------+-------+
95| ``x and y`` | if *x* is false, then *x*, else | \(2) |
96| | *y* | |
97+-------------+---------------------------------+-------+
98| ``not x`` | if *x* is false, then ``True``, | \(3) |
99| | else ``False`` | |
100+-------------+---------------------------------+-------+
101
102.. index::
103 operator: and
104 operator: or
105 operator: not
106
107Notes:
108
109(1)
110 This is a short-circuit operator, so it only evaluates the second
Mariatta1936ba92017-03-03 13:24:13 -0800111 argument if the first one is false.
Georg Brandl116aa622007-08-15 14:28:22 +0000112
113(2)
114 This is a short-circuit operator, so it only evaluates the second
Mariatta1936ba92017-03-03 13:24:13 -0800115 argument if the first one is true.
Georg Brandl116aa622007-08-15 14:28:22 +0000116
117(3)
118 ``not`` has a lower priority than non-Boolean operators, so ``not a == b`` is
119 interpreted as ``not (a == b)``, and ``a == not b`` is a syntax error.
120
121
122.. _stdcomparisons:
123
124Comparisons
125===========
126
Alexandre Vassalotti6d3dfc32009-07-29 19:54:39 +0000127.. index::
128 pair: chaining; comparisons
129 pair: operator; comparison
130 operator: ==
131 operator: <
132 operator: <=
133 operator: >
134 operator: >=
135 operator: !=
136 operator: is
137 operator: is not
Georg Brandl116aa622007-08-15 14:28:22 +0000138
Georg Brandl905ec322007-09-28 13:39:25 +0000139There are eight comparison operations in Python. They all have the same
140priority (which is higher than that of the Boolean operations). Comparisons can
Georg Brandl116aa622007-08-15 14:28:22 +0000141be chained arbitrarily; for example, ``x < y <= z`` is equivalent to ``x < y and
142y <= z``, except that *y* is evaluated only once (but in both cases *z* is not
143evaluated at all when ``x < y`` is found to be false).
144
145This table summarizes the comparison operations:
146
Georg Brandlfd855162008-01-07 09:13:03 +0000147+------------+-------------------------+
148| Operation | Meaning |
149+============+=========================+
150| ``<`` | strictly less than |
151+------------+-------------------------+
152| ``<=`` | less than or equal |
153+------------+-------------------------+
154| ``>`` | strictly greater than |
155+------------+-------------------------+
156| ``>=`` | greater than or equal |
157+------------+-------------------------+
158| ``==`` | equal |
159+------------+-------------------------+
160| ``!=`` | not equal |
161+------------+-------------------------+
162| ``is`` | object identity |
163+------------+-------------------------+
164| ``is not`` | negated object identity |
165+------------+-------------------------+
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000166
167.. index::
Georg Brandl116aa622007-08-15 14:28:22 +0000168 pair: object; numeric
169 pair: objects; comparing
170
Georg Brandl905ec322007-09-28 13:39:25 +0000171Objects of different types, except different numeric types, never compare equal.
Antoine Pitroue231e392009-12-19 18:22:15 +0000172Furthermore, some types (for example, function objects) support only a degenerate
Georg Brandl905ec322007-09-28 13:39:25 +0000173notion of comparison where any two objects of that type are unequal. The ``<``,
174``<=``, ``>`` and ``>=`` operators will raise a :exc:`TypeError` exception when
Mark Dickinsonf673f0c2010-03-13 09:48:39 +0000175comparing a complex number with another built-in numeric type, when the objects
176are of different types that cannot be compared, or in other cases where there is
177no defined ordering.
Georg Brandl116aa622007-08-15 14:28:22 +0000178
Georg Brandl48310cd2009-01-03 21:18:54 +0000179.. index::
Georg Brandl905ec322007-09-28 13:39:25 +0000180 single: __eq__() (instance method)
181 single: __ne__() (instance method)
182 single: __lt__() (instance method)
183 single: __le__() (instance method)
184 single: __gt__() (instance method)
185 single: __ge__() (instance method)
Georg Brandl116aa622007-08-15 14:28:22 +0000186
Georg Brandl05f5ab72008-09-24 09:11:47 +0000187Non-identical instances of a class normally compare as non-equal unless the
188class defines the :meth:`__eq__` method.
Georg Brandl116aa622007-08-15 14:28:22 +0000189
Georg Brandl905ec322007-09-28 13:39:25 +0000190Instances of a class cannot be ordered with respect to other instances of the
191same class, or other types of object, unless the class defines enough of the
Georg Brandl05f5ab72008-09-24 09:11:47 +0000192methods :meth:`__lt__`, :meth:`__le__`, :meth:`__gt__`, and :meth:`__ge__` (in
193general, :meth:`__lt__` and :meth:`__eq__` are sufficient, if you want the
194conventional meanings of the comparison operators).
Georg Brandl905ec322007-09-28 13:39:25 +0000195
196The behavior of the :keyword:`is` and :keyword:`is not` operators cannot be
197customized; also they can be applied to any two objects and never raise an
198exception.
Georg Brandl116aa622007-08-15 14:28:22 +0000199
200.. index::
201 operator: in
202 operator: not in
203
Georg Brandl375aec22011-01-15 17:03:02 +0000204Two more operations with the same syntactic priority, :keyword:`in` and
205:keyword:`not in`, are supported only by sequence types (below).
Georg Brandl116aa622007-08-15 14:28:22 +0000206
207
208.. _typesnumeric:
209
Georg Brandl905ec322007-09-28 13:39:25 +0000210Numeric Types --- :class:`int`, :class:`float`, :class:`complex`
211================================================================
Georg Brandl116aa622007-08-15 14:28:22 +0000212
213.. index::
214 object: numeric
215 object: Boolean
216 object: integer
Georg Brandl116aa622007-08-15 14:28:22 +0000217 object: floating point
218 object: complex number
219 pair: C; language
220
Mark Summerfieldbbfd71d2008-07-01 15:50:04 +0000221There are three distinct numeric types: :dfn:`integers`, :dfn:`floating
222point numbers`, and :dfn:`complex numbers`. In addition, Booleans are a
223subtype of integers. Integers have unlimited precision. Floating point
Georg Brandl60203b42010-10-06 10:11:56 +0000224numbers are usually implemented using :c:type:`double` in C; information
Mark Dickinson74f59022010-08-04 18:42:43 +0000225about the precision and internal representation of floating point
226numbers for the machine on which your program is running is available
227in :data:`sys.float_info`. Complex numbers have a real and imaginary
228part, which are each a floating point number. To extract these parts
229from a complex number *z*, use ``z.real`` and ``z.imag``. (The standard
230library includes additional numeric types, :mod:`fractions` that hold
231rationals, and :mod:`decimal` that hold floating-point numbers with
232user-definable precision.)
Georg Brandl116aa622007-08-15 14:28:22 +0000233
234.. index::
235 pair: numeric; literals
236 pair: integer; literals
Georg Brandl116aa622007-08-15 14:28:22 +0000237 pair: floating point; literals
238 pair: complex number; literals
239 pair: hexadecimal; literals
240 pair: octal; literals
Neal Norwitz1d2aef52007-10-02 07:26:14 +0000241 pair: binary; literals
Georg Brandl116aa622007-08-15 14:28:22 +0000242
243Numbers are created by numeric literals or as the result of built-in functions
Georg Brandl905ec322007-09-28 13:39:25 +0000244and operators. Unadorned integer literals (including hex, octal and binary
245numbers) yield integers. Numeric literals containing a decimal point or an
246exponent sign yield floating point numbers. Appending ``'j'`` or ``'J'`` to a
247numeric literal yields an imaginary number (a complex number with a zero real
248part) which you can add to an integer or float to get a complex number with real
249and imaginary parts.
Georg Brandl116aa622007-08-15 14:28:22 +0000250
251.. index::
252 single: arithmetic
253 builtin: int
Georg Brandl116aa622007-08-15 14:28:22 +0000254 builtin: float
255 builtin: complex
Alexandre Vassalotti6d3dfc32009-07-29 19:54:39 +0000256 operator: +
257 operator: -
258 operator: *
259 operator: /
260 operator: //
261 operator: %
262 operator: **
Georg Brandl116aa622007-08-15 14:28:22 +0000263
264Python fully supports mixed arithmetic: when a binary arithmetic operator has
265operands of different numeric types, the operand with the "narrower" type is
Georg Brandl905ec322007-09-28 13:39:25 +0000266widened to that of the other, where integer is narrower than floating point,
267which is narrower than complex. Comparisons between numbers of mixed type use
Ezio Melotti0656a562011-08-15 14:27:19 +0300268the same rule. [2]_ The constructors :func:`int`, :func:`float`, and
Georg Brandl905ec322007-09-28 13:39:25 +0000269:func:`complex` can be used to produce numbers of a specific type.
Georg Brandl116aa622007-08-15 14:28:22 +0000270
271All numeric types (except complex) support the following operations, sorted by
Georg Brandle4196d32014-10-31 09:41:46 +0100272ascending priority (all numeric operations have a higher priority than
273comparison operations):
Georg Brandl116aa622007-08-15 14:28:22 +0000274
Raymond Hettingerc706dbf2011-03-22 17:33:17 -0700275+---------------------+---------------------------------+---------+--------------------+
276| Operation | Result | Notes | Full documentation |
277+=====================+=================================+=========+====================+
278| ``x + y`` | sum of *x* and *y* | | |
279+---------------------+---------------------------------+---------+--------------------+
280| ``x - y`` | difference of *x* and *y* | | |
281+---------------------+---------------------------------+---------+--------------------+
282| ``x * y`` | product of *x* and *y* | | |
283+---------------------+---------------------------------+---------+--------------------+
284| ``x / y`` | quotient of *x* and *y* | | |
285+---------------------+---------------------------------+---------+--------------------+
286| ``x // y`` | floored quotient of *x* and | \(1) | |
287| | *y* | | |
288+---------------------+---------------------------------+---------+--------------------+
289| ``x % y`` | remainder of ``x / y`` | \(2) | |
290+---------------------+---------------------------------+---------+--------------------+
291| ``-x`` | *x* negated | | |
292+---------------------+---------------------------------+---------+--------------------+
293| ``+x`` | *x* unchanged | | |
294+---------------------+---------------------------------+---------+--------------------+
295| ``abs(x)`` | absolute value or magnitude of | | :func:`abs` |
296| | *x* | | |
297+---------------------+---------------------------------+---------+--------------------+
298| ``int(x)`` | *x* converted to integer | \(3)\(6)| :func:`int` |
299+---------------------+---------------------------------+---------+--------------------+
300| ``float(x)`` | *x* converted to floating point | \(4)\(6)| :func:`float` |
301+---------------------+---------------------------------+---------+--------------------+
302| ``complex(re, im)`` | a complex number with real part | \(6) | :func:`complex` |
303| | *re*, imaginary part *im*. | | |
304| | *im* defaults to zero. | | |
305+---------------------+---------------------------------+---------+--------------------+
306| ``c.conjugate()`` | conjugate of the complex number | | |
307| | *c* | | |
308+---------------------+---------------------------------+---------+--------------------+
309| ``divmod(x, y)`` | the pair ``(x // y, x % y)`` | \(2) | :func:`divmod` |
310+---------------------+---------------------------------+---------+--------------------+
311| ``pow(x, y)`` | *x* to the power *y* | \(5) | :func:`pow` |
312+---------------------+---------------------------------+---------+--------------------+
313| ``x ** y`` | *x* to the power *y* | \(5) | |
314+---------------------+---------------------------------+---------+--------------------+
Georg Brandl116aa622007-08-15 14:28:22 +0000315
316.. index::
317 triple: operations on; numeric; types
318 single: conjugate() (complex number method)
319
320Notes:
321
322(1)
Georg Brandl905ec322007-09-28 13:39:25 +0000323 Also referred to as integer division. The resultant value is a whole
324 integer, though the result's type is not necessarily int. The result is
325 always rounded towards minus infinity: ``1//2`` is ``0``, ``(-1)//2`` is
326 ``-1``, ``1//(-2)`` is ``-1``, and ``(-1)//(-2)`` is ``0``.
Georg Brandl116aa622007-08-15 14:28:22 +0000327
328(2)
Georg Brandl905ec322007-09-28 13:39:25 +0000329 Not for complex numbers. Instead convert to floats using :func:`abs` if
330 appropriate.
331
332(3)
Georg Brandl116aa622007-08-15 14:28:22 +0000333 .. index::
334 module: math
335 single: floor() (in module math)
336 single: ceil() (in module math)
Benjamin Peterson28d88b42009-01-09 03:03:23 +0000337 single: trunc() (in module math)
Georg Brandl116aa622007-08-15 14:28:22 +0000338 pair: numeric; conversions
339 pair: C; language
340
Georg Brandlba956ae2007-11-29 17:24:34 +0000341 Conversion from floating point to integer may round or truncate
Serhiy Storchaka0d196ed2013-10-09 14:02:31 +0300342 as in C; see functions :func:`math.floor` and :func:`math.ceil` for
343 well-defined conversions.
Georg Brandl116aa622007-08-15 14:28:22 +0000344
Georg Brandl74f36692008-01-06 17:39:49 +0000345(4)
Georg Brandl48310cd2009-01-03 21:18:54 +0000346 float also accepts the strings "nan" and "inf" with an optional prefix "+"
Christian Heimes99170a52007-12-19 02:07:34 +0000347 or "-" for Not a Number (NaN) and positive or negative infinity.
Christian Heimes7f044312008-01-06 17:05:40 +0000348
Georg Brandl74f36692008-01-06 17:39:49 +0000349(5)
Christian Heimes7f044312008-01-06 17:05:40 +0000350 Python defines ``pow(0, 0)`` and ``0 ** 0`` to be ``1``, as is common for
351 programming languages.
352
Raymond Hettingerc706dbf2011-03-22 17:33:17 -0700353(6)
354 The numeric literals accepted include the digits ``0`` to ``9`` or any
355 Unicode equivalent (code points with the ``Nd`` property).
356
Benjamin Peterson48013832015-06-27 15:45:56 -0500357 See http://www.unicode.org/Public/8.0.0/ucd/extracted/DerivedNumericType.txt
Raymond Hettingerc706dbf2011-03-22 17:33:17 -0700358 for a complete list of code points with the ``Nd`` property.
Georg Brandl48310cd2009-01-03 21:18:54 +0000359
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000360
Benjamin Peterson10116d42011-05-01 17:38:17 -0500361All :class:`numbers.Real` types (:class:`int` and :class:`float`) also include
362the following operations:
Christian Heimesfaf2f632008-01-06 16:59:19 +0000363
Martin Panter129fe042016-05-08 12:22:37 +0000364+--------------------+---------------------------------------------+
365| Operation | Result |
366+====================+=============================================+
367| :func:`math.trunc(\| *x* truncated to :class:`~numbers.Integral` |
368| x) <math.trunc>` | |
369+--------------------+---------------------------------------------+
370| :func:`round(x[, | *x* rounded to *n* digits, |
371| n]) <round>` | rounding half to even. If *n* is |
372| | omitted, it defaults to 0. |
373+--------------------+---------------------------------------------+
374| :func:`math.floor(\| the greatest :class:`~numbers.Integral` |
375| x) <math.floor>` | <= *x* |
376+--------------------+---------------------------------------------+
377| :func:`math.ceil(x)| the least :class:`~numbers.Integral` >= *x* |
378| <math.ceil>` | |
379+--------------------+---------------------------------------------+
Christian Heimesfaf2f632008-01-06 16:59:19 +0000380
Mark Summerfieldbbfd71d2008-07-01 15:50:04 +0000381For additional numeric operations see the :mod:`math` and :mod:`cmath`
382modules.
383
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000384.. XXXJH exceptions: overflow (when? what operations?) zerodivision
Georg Brandl116aa622007-08-15 14:28:22 +0000385
386
387.. _bitstring-ops:
388
Benjamin Petersone9fca252012-01-25 16:29:03 -0500389Bitwise Operations on Integer Types
Georg Brandl116aa622007-08-15 14:28:22 +0000390--------------------------------------
391
Alexandre Vassalotti6d3dfc32009-07-29 19:54:39 +0000392.. index::
393 triple: operations on; integer; types
Benjamin Petersone9fca252012-01-25 16:29:03 -0500394 pair: bitwise; operations
Alexandre Vassalotti6d3dfc32009-07-29 19:54:39 +0000395 pair: shifting; operations
396 pair: masking; operations
397 operator: ^
398 operator: &
399 operator: <<
400 operator: >>
Georg Brandl116aa622007-08-15 14:28:22 +0000401
Benjamin Petersonb4b0b352012-01-25 16:30:18 -0500402Bitwise operations only make sense for integers. Negative numbers are treated
Raymond Hettinger4d028962016-03-12 22:58:24 -0800403as their 2's complement value (this assumes that there are enough bits so that
404no overflow occurs during the operation).
Georg Brandl116aa622007-08-15 14:28:22 +0000405
Christian Heimesfaf2f632008-01-06 16:59:19 +0000406The priorities of the binary bitwise operations are all lower than the numeric
Georg Brandl116aa622007-08-15 14:28:22 +0000407operations and higher than the comparisons; the unary operation ``~`` has the
408same priority as the other unary numeric operations (``+`` and ``-``).
409
Georg Brandle4196d32014-10-31 09:41:46 +0100410This table lists the bitwise operations sorted in ascending priority:
Georg Brandl116aa622007-08-15 14:28:22 +0000411
412+------------+--------------------------------+----------+
413| Operation | Result | Notes |
414+============+================================+==========+
415| ``x | y`` | bitwise :dfn:`or` of *x* and | |
416| | *y* | |
417+------------+--------------------------------+----------+
418| ``x ^ y`` | bitwise :dfn:`exclusive or` of | |
419| | *x* and *y* | |
420+------------+--------------------------------+----------+
421| ``x & y`` | bitwise :dfn:`and` of *x* and | |
422| | *y* | |
423+------------+--------------------------------+----------+
Christian Heimes043d6f62008-01-07 17:19:16 +0000424| ``x << n`` | *x* shifted left by *n* bits | (1)(2) |
Georg Brandl116aa622007-08-15 14:28:22 +0000425+------------+--------------------------------+----------+
Christian Heimes043d6f62008-01-07 17:19:16 +0000426| ``x >> n`` | *x* shifted right by *n* bits | (1)(3) |
Georg Brandl116aa622007-08-15 14:28:22 +0000427+------------+--------------------------------+----------+
428| ``~x`` | the bits of *x* inverted | |
429+------------+--------------------------------+----------+
430
Georg Brandl116aa622007-08-15 14:28:22 +0000431Notes:
432
433(1)
434 Negative shift counts are illegal and cause a :exc:`ValueError` to be raised.
435
436(2)
437 A left shift by *n* bits is equivalent to multiplication by ``pow(2, n)``
438 without overflow check.
439
440(3)
441 A right shift by *n* bits is equivalent to division by ``pow(2, n)`` without
442 overflow check.
443
444
Mark Dickinson54bc1ec2008-12-17 16:19:07 +0000445Additional Methods on Integer Types
446-----------------------------------
447
Raymond Hettinger9b2fd322011-05-01 18:14:49 -0700448The int type implements the :class:`numbers.Integral` :term:`abstract base
Georg Brandle4196d32014-10-31 09:41:46 +0100449class`. In addition, it provides a few more methods:
Benjamin Peterson10116d42011-05-01 17:38:17 -0500450
Mark Dickinson54bc1ec2008-12-17 16:19:07 +0000451.. method:: int.bit_length()
452
Raymond Hettingerd3e18b72008-12-19 09:11:49 +0000453 Return the number of bits necessary to represent an integer in binary,
454 excluding the sign and leading zeros::
Mark Dickinson54bc1ec2008-12-17 16:19:07 +0000455
Raymond Hettingerd3e18b72008-12-19 09:11:49 +0000456 >>> n = -37
Mark Dickinson54bc1ec2008-12-17 16:19:07 +0000457 >>> bin(n)
Raymond Hettingerd3e18b72008-12-19 09:11:49 +0000458 '-0b100101'
Mark Dickinson54bc1ec2008-12-17 16:19:07 +0000459 >>> n.bit_length()
460 6
Mark Dickinson54bc1ec2008-12-17 16:19:07 +0000461
Raymond Hettingerd3e18b72008-12-19 09:11:49 +0000462 More precisely, if ``x`` is nonzero, then ``x.bit_length()`` is the
463 unique positive integer ``k`` such that ``2**(k-1) <= abs(x) < 2**k``.
464 Equivalently, when ``abs(x)`` is small enough to have a correctly
465 rounded logarithm, then ``k = 1 + int(log(abs(x), 2))``.
466 If ``x`` is zero, then ``x.bit_length()`` returns ``0``.
Mark Dickinson54bc1ec2008-12-17 16:19:07 +0000467
468 Equivalent to::
469
470 def bit_length(self):
Senthil Kumaran0aae6dc2010-06-22 02:57:23 +0000471 s = bin(self) # binary representation: bin(-37) --> '-0b100101'
Raymond Hettingerd3e18b72008-12-19 09:11:49 +0000472 s = s.lstrip('-0b') # remove leading zeros and minus sign
473 return len(s) # len('100101') --> 6
Mark Dickinson54bc1ec2008-12-17 16:19:07 +0000474
475 .. versionadded:: 3.1
476
Georg Brandl67b21b72010-08-17 15:07:14 +0000477.. method:: int.to_bytes(length, byteorder, \*, signed=False)
Alexandre Vassalottic36c3782010-01-09 20:35:09 +0000478
479 Return an array of bytes representing an integer.
480
481 >>> (1024).to_bytes(2, byteorder='big')
482 b'\x04\x00'
483 >>> (1024).to_bytes(10, byteorder='big')
484 b'\x00\x00\x00\x00\x00\x00\x00\x00\x04\x00'
485 >>> (-1024).to_bytes(10, byteorder='big', signed=True)
486 b'\xff\xff\xff\xff\xff\xff\xff\xff\xfc\x00'
487 >>> x = 1000
INADA Naoki3e3e9f32016-10-31 17:41:47 +0900488 >>> x.to_bytes((x.bit_length() + 7) // 8, byteorder='little')
Alexandre Vassalottic36c3782010-01-09 20:35:09 +0000489 b'\xe8\x03'
490
491 The integer is represented using *length* bytes. An :exc:`OverflowError`
492 is raised if the integer is not representable with the given number of
493 bytes.
494
495 The *byteorder* argument determines the byte order used to represent the
496 integer. If *byteorder* is ``"big"``, the most significant byte is at the
497 beginning of the byte array. If *byteorder* is ``"little"``, the most
498 significant byte is at the end of the byte array. To request the native
499 byte order of the host system, use :data:`sys.byteorder` as the byte order
500 value.
501
502 The *signed* argument determines whether two's complement is used to
503 represent the integer. If *signed* is ``False`` and a negative integer is
504 given, an :exc:`OverflowError` is raised. The default value for *signed*
505 is ``False``.
506
507 .. versionadded:: 3.2
508
Georg Brandl67b21b72010-08-17 15:07:14 +0000509.. classmethod:: int.from_bytes(bytes, byteorder, \*, signed=False)
Alexandre Vassalottic36c3782010-01-09 20:35:09 +0000510
511 Return the integer represented by the given array of bytes.
512
513 >>> int.from_bytes(b'\x00\x10', byteorder='big')
514 16
515 >>> int.from_bytes(b'\x00\x10', byteorder='little')
516 4096
517 >>> int.from_bytes(b'\xfc\x00', byteorder='big', signed=True)
518 -1024
519 >>> int.from_bytes(b'\xfc\x00', byteorder='big', signed=False)
520 64512
521 >>> int.from_bytes([255, 0, 0], byteorder='big')
522 16711680
523
Ezio Melottic228e962013-05-04 18:06:34 +0300524 The argument *bytes* must either be a :term:`bytes-like object` or an
525 iterable producing bytes.
Alexandre Vassalottic36c3782010-01-09 20:35:09 +0000526
527 The *byteorder* argument determines the byte order used to represent the
528 integer. If *byteorder* is ``"big"``, the most significant byte is at the
529 beginning of the byte array. If *byteorder* is ``"little"``, the most
530 significant byte is at the end of the byte array. To request the native
531 byte order of the host system, use :data:`sys.byteorder` as the byte order
532 value.
533
534 The *signed* argument indicates whether two's complement is used to
535 represent the integer.
536
537 .. versionadded:: 3.2
538
Mark Dickinson54bc1ec2008-12-17 16:19:07 +0000539
Mark Dickinson65fe25e2008-07-16 11:30:51 +0000540Additional Methods on Float
541---------------------------
542
Benjamin Peterson10116d42011-05-01 17:38:17 -0500543The float type implements the :class:`numbers.Real` :term:`abstract base
544class`. float also has the following additional methods.
Benjamin Petersond7b03282008-09-13 15:58:53 +0000545
546.. method:: float.as_integer_ratio()
547
Mark Dickinson4a3c7c42010-11-07 12:48:18 +0000548 Return a pair of integers whose ratio is exactly equal to the
549 original float and with a positive denominator. Raises
550 :exc:`OverflowError` on infinities and a :exc:`ValueError` on
551 NaNs.
552
553.. method:: float.is_integer()
554
555 Return ``True`` if the float instance is finite with integral
556 value, and ``False`` otherwise::
557
558 >>> (-2.0).is_integer()
559 True
560 >>> (3.2).is_integer()
561 False
Georg Brandl48310cd2009-01-03 21:18:54 +0000562
Benjamin Petersond7b03282008-09-13 15:58:53 +0000563Two methods support conversion to
Mark Dickinson65fe25e2008-07-16 11:30:51 +0000564and from hexadecimal strings. Since Python's floats are stored
565internally as binary numbers, converting a float to or from a
566*decimal* string usually involves a small rounding error. In
567contrast, hexadecimal strings allow exact representation and
568specification of floating-point numbers. This can be useful when
569debugging, and in numerical work.
570
571
572.. method:: float.hex()
573
574 Return a representation of a floating-point number as a hexadecimal
575 string. For finite floating-point numbers, this representation
576 will always include a leading ``0x`` and a trailing ``p`` and
577 exponent.
578
579
Georg Brandlabc38772009-04-12 15:51:51 +0000580.. classmethod:: float.fromhex(s)
Mark Dickinson65fe25e2008-07-16 11:30:51 +0000581
582 Class method to return the float represented by a hexadecimal
583 string *s*. The string *s* may have leading and trailing
584 whitespace.
585
586
587Note that :meth:`float.hex` is an instance method, while
588:meth:`float.fromhex` is a class method.
589
590A hexadecimal string takes the form::
591
592 [sign] ['0x'] integer ['.' fraction] ['p' exponent]
593
594where the optional ``sign`` may by either ``+`` or ``-``, ``integer``
595and ``fraction`` are strings of hexadecimal digits, and ``exponent``
596is a decimal integer with an optional leading sign. Case is not
597significant, and there must be at least one hexadecimal digit in
598either the integer or the fraction. This syntax is similar to the
599syntax specified in section 6.4.4.2 of the C99 standard, and also to
600the syntax used in Java 1.5 onwards. In particular, the output of
601:meth:`float.hex` is usable as a hexadecimal floating-point literal in
602C or Java code, and hexadecimal strings produced by C's ``%a`` format
603character or Java's ``Double.toHexString`` are accepted by
604:meth:`float.fromhex`.
605
606
607Note that the exponent is written in decimal rather than hexadecimal,
608and that it gives the power of 2 by which to multiply the coefficient.
609For example, the hexadecimal string ``0x3.a7p10`` represents the
610floating-point number ``(3 + 10./16 + 7./16**2) * 2.0**10``, or
611``3740.0``::
612
613 >>> float.fromhex('0x3.a7p10')
614 3740.0
615
616
617Applying the reverse conversion to ``3740.0`` gives a different
618hexadecimal string representing the same number::
619
620 >>> float.hex(3740.0)
621 '0x1.d380000000000p+11'
622
623
Mark Dickinsondc787d22010-05-23 13:33:13 +0000624.. _numeric-hash:
625
626Hashing of numeric types
627------------------------
628
629For numbers ``x`` and ``y``, possibly of different types, it's a requirement
630that ``hash(x) == hash(y)`` whenever ``x == y`` (see the :meth:`__hash__`
631method documentation for more details). For ease of implementation and
632efficiency across a variety of numeric types (including :class:`int`,
633:class:`float`, :class:`decimal.Decimal` and :class:`fractions.Fraction`)
634Python's hash for numeric types is based on a single mathematical function
635that's defined for any rational number, and hence applies to all instances of
Serhiy Storchaka0d196ed2013-10-09 14:02:31 +0300636:class:`int` and :class:`fractions.Fraction`, and all finite instances of
Mark Dickinsondc787d22010-05-23 13:33:13 +0000637:class:`float` and :class:`decimal.Decimal`. Essentially, this function is
638given by reduction modulo ``P`` for a fixed prime ``P``. The value of ``P`` is
639made available to Python as the :attr:`modulus` attribute of
640:data:`sys.hash_info`.
641
642.. impl-detail::
643
644 Currently, the prime used is ``P = 2**31 - 1`` on machines with 32-bit C
645 longs and ``P = 2**61 - 1`` on machines with 64-bit C longs.
646
647Here are the rules in detail:
648
Georg Brandl226ed7e2012-03-24 08:12:41 +0100649- If ``x = m / n`` is a nonnegative rational number and ``n`` is not divisible
650 by ``P``, define ``hash(x)`` as ``m * invmod(n, P) % P``, where ``invmod(n,
651 P)`` gives the inverse of ``n`` modulo ``P``.
Mark Dickinsondc787d22010-05-23 13:33:13 +0000652
Georg Brandl226ed7e2012-03-24 08:12:41 +0100653- If ``x = m / n`` is a nonnegative rational number and ``n`` is
654 divisible by ``P`` (but ``m`` is not) then ``n`` has no inverse
655 modulo ``P`` and the rule above doesn't apply; in this case define
656 ``hash(x)`` to be the constant value ``sys.hash_info.inf``.
Mark Dickinsondc787d22010-05-23 13:33:13 +0000657
Georg Brandl226ed7e2012-03-24 08:12:41 +0100658- If ``x = m / n`` is a negative rational number define ``hash(x)``
659 as ``-hash(-x)``. If the resulting hash is ``-1``, replace it with
660 ``-2``.
Mark Dickinsondc787d22010-05-23 13:33:13 +0000661
Georg Brandl226ed7e2012-03-24 08:12:41 +0100662- The particular values ``sys.hash_info.inf``, ``-sys.hash_info.inf``
663 and ``sys.hash_info.nan`` are used as hash values for positive
664 infinity, negative infinity, or nans (respectively). (All hashable
665 nans have the same hash value.)
Mark Dickinsondc787d22010-05-23 13:33:13 +0000666
Georg Brandl226ed7e2012-03-24 08:12:41 +0100667- For a :class:`complex` number ``z``, the hash values of the real
668 and imaginary parts are combined by computing ``hash(z.real) +
669 sys.hash_info.imag * hash(z.imag)``, reduced modulo
670 ``2**sys.hash_info.width`` so that it lies in
671 ``range(-2**(sys.hash_info.width - 1), 2**(sys.hash_info.width -
672 1))``. Again, if the result is ``-1``, it's replaced with ``-2``.
Mark Dickinsondc787d22010-05-23 13:33:13 +0000673
674
675To clarify the above rules, here's some example Python code,
Nick Coghlan273069c2012-08-20 17:14:07 +1000676equivalent to the built-in hash, for computing the hash of a rational
Mark Dickinsondc787d22010-05-23 13:33:13 +0000677number, :class:`float`, or :class:`complex`::
678
679
680 import sys, math
681
682 def hash_fraction(m, n):
683 """Compute the hash of a rational number m / n.
684
685 Assumes m and n are integers, with n positive.
686 Equivalent to hash(fractions.Fraction(m, n)).
687
688 """
689 P = sys.hash_info.modulus
690 # Remove common factors of P. (Unnecessary if m and n already coprime.)
691 while m % P == n % P == 0:
692 m, n = m // P, n // P
693
694 if n % P == 0:
Berker Peksagaa46bd42016-07-25 04:55:51 +0300695 hash_value = sys.hash_info.inf
Mark Dickinsondc787d22010-05-23 13:33:13 +0000696 else:
697 # Fermat's Little Theorem: pow(n, P-1, P) is 1, so
698 # pow(n, P-2, P) gives the inverse of n modulo P.
Berker Peksagaa46bd42016-07-25 04:55:51 +0300699 hash_value = (abs(m) % P) * pow(n, P - 2, P) % P
Mark Dickinsondc787d22010-05-23 13:33:13 +0000700 if m < 0:
Berker Peksagaa46bd42016-07-25 04:55:51 +0300701 hash_value = -hash_value
702 if hash_value == -1:
703 hash_value = -2
704 return hash_value
Mark Dickinsondc787d22010-05-23 13:33:13 +0000705
706 def hash_float(x):
707 """Compute the hash of a float x."""
708
709 if math.isnan(x):
710 return sys.hash_info.nan
711 elif math.isinf(x):
712 return sys.hash_info.inf if x > 0 else -sys.hash_info.inf
713 else:
714 return hash_fraction(*x.as_integer_ratio())
715
716 def hash_complex(z):
717 """Compute the hash of a complex number z."""
718
Berker Peksagaa46bd42016-07-25 04:55:51 +0300719 hash_value = hash_float(z.real) + sys.hash_info.imag * hash_float(z.imag)
Mark Dickinsondc787d22010-05-23 13:33:13 +0000720 # do a signed reduction modulo 2**sys.hash_info.width
721 M = 2**(sys.hash_info.width - 1)
Berker Peksagaa46bd42016-07-25 04:55:51 +0300722 hash_value = (hash_value & (M - 1)) - (hash_value & M)
723 if hash_value == -1:
724 hash_value = -2
725 return hash_value
Mark Dickinsondc787d22010-05-23 13:33:13 +0000726
Georg Brandl6ea420b2008-07-16 12:58:29 +0000727.. _typeiter:
728
Georg Brandl116aa622007-08-15 14:28:22 +0000729Iterator Types
730==============
731
Georg Brandl116aa622007-08-15 14:28:22 +0000732.. index::
733 single: iterator protocol
734 single: protocol; iterator
735 single: sequence; iteration
736 single: container; iteration over
737
738Python supports a concept of iteration over containers. This is implemented
739using two distinct methods; these are used to allow user-defined classes to
740support iteration. Sequences, described below in more detail, always support
741the iteration methods.
742
743One method needs to be defined for container objects to provide iteration
744support:
745
Christian Heimes790c8232008-01-07 21:14:23 +0000746.. XXX duplicated in reference/datamodel!
Georg Brandl116aa622007-08-15 14:28:22 +0000747
Christian Heimes790c8232008-01-07 21:14:23 +0000748.. method:: container.__iter__()
Georg Brandl116aa622007-08-15 14:28:22 +0000749
750 Return an iterator object. The object is required to support the iterator
751 protocol described below. If a container supports different types of
752 iteration, additional methods can be provided to specifically request
753 iterators for those iteration types. (An example of an object supporting
754 multiple forms of iteration would be a tree structure which supports both
755 breadth-first and depth-first traversal.) This method corresponds to the
Antoine Pitrou39668f52013-08-01 21:12:45 +0200756 :c:member:`~PyTypeObject.tp_iter` slot of the type structure for Python objects in the Python/C
Georg Brandl116aa622007-08-15 14:28:22 +0000757 API.
758
759The iterator objects themselves are required to support the following two
760methods, which together form the :dfn:`iterator protocol`:
761
762
763.. method:: iterator.__iter__()
764
765 Return the iterator object itself. This is required to allow both containers
766 and iterators to be used with the :keyword:`for` and :keyword:`in` statements.
Antoine Pitrou39668f52013-08-01 21:12:45 +0200767 This method corresponds to the :c:member:`~PyTypeObject.tp_iter` slot of the type structure for
Georg Brandl116aa622007-08-15 14:28:22 +0000768 Python objects in the Python/C API.
769
770
Georg Brandl905ec322007-09-28 13:39:25 +0000771.. method:: iterator.__next__()
Georg Brandl116aa622007-08-15 14:28:22 +0000772
773 Return the next item from the container. If there are no further items, raise
774 the :exc:`StopIteration` exception. This method corresponds to the
Antoine Pitrou39668f52013-08-01 21:12:45 +0200775 :c:member:`~PyTypeObject.tp_iternext` slot of the type structure for Python objects in the
Georg Brandl116aa622007-08-15 14:28:22 +0000776 Python/C API.
777
778Python defines several iterator objects to support iteration over general and
779specific sequence types, dictionaries, and other more specialized forms. The
780specific types are not important beyond their implementation of the iterator
781protocol.
782
Ezio Melotti7fa82222012-10-12 13:42:08 +0300783Once an iterator's :meth:`~iterator.__next__` method raises
784:exc:`StopIteration`, it must continue to do so on subsequent calls.
785Implementations that do not obey this property are deemed broken.
Georg Brandl116aa622007-08-15 14:28:22 +0000786
Benjamin Peterson0289b152009-06-28 17:22:03 +0000787
788.. _generator-types:
789
790Generator Types
791---------------
792
Georg Brandl9afde1c2007-11-01 20:32:30 +0000793Python's :term:`generator`\s provide a convenient way to implement the iterator
794protocol. If a container object's :meth:`__iter__` method is implemented as a
795generator, it will automatically return an iterator object (technically, a
Ezio Melotti7fa82222012-10-12 13:42:08 +0300796generator object) supplying the :meth:`__iter__` and :meth:`~generator.__next__`
797methods.
Benjamin Peterson0289b152009-06-28 17:22:03 +0000798More information about generators can be found in :ref:`the documentation for
799the yield expression <yieldexpr>`.
Georg Brandl116aa622007-08-15 14:28:22 +0000800
801
802.. _typesseq:
803
Nick Coghlan273069c2012-08-20 17:14:07 +1000804Sequence Types --- :class:`list`, :class:`tuple`, :class:`range`
805================================================================
Georg Brandl116aa622007-08-15 14:28:22 +0000806
Nick Coghlan273069c2012-08-20 17:14:07 +1000807There are three basic sequence types: lists, tuples, and range objects.
808Additional sequence types tailored for processing of
809:ref:`binary data <binaryseq>` and :ref:`text strings <textseq>` are
810described in dedicated sections.
Georg Brandle17d5862009-01-18 10:40:25 +0000811
Georg Brandl116aa622007-08-15 14:28:22 +0000812
Nick Coghlan273069c2012-08-20 17:14:07 +1000813.. _typesseq-common:
Georg Brandl116aa622007-08-15 14:28:22 +0000814
Nick Coghlan273069c2012-08-20 17:14:07 +1000815Common Sequence Operations
816--------------------------
Georg Brandl7c676132007-10-23 18:17:00 +0000817
Nick Coghlan273069c2012-08-20 17:14:07 +1000818.. index:: object: sequence
Georg Brandl4b491312007-08-31 09:22:56 +0000819
Nick Coghlan273069c2012-08-20 17:14:07 +1000820The operations in the following table are supported by most sequence types,
821both mutable and immutable. The :class:`collections.abc.Sequence` ABC is
822provided to make it easier to correctly implement these operations on
823custom sequence types.
Georg Brandl116aa622007-08-15 14:28:22 +0000824
Georg Brandle4196d32014-10-31 09:41:46 +0100825This table lists the sequence operations sorted in ascending priority. In the
826table, *s* and *t* are sequences of the same type, *n*, *i*, *j* and *k* are
827integers and *x* is an arbitrary object that meets any type and value
828restrictions imposed by *s*.
Georg Brandl116aa622007-08-15 14:28:22 +0000829
Nick Coghlan273069c2012-08-20 17:14:07 +1000830The ``in`` and ``not in`` operations have the same priorities as the
831comparison operations. The ``+`` (concatenation) and ``*`` (repetition)
832operations have the same priority as the corresponding numeric operations.
Georg Brandl116aa622007-08-15 14:28:22 +0000833
Nick Coghlan83c0ae52012-08-21 17:42:52 +1000834.. index::
835 triple: operations on; sequence; types
836 builtin: len
837 builtin: min
838 builtin: max
839 pair: concatenation; operation
840 pair: repetition; operation
841 pair: subscript; operation
842 pair: slice; operation
843 operator: in
844 operator: not in
845 single: count() (sequence method)
846 single: index() (sequence method)
847
Nick Coghlan273069c2012-08-20 17:14:07 +1000848+--------------------------+--------------------------------+----------+
849| Operation | Result | Notes |
850+==========================+================================+==========+
851| ``x in s`` | ``True`` if an item of *s* is | \(1) |
852| | equal to *x*, else ``False`` | |
853+--------------------------+--------------------------------+----------+
854| ``x not in s`` | ``False`` if an item of *s* is | \(1) |
855| | equal to *x*, else ``True`` | |
856+--------------------------+--------------------------------+----------+
857| ``s + t`` | the concatenation of *s* and | (6)(7) |
858| | *t* | |
859+--------------------------+--------------------------------+----------+
Martin Panter7f02d6d2015-09-07 02:08:55 +0000860| ``s * n`` or | equivalent to adding *s* to | (2)(7) |
861| ``n * s`` | itself *n* times | |
Nick Coghlan273069c2012-08-20 17:14:07 +1000862+--------------------------+--------------------------------+----------+
863| ``s[i]`` | *i*\ th item of *s*, origin 0 | \(3) |
864+--------------------------+--------------------------------+----------+
865| ``s[i:j]`` | slice of *s* from *i* to *j* | (3)(4) |
866+--------------------------+--------------------------------+----------+
867| ``s[i:j:k]`` | slice of *s* from *i* to *j* | (3)(5) |
868| | with step *k* | |
869+--------------------------+--------------------------------+----------+
870| ``len(s)`` | length of *s* | |
871+--------------------------+--------------------------------+----------+
872| ``min(s)`` | smallest item of *s* | |
873+--------------------------+--------------------------------+----------+
874| ``max(s)`` | largest item of *s* | |
875+--------------------------+--------------------------------+----------+
Ned Deily0995c472013-07-14 12:43:16 -0700876| ``s.index(x[, i[, j]])`` | index of the first occurrence | \(8) |
Nick Coghlan273069c2012-08-20 17:14:07 +1000877| | of *x* in *s* (at or after | |
878| | index *i* and before index *j*)| |
879+--------------------------+--------------------------------+----------+
Ned Deily0995c472013-07-14 12:43:16 -0700880| ``s.count(x)`` | total number of occurrences of | |
Nick Coghlan273069c2012-08-20 17:14:07 +1000881| | *x* in *s* | |
882+--------------------------+--------------------------------+----------+
883
884Sequences of the same type also support comparisons. In particular, tuples
885and lists are compared lexicographically by comparing corresponding elements.
886This means that to compare equal, every element must compare equal and the
887two sequences must be of the same type and have the same length. (For full
888details see :ref:`comparisons` in the language reference.)
Georg Brandl116aa622007-08-15 14:28:22 +0000889
Georg Brandl116aa622007-08-15 14:28:22 +0000890Notes:
891
892(1)
Nick Coghlan273069c2012-08-20 17:14:07 +1000893 While the ``in`` and ``not in`` operations are used only for simple
894 containment testing in the general case, some specialised sequences
895 (such as :class:`str`, :class:`bytes` and :class:`bytearray`) also use
896 them for subsequence testing::
897
898 >>> "gg" in "eggs"
899 True
Georg Brandl116aa622007-08-15 14:28:22 +0000900
901(2)
902 Values of *n* less than ``0`` are treated as ``0`` (which yields an empty
Martin Panter7f02d6d2015-09-07 02:08:55 +0000903 sequence of the same type as *s*). Note that items in the sequence *s*
904 are not copied; they are referenced multiple times. This often haunts
905 new Python programmers; consider::
Georg Brandl116aa622007-08-15 14:28:22 +0000906
907 >>> lists = [[]] * 3
908 >>> lists
909 [[], [], []]
910 >>> lists[0].append(3)
911 >>> lists
912 [[3], [3], [3]]
913
914 What has happened is that ``[[]]`` is a one-element list containing an empty
Martin Panter7f02d6d2015-09-07 02:08:55 +0000915 list, so all three elements of ``[[]] * 3`` are references to this single empty
Christian Heimesfe337bf2008-03-23 21:54:12 +0000916 list. Modifying any of the elements of ``lists`` modifies this single list.
Nick Coghlan273069c2012-08-20 17:14:07 +1000917 You can create a list of different lists this way::
Georg Brandl116aa622007-08-15 14:28:22 +0000918
919 >>> lists = [[] for i in range(3)]
920 >>> lists[0].append(3)
921 >>> lists[1].append(5)
922 >>> lists[2].append(7)
923 >>> lists
924 [[3], [5], [7]]
925
Martin Panter7f02d6d2015-09-07 02:08:55 +0000926 Further explanation is available in the FAQ entry
927 :ref:`faq-multidimensional-list`.
928
Georg Brandl116aa622007-08-15 14:28:22 +0000929(3)
Xiang Zhangcea904f2016-12-30 11:57:09 +0800930 If *i* or *j* is negative, the index is relative to the end of sequence *s*:
Georg Brandl7c676132007-10-23 18:17:00 +0000931 ``len(s) + i`` or ``len(s) + j`` is substituted. But note that ``-0`` is
932 still ``0``.
Georg Brandl116aa622007-08-15 14:28:22 +0000933
934(4)
935 The slice of *s* from *i* to *j* is defined as the sequence of items with index
936 *k* such that ``i <= k < j``. If *i* or *j* is greater than ``len(s)``, use
937 ``len(s)``. If *i* is omitted or ``None``, use ``0``. If *j* is omitted or
938 ``None``, use ``len(s)``. If *i* is greater than or equal to *j*, the slice is
939 empty.
940
941(5)
942 The slice of *s* from *i* to *j* with step *k* is defined as the sequence of
Christian Heimes2c181612007-12-17 20:04:13 +0000943 items with index ``x = i + n*k`` such that ``0 <= n < (j-i)/k``. In other words,
Georg Brandl116aa622007-08-15 14:28:22 +0000944 the indices are ``i``, ``i+k``, ``i+2*k``, ``i+3*k`` and so on, stopping when
Martin Panter3dbd87f2016-12-24 08:25:15 +0000945 *j* is reached (but never including *j*). When *k* is positive,
946 *i* and *j* are reduced to ``len(s)`` if they are greater.
947 When *k* is negative, *i* and *j* are reduced to ``len(s) - 1`` if
948 they are greater. If *i* or *j* are omitted or ``None``, they become
Georg Brandl116aa622007-08-15 14:28:22 +0000949 "end" values (which end depends on the sign of *k*). Note, *k* cannot be zero.
950 If *k* is ``None``, it is treated like ``1``.
951
952(6)
Nick Coghlan273069c2012-08-20 17:14:07 +1000953 Concatenating immutable sequences always results in a new object. This
954 means that building up a sequence by repeated concatenation will have a
955 quadratic runtime cost in the total sequence length. To get a linear
956 runtime cost, you must switch to one of the alternatives below:
Georg Brandl495f7b52009-10-27 15:28:25 +0000957
Antoine Pitroufd9ebd42011-11-25 16:33:53 +0100958 * if concatenating :class:`str` objects, you can build a list and use
Martin Panter7462b6492015-11-02 03:37:02 +0000959 :meth:`str.join` at the end or else write to an :class:`io.StringIO`
Nick Coghlan83c0ae52012-08-21 17:42:52 +1000960 instance and retrieve its value when complete
Antoine Pitroufd9ebd42011-11-25 16:33:53 +0100961
962 * if concatenating :class:`bytes` objects, you can similarly use
Nick Coghlan273069c2012-08-20 17:14:07 +1000963 :meth:`bytes.join` or :class:`io.BytesIO`, or you can do in-place
964 concatenation with a :class:`bytearray` object. :class:`bytearray`
Nick Coghlan83c0ae52012-08-21 17:42:52 +1000965 objects are mutable and have an efficient overallocation mechanism
Georg Brandl116aa622007-08-15 14:28:22 +0000966
Nick Coghlan83c0ae52012-08-21 17:42:52 +1000967 * if concatenating :class:`tuple` objects, extend a :class:`list` instead
Nick Coghlan273069c2012-08-20 17:14:07 +1000968
969 * for other types, investigate the relevant class documentation
970
971
972(7)
973 Some sequence types (such as :class:`range`) only support item sequences
974 that follow specific patterns, and hence don't support sequence
975 concatenation or repetition.
976
977(8)
978 ``index`` raises :exc:`ValueError` when *x* is not found in *s*.
979 When supported, the additional arguments to the index method allow
980 efficient searching of subsections of the sequence. Passing the extra
981 arguments is roughly equivalent to using ``s[i:j].index(x)``, only
982 without copying any data and with the returned index being relative to
983 the start of the sequence rather than the start of the slice.
984
985
986.. _typesseq-immutable:
987
988Immutable Sequence Types
989------------------------
990
991.. index::
992 triple: immutable; sequence; types
993 object: tuple
Nick Coghlan83c0ae52012-08-21 17:42:52 +1000994 builtin: hash
Nick Coghlan273069c2012-08-20 17:14:07 +1000995
996The only operation that immutable sequence types generally implement that is
997not also implemented by mutable sequence types is support for the :func:`hash`
998built-in.
999
1000This support allows immutable sequences, such as :class:`tuple` instances, to
1001be used as :class:`dict` keys and stored in :class:`set` and :class:`frozenset`
1002instances.
1003
1004Attempting to hash an immutable sequence that contains unhashable values will
1005result in :exc:`TypeError`.
1006
1007
1008.. _typesseq-mutable:
1009
1010Mutable Sequence Types
1011----------------------
1012
1013.. index::
1014 triple: mutable; sequence; types
1015 object: list
1016 object: bytearray
1017
1018The operations in the following table are defined on mutable sequence types.
1019The :class:`collections.abc.MutableSequence` ABC is provided to make it
1020easier to correctly implement these operations on custom sequence types.
1021
1022In the table *s* is an instance of a mutable sequence type, *t* is any
1023iterable object and *x* is an arbitrary object that meets any type
1024and value restrictions imposed by *s* (for example, :class:`bytearray` only
1025accepts integers that meet the value restriction ``0 <= x <= 255``).
1026
1027
1028.. index::
1029 triple: operations on; sequence; types
1030 triple: operations on; list; type
1031 pair: subscript; assignment
1032 pair: slice; assignment
1033 statement: del
1034 single: append() (sequence method)
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001035 single: clear() (sequence method)
1036 single: copy() (sequence method)
Nick Coghlan273069c2012-08-20 17:14:07 +10001037 single: extend() (sequence method)
Nick Coghlan273069c2012-08-20 17:14:07 +10001038 single: insert() (sequence method)
1039 single: pop() (sequence method)
1040 single: remove() (sequence method)
1041 single: reverse() (sequence method)
1042
1043+------------------------------+--------------------------------+---------------------+
1044| Operation | Result | Notes |
1045+==============================+================================+=====================+
1046| ``s[i] = x`` | item *i* of *s* is replaced by | |
1047| | *x* | |
1048+------------------------------+--------------------------------+---------------------+
1049| ``s[i:j] = t`` | slice of *s* from *i* to *j* | |
1050| | is replaced by the contents of | |
1051| | the iterable *t* | |
1052+------------------------------+--------------------------------+---------------------+
1053| ``del s[i:j]`` | same as ``s[i:j] = []`` | |
1054+------------------------------+--------------------------------+---------------------+
1055| ``s[i:j:k] = t`` | the elements of ``s[i:j:k]`` | \(1) |
1056| | are replaced by those of *t* | |
1057+------------------------------+--------------------------------+---------------------+
1058| ``del s[i:j:k]`` | removes the elements of | |
1059| | ``s[i:j:k]`` from the list | |
1060+------------------------------+--------------------------------+---------------------+
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001061| ``s.append(x)`` | appends *x* to the end of the | |
1062| | sequence (same as | |
1063| | ``s[len(s):len(s)] = [x]``) | |
Nick Coghlan273069c2012-08-20 17:14:07 +10001064+------------------------------+--------------------------------+---------------------+
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001065| ``s.clear()`` | removes all items from ``s`` | \(5) |
Nick Coghlan273069c2012-08-20 17:14:07 +10001066| | (same as ``del s[:]``) | |
1067+------------------------------+--------------------------------+---------------------+
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001068| ``s.copy()`` | creates a shallow copy of ``s``| \(5) |
Nick Coghlan273069c2012-08-20 17:14:07 +10001069| | (same as ``s[:]``) | |
1070+------------------------------+--------------------------------+---------------------+
Martin Panter3795d122015-10-03 07:46:04 +00001071| ``s.extend(t)`` or | extends *s* with the | |
1072| ``s += t`` | contents of *t* (for the | |
1073| | most part the same as | |
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001074| | ``s[len(s):len(s)] = t``) | |
Nick Coghlan273069c2012-08-20 17:14:07 +10001075+------------------------------+--------------------------------+---------------------+
Martin Panter3795d122015-10-03 07:46:04 +00001076| ``s *= n`` | updates *s* with its contents | \(6) |
1077| | repeated *n* times | |
1078+------------------------------+--------------------------------+---------------------+
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001079| ``s.insert(i, x)`` | inserts *x* into *s* at the | |
1080| | index given by *i* | |
1081| | (same as ``s[i:i] = [x]``) | |
Nick Coghlan273069c2012-08-20 17:14:07 +10001082+------------------------------+--------------------------------+---------------------+
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001083| ``s.pop([i])`` | retrieves the item at *i* and | \(2) |
1084| | also removes it from *s* | |
Nick Coghlan273069c2012-08-20 17:14:07 +10001085+------------------------------+--------------------------------+---------------------+
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001086| ``s.remove(x)`` | remove the first item from *s* | \(3) |
1087| | where ``s[i] == x`` | |
Nick Coghlan273069c2012-08-20 17:14:07 +10001088+------------------------------+--------------------------------+---------------------+
1089| ``s.reverse()`` | reverses the items of *s* in | \(4) |
1090| | place | |
1091+------------------------------+--------------------------------+---------------------+
1092
1093
1094Notes:
1095
1096(1)
1097 *t* must have the same length as the slice it is replacing.
1098
1099(2)
1100 The optional argument *i* defaults to ``-1``, so that by default the last
1101 item is removed and returned.
1102
1103(3)
1104 ``remove`` raises :exc:`ValueError` when *x* is not found in *s*.
1105
1106(4)
1107 The :meth:`reverse` method modifies the sequence in place for economy of
1108 space when reversing a large sequence. To remind users that it operates by
1109 side effect, it does not return the reversed sequence.
1110
1111(5)
1112 :meth:`clear` and :meth:`!copy` are included for consistency with the
1113 interfaces of mutable containers that don't support slicing operations
1114 (such as :class:`dict` and :class:`set`)
1115
1116 .. versionadded:: 3.3
1117 :meth:`clear` and :meth:`!copy` methods.
1118
Martin Panter3795d122015-10-03 07:46:04 +00001119(6)
1120 The value *n* is an integer, or an object implementing
1121 :meth:`~object.__index__`. Zero and negative values of *n* clear
1122 the sequence. Items in the sequence are not copied; they are referenced
1123 multiple times, as explained for ``s * n`` under :ref:`typesseq-common`.
1124
Nick Coghlan273069c2012-08-20 17:14:07 +10001125
1126.. _typesseq-list:
1127
1128Lists
1129-----
1130
1131.. index:: object: list
1132
1133Lists are mutable sequences, typically used to store collections of
1134homogeneous items (where the precise degree of similarity will vary by
1135application).
1136
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001137.. class:: list([iterable])
Nick Coghlan273069c2012-08-20 17:14:07 +10001138
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001139 Lists may be constructed in several ways:
Nick Coghlan273069c2012-08-20 17:14:07 +10001140
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001141 * Using a pair of square brackets to denote the empty list: ``[]``
1142 * Using square brackets, separating items with commas: ``[a]``, ``[a, b, c]``
1143 * Using a list comprehension: ``[x for x in iterable]``
1144 * Using the type constructor: ``list()`` or ``list(iterable)``
Nick Coghlan273069c2012-08-20 17:14:07 +10001145
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001146 The constructor builds a list whose items are the same and in the same
1147 order as *iterable*'s items. *iterable* may be either a sequence, a
1148 container that supports iteration, or an iterator object. If *iterable*
1149 is already a list, a copy is made and returned, similar to ``iterable[:]``.
1150 For example, ``list('abc')`` returns ``['a', 'b', 'c']`` and
1151 ``list( (1, 2, 3) )`` returns ``[1, 2, 3]``.
1152 If no argument is given, the constructor creates a new empty list, ``[]``.
Nick Coghlan273069c2012-08-20 17:14:07 +10001153
Nick Coghlan273069c2012-08-20 17:14:07 +10001154
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001155 Many other operations also produce lists, including the :func:`sorted`
1156 built-in.
Nick Coghlan273069c2012-08-20 17:14:07 +10001157
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001158 Lists implement all of the :ref:`common <typesseq-common>` and
1159 :ref:`mutable <typesseq-mutable>` sequence operations. Lists also provide the
1160 following additional method:
Nick Coghlan273069c2012-08-20 17:14:07 +10001161
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001162 .. method:: list.sort(*, key=None, reverse=None)
Nick Coghlan273069c2012-08-20 17:14:07 +10001163
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001164 This method sorts the list in place, using only ``<`` comparisons
1165 between items. Exceptions are not suppressed - if any comparison operations
1166 fail, the entire sort operation will fail (and the list will likely be left
1167 in a partially modified state).
Nick Coghlan273069c2012-08-20 17:14:07 +10001168
Zachary Waree1391a02013-11-22 13:58:34 -06001169 :meth:`sort` accepts two arguments that can only be passed by keyword
1170 (:ref:`keyword-only arguments <keyword-only_parameter>`):
1171
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001172 *key* specifies a function of one argument that is used to extract a
1173 comparison key from each list element (for example, ``key=str.lower``).
1174 The key corresponding to each item in the list is calculated once and
1175 then used for the entire sorting process. The default value of ``None``
1176 means that list items are sorted directly without calculating a separate
1177 key value.
Nick Coghlan273069c2012-08-20 17:14:07 +10001178
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001179 The :func:`functools.cmp_to_key` utility is available to convert a 2.x
1180 style *cmp* function to a *key* function.
Nick Coghlan273069c2012-08-20 17:14:07 +10001181
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001182 *reverse* is a boolean value. If set to ``True``, then the list elements
1183 are sorted as if each comparison were reversed.
Nick Coghlan273069c2012-08-20 17:14:07 +10001184
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001185 This method modifies the sequence in place for economy of space when
1186 sorting a large sequence. To remind users that it operates by side
1187 effect, it does not return the sorted sequence (use :func:`sorted` to
1188 explicitly request a new sorted list instance).
1189
1190 The :meth:`sort` method is guaranteed to be stable. A sort is stable if it
1191 guarantees not to change the relative order of elements that compare equal
1192 --- this is helpful for sorting in multiple passes (for example, sort by
1193 department, then by salary grade).
1194
1195 .. impl-detail::
1196
1197 While a list is being sorted, the effect of attempting to mutate, or even
1198 inspect, the list is undefined. The C implementation of Python makes the
1199 list appear empty for the duration, and raises :exc:`ValueError` if it can
1200 detect that the list has been mutated during a sort.
Nick Coghlan273069c2012-08-20 17:14:07 +10001201
1202
1203.. _typesseq-tuple:
1204
1205Tuples
1206------
1207
1208.. index:: object: tuple
1209
1210Tuples are immutable sequences, typically used to store collections of
1211heterogeneous data (such as the 2-tuples produced by the :func:`enumerate`
1212built-in). Tuples are also used for cases where an immutable sequence of
1213homogeneous data is needed (such as allowing storage in a :class:`set` or
1214:class:`dict` instance).
1215
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001216.. class:: tuple([iterable])
Nick Coghlan273069c2012-08-20 17:14:07 +10001217
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001218 Tuples may be constructed in a number of ways:
Nick Coghlan273069c2012-08-20 17:14:07 +10001219
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001220 * Using a pair of parentheses to denote the empty tuple: ``()``
1221 * Using a trailing comma for a singleton tuple: ``a,`` or ``(a,)``
1222 * Separating items with commas: ``a, b, c`` or ``(a, b, c)``
1223 * Using the :func:`tuple` built-in: ``tuple()`` or ``tuple(iterable)``
Nick Coghlan273069c2012-08-20 17:14:07 +10001224
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001225 The constructor builds a tuple whose items are the same and in the same
1226 order as *iterable*'s items. *iterable* may be either a sequence, a
1227 container that supports iteration, or an iterator object. If *iterable*
1228 is already a tuple, it is returned unchanged. For example,
1229 ``tuple('abc')`` returns ``('a', 'b', 'c')`` and
1230 ``tuple( [1, 2, 3] )`` returns ``(1, 2, 3)``.
1231 If no argument is given, the constructor creates a new empty tuple, ``()``.
Nick Coghlan273069c2012-08-20 17:14:07 +10001232
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001233 Note that it is actually the comma which makes a tuple, not the parentheses.
1234 The parentheses are optional, except in the empty tuple case, or
1235 when they are needed to avoid syntactic ambiguity. For example,
1236 ``f(a, b, c)`` is a function call with three arguments, while
1237 ``f((a, b, c))`` is a function call with a 3-tuple as the sole argument.
1238
1239 Tuples implement all of the :ref:`common <typesseq-common>` sequence
1240 operations.
1241
1242For heterogeneous collections of data where access by name is clearer than
1243access by index, :func:`collections.namedtuple` may be a more appropriate
1244choice than a simple tuple object.
Nick Coghlan273069c2012-08-20 17:14:07 +10001245
1246
1247.. _typesseq-range:
1248
1249Ranges
1250------
1251
1252.. index:: object: range
1253
1254The :class:`range` type represents an immutable sequence of numbers and is
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001255commonly used for looping a specific number of times in :keyword:`for`
1256loops.
Nick Coghlan273069c2012-08-20 17:14:07 +10001257
Ezio Melotti8429b672012-09-14 06:35:09 +03001258.. class:: range(stop)
1259 range(start, stop[, step])
Nick Coghlan273069c2012-08-20 17:14:07 +10001260
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001261 The arguments to the range constructor must be integers (either built-in
1262 :class:`int` or any object that implements the ``__index__`` special
1263 method). If the *step* argument is omitted, it defaults to ``1``.
1264 If the *start* argument is omitted, it defaults to ``0``.
1265 If *step* is zero, :exc:`ValueError` is raised.
1266
1267 For a positive *step*, the contents of a range ``r`` are determined by the
1268 formula ``r[i] = start + step*i`` where ``i >= 0`` and
1269 ``r[i] < stop``.
1270
1271 For a negative *step*, the contents of the range are still determined by
1272 the formula ``r[i] = start + step*i``, but the constraints are ``i >= 0``
1273 and ``r[i] > stop``.
1274
Sandro Tosi4c1b9f42013-01-27 00:33:04 +01001275 A range object will be empty if ``r[0]`` does not meet the value
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001276 constraint. Ranges do support negative indices, but these are interpreted
1277 as indexing from the end of the sequence determined by the positive
1278 indices.
1279
1280 Ranges containing absolute values larger than :data:`sys.maxsize` are
1281 permitted but some features (such as :func:`len`) may raise
1282 :exc:`OverflowError`.
1283
1284 Range examples::
1285
1286 >>> list(range(10))
1287 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
1288 >>> list(range(1, 11))
1289 [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
1290 >>> list(range(0, 30, 5))
1291 [0, 5, 10, 15, 20, 25]
1292 >>> list(range(0, 10, 3))
1293 [0, 3, 6, 9]
1294 >>> list(range(0, -10, -1))
1295 [0, -1, -2, -3, -4, -5, -6, -7, -8, -9]
1296 >>> list(range(0))
1297 []
1298 >>> list(range(1, 0))
1299 []
1300
1301 Ranges implement all of the :ref:`common <typesseq-common>` sequence operations
1302 except concatenation and repetition (due to the fact that range objects can
1303 only represent sequences that follow a strict pattern and repetition and
1304 concatenation will usually violate that pattern).
1305
Georg Brandl8c16cb92016-02-25 20:17:45 +01001306 .. attribute:: start
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001307
1308 The value of the *start* parameter (or ``0`` if the parameter was
1309 not supplied)
1310
Georg Brandl8c16cb92016-02-25 20:17:45 +01001311 .. attribute:: stop
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001312
1313 The value of the *stop* parameter
1314
Georg Brandl8c16cb92016-02-25 20:17:45 +01001315 .. attribute:: step
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001316
1317 The value of the *step* parameter (or ``1`` if the parameter was
1318 not supplied)
Nick Coghlan273069c2012-08-20 17:14:07 +10001319
1320The advantage of the :class:`range` type over a regular :class:`list` or
1321:class:`tuple` is that a :class:`range` object will always take the same
1322(small) amount of memory, no matter the size of the range it represents (as it
1323only stores the ``start``, ``stop`` and ``step`` values, calculating individual
1324items and subranges as needed).
1325
Serhiy Storchaka0d196ed2013-10-09 14:02:31 +03001326Range objects implement the :class:`collections.abc.Sequence` ABC, and provide
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001327features such as containment tests, element index lookup, slicing and
1328support for negative indices (see :ref:`typesseq`):
1329
1330 >>> r = range(0, 20, 2)
1331 >>> r
1332 range(0, 20, 2)
1333 >>> 11 in r
1334 False
1335 >>> 10 in r
1336 True
1337 >>> r.index(10)
1338 5
1339 >>> r[5]
1340 10
1341 >>> r[:5]
1342 range(0, 10, 2)
1343 >>> r[-1]
1344 18
1345
1346Testing range objects for equality with ``==`` and ``!=`` compares
1347them as sequences. That is, two range objects are considered equal if
1348they represent the same sequence of values. (Note that two range
Serhiy Storchaka0d196ed2013-10-09 14:02:31 +03001349objects that compare equal might have different :attr:`~range.start`,
1350:attr:`~range.stop` and :attr:`~range.step` attributes, for example
1351``range(0) == range(2, 1, 3)`` or ``range(0, 3, 2) == range(0, 4, 2)``.)
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001352
1353.. versionchanged:: 3.2
1354 Implement the Sequence ABC.
1355 Support slicing and negative indices.
1356 Test :class:`int` objects for membership in constant time instead of
1357 iterating through all items.
1358
1359.. versionchanged:: 3.3
1360 Define '==' and '!=' to compare range objects based on the
1361 sequence of values they define (instead of comparing based on
1362 object identity).
1363
1364.. versionadded:: 3.3
Serhiy Storchaka0d196ed2013-10-09 14:02:31 +03001365 The :attr:`~range.start`, :attr:`~range.stop` and :attr:`~range.step`
1366 attributes.
Nick Coghlan273069c2012-08-20 17:14:07 +10001367
Raymond Hettingere256acc2016-09-06 16:35:34 -07001368.. seealso::
1369
1370 * The `linspace recipe <http://code.activestate.com/recipes/579000/>`_
1371 shows how to implement a lazy version of range that suitable for floating
1372 point applications.
Nick Coghlan273069c2012-08-20 17:14:07 +10001373
Chris Jerdonek5fae0e52012-11-20 17:45:51 -08001374.. index::
1375 single: string; text sequence type
Chris Jerdonekbb4e9412012-11-28 01:38:40 -08001376 single: str (built-in class); (see also string)
Chris Jerdonek5fae0e52012-11-20 17:45:51 -08001377 object: string
1378
Nick Coghlan273069c2012-08-20 17:14:07 +10001379.. _textseq:
1380
1381Text Sequence Type --- :class:`str`
1382===================================
1383
Chris Jerdonek5fae0e52012-11-20 17:45:51 -08001384Textual data in Python is handled with :class:`str` objects, or :dfn:`strings`.
1385Strings are immutable
Chris Jerdonekc33899b2012-10-11 18:57:48 -07001386:ref:`sequences <typesseq>` of Unicode code points. String literals are
Nick Coghlan273069c2012-08-20 17:14:07 +10001387written in a variety of ways:
1388
1389* Single quotes: ``'allows embedded "double" quotes'``
1390* Double quotes: ``"allows embedded 'single' quotes"``.
1391* Triple quoted: ``'''Three single quotes'''``, ``"""Three double quotes"""``
1392
1393Triple quoted strings may span multiple lines - all associated whitespace will
1394be included in the string literal.
1395
1396String literals that are part of a single expression and have only whitespace
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001397between them will be implicitly converted to a single string literal. That
1398is, ``("spam " "eggs") == "spam eggs"``.
Nick Coghlan273069c2012-08-20 17:14:07 +10001399
1400See :ref:`strings` for more about the various forms of string literal,
1401including supported escape sequences, and the ``r`` ("raw") prefix that
1402disables most escape sequence processing.
1403
Chris Jerdonekbb4e9412012-11-28 01:38:40 -08001404Strings may also be created from other objects using the :class:`str`
1405constructor.
Nick Coghlan273069c2012-08-20 17:14:07 +10001406
1407Since there is no separate "character" type, indexing a string produces
1408strings of length 1. That is, for a non-empty string *s*, ``s[0] == s[0:1]``.
1409
Chris Jerdonek5fae0e52012-11-20 17:45:51 -08001410.. index::
1411 object: io.StringIO
1412
Nick Coghlan273069c2012-08-20 17:14:07 +10001413There is also no mutable string type, but :meth:`str.join` or
1414:class:`io.StringIO` can be used to efficiently construct strings from
1415multiple fragments.
1416
1417.. versionchanged:: 3.3
1418 For backwards compatibility with the Python 2 series, the ``u`` prefix is
1419 once again permitted on string literals. It has no effect on the meaning
1420 of string literals and cannot be combined with the ``r`` prefix.
Georg Brandl116aa622007-08-15 14:28:22 +00001421
Chris Jerdonekbb4e9412012-11-28 01:38:40 -08001422
1423.. index::
1424 single: string; str (built-in class)
1425
1426.. class:: str(object='')
1427 str(object=b'', encoding='utf-8', errors='strict')
1428
1429 Return a :ref:`string <textseq>` version of *object*. If *object* is not
1430 provided, returns the empty string. Otherwise, the behavior of ``str()``
1431 depends on whether *encoding* or *errors* is given, as follows.
1432
1433 If neither *encoding* nor *errors* is given, ``str(object)`` returns
1434 :meth:`object.__str__() <object.__str__>`, which is the "informal" or nicely
1435 printable string representation of *object*. For string objects, this is
1436 the string itself. If *object* does not have a :meth:`~object.__str__`
1437 method, then :func:`str` falls back to returning
1438 :meth:`repr(object) <repr>`.
1439
1440 .. index::
1441 single: buffer protocol; str (built-in class)
1442 single: bytes; str (built-in class)
1443
1444 If at least one of *encoding* or *errors* is given, *object* should be a
Ezio Melottic228e962013-05-04 18:06:34 +03001445 :term:`bytes-like object` (e.g. :class:`bytes` or :class:`bytearray`). In
1446 this case, if *object* is a :class:`bytes` (or :class:`bytearray`) object,
1447 then ``str(bytes, encoding, errors)`` is equivalent to
Chris Jerdonekbb4e9412012-11-28 01:38:40 -08001448 :meth:`bytes.decode(encoding, errors) <bytes.decode>`. Otherwise, the bytes
1449 object underlying the buffer object is obtained before calling
1450 :meth:`bytes.decode`. See :ref:`binaryseq` and
1451 :ref:`bufferobjects` for information on buffer objects.
1452
1453 Passing a :class:`bytes` object to :func:`str` without the *encoding*
1454 or *errors* arguments falls under the first case of returning the informal
1455 string representation (see also the :option:`-b` command-line option to
1456 Python). For example::
1457
1458 >>> str(b'Zoot!')
1459 "b'Zoot!'"
1460
1461 For more information on the ``str`` class and its methods, see
1462 :ref:`textseq` and the :ref:`string-methods` section below. To output
Martin Panterbc1ee462016-02-13 00:41:37 +00001463 formatted strings, see the :ref:`f-strings` and :ref:`formatstrings`
1464 sections. In addition, see the :ref:`stringservices` section.
Chris Jerdonekbb4e9412012-11-28 01:38:40 -08001465
1466
1467.. index::
1468 pair: string; methods
1469
Georg Brandl116aa622007-08-15 14:28:22 +00001470.. _string-methods:
1471
1472String Methods
1473--------------
1474
Nick Coghlan273069c2012-08-20 17:14:07 +10001475.. index::
Nick Coghlan273069c2012-08-20 17:14:07 +10001476 module: re
Georg Brandl116aa622007-08-15 14:28:22 +00001477
Nick Coghlan273069c2012-08-20 17:14:07 +10001478Strings implement all of the :ref:`common <typesseq-common>` sequence
1479operations, along with the additional methods described below.
Thomas Wouters8ce81f72007-09-20 18:22:40 +00001480
Nick Coghlan273069c2012-08-20 17:14:07 +10001481Strings also support two styles of string formatting, one providing a large
1482degree of flexibility and customization (see :meth:`str.format`,
1483:ref:`formatstrings` and :ref:`string-formatting`) and the other based on C
1484``printf`` style formatting that handles a narrower range of types and is
1485slightly harder to use correctly, but is often faster for the cases it can
1486handle (:ref:`old-string-formatting`).
1487
1488The :ref:`textservices` section of the standard library covers a number of
1489other modules that provide various text related utilities (including regular
1490expression support in the :mod:`re` module).
Georg Brandl116aa622007-08-15 14:28:22 +00001491
1492.. method:: str.capitalize()
1493
Senthil Kumaranfa897982010-07-05 11:41:42 +00001494 Return a copy of the string with its first character capitalized and the
Senthil Kumaran37c63a32010-07-06 02:08:36 +00001495 rest lowercased.
Georg Brandl116aa622007-08-15 14:28:22 +00001496
Georg Brandl116aa622007-08-15 14:28:22 +00001497
Benjamin Petersond5890c82012-01-14 13:23:30 -05001498.. method:: str.casefold()
1499
1500 Return a casefolded copy of the string. Casefolded strings may be used for
Benjamin Peterson94303542012-01-18 23:09:32 -05001501 caseless matching.
1502
1503 Casefolding is similar to lowercasing but more aggressive because it is
1504 intended to remove all case distinctions in a string. For example, the German
1505 lowercase letter ``'ß'`` is equivalent to ``"ss"``. Since it is already
1506 lowercase, :meth:`lower` would do nothing to ``'ß'``; :meth:`casefold`
1507 converts it to ``"ss"``.
1508
1509 The casefolding algorithm is described in section 3.13 of the Unicode
1510 Standard.
Benjamin Petersond5890c82012-01-14 13:23:30 -05001511
1512 .. versionadded:: 3.3
1513
1514
Georg Brandl116aa622007-08-15 14:28:22 +00001515.. method:: str.center(width[, fillchar])
1516
1517 Return centered in a string of length *width*. Padding is done using the
Nick Coghlane4936b82014-08-09 16:14:04 +10001518 specified *fillchar* (default is an ASCII space). The original string is
1519 returned if *width* is less than or equal to ``len(s)``.
1520
Georg Brandl116aa622007-08-15 14:28:22 +00001521
Georg Brandl116aa622007-08-15 14:28:22 +00001522
1523.. method:: str.count(sub[, start[, end]])
1524
Benjamin Petersonad3d5c22009-02-26 03:38:59 +00001525 Return the number of non-overlapping occurrences of substring *sub* in the
1526 range [*start*, *end*]. Optional arguments *start* and *end* are
1527 interpreted as in slice notation.
Georg Brandl116aa622007-08-15 14:28:22 +00001528
1529
Victor Stinnere14e2122010-11-07 18:41:46 +00001530.. method:: str.encode(encoding="utf-8", errors="strict")
Georg Brandl116aa622007-08-15 14:28:22 +00001531
Victor Stinnere14e2122010-11-07 18:41:46 +00001532 Return an encoded version of the string as a bytes object. Default encoding
1533 is ``'utf-8'``. *errors* may be given to set a different error handling scheme.
1534 The default for *errors* is ``'strict'``, meaning that encoding errors raise
1535 a :exc:`UnicodeError`. Other possible
Georg Brandl4f5f98d2009-05-04 21:01:20 +00001536 values are ``'ignore'``, ``'replace'``, ``'xmlcharrefreplace'``,
1537 ``'backslashreplace'`` and any other name registered via
Nick Coghlanb9fdb7a2015-01-07 00:22:00 +10001538 :func:`codecs.register_error`, see section :ref:`error-handlers`. For a
Georg Brandl4f5f98d2009-05-04 21:01:20 +00001539 list of possible encodings, see section :ref:`standard-encodings`.
Georg Brandl116aa622007-08-15 14:28:22 +00001540
Benjamin Peterson308d6372009-09-18 21:42:35 +00001541 .. versionchanged:: 3.1
Georg Brandl67b21b72010-08-17 15:07:14 +00001542 Support for keyword arguments added.
1543
Georg Brandl116aa622007-08-15 14:28:22 +00001544
1545.. method:: str.endswith(suffix[, start[, end]])
1546
1547 Return ``True`` if the string ends with the specified *suffix*, otherwise return
1548 ``False``. *suffix* can also be a tuple of suffixes to look for. With optional
1549 *start*, test beginning at that position. With optional *end*, stop comparing
1550 at that position.
1551
Georg Brandl116aa622007-08-15 14:28:22 +00001552
Ezio Melotti745d54d2013-11-16 19:10:57 +02001553.. method:: str.expandtabs(tabsize=8)
Georg Brandl116aa622007-08-15 14:28:22 +00001554
Ned Deilybebe91a2013-04-21 13:05:21 -07001555 Return a copy of the string where all tab characters are replaced by one or
1556 more spaces, depending on the current column and the given tab size. Tab
1557 positions occur every *tabsize* characters (default is 8, giving tab
1558 positions at columns 0, 8, 16 and so on). To expand the string, the current
1559 column is set to zero and the string is examined character by character. If
1560 the character is a tab (``\t``), one or more space characters are inserted
1561 in the result until the current column is equal to the next tab position.
1562 (The tab character itself is not copied.) If the character is a newline
1563 (``\n``) or return (``\r``), it is copied and the current column is reset to
1564 zero. Any other character is copied unchanged and the current column is
1565 incremented by one regardless of how the character is represented when
1566 printed.
1567
1568 >>> '01\t012\t0123\t01234'.expandtabs()
1569 '01 012 0123 01234'
1570 >>> '01\t012\t0123\t01234'.expandtabs(4)
1571 '01 012 0123 01234'
Georg Brandl116aa622007-08-15 14:28:22 +00001572
1573
1574.. method:: str.find(sub[, start[, end]])
1575
Senthil Kumaran114a1d62016-01-03 17:57:10 -08001576 Return the lowest index in the string where substring *sub* is found within
1577 the slice ``s[start:end]``. Optional arguments *start* and *end* are
1578 interpreted as in slice notation. Return ``-1`` if *sub* is not found.
Georg Brandl116aa622007-08-15 14:28:22 +00001579
Ezio Melotti0ed8c682011-05-09 03:54:30 +03001580 .. note::
1581
1582 The :meth:`~str.find` method should be used only if you need to know the
1583 position of *sub*. To check if *sub* is a substring or not, use the
1584 :keyword:`in` operator::
1585
1586 >>> 'Py' in 'Python'
1587 True
1588
Georg Brandl116aa622007-08-15 14:28:22 +00001589
Benjamin Petersonad3d5c22009-02-26 03:38:59 +00001590.. method:: str.format(*args, **kwargs)
Georg Brandl4b491312007-08-31 09:22:56 +00001591
Georg Brandl1f70cdf2010-03-21 09:04:24 +00001592 Perform a string formatting operation. The string on which this method is
1593 called can contain literal text or replacement fields delimited by braces
1594 ``{}``. Each replacement field contains either the numeric index of a
1595 positional argument, or the name of a keyword argument. Returns a copy of
1596 the string where each replacement field is replaced with the string value of
1597 the corresponding argument.
Georg Brandl4b491312007-08-31 09:22:56 +00001598
1599 >>> "The sum of 1 + 2 is {0}".format(1+2)
1600 'The sum of 1 + 2 is 3'
1601
1602 See :ref:`formatstrings` for a description of the various formatting options
1603 that can be specified in format strings.
1604
Georg Brandl4b491312007-08-31 09:22:56 +00001605
Eric Smith27bbca62010-11-04 17:06:58 +00001606.. method:: str.format_map(mapping)
1607
Éric Araujo2642ad02010-11-06 04:59:27 +00001608 Similar to ``str.format(**mapping)``, except that ``mapping`` is
Serhiy Storchakaa4d170d2013-12-23 18:20:51 +02001609 used directly and not copied to a :class:`dict`. This is useful
Eric Smith5ad85f82010-11-06 13:22:13 +00001610 if for example ``mapping`` is a dict subclass:
Eric Smith27bbca62010-11-04 17:06:58 +00001611
Eric Smith5ad85f82010-11-06 13:22:13 +00001612 >>> class Default(dict):
1613 ... def __missing__(self, key):
1614 ... return key
1615 ...
1616 >>> '{name} was born in {country}'.format_map(Default(name='Guido'))
1617 'Guido was born in country'
1618
1619 .. versionadded:: 3.2
1620
Eric Smith27bbca62010-11-04 17:06:58 +00001621
Georg Brandl116aa622007-08-15 14:28:22 +00001622.. method:: str.index(sub[, start[, end]])
1623
Nick Coghlane4936b82014-08-09 16:14:04 +10001624 Like :meth:`~str.find`, but raise :exc:`ValueError` when the substring is
1625 not found.
Georg Brandl116aa622007-08-15 14:28:22 +00001626
1627
1628.. method:: str.isalnum()
1629
1630 Return true if all characters in the string are alphanumeric and there is at
Alexander Belopolsky0d267982010-12-23 02:58:25 +00001631 least one character, false otherwise. A character ``c`` is alphanumeric if one
1632 of the following returns ``True``: ``c.isalpha()``, ``c.isdecimal()``,
1633 ``c.isdigit()``, or ``c.isnumeric()``.
Georg Brandl116aa622007-08-15 14:28:22 +00001634
Georg Brandl116aa622007-08-15 14:28:22 +00001635
1636.. method:: str.isalpha()
1637
1638 Return true if all characters in the string are alphabetic and there is at least
Alexander Belopolsky0d267982010-12-23 02:58:25 +00001639 one character, false otherwise. Alphabetic characters are those characters defined
1640 in the Unicode character database as "Letter", i.e., those with general category
1641 property being one of "Lm", "Lt", "Lu", "Ll", or "Lo". Note that this is different
1642 from the "Alphabetic" property defined in the Unicode Standard.
Georg Brandl116aa622007-08-15 14:28:22 +00001643
Georg Brandl116aa622007-08-15 14:28:22 +00001644
Mark Summerfieldbbfd71d2008-07-01 15:50:04 +00001645.. method:: str.isdecimal()
1646
1647 Return true if all characters in the string are decimal
1648 characters and there is at least one character, false
Martin Panter49c14d82016-12-11 01:08:25 +00001649 otherwise. Decimal characters are those that can be used to form
1650 numbers in base 10, e.g. U+0660, ARABIC-INDIC DIGIT
1651 ZERO. Formally a decimal character is a character in the Unicode
1652 General Category "Nd".
Georg Brandl48310cd2009-01-03 21:18:54 +00001653
Mark Summerfieldbbfd71d2008-07-01 15:50:04 +00001654
Georg Brandl116aa622007-08-15 14:28:22 +00001655.. method:: str.isdigit()
1656
1657 Return true if all characters in the string are digits and there is at least one
Alexander Belopolsky0d267982010-12-23 02:58:25 +00001658 character, false otherwise. Digits include decimal characters and digits that need
Martin Panter49c14d82016-12-11 01:08:25 +00001659 special handling, such as the compatibility superscript digits.
1660 This covers digits which cannot be used to form numbers in base 10,
1661 like the Kharosthi numbers. Formally, a digit is a character that has the
1662 property value Numeric_Type=Digit or Numeric_Type=Decimal.
Georg Brandl116aa622007-08-15 14:28:22 +00001663
Georg Brandl116aa622007-08-15 14:28:22 +00001664
1665.. method:: str.isidentifier()
1666
1667 Return true if the string is a valid identifier according to the language
Georg Brandl4b491312007-08-31 09:22:56 +00001668 definition, section :ref:`identifiers`.
Georg Brandl116aa622007-08-15 14:28:22 +00001669
Raymond Hettinger378170d2013-03-23 08:21:12 -07001670 Use :func:`keyword.iskeyword` to test for reserved identifiers such as
1671 :keyword:`def` and :keyword:`class`.
Georg Brandl116aa622007-08-15 14:28:22 +00001672
1673.. method:: str.islower()
1674
Ezio Melotti0656a562011-08-15 14:27:19 +03001675 Return true if all cased characters [4]_ in the string are lowercase and
1676 there is at least one cased character, false otherwise.
Georg Brandl116aa622007-08-15 14:28:22 +00001677
Georg Brandl116aa622007-08-15 14:28:22 +00001678
Mark Summerfieldbbfd71d2008-07-01 15:50:04 +00001679.. method:: str.isnumeric()
1680
1681 Return true if all characters in the string are numeric
1682 characters, and there is at least one character, false
1683 otherwise. Numeric characters include digit characters, and all characters
1684 that have the Unicode numeric value property, e.g. U+2155,
Alexander Belopolsky0d267982010-12-23 02:58:25 +00001685 VULGAR FRACTION ONE FIFTH. Formally, numeric characters are those with the property
1686 value Numeric_Type=Digit, Numeric_Type=Decimal or Numeric_Type=Numeric.
Mark Summerfieldbbfd71d2008-07-01 15:50:04 +00001687
Georg Brandl48310cd2009-01-03 21:18:54 +00001688
Georg Brandl559e5d72008-06-11 18:37:52 +00001689.. method:: str.isprintable()
1690
1691 Return true if all characters in the string are printable or the string is
1692 empty, false otherwise. Nonprintable characters are those characters defined
1693 in the Unicode character database as "Other" or "Separator", excepting the
1694 ASCII space (0x20) which is considered printable. (Note that printable
1695 characters in this context are those which should not be escaped when
1696 :func:`repr` is invoked on a string. It has no bearing on the handling of
1697 strings written to :data:`sys.stdout` or :data:`sys.stderr`.)
1698
1699
Georg Brandl116aa622007-08-15 14:28:22 +00001700.. method:: str.isspace()
1701
1702 Return true if there are only whitespace characters in the string and there is
Alexander Belopolsky0d267982010-12-23 02:58:25 +00001703 at least one character, false otherwise. Whitespace characters are those
1704 characters defined in the Unicode character database as "Other" or "Separator"
1705 and those with bidirectional property being one of "WS", "B", or "S".
Georg Brandl116aa622007-08-15 14:28:22 +00001706
1707.. method:: str.istitle()
1708
1709 Return true if the string is a titlecased string and there is at least one
1710 character, for example uppercase characters may only follow uncased characters
1711 and lowercase characters only cased ones. Return false otherwise.
1712
Georg Brandl116aa622007-08-15 14:28:22 +00001713
1714.. method:: str.isupper()
1715
Ezio Melotti0656a562011-08-15 14:27:19 +03001716 Return true if all cased characters [4]_ in the string are uppercase and
1717 there is at least one cased character, false otherwise.
Georg Brandl116aa622007-08-15 14:28:22 +00001718
Georg Brandl116aa622007-08-15 14:28:22 +00001719
Georg Brandl495f7b52009-10-27 15:28:25 +00001720.. method:: str.join(iterable)
Georg Brandl116aa622007-08-15 14:28:22 +00001721
Georg Brandl495f7b52009-10-27 15:28:25 +00001722 Return a string which is the concatenation of the strings in the
1723 :term:`iterable` *iterable*. A :exc:`TypeError` will be raised if there are
Terry Jan Reedyf4ec3c52012-01-11 03:29:42 -05001724 any non-string values in *iterable*, including :class:`bytes` objects. The
Georg Brandl495f7b52009-10-27 15:28:25 +00001725 separator between elements is the string providing this method.
Georg Brandl116aa622007-08-15 14:28:22 +00001726
1727
1728.. method:: str.ljust(width[, fillchar])
1729
Nick Coghlane4936b82014-08-09 16:14:04 +10001730 Return the string left justified in a string of length *width*. Padding is
1731 done using the specified *fillchar* (default is an ASCII space). The
1732 original string is returned if *width* is less than or equal to ``len(s)``.
Georg Brandl116aa622007-08-15 14:28:22 +00001733
Georg Brandl116aa622007-08-15 14:28:22 +00001734
1735.. method:: str.lower()
1736
Ezio Melotti0656a562011-08-15 14:27:19 +03001737 Return a copy of the string with all the cased characters [4]_ converted to
1738 lowercase.
Georg Brandl116aa622007-08-15 14:28:22 +00001739
Benjamin Peterson94303542012-01-18 23:09:32 -05001740 The lowercasing algorithm used is described in section 3.13 of the Unicode
1741 Standard.
1742
Georg Brandl116aa622007-08-15 14:28:22 +00001743
1744.. method:: str.lstrip([chars])
1745
1746 Return a copy of the string with leading characters removed. The *chars*
1747 argument is a string specifying the set of characters to be removed. If omitted
1748 or ``None``, the *chars* argument defaults to removing whitespace. The *chars*
Nick Coghlane4936b82014-08-09 16:14:04 +10001749 argument is not a prefix; rather, all combinations of its values are stripped::
Georg Brandl116aa622007-08-15 14:28:22 +00001750
1751 >>> ' spacious '.lstrip()
1752 'spacious '
1753 >>> 'www.example.com'.lstrip('cmowz.')
1754 'example.com'
1755
Georg Brandl116aa622007-08-15 14:28:22 +00001756
Georg Brandlabc38772009-04-12 15:51:51 +00001757.. staticmethod:: str.maketrans(x[, y[, z]])
Georg Brandlceee0772007-11-27 23:48:05 +00001758
1759 This static method returns a translation table usable for :meth:`str.translate`.
1760
1761 If there is only one argument, it must be a dictionary mapping Unicode
1762 ordinals (integers) or characters (strings of length 1) to Unicode ordinals,
Serhiy Storchakaecf41da2016-10-19 16:29:26 +03001763 strings (of arbitrary lengths) or ``None``. Character keys will then be
Georg Brandlceee0772007-11-27 23:48:05 +00001764 converted to ordinals.
1765
1766 If there are two arguments, they must be strings of equal length, and in the
1767 resulting dictionary, each character in x will be mapped to the character at
1768 the same position in y. If there is a third argument, it must be a string,
Serhiy Storchakaecf41da2016-10-19 16:29:26 +03001769 whose characters will be mapped to ``None`` in the result.
Georg Brandlceee0772007-11-27 23:48:05 +00001770
1771
Georg Brandl116aa622007-08-15 14:28:22 +00001772.. method:: str.partition(sep)
1773
1774 Split the string at the first occurrence of *sep*, and return a 3-tuple
1775 containing the part before the separator, the separator itself, and the part
1776 after the separator. If the separator is not found, return a 3-tuple containing
1777 the string itself, followed by two empty strings.
1778
Georg Brandl116aa622007-08-15 14:28:22 +00001779
1780.. method:: str.replace(old, new[, count])
1781
1782 Return a copy of the string with all occurrences of substring *old* replaced by
1783 *new*. If the optional argument *count* is given, only the first *count*
1784 occurrences are replaced.
1785
1786
Georg Brandl226878c2007-08-31 10:15:37 +00001787.. method:: str.rfind(sub[, start[, end]])
Georg Brandl116aa622007-08-15 14:28:22 +00001788
Benjamin Petersond99cd812010-04-27 22:58:50 +00001789 Return the highest index in the string where substring *sub* is found, such
1790 that *sub* is contained within ``s[start:end]``. Optional arguments *start*
1791 and *end* are interpreted as in slice notation. Return ``-1`` on failure.
Georg Brandl116aa622007-08-15 14:28:22 +00001792
1793
1794.. method:: str.rindex(sub[, start[, end]])
1795
1796 Like :meth:`rfind` but raises :exc:`ValueError` when the substring *sub* is not
1797 found.
1798
1799
1800.. method:: str.rjust(width[, fillchar])
1801
Nick Coghlane4936b82014-08-09 16:14:04 +10001802 Return the string right justified in a string of length *width*. Padding is
1803 done using the specified *fillchar* (default is an ASCII space). The
1804 original string is returned if *width* is less than or equal to ``len(s)``.
Georg Brandl116aa622007-08-15 14:28:22 +00001805
Georg Brandl116aa622007-08-15 14:28:22 +00001806
1807.. method:: str.rpartition(sep)
1808
1809 Split the string at the last occurrence of *sep*, and return a 3-tuple
1810 containing the part before the separator, the separator itself, and the part
1811 after the separator. If the separator is not found, return a 3-tuple containing
1812 two empty strings, followed by the string itself.
1813
Georg Brandl116aa622007-08-15 14:28:22 +00001814
Ezio Melotticda6b6d2012-02-26 09:39:55 +02001815.. method:: str.rsplit(sep=None, maxsplit=-1)
Georg Brandl116aa622007-08-15 14:28:22 +00001816
1817 Return a list of the words in the string, using *sep* as the delimiter string.
1818 If *maxsplit* is given, at most *maxsplit* splits are done, the *rightmost*
1819 ones. If *sep* is not specified or ``None``, any whitespace string is a
1820 separator. Except for splitting from the right, :meth:`rsplit` behaves like
1821 :meth:`split` which is described in detail below.
1822
Georg Brandl116aa622007-08-15 14:28:22 +00001823
1824.. method:: str.rstrip([chars])
1825
1826 Return a copy of the string with trailing characters removed. The *chars*
1827 argument is a string specifying the set of characters to be removed. If omitted
1828 or ``None``, the *chars* argument defaults to removing whitespace. The *chars*
Nick Coghlane4936b82014-08-09 16:14:04 +10001829 argument is not a suffix; rather, all combinations of its values are stripped::
Georg Brandl116aa622007-08-15 14:28:22 +00001830
1831 >>> ' spacious '.rstrip()
1832 ' spacious'
1833 >>> 'mississippi'.rstrip('ipz')
1834 'mississ'
1835
Georg Brandl116aa622007-08-15 14:28:22 +00001836
Ezio Melotticda6b6d2012-02-26 09:39:55 +02001837.. method:: str.split(sep=None, maxsplit=-1)
Georg Brandl116aa622007-08-15 14:28:22 +00001838
Georg Brandl226878c2007-08-31 10:15:37 +00001839 Return a list of the words in the string, using *sep* as the delimiter
1840 string. If *maxsplit* is given, at most *maxsplit* splits are done (thus,
1841 the list will have at most ``maxsplit+1`` elements). If *maxsplit* is not
Ezio Melottibf3165b2012-05-10 15:30:42 +03001842 specified or ``-1``, then there is no limit on the number of splits
1843 (all possible splits are made).
Georg Brandl9afde1c2007-11-01 20:32:30 +00001844
Guido van Rossum2cc30da2007-11-02 23:46:40 +00001845 If *sep* is given, consecutive delimiters are not grouped together and are
Georg Brandl226878c2007-08-31 10:15:37 +00001846 deemed to delimit empty strings (for example, ``'1,,2'.split(',')`` returns
1847 ``['1', '', '2']``). The *sep* argument may consist of multiple characters
Georg Brandl9afde1c2007-11-01 20:32:30 +00001848 (for example, ``'1<>2<>3'.split('<>')`` returns ``['1', '2', '3']``).
Georg Brandl226878c2007-08-31 10:15:37 +00001849 Splitting an empty string with a specified separator returns ``['']``.
Georg Brandl116aa622007-08-15 14:28:22 +00001850
Nick Coghlane4936b82014-08-09 16:14:04 +10001851 For example::
1852
1853 >>> '1,2,3'.split(',')
1854 ['1', '2', '3']
1855 >>> '1,2,3'.split(',', maxsplit=1)
Benjamin Petersoneb83ffe2014-09-22 22:43:50 -04001856 ['1', '2,3']
Nick Coghlane4936b82014-08-09 16:14:04 +10001857 >>> '1,2,,3,'.split(',')
1858 ['1', '2', '', '3', '']
1859
Georg Brandl116aa622007-08-15 14:28:22 +00001860 If *sep* is not specified or is ``None``, a different splitting algorithm is
Georg Brandl9afde1c2007-11-01 20:32:30 +00001861 applied: runs of consecutive whitespace are regarded as a single separator,
1862 and the result will contain no empty strings at the start or end if the
1863 string has leading or trailing whitespace. Consequently, splitting an empty
1864 string or a string consisting of just whitespace with a ``None`` separator
1865 returns ``[]``.
1866
Nick Coghlane4936b82014-08-09 16:14:04 +10001867 For example::
1868
1869 >>> '1 2 3'.split()
1870 ['1', '2', '3']
1871 >>> '1 2 3'.split(maxsplit=1)
1872 ['1', '2 3']
1873 >>> ' 1 2 3 '.split()
1874 ['1', '2', '3']
Georg Brandl116aa622007-08-15 14:28:22 +00001875
1876
R David Murray1b00f252012-08-15 10:43:58 -04001877.. index::
1878 single: universal newlines; str.splitlines method
1879
Georg Brandl116aa622007-08-15 14:28:22 +00001880.. method:: str.splitlines([keepends])
1881
Benjamin Peterson8218bd42015-03-31 21:20:36 -04001882 Return a list of the lines in the string, breaking at line boundaries. Line
1883 breaks are not included in the resulting list unless *keepends* is given and
1884 true.
1885
1886 This method splits on the following line boundaries. In particular, the
1887 boundaries are a superset of :term:`universal newlines`.
1888
1889 +-----------------------+-----------------------------+
1890 | Representation | Description |
1891 +=======================+=============================+
1892 | ``\n`` | Line Feed |
1893 +-----------------------+-----------------------------+
1894 | ``\r`` | Carriage Return |
1895 +-----------------------+-----------------------------+
1896 | ``\r\n`` | Carriage Return + Line Feed |
1897 +-----------------------+-----------------------------+
1898 | ``\v`` or ``\x0b`` | Line Tabulation |
1899 +-----------------------+-----------------------------+
1900 | ``\f`` or ``\x0c`` | Form Feed |
1901 +-----------------------+-----------------------------+
1902 | ``\x1c`` | File Separator |
1903 +-----------------------+-----------------------------+
1904 | ``\x1d`` | Group Separator |
1905 +-----------------------+-----------------------------+
1906 | ``\x1e`` | Record Separator |
1907 +-----------------------+-----------------------------+
1908 | ``\x85`` | Next Line (C1 Control Code) |
1909 +-----------------------+-----------------------------+
1910 | ``\u2028`` | Line Separator |
1911 +-----------------------+-----------------------------+
1912 | ``\u2029`` | Paragraph Separator |
1913 +-----------------------+-----------------------------+
1914
1915 .. versionchanged:: 3.2
1916
1917 ``\v`` and ``\f`` added to list of line boundaries.
R David Murrayae1b94b2012-06-01 16:19:36 -04001918
Nick Coghlane4936b82014-08-09 16:14:04 +10001919 For example::
1920
1921 >>> 'ab c\n\nde fg\rkl\r\n'.splitlines()
Larry Hastingsc6256e52014-10-05 19:03:48 -07001922 ['ab c', '', 'de fg', 'kl']
Nick Coghlane4936b82014-08-09 16:14:04 +10001923 >>> 'ab c\n\nde fg\rkl\r\n'.splitlines(keepends=True)
1924 ['ab c\n', '\n', 'de fg\r', 'kl\r\n']
Georg Brandl116aa622007-08-15 14:28:22 +00001925
R David Murray05c35a62012-08-06 16:08:09 -04001926 Unlike :meth:`~str.split` when a delimiter string *sep* is given, this
1927 method returns an empty list for the empty string, and a terminal line
Nick Coghlane4936b82014-08-09 16:14:04 +10001928 break does not result in an extra line::
1929
1930 >>> "".splitlines()
1931 []
1932 >>> "One line\n".splitlines()
1933 ['One line']
1934
1935 For comparison, ``split('\n')`` gives::
1936
1937 >>> ''.split('\n')
1938 ['']
1939 >>> 'Two lines\n'.split('\n')
1940 ['Two lines', '']
R David Murray05c35a62012-08-06 16:08:09 -04001941
Georg Brandl116aa622007-08-15 14:28:22 +00001942
1943.. method:: str.startswith(prefix[, start[, end]])
1944
1945 Return ``True`` if string starts with the *prefix*, otherwise return ``False``.
1946 *prefix* can also be a tuple of prefixes to look for. With optional *start*,
1947 test string beginning at that position. With optional *end*, stop comparing
1948 string at that position.
1949
Georg Brandl116aa622007-08-15 14:28:22 +00001950
1951.. method:: str.strip([chars])
1952
1953 Return a copy of the string with the leading and trailing characters removed.
1954 The *chars* argument is a string specifying the set of characters to be removed.
1955 If omitted or ``None``, the *chars* argument defaults to removing whitespace.
1956 The *chars* argument is not a prefix or suffix; rather, all combinations of its
Nick Coghlane4936b82014-08-09 16:14:04 +10001957 values are stripped::
Georg Brandl116aa622007-08-15 14:28:22 +00001958
1959 >>> ' spacious '.strip()
1960 'spacious'
1961 >>> 'www.example.com'.strip('cmowz.')
1962 'example'
1963
Raymond Hettinger19cfb572015-05-23 09:11:55 -07001964 The outermost leading and trailing *chars* argument values are stripped
1965 from the string. Characters are removed from the leading end until
1966 reaching a string character that is not contained in the set of
1967 characters in *chars*. A similar action takes place on the trailing end.
1968 For example::
1969
1970 >>> comment_string = '#....... Section 3.2.1 Issue #32 .......'
1971 >>> comment_string.strip('.#! ')
1972 'Section 3.2.1 Issue #32'
1973
Georg Brandl116aa622007-08-15 14:28:22 +00001974
1975.. method:: str.swapcase()
1976
1977 Return a copy of the string with uppercase characters converted to lowercase and
Benjamin Petersonb2bf01d2012-01-11 18:17:06 -05001978 vice versa. Note that it is not necessarily true that
1979 ``s.swapcase().swapcase() == s``.
Georg Brandl116aa622007-08-15 14:28:22 +00001980
Georg Brandl116aa622007-08-15 14:28:22 +00001981
1982.. method:: str.title()
1983
Raymond Hettingerb8b0ba12009-09-29 06:22:28 +00001984 Return a titlecased version of the string where words start with an uppercase
1985 character and the remaining characters are lowercase.
1986
Nick Coghlane4936b82014-08-09 16:14:04 +10001987 For example::
1988
1989 >>> 'Hello world'.title()
1990 'Hello World'
1991
Raymond Hettingerb8b0ba12009-09-29 06:22:28 +00001992 The algorithm uses a simple language-independent definition of a word as
1993 groups of consecutive letters. The definition works in many contexts but
1994 it means that apostrophes in contractions and possessives form word
1995 boundaries, which may not be the desired result::
1996
1997 >>> "they're bill's friends from the UK".title()
1998 "They'Re Bill'S Friends From The Uk"
1999
2000 A workaround for apostrophes can be constructed using regular expressions::
2001
2002 >>> import re
2003 >>> def titlecase(s):
Andrew Svetlov5c904362012-11-08 17:26:53 +02002004 ... return re.sub(r"[A-Za-z]+('[A-Za-z]+)?",
2005 ... lambda mo: mo.group(0)[0].upper() +
2006 ... mo.group(0)[1:].lower(),
2007 ... s)
2008 ...
Raymond Hettingerb8b0ba12009-09-29 06:22:28 +00002009 >>> titlecase("they're bill's friends.")
2010 "They're Bill's Friends."
Georg Brandl116aa622007-08-15 14:28:22 +00002011
Georg Brandl116aa622007-08-15 14:28:22 +00002012
Zachary Ware79b98df2015-08-05 23:54:15 -05002013.. method:: str.translate(table)
Georg Brandl116aa622007-08-15 14:28:22 +00002014
Zachary Ware79b98df2015-08-05 23:54:15 -05002015 Return a copy of the string in which each character has been mapped through
2016 the given translation table. The table must be an object that implements
2017 indexing via :meth:`__getitem__`, typically a :term:`mapping` or
2018 :term:`sequence`. When indexed by a Unicode ordinal (an integer), the
2019 table object can do any of the following: return a Unicode ordinal or a
2020 string, to map the character to one or more other characters; return
2021 ``None``, to delete the character from the return string; or raise a
2022 :exc:`LookupError` exception, to map the character to itself.
Georg Brandlceee0772007-11-27 23:48:05 +00002023
Georg Brandl454636f2008-12-27 23:33:20 +00002024 You can use :meth:`str.maketrans` to create a translation map from
2025 character-to-character mappings in different formats.
Christian Heimesfe337bf2008-03-23 21:54:12 +00002026
Zachary Ware79b98df2015-08-05 23:54:15 -05002027 See also the :mod:`codecs` module for a more flexible approach to custom
2028 character mappings.
Georg Brandl116aa622007-08-15 14:28:22 +00002029
2030
2031.. method:: str.upper()
2032
Ezio Melotti0656a562011-08-15 14:27:19 +03002033 Return a copy of the string with all the cased characters [4]_ converted to
2034 uppercase. Note that ``str.upper().isupper()`` might be ``False`` if ``s``
2035 contains uncased characters or if the Unicode category of the resulting
Benjamin Peterson94303542012-01-18 23:09:32 -05002036 character(s) is not "Lu" (Letter, uppercase), but e.g. "Lt" (Letter,
2037 titlecase).
2038
2039 The uppercasing algorithm used is described in section 3.13 of the Unicode
2040 Standard.
Georg Brandl116aa622007-08-15 14:28:22 +00002041
Georg Brandl116aa622007-08-15 14:28:22 +00002042
2043.. method:: str.zfill(width)
2044
Nick Coghlane4936b82014-08-09 16:14:04 +10002045 Return a copy of the string left filled with ASCII ``'0'`` digits to
Tim Golden42c235e2015-04-06 11:04:49 +01002046 make a string of length *width*. A leading sign prefix (``'+'``/``'-'``)
Nick Coghlane4936b82014-08-09 16:14:04 +10002047 is handled by inserting the padding *after* the sign character rather
2048 than before. The original string is returned if *width* is less than
2049 or equal to ``len(s)``.
2050
2051 For example::
2052
2053 >>> "42".zfill(5)
2054 '00042'
2055 >>> "-42".zfill(5)
2056 '-0042'
Christian Heimesb186d002008-03-18 15:15:01 +00002057
2058
Georg Brandl116aa622007-08-15 14:28:22 +00002059
Georg Brandl4b491312007-08-31 09:22:56 +00002060.. _old-string-formatting:
Georg Brandl116aa622007-08-15 14:28:22 +00002061
Nick Coghlan273069c2012-08-20 17:14:07 +10002062``printf``-style String Formatting
2063----------------------------------
Georg Brandl116aa622007-08-15 14:28:22 +00002064
2065.. index::
2066 single: formatting, string (%)
2067 single: interpolation, string (%)
Martin Panterbc1ee462016-02-13 00:41:37 +00002068 single: string; formatting, printf
2069 single: string; interpolation, printf
Georg Brandl116aa622007-08-15 14:28:22 +00002070 single: printf-style formatting
2071 single: sprintf-style formatting
2072 single: % formatting
2073 single: % interpolation
2074
Georg Brandl4b491312007-08-31 09:22:56 +00002075.. note::
2076
Nick Coghlan273069c2012-08-20 17:14:07 +10002077 The formatting operations described here exhibit a variety of quirks that
2078 lead to a number of common errors (such as failing to display tuples and
Martin Panterbc1ee462016-02-13 00:41:37 +00002079 dictionaries correctly). Using the newer :ref:`formatted
2080 string literals <f-strings>` or the :meth:`str.format` interface
2081 helps avoid these errors. These alternatives also provide more powerful,
2082 flexible and extensible approaches to formatting text.
Georg Brandl4b491312007-08-31 09:22:56 +00002083
2084String objects have one unique built-in operation: the ``%`` operator (modulo).
2085This is also known as the string *formatting* or *interpolation* operator.
2086Given ``format % values`` (where *format* is a string), ``%`` conversion
2087specifications in *format* are replaced with zero or more elements of *values*.
Nick Coghlan273069c2012-08-20 17:14:07 +10002088The effect is similar to using the :c:func:`sprintf` in the C language.
Georg Brandl116aa622007-08-15 14:28:22 +00002089
2090If *format* requires a single argument, *values* may be a single non-tuple
Ezio Melotti0656a562011-08-15 14:27:19 +03002091object. [5]_ Otherwise, *values* must be a tuple with exactly the number of
Georg Brandl116aa622007-08-15 14:28:22 +00002092items specified by the format string, or a single mapping object (for example, a
2093dictionary).
2094
2095A conversion specifier contains two or more characters and has the following
2096components, which must occur in this order:
2097
2098#. The ``'%'`` character, which marks the start of the specifier.
2099
2100#. Mapping key (optional), consisting of a parenthesised sequence of characters
2101 (for example, ``(somename)``).
2102
2103#. Conversion flags (optional), which affect the result of some conversion
2104 types.
2105
2106#. Minimum field width (optional). If specified as an ``'*'`` (asterisk), the
2107 actual width is read from the next element of the tuple in *values*, and the
2108 object to convert comes after the minimum field width and optional precision.
2109
2110#. Precision (optional), given as a ``'.'`` (dot) followed by the precision. If
Eli Benderskyef4902a2011-07-29 09:30:42 +03002111 specified as ``'*'`` (an asterisk), the actual precision is read from the next
Georg Brandl116aa622007-08-15 14:28:22 +00002112 element of the tuple in *values*, and the value to convert comes after the
2113 precision.
2114
2115#. Length modifier (optional).
2116
2117#. Conversion type.
2118
2119When the right argument is a dictionary (or other mapping type), then the
2120formats in the string *must* include a parenthesised mapping key into that
2121dictionary inserted immediately after the ``'%'`` character. The mapping key
Christian Heimesfe337bf2008-03-23 21:54:12 +00002122selects the value to be formatted from the mapping. For example:
Georg Brandl116aa622007-08-15 14:28:22 +00002123
Georg Brandledc9e7f2010-10-17 09:19:03 +00002124 >>> print('%(language)s has %(number)03d quote types.' %
2125 ... {'language': "Python", "number": 2})
Georg Brandl116aa622007-08-15 14:28:22 +00002126 Python has 002 quote types.
2127
2128In this case no ``*`` specifiers may occur in a format (since they require a
2129sequential parameter list).
2130
2131The conversion flag characters are:
2132
2133+---------+---------------------------------------------------------------------+
2134| Flag | Meaning |
2135+=========+=====================================================================+
2136| ``'#'`` | The value conversion will use the "alternate form" (where defined |
2137| | below). |
2138+---------+---------------------------------------------------------------------+
2139| ``'0'`` | The conversion will be zero padded for numeric values. |
2140+---------+---------------------------------------------------------------------+
2141| ``'-'`` | The converted value is left adjusted (overrides the ``'0'`` |
2142| | conversion if both are given). |
2143+---------+---------------------------------------------------------------------+
2144| ``' '`` | (a space) A blank should be left before a positive number (or empty |
2145| | string) produced by a signed conversion. |
2146+---------+---------------------------------------------------------------------+
2147| ``'+'`` | A sign character (``'+'`` or ``'-'``) will precede the conversion |
2148| | (overrides a "space" flag). |
2149+---------+---------------------------------------------------------------------+
2150
2151A length modifier (``h``, ``l``, or ``L``) may be present, but is ignored as it
Alexandre Vassalotti5f8ced22008-05-16 00:03:33 +00002152is not necessary for Python -- so e.g. ``%ld`` is identical to ``%d``.
Georg Brandl116aa622007-08-15 14:28:22 +00002153
2154The conversion types are:
2155
2156+------------+-----------------------------------------------------+-------+
2157| Conversion | Meaning | Notes |
2158+============+=====================================================+=======+
2159| ``'d'`` | Signed integer decimal. | |
2160+------------+-----------------------------------------------------+-------+
2161| ``'i'`` | Signed integer decimal. | |
2162+------------+-----------------------------------------------------+-------+
Alexandre Vassalotti5f8ced22008-05-16 00:03:33 +00002163| ``'o'`` | Signed octal value. | \(1) |
Georg Brandl116aa622007-08-15 14:28:22 +00002164+------------+-----------------------------------------------------+-------+
Berker Peksag7b440df2016-12-15 05:37:56 +03002165| ``'u'`` | Obsolete type -- it is identical to ``'d'``. | \(6) |
Georg Brandl116aa622007-08-15 14:28:22 +00002166+------------+-----------------------------------------------------+-------+
Alexandre Vassalotti5f8ced22008-05-16 00:03:33 +00002167| ``'x'`` | Signed hexadecimal (lowercase). | \(2) |
Georg Brandl116aa622007-08-15 14:28:22 +00002168+------------+-----------------------------------------------------+-------+
Alexandre Vassalotti5f8ced22008-05-16 00:03:33 +00002169| ``'X'`` | Signed hexadecimal (uppercase). | \(2) |
Georg Brandl116aa622007-08-15 14:28:22 +00002170+------------+-----------------------------------------------------+-------+
2171| ``'e'`` | Floating point exponential format (lowercase). | \(3) |
2172+------------+-----------------------------------------------------+-------+
2173| ``'E'`` | Floating point exponential format (uppercase). | \(3) |
2174+------------+-----------------------------------------------------+-------+
Eric Smith22b85b32008-07-17 19:18:29 +00002175| ``'f'`` | Floating point decimal format. | \(3) |
Georg Brandl116aa622007-08-15 14:28:22 +00002176+------------+-----------------------------------------------------+-------+
Eric Smith22b85b32008-07-17 19:18:29 +00002177| ``'F'`` | Floating point decimal format. | \(3) |
Georg Brandl116aa622007-08-15 14:28:22 +00002178+------------+-----------------------------------------------------+-------+
Christian Heimes8dc226f2008-05-06 23:45:46 +00002179| ``'g'`` | Floating point format. Uses lowercase exponential | \(4) |
2180| | format if exponent is less than -4 or not less than | |
2181| | precision, decimal format otherwise. | |
Georg Brandl116aa622007-08-15 14:28:22 +00002182+------------+-----------------------------------------------------+-------+
Christian Heimes8dc226f2008-05-06 23:45:46 +00002183| ``'G'`` | Floating point format. Uses uppercase exponential | \(4) |
2184| | format if exponent is less than -4 or not less than | |
2185| | precision, decimal format otherwise. | |
Georg Brandl116aa622007-08-15 14:28:22 +00002186+------------+-----------------------------------------------------+-------+
2187| ``'c'`` | Single character (accepts integer or single | |
2188| | character string). | |
2189+------------+-----------------------------------------------------+-------+
Ezio Melotti0639d5a2009-12-19 23:26:38 +00002190| ``'r'`` | String (converts any Python object using | \(5) |
Georg Brandl116aa622007-08-15 14:28:22 +00002191| | :func:`repr`). | |
2192+------------+-----------------------------------------------------+-------+
Eli Benderskyef4902a2011-07-29 09:30:42 +03002193| ``'s'`` | String (converts any Python object using | \(5) |
Georg Brandl116aa622007-08-15 14:28:22 +00002194| | :func:`str`). | |
2195+------------+-----------------------------------------------------+-------+
Eli Benderskyef4902a2011-07-29 09:30:42 +03002196| ``'a'`` | String (converts any Python object using | \(5) |
2197| | :func:`ascii`). | |
2198+------------+-----------------------------------------------------+-------+
Georg Brandl116aa622007-08-15 14:28:22 +00002199| ``'%'`` | No argument is converted, results in a ``'%'`` | |
2200| | character in the result. | |
2201+------------+-----------------------------------------------------+-------+
2202
2203Notes:
2204
2205(1)
Martin Panter41176ae2016-12-11 01:07:29 +00002206 The alternate form causes a leading octal specifier (``'0o'``) to be
2207 inserted before the first digit.
Georg Brandl116aa622007-08-15 14:28:22 +00002208
2209(2)
2210 The alternate form causes a leading ``'0x'`` or ``'0X'`` (depending on whether
Martin Panter41176ae2016-12-11 01:07:29 +00002211 the ``'x'`` or ``'X'`` format was used) to be inserted before the first digit.
Georg Brandl116aa622007-08-15 14:28:22 +00002212
2213(3)
2214 The alternate form causes the result to always contain a decimal point, even if
2215 no digits follow it.
2216
2217 The precision determines the number of digits after the decimal point and
2218 defaults to 6.
2219
2220(4)
2221 The alternate form causes the result to always contain a decimal point, and
2222 trailing zeroes are not removed as they would otherwise be.
2223
2224 The precision determines the number of significant digits before and after the
2225 decimal point and defaults to 6.
2226
2227(5)
Eli Benderskyef4902a2011-07-29 09:30:42 +03002228 If precision is ``N``, the output is truncated to ``N`` characters.
Georg Brandl116aa622007-08-15 14:28:22 +00002229
Berker Peksag7b440df2016-12-15 05:37:56 +03002230(6)
Alexandre Vassalotti5f8ced22008-05-16 00:03:33 +00002231 See :pep:`237`.
2232
Georg Brandl116aa622007-08-15 14:28:22 +00002233Since Python strings have an explicit length, ``%s`` conversions do not assume
2234that ``'\0'`` is the end of the string.
2235
Christian Heimes5b5e81c2007-12-31 16:14:33 +00002236.. XXX Examples?
2237
Mark Dickinson33841c32009-05-01 15:37:04 +00002238.. versionchanged:: 3.1
2239 ``%f`` conversions for numbers whose absolute value is over 1e50 are no
2240 longer replaced by ``%g`` conversions.
Georg Brandl116aa622007-08-15 14:28:22 +00002241
Georg Brandl116aa622007-08-15 14:28:22 +00002242
Chris Jerdonek5fae0e52012-11-20 17:45:51 -08002243.. index::
2244 single: buffer protocol; binary sequence types
2245
Nick Coghlan273069c2012-08-20 17:14:07 +10002246.. _binaryseq:
Georg Brandl116aa622007-08-15 14:28:22 +00002247
Nick Coghlan273069c2012-08-20 17:14:07 +10002248Binary Sequence Types --- :class:`bytes`, :class:`bytearray`, :class:`memoryview`
2249=================================================================================
Georg Brandl116aa622007-08-15 14:28:22 +00002250
2251.. index::
Nick Coghlan273069c2012-08-20 17:14:07 +10002252 object: bytes
Georg Brandl95414632007-11-22 11:00:28 +00002253 object: bytearray
Nick Coghlan273069c2012-08-20 17:14:07 +10002254 object: memoryview
2255 module: array
Georg Brandl116aa622007-08-15 14:28:22 +00002256
Nick Coghlan273069c2012-08-20 17:14:07 +10002257The core built-in types for manipulating binary data are :class:`bytes` and
2258:class:`bytearray`. They are supported by :class:`memoryview` which uses
Chris Jerdonek5fae0e52012-11-20 17:45:51 -08002259the :ref:`buffer protocol <bufferobjects>` to access the memory of other
2260binary objects without needing to make a copy.
Georg Brandl226878c2007-08-31 10:15:37 +00002261
Nick Coghlan273069c2012-08-20 17:14:07 +10002262The :mod:`array` module supports efficient storage of basic data types like
226332-bit integers and IEEE754 double-precision floating values.
Georg Brandl116aa622007-08-15 14:28:22 +00002264
Nick Coghlan273069c2012-08-20 17:14:07 +10002265.. _typebytes:
Senthil Kumaran7cafd262010-10-02 03:16:04 +00002266
Nick Coghlan273069c2012-08-20 17:14:07 +10002267Bytes
2268-----
2269
2270.. index:: object: bytes
2271
2272Bytes objects are immutable sequences of single bytes. Since many major
2273binary protocols are based on the ASCII text encoding, bytes objects offer
2274several methods that are only valid when working with ASCII compatible
2275data and are closely related to string objects in a variety of other ways.
2276
2277Firstly, the syntax for bytes literals is largely the same as that for string
2278literals, except that a ``b`` prefix is added:
2279
2280* Single quotes: ``b'still allows embedded "double" quotes'``
2281* Double quotes: ``b"still allows embedded 'single' quotes"``.
2282* Triple quoted: ``b'''3 single quotes'''``, ``b"""3 double quotes"""``
2283
2284Only ASCII characters are permitted in bytes literals (regardless of the
2285declared source code encoding). Any binary values over 127 must be entered
2286into bytes literals using the appropriate escape sequence.
2287
2288As with string literals, bytes literals may also use a ``r`` prefix to disable
2289processing of escape sequences. See :ref:`strings` for more about the various
2290forms of bytes literal, including supported escape sequences.
2291
2292While bytes literals and representations are based on ASCII text, bytes
2293objects actually behave like immutable sequences of integers, with each
2294value in the sequence restricted such that ``0 <= x < 256`` (attempts to
2295violate this restriction will trigger :exc:`ValueError`. This is done
2296deliberately to emphasise that while many binary formats include ASCII based
2297elements and can be usefully manipulated with some text-oriented algorithms,
2298this is not generally the case for arbitrary binary data (blindly applying
2299text processing algorithms to binary data formats that are not ASCII
2300compatible will usually lead to data corruption).
2301
2302In addition to the literal forms, bytes objects can be created in a number of
2303other ways:
2304
2305* A zero-filled bytes object of a specified length: ``bytes(10)``
2306* From an iterable of integers: ``bytes(range(20))``
2307* Copying existing binary data via the buffer protocol: ``bytes(obj)``
2308
Nick Coghlan83c0ae52012-08-21 17:42:52 +10002309Also see the :ref:`bytes <func-bytes>` built-in.
2310
Nick Coghlane4936b82014-08-09 16:14:04 +10002311Since 2 hexadecimal digits correspond precisely to a single byte, hexadecimal
2312numbers are a commonly used format for describing binary data. Accordingly,
2313the bytes type has an additional class method to read data in that format:
2314
2315.. classmethod:: bytes.fromhex(string)
2316
2317 This :class:`bytes` class method returns a bytes object, decoding the
2318 given string object. The string must contain two hexadecimal digits per
2319 byte, with ASCII spaces being ignored.
2320
2321 >>> bytes.fromhex('2Ef0 F1f2 ')
2322 b'.\xf0\xf1\xf2'
2323
Gregory P. Smith8cb65692015-04-25 23:22:26 +00002324A reverse conversion function exists to transform a bytes object into its
2325hexadecimal representation.
2326
2327.. method:: bytes.hex()
2328
2329 Return a string object containing two hexadecimal digits for each
2330 byte in the instance.
2331
2332 >>> b'\xf0\xf1\xf2'.hex()
2333 'f0f1f2'
2334
2335 .. versionadded:: 3.5
2336
Nick Coghlane4936b82014-08-09 16:14:04 +10002337Since bytes objects are sequences of integers (akin to a tuple), for a bytes
2338object *b*, ``b[0]`` will be an integer, while ``b[0:1]`` will be a bytes
2339object of length 1. (This contrasts with text strings, where both indexing
2340and slicing will produce a string of length 1)
Nick Coghlan273069c2012-08-20 17:14:07 +10002341
2342The representation of bytes objects uses the literal format (``b'...'``)
2343since it is often more useful than e.g. ``bytes([46, 46, 46])``. You can
2344always convert a bytes object into a list of integers using ``list(b)``.
Georg Brandl116aa622007-08-15 14:28:22 +00002345
Nick Coghlan273069c2012-08-20 17:14:07 +10002346.. note::
2347 For Python 2.x users: In the Python 2.x series, a variety of implicit
2348 conversions between 8-bit strings (the closest thing 2.x offers to a
2349 built-in binary data type) and Unicode strings were permitted. This was a
2350 backwards compatibility workaround to account for the fact that Python
2351 originally only supported 8-bit text, and Unicode text was a later
2352 addition. In Python 3.x, those implicit conversions are gone - conversions
2353 between 8-bit binary data and Unicode text must be explicit, and bytes and
2354 string objects will always compare unequal.
Raymond Hettingerc50846a2010-04-05 18:56:31 +00002355
Georg Brandl116aa622007-08-15 14:28:22 +00002356
Nick Coghlan273069c2012-08-20 17:14:07 +10002357.. _typebytearray:
Georg Brandl116aa622007-08-15 14:28:22 +00002358
Nick Coghlan273069c2012-08-20 17:14:07 +10002359Bytearray Objects
2360-----------------
Georg Brandl116aa622007-08-15 14:28:22 +00002361
Nick Coghlan273069c2012-08-20 17:14:07 +10002362.. index:: object: bytearray
Georg Brandl495f7b52009-10-27 15:28:25 +00002363
Nick Coghlan273069c2012-08-20 17:14:07 +10002364:class:`bytearray` objects are a mutable counterpart to :class:`bytes`
2365objects. There is no dedicated literal syntax for bytearray objects, instead
2366they are always created by calling the constructor:
Georg Brandl116aa622007-08-15 14:28:22 +00002367
Nick Coghlan273069c2012-08-20 17:14:07 +10002368* Creating an empty instance: ``bytearray()``
2369* Creating a zero-filled instance with a given length: ``bytearray(10)``
2370* From an iterable of integers: ``bytearray(range(20))``
Ezio Melotti971ba4c2012-10-27 23:25:18 +03002371* Copying existing binary data via the buffer protocol: ``bytearray(b'Hi!')``
Eli Benderskycbbaa962011-02-25 05:47:53 +00002372
Nick Coghlan273069c2012-08-20 17:14:07 +10002373As bytearray objects are mutable, they support the
2374:ref:`mutable <typesseq-mutable>` sequence operations in addition to the
2375common bytes and bytearray operations described in :ref:`bytes-methods`.
Georg Brandl116aa622007-08-15 14:28:22 +00002376
Nick Coghlan83c0ae52012-08-21 17:42:52 +10002377Also see the :ref:`bytearray <func-bytearray>` built-in.
2378
Nick Coghlane4936b82014-08-09 16:14:04 +10002379Since 2 hexadecimal digits correspond precisely to a single byte, hexadecimal
2380numbers are a commonly used format for describing binary data. Accordingly,
2381the bytearray type has an additional class method to read data in that format:
2382
2383.. classmethod:: bytearray.fromhex(string)
2384
2385 This :class:`bytearray` class method returns bytearray object, decoding
2386 the given string object. The string must contain two hexadecimal digits
2387 per byte, with ASCII spaces being ignored.
2388
2389 >>> bytearray.fromhex('2Ef0 F1f2 ')
2390 bytearray(b'.\xf0\xf1\xf2')
2391
Gregory P. Smith8cb65692015-04-25 23:22:26 +00002392A reverse conversion function exists to transform a bytearray object into its
2393hexadecimal representation.
2394
2395.. method:: bytearray.hex()
2396
2397 Return a string object containing two hexadecimal digits for each
2398 byte in the instance.
2399
2400 >>> bytearray(b'\xf0\xf1\xf2').hex()
2401 'f0f1f2'
2402
2403 .. versionadded:: 3.5
2404
Nick Coghlane4936b82014-08-09 16:14:04 +10002405Since bytearray objects are sequences of integers (akin to a list), for a
2406bytearray object *b*, ``b[0]`` will be an integer, while ``b[0:1]`` will be
2407a bytearray object of length 1. (This contrasts with text strings, where
2408both indexing and slicing will produce a string of length 1)
2409
2410The representation of bytearray objects uses the bytes literal format
2411(``bytearray(b'...')``) since it is often more useful than e.g.
2412``bytearray([46, 46, 46])``. You can always convert a bytearray object into
2413a list of integers using ``list(b)``.
2414
Georg Brandl495f7b52009-10-27 15:28:25 +00002415
Georg Brandl226878c2007-08-31 10:15:37 +00002416.. _bytes-methods:
2417
Nick Coghlan273069c2012-08-20 17:14:07 +10002418Bytes and Bytearray Operations
2419------------------------------
Georg Brandl226878c2007-08-31 10:15:37 +00002420
2421.. index:: pair: bytes; methods
Georg Brandl95414632007-11-22 11:00:28 +00002422 pair: bytearray; methods
Georg Brandl226878c2007-08-31 10:15:37 +00002423
Nick Coghlan273069c2012-08-20 17:14:07 +10002424Both bytes and bytearray objects support the :ref:`common <typesseq-common>`
2425sequence operations. They interoperate not just with operands of the same
Nick Coghlane4936b82014-08-09 16:14:04 +10002426type, but with any :term:`bytes-like object`. Due to this flexibility, they can be
Nick Coghlan273069c2012-08-20 17:14:07 +10002427freely mixed in operations without causing errors. However, the return type
2428of the result may depend on the order of operands.
Guido van Rossum98297ee2007-11-06 21:34:58 +00002429
Georg Brandl7c676132007-10-23 18:17:00 +00002430.. note::
Georg Brandl226878c2007-08-31 10:15:37 +00002431
Georg Brandl95414632007-11-22 11:00:28 +00002432 The methods on bytes and bytearray objects don't accept strings as their
Georg Brandl7c676132007-10-23 18:17:00 +00002433 arguments, just as the methods on strings don't accept bytes as their
Nick Coghlan273069c2012-08-20 17:14:07 +10002434 arguments. For example, you have to write::
Georg Brandl226878c2007-08-31 10:15:37 +00002435
Georg Brandl7c676132007-10-23 18:17:00 +00002436 a = "abc"
2437 b = a.replace("a", "f")
2438
Nick Coghlan273069c2012-08-20 17:14:07 +10002439 and::
Georg Brandl7c676132007-10-23 18:17:00 +00002440
2441 a = b"abc"
2442 b = a.replace(b"a", b"f")
Georg Brandl226878c2007-08-31 10:15:37 +00002443
Nick Coghlane4936b82014-08-09 16:14:04 +10002444Some bytes and bytearray operations assume the use of ASCII compatible
2445binary formats, and hence should be avoided when working with arbitrary
2446binary data. These restrictions are covered below.
Nick Coghlan273069c2012-08-20 17:14:07 +10002447
2448.. note::
Nick Coghlane4936b82014-08-09 16:14:04 +10002449 Using these ASCII based operations to manipulate binary data that is not
Nick Coghlan273069c2012-08-20 17:14:07 +10002450 stored in an ASCII based format may lead to data corruption.
2451
Nick Coghlane4936b82014-08-09 16:14:04 +10002452The following methods on bytes and bytearray objects can be used with
2453arbitrary binary data.
Nick Coghlan273069c2012-08-20 17:14:07 +10002454
Nick Coghlane4936b82014-08-09 16:14:04 +10002455.. method:: bytes.count(sub[, start[, end]])
2456 bytearray.count(sub[, start[, end]])
Nick Coghlan273069c2012-08-20 17:14:07 +10002457
Nick Coghlane4936b82014-08-09 16:14:04 +10002458 Return the number of non-overlapping occurrences of subsequence *sub* in
2459 the range [*start*, *end*]. Optional arguments *start* and *end* are
2460 interpreted as in slice notation.
Nick Coghlan273069c2012-08-20 17:14:07 +10002461
Nick Coghlane4936b82014-08-09 16:14:04 +10002462 The subsequence to search for may be any :term:`bytes-like object` or an
2463 integer in the range 0 to 255.
2464
2465 .. versionchanged:: 3.3
2466 Also accept an integer in the range 0 to 255 as the subsequence.
2467
Georg Brandl226878c2007-08-31 10:15:37 +00002468
Victor Stinnere14e2122010-11-07 18:41:46 +00002469.. method:: bytes.decode(encoding="utf-8", errors="strict")
2470 bytearray.decode(encoding="utf-8", errors="strict")
Georg Brandl4f5f98d2009-05-04 21:01:20 +00002471
Victor Stinnere14e2122010-11-07 18:41:46 +00002472 Return a string decoded from the given bytes. Default encoding is
2473 ``'utf-8'``. *errors* may be given to set a different
Georg Brandl4f5f98d2009-05-04 21:01:20 +00002474 error handling scheme. The default for *errors* is ``'strict'``, meaning
2475 that encoding errors raise a :exc:`UnicodeError`. Other possible values are
2476 ``'ignore'``, ``'replace'`` and any other name registered via
Nick Coghlanb9fdb7a2015-01-07 00:22:00 +10002477 :func:`codecs.register_error`, see section :ref:`error-handlers`. For a
Georg Brandl4f5f98d2009-05-04 21:01:20 +00002478 list of possible encodings, see section :ref:`standard-encodings`.
2479
Nick Coghlane4936b82014-08-09 16:14:04 +10002480 .. note::
2481
2482 Passing the *encoding* argument to :class:`str` allows decoding any
2483 :term:`bytes-like object` directly, without needing to make a temporary
2484 bytes or bytearray object.
2485
Benjamin Peterson308d6372009-09-18 21:42:35 +00002486 .. versionchanged:: 3.1
2487 Added support for keyword arguments.
2488
Georg Brandl226878c2007-08-31 10:15:37 +00002489
Nick Coghlane4936b82014-08-09 16:14:04 +10002490.. method:: bytes.endswith(suffix[, start[, end]])
2491 bytearray.endswith(suffix[, start[, end]])
Georg Brandl226878c2007-08-31 10:15:37 +00002492
Nick Coghlane4936b82014-08-09 16:14:04 +10002493 Return ``True`` if the binary data ends with the specified *suffix*,
2494 otherwise return ``False``. *suffix* can also be a tuple of suffixes to
2495 look for. With optional *start*, test beginning at that position. With
2496 optional *end*, stop comparing at that position.
Georg Brandl226878c2007-08-31 10:15:37 +00002497
Nick Coghlane4936b82014-08-09 16:14:04 +10002498 The suffix(es) to search for may be any :term:`bytes-like object`.
Georg Brandl226878c2007-08-31 10:15:37 +00002499
Georg Brandlabc38772009-04-12 15:51:51 +00002500
Nick Coghlane4936b82014-08-09 16:14:04 +10002501.. method:: bytes.find(sub[, start[, end]])
2502 bytearray.find(sub[, start[, end]])
2503
2504 Return the lowest index in the data where the subsequence *sub* is found,
2505 such that *sub* is contained in the slice ``s[start:end]``. Optional
2506 arguments *start* and *end* are interpreted as in slice notation. Return
2507 ``-1`` if *sub* is not found.
2508
2509 The subsequence to search for may be any :term:`bytes-like object` or an
2510 integer in the range 0 to 255.
2511
2512 .. note::
2513
2514 The :meth:`~bytes.find` method should be used only if you need to know the
2515 position of *sub*. To check if *sub* is a substring or not, use the
2516 :keyword:`in` operator::
2517
2518 >>> b'Py' in b'Python'
2519 True
2520
2521 .. versionchanged:: 3.3
2522 Also accept an integer in the range 0 to 255 as the subsequence.
2523
2524
2525.. method:: bytes.index(sub[, start[, end]])
2526 bytearray.index(sub[, start[, end]])
2527
2528 Like :meth:`~bytes.find`, but raise :exc:`ValueError` when the
2529 subsequence is not found.
2530
2531 The subsequence to search for may be any :term:`bytes-like object` or an
2532 integer in the range 0 to 255.
2533
2534 .. versionchanged:: 3.3
2535 Also accept an integer in the range 0 to 255 as the subsequence.
2536
2537
2538.. method:: bytes.join(iterable)
2539 bytearray.join(iterable)
2540
2541 Return a bytes or bytearray object which is the concatenation of the
2542 binary data sequences in the :term:`iterable` *iterable*. A
2543 :exc:`TypeError` will be raised if there are any values in *iterable*
R David Murray0e8168c2015-05-17 10:16:37 -04002544 that are not :term:`bytes-like objects <bytes-like object>`, including
Nick Coghlane4936b82014-08-09 16:14:04 +10002545 :class:`str` objects. The separator between elements is the contents
2546 of the bytes or bytearray object providing this method.
2547
2548
2549.. staticmethod:: bytes.maketrans(from, to)
2550 bytearray.maketrans(from, to)
2551
2552 This static method returns a translation table usable for
2553 :meth:`bytes.translate` that will map each character in *from* into the
2554 character at the same position in *to*; *from* and *to* must both be
2555 :term:`bytes-like objects <bytes-like object>` and have the same length.
2556
2557 .. versionadded:: 3.1
2558
2559
2560.. method:: bytes.partition(sep)
2561 bytearray.partition(sep)
2562
2563 Split the sequence at the first occurrence of *sep*, and return a 3-tuple
2564 containing the part before the separator, the separator, and the part
2565 after the separator. If the separator is not found, return a 3-tuple
2566 containing a copy of the original sequence, followed by two empty bytes or
2567 bytearray objects.
2568
2569 The separator to search for may be any :term:`bytes-like object`.
2570
2571
2572.. method:: bytes.replace(old, new[, count])
2573 bytearray.replace(old, new[, count])
2574
2575 Return a copy of the sequence with all occurrences of subsequence *old*
2576 replaced by *new*. If the optional argument *count* is given, only the
2577 first *count* occurrences are replaced.
2578
2579 The subsequence to search for and its replacement may be any
2580 :term:`bytes-like object`.
2581
2582 .. note::
2583
2584 The bytearray version of this method does *not* operate in place - it
2585 always produces a new object, even if no changes were made.
2586
2587
2588.. method:: bytes.rfind(sub[, start[, end]])
2589 bytearray.rfind(sub[, start[, end]])
2590
2591 Return the highest index in the sequence where the subsequence *sub* is
2592 found, such that *sub* is contained within ``s[start:end]``. Optional
2593 arguments *start* and *end* are interpreted as in slice notation. Return
2594 ``-1`` on failure.
2595
2596 The subsequence to search for may be any :term:`bytes-like object` or an
2597 integer in the range 0 to 255.
2598
2599 .. versionchanged:: 3.3
2600 Also accept an integer in the range 0 to 255 as the subsequence.
2601
2602
2603.. method:: bytes.rindex(sub[, start[, end]])
2604 bytearray.rindex(sub[, start[, end]])
2605
2606 Like :meth:`~bytes.rfind` but raises :exc:`ValueError` when the
2607 subsequence *sub* is not found.
2608
2609 The subsequence to search for may be any :term:`bytes-like object` or an
2610 integer in the range 0 to 255.
2611
2612 .. versionchanged:: 3.3
2613 Also accept an integer in the range 0 to 255 as the subsequence.
2614
2615
2616.. method:: bytes.rpartition(sep)
2617 bytearray.rpartition(sep)
2618
2619 Split the sequence at the last occurrence of *sep*, and return a 3-tuple
2620 containing the part before the separator, the separator, and the part
2621 after the separator. If the separator is not found, return a 3-tuple
2622 containing a copy of the original sequence, followed by two empty bytes or
2623 bytearray objects.
2624
2625 The separator to search for may be any :term:`bytes-like object`.
2626
2627
2628.. method:: bytes.startswith(prefix[, start[, end]])
2629 bytearray.startswith(prefix[, start[, end]])
2630
2631 Return ``True`` if the binary data starts with the specified *prefix*,
2632 otherwise return ``False``. *prefix* can also be a tuple of prefixes to
2633 look for. With optional *start*, test beginning at that position. With
2634 optional *end*, stop comparing at that position.
2635
2636 The prefix(es) to search for may be any :term:`bytes-like object`.
2637
Georg Brandl48310cd2009-01-03 21:18:54 +00002638
Martin Panter1b6c6da2016-08-27 08:35:02 +00002639.. method:: bytes.translate(table, delete=b'')
2640 bytearray.translate(table, delete=b'')
Georg Brandl226878c2007-08-31 10:15:37 +00002641
Georg Brandl454636f2008-12-27 23:33:20 +00002642 Return a copy of the bytes or bytearray object where all bytes occurring in
Nick Coghlane4936b82014-08-09 16:14:04 +10002643 the optional argument *delete* are removed, and the remaining bytes have
2644 been mapped through the given translation table, which must be a bytes
2645 object of length 256.
Georg Brandl226878c2007-08-31 10:15:37 +00002646
Nick Coghlane4936b82014-08-09 16:14:04 +10002647 You can use the :func:`bytes.maketrans` method to create a translation
2648 table.
Georg Brandl226878c2007-08-31 10:15:37 +00002649
Georg Brandl454636f2008-12-27 23:33:20 +00002650 Set the *table* argument to ``None`` for translations that only delete
2651 characters::
Georg Brandl226878c2007-08-31 10:15:37 +00002652
Georg Brandl454636f2008-12-27 23:33:20 +00002653 >>> b'read this short text'.translate(None, b'aeiou')
2654 b'rd ths shrt txt'
Georg Brandl226878c2007-08-31 10:15:37 +00002655
Martin Panter1b6c6da2016-08-27 08:35:02 +00002656 .. versionchanged:: 3.6
2657 *delete* is now supported as a keyword argument.
2658
Georg Brandl226878c2007-08-31 10:15:37 +00002659
Nick Coghlane4936b82014-08-09 16:14:04 +10002660The following methods on bytes and bytearray objects have default behaviours
2661that assume the use of ASCII compatible binary formats, but can still be used
2662with arbitrary binary data by passing appropriate arguments. Note that all of
2663the bytearray methods in this section do *not* operate in place, and instead
2664produce new objects.
Georg Brandlabc38772009-04-12 15:51:51 +00002665
Nick Coghlane4936b82014-08-09 16:14:04 +10002666.. method:: bytes.center(width[, fillbyte])
2667 bytearray.center(width[, fillbyte])
Georg Brandlabc38772009-04-12 15:51:51 +00002668
Nick Coghlane4936b82014-08-09 16:14:04 +10002669 Return a copy of the object centered in a sequence of length *width*.
2670 Padding is done using the specified *fillbyte* (default is an ASCII
2671 space). For :class:`bytes` objects, the original sequence is returned if
2672 *width* is less than or equal to ``len(s)``.
2673
2674 .. note::
2675
2676 The bytearray version of this method does *not* operate in place -
2677 it always produces a new object, even if no changes were made.
2678
2679
2680.. method:: bytes.ljust(width[, fillbyte])
2681 bytearray.ljust(width[, fillbyte])
2682
2683 Return a copy of the object left justified in a sequence of length *width*.
2684 Padding is done using the specified *fillbyte* (default is an ASCII
2685 space). For :class:`bytes` objects, the original sequence is returned if
2686 *width* is less than or equal to ``len(s)``.
2687
2688 .. note::
2689
2690 The bytearray version of this method does *not* operate in place -
2691 it always produces a new object, even if no changes were made.
2692
2693
2694.. method:: bytes.lstrip([chars])
2695 bytearray.lstrip([chars])
2696
2697 Return a copy of the sequence with specified leading bytes removed. The
2698 *chars* argument is a binary sequence specifying the set of byte values to
2699 be removed - the name refers to the fact this method is usually used with
2700 ASCII characters. If omitted or ``None``, the *chars* argument defaults
2701 to removing ASCII whitespace. The *chars* argument is not a prefix;
2702 rather, all combinations of its values are stripped::
2703
2704 >>> b' spacious '.lstrip()
2705 b'spacious '
2706 >>> b'www.example.com'.lstrip(b'cmowz.')
2707 b'example.com'
2708
2709 The binary sequence of byte values to remove may be any
2710 :term:`bytes-like object`.
2711
2712 .. note::
2713
2714 The bytearray version of this method does *not* operate in place -
2715 it always produces a new object, even if no changes were made.
2716
2717
2718.. method:: bytes.rjust(width[, fillbyte])
2719 bytearray.rjust(width[, fillbyte])
2720
2721 Return a copy of the object right justified in a sequence of length *width*.
2722 Padding is done using the specified *fillbyte* (default is an ASCII
2723 space). For :class:`bytes` objects, the original sequence is returned if
2724 *width* is less than or equal to ``len(s)``.
2725
2726 .. note::
2727
2728 The bytearray version of this method does *not* operate in place -
2729 it always produces a new object, even if no changes were made.
2730
2731
2732.. method:: bytes.rsplit(sep=None, maxsplit=-1)
2733 bytearray.rsplit(sep=None, maxsplit=-1)
2734
2735 Split the binary sequence into subsequences of the same type, using *sep*
2736 as the delimiter string. If *maxsplit* is given, at most *maxsplit* splits
2737 are done, the *rightmost* ones. If *sep* is not specified or ``None``,
2738 any subsequence consisting solely of ASCII whitespace is a separator.
2739 Except for splitting from the right, :meth:`rsplit` behaves like
2740 :meth:`split` which is described in detail below.
2741
2742
2743.. method:: bytes.rstrip([chars])
2744 bytearray.rstrip([chars])
2745
2746 Return a copy of the sequence with specified trailing bytes removed. The
2747 *chars* argument is a binary sequence specifying the set of byte values to
2748 be removed - the name refers to the fact this method is usually used with
2749 ASCII characters. If omitted or ``None``, the *chars* argument defaults to
2750 removing ASCII whitespace. The *chars* argument is not a suffix; rather,
2751 all combinations of its values are stripped::
2752
2753 >>> b' spacious '.rstrip()
2754 b' spacious'
2755 >>> b'mississippi'.rstrip(b'ipz')
2756 b'mississ'
2757
2758 The binary sequence of byte values to remove may be any
2759 :term:`bytes-like object`.
2760
2761 .. note::
2762
2763 The bytearray version of this method does *not* operate in place -
2764 it always produces a new object, even if no changes were made.
2765
2766
2767.. method:: bytes.split(sep=None, maxsplit=-1)
2768 bytearray.split(sep=None, maxsplit=-1)
2769
2770 Split the binary sequence into subsequences of the same type, using *sep*
2771 as the delimiter string. If *maxsplit* is given and non-negative, at most
2772 *maxsplit* splits are done (thus, the list will have at most ``maxsplit+1``
2773 elements). If *maxsplit* is not specified or is ``-1``, then there is no
2774 limit on the number of splits (all possible splits are made).
2775
2776 If *sep* is given, consecutive delimiters are not grouped together and are
2777 deemed to delimit empty subsequences (for example, ``b'1,,2'.split(b',')``
2778 returns ``[b'1', b'', b'2']``). The *sep* argument may consist of a
2779 multibyte sequence (for example, ``b'1<>2<>3'.split(b'<>')`` returns
2780 ``[b'1', b'2', b'3']``). Splitting an empty sequence with a specified
2781 separator returns ``[b'']`` or ``[bytearray(b'')]`` depending on the type
2782 of object being split. The *sep* argument may be any
2783 :term:`bytes-like object`.
2784
2785 For example::
2786
2787 >>> b'1,2,3'.split(b',')
2788 [b'1', b'2', b'3']
2789 >>> b'1,2,3'.split(b',', maxsplit=1)
Benjamin Petersoneb83ffe2014-09-22 22:43:50 -04002790 [b'1', b'2,3']
Nick Coghlane4936b82014-08-09 16:14:04 +10002791 >>> b'1,2,,3,'.split(b',')
2792 [b'1', b'2', b'', b'3', b'']
2793
2794 If *sep* is not specified or is ``None``, a different splitting algorithm
2795 is applied: runs of consecutive ASCII whitespace are regarded as a single
2796 separator, and the result will contain no empty strings at the start or
2797 end if the sequence has leading or trailing whitespace. Consequently,
2798 splitting an empty sequence or a sequence consisting solely of ASCII
2799 whitespace without a specified separator returns ``[]``.
2800
2801 For example::
2802
2803
2804 >>> b'1 2 3'.split()
2805 [b'1', b'2', b'3']
2806 >>> b'1 2 3'.split(maxsplit=1)
2807 [b'1', b'2 3']
2808 >>> b' 1 2 3 '.split()
2809 [b'1', b'2', b'3']
2810
2811
2812.. method:: bytes.strip([chars])
2813 bytearray.strip([chars])
2814
2815 Return a copy of the sequence with specified leading and trailing bytes
2816 removed. The *chars* argument is a binary sequence specifying the set of
2817 byte values to be removed - the name refers to the fact this method is
2818 usually used with ASCII characters. If omitted or ``None``, the *chars*
2819 argument defaults to removing ASCII whitespace. The *chars* argument is
2820 not a prefix or suffix; rather, all combinations of its values are
2821 stripped::
2822
2823 >>> b' spacious '.strip()
2824 b'spacious'
2825 >>> b'www.example.com'.strip(b'cmowz.')
2826 b'example'
2827
2828 The binary sequence of byte values to remove may be any
2829 :term:`bytes-like object`.
2830
2831 .. note::
2832
2833 The bytearray version of this method does *not* operate in place -
2834 it always produces a new object, even if no changes were made.
2835
2836
2837The following methods on bytes and bytearray objects assume the use of ASCII
2838compatible binary formats and should not be applied to arbitrary binary data.
2839Note that all of the bytearray methods in this section do *not* operate in
2840place, and instead produce new objects.
2841
2842.. method:: bytes.capitalize()
2843 bytearray.capitalize()
2844
2845 Return a copy of the sequence with each byte interpreted as an ASCII
2846 character, and the first byte capitalized and the rest lowercased.
2847 Non-ASCII byte values are passed through unchanged.
2848
2849 .. note::
2850
2851 The bytearray version of this method does *not* operate in place - it
2852 always produces a new object, even if no changes were made.
2853
2854
2855.. method:: bytes.expandtabs(tabsize=8)
2856 bytearray.expandtabs(tabsize=8)
2857
2858 Return a copy of the sequence where all ASCII tab characters are replaced
2859 by one or more ASCII spaces, depending on the current column and the given
2860 tab size. Tab positions occur every *tabsize* bytes (default is 8,
2861 giving tab positions at columns 0, 8, 16 and so on). To expand the
2862 sequence, the current column is set to zero and the sequence is examined
2863 byte by byte. If the byte is an ASCII tab character (``b'\t'``), one or
2864 more space characters are inserted in the result until the current column
2865 is equal to the next tab position. (The tab character itself is not
2866 copied.) If the current byte is an ASCII newline (``b'\n'``) or
2867 carriage return (``b'\r'``), it is copied and the current column is reset
2868 to zero. Any other byte value is copied unchanged and the current column
2869 is incremented by one regardless of how the byte value is represented when
2870 printed::
2871
2872 >>> b'01\t012\t0123\t01234'.expandtabs()
2873 b'01 012 0123 01234'
2874 >>> b'01\t012\t0123\t01234'.expandtabs(4)
2875 b'01 012 0123 01234'
2876
2877 .. note::
2878
2879 The bytearray version of this method does *not* operate in place - it
2880 always produces a new object, even if no changes were made.
2881
2882
2883.. method:: bytes.isalnum()
2884 bytearray.isalnum()
2885
2886 Return true if all bytes in the sequence are alphabetical ASCII characters
2887 or ASCII decimal digits and the sequence is not empty, false otherwise.
2888 Alphabetic ASCII characters are those byte values in the sequence
2889 ``b'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'``. ASCII decimal
2890 digits are those byte values in the sequence ``b'0123456789'``.
2891
2892 For example::
2893
2894 >>> b'ABCabc1'.isalnum()
2895 True
2896 >>> b'ABC abc1'.isalnum()
2897 False
2898
2899
2900.. method:: bytes.isalpha()
2901 bytearray.isalpha()
2902
2903 Return true if all bytes in the sequence are alphabetic ASCII characters
2904 and the sequence is not empty, false otherwise. Alphabetic ASCII
2905 characters are those byte values in the sequence
2906 ``b'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'``.
2907
2908 For example::
2909
2910 >>> b'ABCabc'.isalpha()
2911 True
2912 >>> b'ABCabc1'.isalpha()
2913 False
2914
2915
2916.. method:: bytes.isdigit()
2917 bytearray.isdigit()
2918
2919 Return true if all bytes in the sequence are ASCII decimal digits
2920 and the sequence is not empty, false otherwise. ASCII decimal digits are
2921 those byte values in the sequence ``b'0123456789'``.
2922
2923 For example::
2924
2925 >>> b'1234'.isdigit()
2926 True
2927 >>> b'1.23'.isdigit()
2928 False
2929
2930
2931.. method:: bytes.islower()
2932 bytearray.islower()
2933
2934 Return true if there is at least one lowercase ASCII character
2935 in the sequence and no uppercase ASCII characters, false otherwise.
2936
2937 For example::
2938
2939 >>> b'hello world'.islower()
2940 True
2941 >>> b'Hello world'.islower()
2942 False
2943
2944 Lowercase ASCII characters are those byte values in the sequence
2945 ``b'abcdefghijklmnopqrstuvwxyz'``. Uppercase ASCII characters
2946 are those byte values in the sequence ``b'ABCDEFGHIJKLMNOPQRSTUVWXYZ'``.
2947
2948
2949.. method:: bytes.isspace()
2950 bytearray.isspace()
2951
2952 Return true if all bytes in the sequence are ASCII whitespace and the
2953 sequence is not empty, false otherwise. ASCII whitespace characters are
Serhiy Storchakabf7b9ed2015-11-23 16:43:05 +02002954 those byte values in the sequence ``b' \t\n\r\x0b\f'`` (space, tab, newline,
Nick Coghlane4936b82014-08-09 16:14:04 +10002955 carriage return, vertical tab, form feed).
2956
2957
2958.. method:: bytes.istitle()
2959 bytearray.istitle()
2960
2961 Return true if the sequence is ASCII titlecase and the sequence is not
2962 empty, false otherwise. See :meth:`bytes.title` for more details on the
2963 definition of "titlecase".
2964
2965 For example::
2966
2967 >>> b'Hello World'.istitle()
2968 True
2969 >>> b'Hello world'.istitle()
2970 False
2971
2972
2973.. method:: bytes.isupper()
2974 bytearray.isupper()
2975
Zachary Ware0b496372015-02-27 01:40:22 -06002976 Return true if there is at least one uppercase alphabetic ASCII character
2977 in the sequence and no lowercase ASCII characters, false otherwise.
Nick Coghlane4936b82014-08-09 16:14:04 +10002978
2979 For example::
2980
2981 >>> b'HELLO WORLD'.isupper()
2982 True
2983 >>> b'Hello world'.isupper()
2984 False
2985
2986 Lowercase ASCII characters are those byte values in the sequence
2987 ``b'abcdefghijklmnopqrstuvwxyz'``. Uppercase ASCII characters
2988 are those byte values in the sequence ``b'ABCDEFGHIJKLMNOPQRSTUVWXYZ'``.
2989
2990
2991.. method:: bytes.lower()
2992 bytearray.lower()
2993
2994 Return a copy of the sequence with all the uppercase ASCII characters
2995 converted to their corresponding lowercase counterpart.
2996
2997 For example::
2998
2999 >>> b'Hello World'.lower()
3000 b'hello world'
3001
3002 Lowercase ASCII characters are those byte values in the sequence
3003 ``b'abcdefghijklmnopqrstuvwxyz'``. Uppercase ASCII characters
3004 are those byte values in the sequence ``b'ABCDEFGHIJKLMNOPQRSTUVWXYZ'``.
3005
3006 .. note::
3007
3008 The bytearray version of this method does *not* operate in place - it
3009 always produces a new object, even if no changes were made.
3010
3011
3012.. index::
3013 single: universal newlines; bytes.splitlines method
3014 single: universal newlines; bytearray.splitlines method
3015
3016.. method:: bytes.splitlines(keepends=False)
3017 bytearray.splitlines(keepends=False)
3018
3019 Return a list of the lines in the binary sequence, breaking at ASCII
3020 line boundaries. This method uses the :term:`universal newlines` approach
3021 to splitting lines. Line breaks are not included in the resulting list
3022 unless *keepends* is given and true.
3023
3024 For example::
3025
3026 >>> b'ab c\n\nde fg\rkl\r\n'.splitlines()
Larry Hastingsc6256e52014-10-05 19:03:48 -07003027 [b'ab c', b'', b'de fg', b'kl']
Nick Coghlane4936b82014-08-09 16:14:04 +10003028 >>> b'ab c\n\nde fg\rkl\r\n'.splitlines(keepends=True)
3029 [b'ab c\n', b'\n', b'de fg\r', b'kl\r\n']
3030
3031 Unlike :meth:`~bytes.split` when a delimiter string *sep* is given, this
3032 method returns an empty list for the empty string, and a terminal line
3033 break does not result in an extra line::
3034
3035 >>> b"".split(b'\n'), b"Two lines\n".split(b'\n')
3036 ([b''], [b'Two lines', b''])
3037 >>> b"".splitlines(), b"One line\n".splitlines()
3038 ([], [b'One line'])
3039
3040
3041.. method:: bytes.swapcase()
3042 bytearray.swapcase()
3043
3044 Return a copy of the sequence with all the lowercase ASCII characters
3045 converted to their corresponding uppercase counterpart and vice-versa.
3046
3047 For example::
3048
3049 >>> b'Hello World'.swapcase()
3050 b'hELLO wORLD'
3051
3052 Lowercase ASCII characters are those byte values in the sequence
3053 ``b'abcdefghijklmnopqrstuvwxyz'``. Uppercase ASCII characters
3054 are those byte values in the sequence ``b'ABCDEFGHIJKLMNOPQRSTUVWXYZ'``.
3055
3056 Unlike :func:`str.swapcase()`, it is always the case that
3057 ``bin.swapcase().swapcase() == bin`` for the binary versions. Case
3058 conversions are symmetrical in ASCII, even though that is not generally
3059 true for arbitrary Unicode code points.
3060
3061 .. note::
3062
3063 The bytearray version of this method does *not* operate in place - it
3064 always produces a new object, even if no changes were made.
3065
3066
3067.. method:: bytes.title()
3068 bytearray.title()
3069
3070 Return a titlecased version of the binary sequence where words start with
3071 an uppercase ASCII character and the remaining characters are lowercase.
3072 Uncased byte values are left unmodified.
3073
3074 For example::
3075
3076 >>> b'Hello world'.title()
3077 b'Hello World'
3078
3079 Lowercase ASCII characters are those byte values in the sequence
3080 ``b'abcdefghijklmnopqrstuvwxyz'``. Uppercase ASCII characters
3081 are those byte values in the sequence ``b'ABCDEFGHIJKLMNOPQRSTUVWXYZ'``.
3082 All other byte values are uncased.
3083
3084 The algorithm uses a simple language-independent definition of a word as
3085 groups of consecutive letters. The definition works in many contexts but
3086 it means that apostrophes in contractions and possessives form word
3087 boundaries, which may not be the desired result::
3088
3089 >>> b"they're bill's friends from the UK".title()
3090 b"They'Re Bill'S Friends From The Uk"
3091
3092 A workaround for apostrophes can be constructed using regular expressions::
3093
3094 >>> import re
3095 >>> def titlecase(s):
3096 ... return re.sub(rb"[A-Za-z]+('[A-Za-z]+)?",
3097 ... lambda mo: mo.group(0)[0:1].upper() +
3098 ... mo.group(0)[1:].lower(),
3099 ... s)
3100 ...
3101 >>> titlecase(b"they're bill's friends.")
3102 b"They're Bill's Friends."
3103
3104 .. note::
3105
3106 The bytearray version of this method does *not* operate in place - it
3107 always produces a new object, even if no changes were made.
3108
3109
3110.. method:: bytes.upper()
3111 bytearray.upper()
3112
3113 Return a copy of the sequence with all the lowercase ASCII characters
3114 converted to their corresponding uppercase counterpart.
3115
3116 For example::
3117
3118 >>> b'Hello World'.upper()
3119 b'HELLO WORLD'
3120
3121 Lowercase ASCII characters are those byte values in the sequence
3122 ``b'abcdefghijklmnopqrstuvwxyz'``. Uppercase ASCII characters
3123 are those byte values in the sequence ``b'ABCDEFGHIJKLMNOPQRSTUVWXYZ'``.
3124
3125 .. note::
3126
3127 The bytearray version of this method does *not* operate in place - it
3128 always produces a new object, even if no changes were made.
3129
3130
3131.. method:: bytes.zfill(width)
3132 bytearray.zfill(width)
3133
3134 Return a copy of the sequence left filled with ASCII ``b'0'`` digits to
3135 make a sequence of length *width*. A leading sign prefix (``b'+'``/
3136 ``b'-'`` is handled by inserting the padding *after* the sign character
3137 rather than before. For :class:`bytes` objects, the original sequence is
3138 returned if *width* is less than or equal to ``len(seq)``.
3139
3140 For example::
3141
3142 >>> b"42".zfill(5)
3143 b'00042'
3144 >>> b"-42".zfill(5)
3145 b'-0042'
3146
3147 .. note::
3148
3149 The bytearray version of this method does *not* operate in place - it
3150 always produces a new object, even if no changes were made.
Georg Brandlabc38772009-04-12 15:51:51 +00003151
3152
Ethan Furmanb95b5612015-01-23 20:05:18 -08003153.. _bytes-formatting:
3154
3155``printf``-style Bytes Formatting
3156----------------------------------
3157
3158.. index::
3159 single: formatting, bytes (%)
3160 single: formatting, bytearray (%)
3161 single: interpolation, bytes (%)
3162 single: interpolation, bytearray (%)
3163 single: bytes; formatting
3164 single: bytearray; formatting
3165 single: bytes; interpolation
3166 single: bytearray; interpolation
3167 single: printf-style formatting
3168 single: sprintf-style formatting
3169 single: % formatting
3170 single: % interpolation
3171
3172.. note::
3173
3174 The formatting operations described here exhibit a variety of quirks that
3175 lead to a number of common errors (such as failing to display tuples and
3176 dictionaries correctly). If the value being printed may be a tuple or
3177 dictionary, wrap it in a tuple.
3178
3179Bytes objects (``bytes``/``bytearray``) have one unique built-in operation:
3180the ``%`` operator (modulo).
3181This is also known as the bytes *formatting* or *interpolation* operator.
3182Given ``format % values`` (where *format* is a bytes object), ``%`` conversion
3183specifications in *format* are replaced with zero or more elements of *values*.
3184The effect is similar to using the :c:func:`sprintf` in the C language.
3185
3186If *format* requires a single argument, *values* may be a single non-tuple
3187object. [5]_ Otherwise, *values* must be a tuple with exactly the number of
3188items specified by the format bytes object, or a single mapping object (for
3189example, a dictionary).
3190
3191A conversion specifier contains two or more characters and has the following
3192components, which must occur in this order:
3193
3194#. The ``'%'`` character, which marks the start of the specifier.
3195
3196#. Mapping key (optional), consisting of a parenthesised sequence of characters
3197 (for example, ``(somename)``).
3198
3199#. Conversion flags (optional), which affect the result of some conversion
3200 types.
3201
3202#. Minimum field width (optional). If specified as an ``'*'`` (asterisk), the
3203 actual width is read from the next element of the tuple in *values*, and the
3204 object to convert comes after the minimum field width and optional precision.
3205
3206#. Precision (optional), given as a ``'.'`` (dot) followed by the precision. If
3207 specified as ``'*'`` (an asterisk), the actual precision is read from the next
3208 element of the tuple in *values*, and the value to convert comes after the
3209 precision.
3210
3211#. Length modifier (optional).
3212
3213#. Conversion type.
3214
3215When the right argument is a dictionary (or other mapping type), then the
3216formats in the bytes object *must* include a parenthesised mapping key into that
3217dictionary inserted immediately after the ``'%'`` character. The mapping key
3218selects the value to be formatted from the mapping. For example:
3219
3220 >>> print(b'%(language)s has %(number)03d quote types.' %
3221 ... {b'language': b"Python", b"number": 2})
3222 b'Python has 002 quote types.'
3223
3224In this case no ``*`` specifiers may occur in a format (since they require a
3225sequential parameter list).
3226
3227The conversion flag characters are:
3228
3229+---------+---------------------------------------------------------------------+
3230| Flag | Meaning |
3231+=========+=====================================================================+
3232| ``'#'`` | The value conversion will use the "alternate form" (where defined |
3233| | below). |
3234+---------+---------------------------------------------------------------------+
3235| ``'0'`` | The conversion will be zero padded for numeric values. |
3236+---------+---------------------------------------------------------------------+
3237| ``'-'`` | The converted value is left adjusted (overrides the ``'0'`` |
3238| | conversion if both are given). |
3239+---------+---------------------------------------------------------------------+
3240| ``' '`` | (a space) A blank should be left before a positive number (or empty |
3241| | string) produced by a signed conversion. |
3242+---------+---------------------------------------------------------------------+
3243| ``'+'`` | A sign character (``'+'`` or ``'-'``) will precede the conversion |
3244| | (overrides a "space" flag). |
3245+---------+---------------------------------------------------------------------+
3246
3247A length modifier (``h``, ``l``, or ``L``) may be present, but is ignored as it
3248is not necessary for Python -- so e.g. ``%ld`` is identical to ``%d``.
3249
3250The conversion types are:
3251
3252+------------+-----------------------------------------------------+-------+
3253| Conversion | Meaning | Notes |
3254+============+=====================================================+=======+
3255| ``'d'`` | Signed integer decimal. | |
3256+------------+-----------------------------------------------------+-------+
3257| ``'i'`` | Signed integer decimal. | |
3258+------------+-----------------------------------------------------+-------+
3259| ``'o'`` | Signed octal value. | \(1) |
3260+------------+-----------------------------------------------------+-------+
Ethan Furman62e977f2015-03-11 08:17:00 -07003261| ``'u'`` | Obsolete type -- it is identical to ``'d'``. | \(8) |
Ethan Furmanb95b5612015-01-23 20:05:18 -08003262+------------+-----------------------------------------------------+-------+
3263| ``'x'`` | Signed hexadecimal (lowercase). | \(2) |
3264+------------+-----------------------------------------------------+-------+
3265| ``'X'`` | Signed hexadecimal (uppercase). | \(2) |
3266+------------+-----------------------------------------------------+-------+
3267| ``'e'`` | Floating point exponential format (lowercase). | \(3) |
3268+------------+-----------------------------------------------------+-------+
3269| ``'E'`` | Floating point exponential format (uppercase). | \(3) |
3270+------------+-----------------------------------------------------+-------+
3271| ``'f'`` | Floating point decimal format. | \(3) |
3272+------------+-----------------------------------------------------+-------+
3273| ``'F'`` | Floating point decimal format. | \(3) |
3274+------------+-----------------------------------------------------+-------+
3275| ``'g'`` | Floating point format. Uses lowercase exponential | \(4) |
3276| | format if exponent is less than -4 or not less than | |
3277| | precision, decimal format otherwise. | |
3278+------------+-----------------------------------------------------+-------+
3279| ``'G'`` | Floating point format. Uses uppercase exponential | \(4) |
3280| | format if exponent is less than -4 or not less than | |
3281| | precision, decimal format otherwise. | |
3282+------------+-----------------------------------------------------+-------+
3283| ``'c'`` | Single byte (accepts integer or single | |
3284| | byte objects). | |
3285+------------+-----------------------------------------------------+-------+
3286| ``'b'`` | Bytes (any object that follows the | \(5) |
3287| | :ref:`buffer protocol <bufferobjects>` or has | |
3288| | :meth:`__bytes__`). | |
3289+------------+-----------------------------------------------------+-------+
3290| ``'s'`` | ``'s'`` is an alias for ``'b'`` and should only | \(6) |
3291| | be used for Python2/3 code bases. | |
3292+------------+-----------------------------------------------------+-------+
3293| ``'a'`` | Bytes (converts any Python object using | \(5) |
3294| | ``repr(obj).encode('ascii','backslashreplace)``). | |
3295+------------+-----------------------------------------------------+-------+
Ethan Furman62e977f2015-03-11 08:17:00 -07003296| ``'r'`` | ``'r'`` is an alias for ``'a'`` and should only | \(7) |
3297| | be used for Python2/3 code bases. | |
3298+------------+-----------------------------------------------------+-------+
Ethan Furmanb95b5612015-01-23 20:05:18 -08003299| ``'%'`` | No argument is converted, results in a ``'%'`` | |
3300| | character in the result. | |
3301+------------+-----------------------------------------------------+-------+
3302
3303Notes:
3304
3305(1)
Martin Panter41176ae2016-12-11 01:07:29 +00003306 The alternate form causes a leading octal specifier (``'0o'``) to be
3307 inserted before the first digit.
Ethan Furmanb95b5612015-01-23 20:05:18 -08003308
3309(2)
3310 The alternate form causes a leading ``'0x'`` or ``'0X'`` (depending on whether
Martin Panter41176ae2016-12-11 01:07:29 +00003311 the ``'x'`` or ``'X'`` format was used) to be inserted before the first digit.
Ethan Furmanb95b5612015-01-23 20:05:18 -08003312
3313(3)
3314 The alternate form causes the result to always contain a decimal point, even if
3315 no digits follow it.
3316
3317 The precision determines the number of digits after the decimal point and
3318 defaults to 6.
3319
3320(4)
3321 The alternate form causes the result to always contain a decimal point, and
3322 trailing zeroes are not removed as they would otherwise be.
3323
3324 The precision determines the number of significant digits before and after the
3325 decimal point and defaults to 6.
3326
3327(5)
3328 If precision is ``N``, the output is truncated to ``N`` characters.
3329
3330(6)
3331 ``b'%s'`` is deprecated, but will not be removed during the 3.x series.
3332
3333(7)
Ethan Furman62e977f2015-03-11 08:17:00 -07003334 ``b'%r'`` is deprecated, but will not be removed during the 3.x series.
3335
3336(8)
Ethan Furmanb95b5612015-01-23 20:05:18 -08003337 See :pep:`237`.
3338
3339.. note::
3340
3341 The bytearray version of this method does *not* operate in place - it
3342 always produces a new object, even if no changes were made.
3343
3344.. seealso:: :pep:`461`.
3345.. versionadded:: 3.5
3346
Nick Coghlan273069c2012-08-20 17:14:07 +10003347.. _typememoryview:
3348
3349Memory Views
3350------------
3351
3352:class:`memoryview` objects allow Python code to access the internal data
3353of an object that supports the :ref:`buffer protocol <bufferobjects>` without
3354copying.
3355
3356.. class:: memoryview(obj)
3357
3358 Create a :class:`memoryview` that references *obj*. *obj* must support the
3359 buffer protocol. Built-in objects that support the buffer protocol include
3360 :class:`bytes` and :class:`bytearray`.
3361
3362 A :class:`memoryview` has the notion of an *element*, which is the
3363 atomic memory unit handled by the originating object *obj*. For many
3364 simple types such as :class:`bytes` and :class:`bytearray`, an element
3365 is a single byte, but other types such as :class:`array.array` may have
3366 bigger elements.
3367
3368 ``len(view)`` is equal to the length of :class:`~memoryview.tolist`.
3369 If ``view.ndim = 0``, the length is 1. If ``view.ndim = 1``, the length
3370 is equal to the number of elements in the view. For higher dimensions,
3371 the length is equal to the length of the nested list representation of
3372 the view. The :class:`~memoryview.itemsize` attribute will give you the
3373 number of bytes in a single element.
3374
Antoine Pitrou31084ba2015-03-19 23:29:36 +01003375 A :class:`memoryview` supports slicing and indexing to expose its data.
3376 One-dimensional slicing will result in a subview::
Nick Coghlan273069c2012-08-20 17:14:07 +10003377
3378 >>> v = memoryview(b'abcefg')
3379 >>> v[1]
3380 98
3381 >>> v[-1]
3382 103
3383 >>> v[1:4]
3384 <memory at 0x7f3ddc9f4350>
3385 >>> bytes(v[1:4])
3386 b'bce'
3387
Antoine Pitrou31084ba2015-03-19 23:29:36 +01003388 If :class:`~memoryview.format` is one of the native format specifiers
3389 from the :mod:`struct` module, indexing with an integer or a tuple of
3390 integers is also supported and returns a single *element* with
3391 the correct type. One-dimensional memoryviews can be indexed
3392 with an integer or a one-integer tuple. Multi-dimensional memoryviews
3393 can be indexed with tuples of exactly *ndim* integers where *ndim* is
3394 the number of dimensions. Zero-dimensional memoryviews can be indexed
3395 with the empty tuple.
3396
3397 Here is an example with a non-byte format::
Nick Coghlan273069c2012-08-20 17:14:07 +10003398
3399 >>> import array
3400 >>> a = array.array('l', [-11111111, 22222222, -33333333, 44444444])
Antoine Pitrou31084ba2015-03-19 23:29:36 +01003401 >>> m = memoryview(a)
3402 >>> m[0]
Nick Coghlan273069c2012-08-20 17:14:07 +10003403 -11111111
Antoine Pitrou31084ba2015-03-19 23:29:36 +01003404 >>> m[-1]
Nick Coghlan273069c2012-08-20 17:14:07 +10003405 44444444
Antoine Pitrou31084ba2015-03-19 23:29:36 +01003406 >>> m[::2].tolist()
Nick Coghlan273069c2012-08-20 17:14:07 +10003407 [-11111111, -33333333]
Nick Coghlan273069c2012-08-20 17:14:07 +10003408
Antoine Pitrou31084ba2015-03-19 23:29:36 +01003409 If the underlying object is writable, the memoryview supports
3410 one-dimensional slice assignment. Resizing is not allowed::
Nick Coghlan273069c2012-08-20 17:14:07 +10003411
3412 >>> data = bytearray(b'abcefg')
3413 >>> v = memoryview(data)
3414 >>> v.readonly
3415 False
3416 >>> v[0] = ord(b'z')
3417 >>> data
3418 bytearray(b'zbcefg')
3419 >>> v[1:4] = b'123'
3420 >>> data
3421 bytearray(b'z123fg')
3422 >>> v[2:3] = b'spam'
3423 Traceback (most recent call last):
3424 File "<stdin>", line 1, in <module>
3425 ValueError: memoryview assignment: lvalue and rvalue have different structures
3426 >>> v[2:6] = b'spam'
3427 >>> data
3428 bytearray(b'z1spam')
3429
Stefan Kraha3b84fb2012-09-02 14:50:56 +02003430 One-dimensional memoryviews of hashable (read-only) types with formats
3431 'B', 'b' or 'c' are also hashable. The hash is defined as
3432 ``hash(m) == hash(m.tobytes())``::
Nick Coghlan273069c2012-08-20 17:14:07 +10003433
3434 >>> v = memoryview(b'abcefg')
3435 >>> hash(v) == hash(b'abcefg')
3436 True
3437 >>> hash(v[2:4]) == hash(b'ce')
3438 True
3439 >>> hash(v[::-2]) == hash(b'abcefg'[::-2])
3440 True
3441
Nick Coghlan273069c2012-08-20 17:14:07 +10003442 .. versionchanged:: 3.3
Antoine Pitrou31084ba2015-03-19 23:29:36 +01003443 One-dimensional memoryviews can now be sliced.
Stefan Kraha3b84fb2012-09-02 14:50:56 +02003444 One-dimensional memoryviews with formats 'B', 'b' or 'c' are now hashable.
Nick Coghlan273069c2012-08-20 17:14:07 +10003445
Nick Coghlan45163cc2013-10-02 22:31:47 +10003446 .. versionchanged:: 3.4
3447 memoryview is now registered automatically with
3448 :class:`collections.abc.Sequence`
3449
Antoine Pitrou31084ba2015-03-19 23:29:36 +01003450 .. versionchanged:: 3.5
3451 memoryviews can now be indexed with tuple of integers.
3452
Nick Coghlan273069c2012-08-20 17:14:07 +10003453 :class:`memoryview` has several methods:
3454
Nick Coghlan06e1ab02012-08-25 17:59:50 +10003455 .. method:: __eq__(exporter)
3456
3457 A memoryview and a :pep:`3118` exporter are equal if their shapes are
3458 equivalent and if all corresponding values are equal when the operands'
3459 respective format codes are interpreted using :mod:`struct` syntax.
3460
3461 For the subset of :mod:`struct` format strings currently supported by
3462 :meth:`tolist`, ``v`` and ``w`` are equal if ``v.tolist() == w.tolist()``::
3463
3464 >>> import array
3465 >>> a = array.array('I', [1, 2, 3, 4, 5])
3466 >>> b = array.array('d', [1.0, 2.0, 3.0, 4.0, 5.0])
3467 >>> c = array.array('b', [5, 3, 1])
3468 >>> x = memoryview(a)
3469 >>> y = memoryview(b)
3470 >>> x == a == y == b
3471 True
3472 >>> x.tolist() == a.tolist() == y.tolist() == b.tolist()
3473 True
3474 >>> z = y[::-2]
3475 >>> z == c
3476 True
3477 >>> z.tolist() == c.tolist()
3478 True
3479
3480 If either format string is not supported by the :mod:`struct` module,
3481 then the objects will always compare as unequal (even if the format
3482 strings and buffer contents are identical)::
3483
3484 >>> from ctypes import BigEndianStructure, c_long
3485 >>> class BEPoint(BigEndianStructure):
3486 ... _fields_ = [("x", c_long), ("y", c_long)]
3487 ...
3488 >>> point = BEPoint(100, 200)
3489 >>> a = memoryview(point)
3490 >>> b = memoryview(point)
3491 >>> a == point
3492 False
3493 >>> a == b
3494 False
3495
3496 Note that, as with floating point numbers, ``v is w`` does *not* imply
3497 ``v == w`` for memoryview objects.
3498
3499 .. versionchanged:: 3.3
Stefan Krahab0c3c72012-08-30 12:09:09 +02003500 Previous versions compared the raw memory disregarding the item format
3501 and the logical array structure.
Nick Coghlan06e1ab02012-08-25 17:59:50 +10003502
Nick Coghlan273069c2012-08-20 17:14:07 +10003503 .. method:: tobytes()
3504
3505 Return the data in the buffer as a bytestring. This is equivalent to
3506 calling the :class:`bytes` constructor on the memoryview. ::
3507
3508 >>> m = memoryview(b"abc")
3509 >>> m.tobytes()
3510 b'abc'
3511 >>> bytes(m)
3512 b'abc'
3513
3514 For non-contiguous arrays the result is equal to the flattened list
Nick Coghlan06e1ab02012-08-25 17:59:50 +10003515 representation with all elements converted to bytes. :meth:`tobytes`
3516 supports all format strings, including those that are not in
3517 :mod:`struct` module syntax.
Nick Coghlan273069c2012-08-20 17:14:07 +10003518
Gregory P. Smith8cb65692015-04-25 23:22:26 +00003519 .. method:: hex()
3520
3521 Return a string object containing two hexadecimal digits for each
3522 byte in the buffer. ::
3523
3524 >>> m = memoryview(b"abc")
3525 >>> m.hex()
3526 '616263'
3527
3528 .. versionadded:: 3.5
3529
Nick Coghlan273069c2012-08-20 17:14:07 +10003530 .. method:: tolist()
3531
3532 Return the data in the buffer as a list of elements. ::
3533
3534 >>> memoryview(b'abc').tolist()
3535 [97, 98, 99]
3536 >>> import array
3537 >>> a = array.array('d', [1.1, 2.2, 3.3])
3538 >>> m = memoryview(a)
3539 >>> m.tolist()
3540 [1.1, 2.2, 3.3]
3541
Stefan Krahab0c3c72012-08-30 12:09:09 +02003542 .. versionchanged:: 3.3
3543 :meth:`tolist` now supports all single character native formats in
3544 :mod:`struct` module syntax as well as multi-dimensional
3545 representations.
Nick Coghlan06e1ab02012-08-25 17:59:50 +10003546
Nick Coghlan273069c2012-08-20 17:14:07 +10003547 .. method:: release()
3548
3549 Release the underlying buffer exposed by the memoryview object. Many
3550 objects take special actions when a view is held on them (for example,
3551 a :class:`bytearray` would temporarily forbid resizing); therefore,
3552 calling release() is handy to remove these restrictions (and free any
3553 dangling resources) as soon as possible.
3554
3555 After this method has been called, any further operation on the view
3556 raises a :class:`ValueError` (except :meth:`release()` itself which can
3557 be called multiple times)::
3558
3559 >>> m = memoryview(b'abc')
3560 >>> m.release()
3561 >>> m[0]
3562 Traceback (most recent call last):
3563 File "<stdin>", line 1, in <module>
3564 ValueError: operation forbidden on released memoryview object
3565
3566 The context management protocol can be used for a similar effect,
3567 using the ``with`` statement::
3568
3569 >>> with memoryview(b'abc') as m:
3570 ... m[0]
3571 ...
3572 97
3573 >>> m[0]
3574 Traceback (most recent call last):
3575 File "<stdin>", line 1, in <module>
3576 ValueError: operation forbidden on released memoryview object
3577
3578 .. versionadded:: 3.2
3579
3580 .. method:: cast(format[, shape])
3581
3582 Cast a memoryview to a new format or shape. *shape* defaults to
3583 ``[byte_length//new_itemsize]``, which means that the result view
3584 will be one-dimensional. The return value is a new memoryview, but
Stefan Krah70e543b2015-08-08 14:33:28 +02003585 the buffer itself is not copied. Supported casts are 1D -> C-:term:`contiguous`
Nick Coghlan06e1ab02012-08-25 17:59:50 +10003586 and C-contiguous -> 1D.
3587
Stefan Krah0c515952015-08-08 13:38:10 +02003588 The destination format is restricted to a single element native format in
Nick Coghlan06e1ab02012-08-25 17:59:50 +10003589 :mod:`struct` syntax. One of the formats must be a byte format
Nick Coghlan273069c2012-08-20 17:14:07 +10003590 ('B', 'b' or 'c'). The byte length of the result must be the same
3591 as the original length.
3592
3593 Cast 1D/long to 1D/unsigned bytes::
3594
3595 >>> import array
3596 >>> a = array.array('l', [1,2,3])
3597 >>> x = memoryview(a)
3598 >>> x.format
3599 'l'
3600 >>> x.itemsize
3601 8
3602 >>> len(x)
3603 3
3604 >>> x.nbytes
3605 24
3606 >>> y = x.cast('B')
3607 >>> y.format
3608 'B'
3609 >>> y.itemsize
3610 1
3611 >>> len(y)
3612 24
3613 >>> y.nbytes
3614 24
3615
3616 Cast 1D/unsigned bytes to 1D/char::
3617
3618 >>> b = bytearray(b'zyz')
3619 >>> x = memoryview(b)
3620 >>> x[0] = b'a'
3621 Traceback (most recent call last):
3622 File "<stdin>", line 1, in <module>
3623 ValueError: memoryview: invalid value for format "B"
3624 >>> y = x.cast('c')
3625 >>> y[0] = b'a'
3626 >>> b
3627 bytearray(b'ayz')
3628
3629 Cast 1D/bytes to 3D/ints to 1D/signed char::
3630
3631 >>> import struct
3632 >>> buf = struct.pack("i"*12, *list(range(12)))
3633 >>> x = memoryview(buf)
3634 >>> y = x.cast('i', shape=[2,2,3])
3635 >>> y.tolist()
3636 [[[0, 1, 2], [3, 4, 5]], [[6, 7, 8], [9, 10, 11]]]
3637 >>> y.format
3638 'i'
3639 >>> y.itemsize
3640 4
3641 >>> len(y)
3642 2
3643 >>> y.nbytes
3644 48
3645 >>> z = y.cast('b')
3646 >>> z.format
3647 'b'
3648 >>> z.itemsize
3649 1
3650 >>> len(z)
3651 48
3652 >>> z.nbytes
3653 48
3654
Terry Jan Reedy0f847642013-03-11 18:34:00 -04003655 Cast 1D/unsigned char to 2D/unsigned long::
Nick Coghlan273069c2012-08-20 17:14:07 +10003656
3657 >>> buf = struct.pack("L"*6, *list(range(6)))
3658 >>> x = memoryview(buf)
3659 >>> y = x.cast('L', shape=[2,3])
3660 >>> len(y)
3661 2
3662 >>> y.nbytes
3663 48
3664 >>> y.tolist()
3665 [[0, 1, 2], [3, 4, 5]]
3666
3667 .. versionadded:: 3.3
3668
Stefan Krah0c515952015-08-08 13:38:10 +02003669 .. versionchanged:: 3.5
3670 The source format is no longer restricted when casting to a byte view.
3671
Nick Coghlan273069c2012-08-20 17:14:07 +10003672 There are also several readonly attributes available:
3673
3674 .. attribute:: obj
3675
3676 The underlying object of the memoryview::
3677
3678 >>> b = bytearray(b'xyz')
3679 >>> m = memoryview(b)
3680 >>> m.obj is b
3681 True
3682
3683 .. versionadded:: 3.3
3684
3685 .. attribute:: nbytes
3686
3687 ``nbytes == product(shape) * itemsize == len(m.tobytes())``. This is
3688 the amount of space in bytes that the array would use in a contiguous
3689 representation. It is not necessarily equal to len(m)::
3690
3691 >>> import array
3692 >>> a = array.array('i', [1,2,3,4,5])
3693 >>> m = memoryview(a)
3694 >>> len(m)
3695 5
3696 >>> m.nbytes
3697 20
3698 >>> y = m[::2]
3699 >>> len(y)
3700 3
3701 >>> y.nbytes
3702 12
3703 >>> len(y.tobytes())
3704 12
3705
3706 Multi-dimensional arrays::
3707
3708 >>> import struct
3709 >>> buf = struct.pack("d"*12, *[1.5*x for x in range(12)])
3710 >>> x = memoryview(buf)
3711 >>> y = x.cast('d', shape=[3,4])
3712 >>> y.tolist()
3713 [[0.0, 1.5, 3.0, 4.5], [6.0, 7.5, 9.0, 10.5], [12.0, 13.5, 15.0, 16.5]]
3714 >>> len(y)
3715 3
3716 >>> y.nbytes
3717 96
3718
3719 .. versionadded:: 3.3
3720
3721 .. attribute:: readonly
3722
3723 A bool indicating whether the memory is read only.
3724
3725 .. attribute:: format
3726
3727 A string containing the format (in :mod:`struct` module style) for each
3728 element in the view. A memoryview can be created from exporters with
3729 arbitrary format strings, but some methods (e.g. :meth:`tolist`) are
Nick Coghlan06e1ab02012-08-25 17:59:50 +10003730 restricted to native single element formats.
Nick Coghlan273069c2012-08-20 17:14:07 +10003731
Stefan Krahab0c3c72012-08-30 12:09:09 +02003732 .. versionchanged:: 3.3
3733 format ``'B'`` is now handled according to the struct module syntax.
3734 This means that ``memoryview(b'abc')[0] == b'abc'[0] == 97``.
3735
Nick Coghlan273069c2012-08-20 17:14:07 +10003736 .. attribute:: itemsize
3737
3738 The size in bytes of each element of the memoryview::
3739
3740 >>> import array, struct
3741 >>> m = memoryview(array.array('H', [32000, 32001, 32002]))
3742 >>> m.itemsize
3743 2
3744 >>> m[0]
3745 32000
3746 >>> struct.calcsize('H') == m.itemsize
3747 True
3748
3749 .. attribute:: ndim
3750
3751 An integer indicating how many dimensions of a multi-dimensional array the
3752 memory represents.
3753
3754 .. attribute:: shape
3755
3756 A tuple of integers the length of :attr:`ndim` giving the shape of the
Alexander Belopolskye8677c02012-09-03 17:29:22 -04003757 memory as an N-dimensional array.
3758
3759 .. versionchanged:: 3.3
Serhiy Storchakaecf41da2016-10-19 16:29:26 +03003760 An empty tuple instead of ``None`` when ndim = 0.
Nick Coghlan273069c2012-08-20 17:14:07 +10003761
3762 .. attribute:: strides
3763
3764 A tuple of integers the length of :attr:`ndim` giving the size in bytes to
3765 access each element for each dimension of the array.
3766
Alexander Belopolskye8677c02012-09-03 17:29:22 -04003767 .. versionchanged:: 3.3
Serhiy Storchakaecf41da2016-10-19 16:29:26 +03003768 An empty tuple instead of ``None`` when ndim = 0.
Alexander Belopolskye8677c02012-09-03 17:29:22 -04003769
Nick Coghlan273069c2012-08-20 17:14:07 +10003770 .. attribute:: suboffsets
3771
3772 Used internally for PIL-style arrays. The value is informational only.
3773
3774 .. attribute:: c_contiguous
3775
Stefan Krah70e543b2015-08-08 14:33:28 +02003776 A bool indicating whether the memory is C-:term:`contiguous`.
Nick Coghlan273069c2012-08-20 17:14:07 +10003777
3778 .. versionadded:: 3.3
3779
3780 .. attribute:: f_contiguous
3781
Stefan Krah70e543b2015-08-08 14:33:28 +02003782 A bool indicating whether the memory is Fortran :term:`contiguous`.
Nick Coghlan273069c2012-08-20 17:14:07 +10003783
3784 .. versionadded:: 3.3
3785
3786 .. attribute:: contiguous
3787
Stefan Krah70e543b2015-08-08 14:33:28 +02003788 A bool indicating whether the memory is :term:`contiguous`.
Nick Coghlan273069c2012-08-20 17:14:07 +10003789
3790 .. versionadded:: 3.3
3791
3792
Georg Brandl116aa622007-08-15 14:28:22 +00003793.. _types-set:
3794
3795Set Types --- :class:`set`, :class:`frozenset`
3796==============================================
3797
3798.. index:: object: set
3799
Guido van Rossum2cc30da2007-11-02 23:46:40 +00003800A :dfn:`set` object is an unordered collection of distinct :term:`hashable` objects.
Georg Brandl116aa622007-08-15 14:28:22 +00003801Common uses include membership testing, removing duplicates from a sequence, and
3802computing mathematical operations such as intersection, union, difference, and
3803symmetric difference.
Nick Coghlan83c0ae52012-08-21 17:42:52 +10003804(For other containers see the built-in :class:`dict`, :class:`list`,
Georg Brandl116aa622007-08-15 14:28:22 +00003805and :class:`tuple` classes, and the :mod:`collections` module.)
3806
Georg Brandl116aa622007-08-15 14:28:22 +00003807Like other collections, sets support ``x in set``, ``len(set)``, and ``for x in
3808set``. Being an unordered collection, sets do not record element position or
3809order of insertion. Accordingly, sets do not support indexing, slicing, or
3810other sequence-like behavior.
3811
Georg Brandl22b34312009-07-26 14:54:51 +00003812There are currently two built-in set types, :class:`set` and :class:`frozenset`.
Georg Brandl116aa622007-08-15 14:28:22 +00003813The :class:`set` type is mutable --- the contents can be changed using methods
Serhiy Storchaka0d196ed2013-10-09 14:02:31 +03003814like :meth:`~set.add` and :meth:`~set.remove`. Since it is mutable, it has no
3815hash value and cannot be used as either a dictionary key or as an element of
3816another set. The :class:`frozenset` type is immutable and :term:`hashable` ---
3817its contents cannot be altered after it is created; it can therefore be used as
3818a dictionary key or as an element of another set.
Georg Brandl116aa622007-08-15 14:28:22 +00003819
Georg Brandl99cd9572010-03-21 09:10:32 +00003820Non-empty sets (not frozensets) can be created by placing a comma-separated list
Georg Brandl53b95e72010-03-21 11:53:50 +00003821of elements within braces, for example: ``{'jack', 'sjoerd'}``, in addition to the
3822:class:`set` constructor.
Georg Brandl99cd9572010-03-21 09:10:32 +00003823
Georg Brandl116aa622007-08-15 14:28:22 +00003824The constructors for both classes work the same:
3825
3826.. class:: set([iterable])
3827 frozenset([iterable])
3828
3829 Return a new set or frozenset object whose elements are taken from
Andrew Svetlov9a411ce2013-04-05 16:21:50 +03003830 *iterable*. The elements of a set must be :term:`hashable`. To
3831 represent sets of sets, the inner sets must be :class:`frozenset`
3832 objects. If *iterable* is not specified, a new empty set is
3833 returned.
Georg Brandl116aa622007-08-15 14:28:22 +00003834
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003835 Instances of :class:`set` and :class:`frozenset` provide the following
3836 operations:
Georg Brandl116aa622007-08-15 14:28:22 +00003837
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003838 .. describe:: len(s)
Georg Brandl116aa622007-08-15 14:28:22 +00003839
Gregory P. Smithe27403b2016-02-08 09:58:40 -08003840 Return the number of elements in set *s* (cardinality of *s*).
Georg Brandl116aa622007-08-15 14:28:22 +00003841
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003842 .. describe:: x in s
Georg Brandl116aa622007-08-15 14:28:22 +00003843
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003844 Test *x* for membership in *s*.
Georg Brandl116aa622007-08-15 14:28:22 +00003845
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003846 .. describe:: x not in s
Georg Brandl116aa622007-08-15 14:28:22 +00003847
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003848 Test *x* for non-membership in *s*.
Georg Brandl116aa622007-08-15 14:28:22 +00003849
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003850 .. method:: isdisjoint(other)
Guido van Rossum58da9312007-11-10 23:39:45 +00003851
Serhiy Storchakafbc1c262013-11-29 12:17:13 +02003852 Return ``True`` if the set has no elements in common with *other*. Sets are
Georg Brandl2ee470f2008-07-16 12:55:28 +00003853 disjoint if and only if their intersection is the empty set.
Guido van Rossum58da9312007-11-10 23:39:45 +00003854
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003855 .. method:: issubset(other)
3856 set <= other
Georg Brandl116aa622007-08-15 14:28:22 +00003857
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003858 Test whether every element in the set is in *other*.
Georg Brandl116aa622007-08-15 14:28:22 +00003859
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003860 .. method:: set < other
Georg Brandla6f52782007-09-01 15:49:30 +00003861
Andrew Svetlov5bb42072012-11-01 21:47:54 +02003862 Test whether the set is a proper subset of *other*, that is,
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003863 ``set <= other and set != other``.
Georg Brandla6f52782007-09-01 15:49:30 +00003864
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003865 .. method:: issuperset(other)
3866 set >= other
Georg Brandl116aa622007-08-15 14:28:22 +00003867
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003868 Test whether every element in *other* is in the set.
Georg Brandl116aa622007-08-15 14:28:22 +00003869
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003870 .. method:: set > other
Georg Brandla6f52782007-09-01 15:49:30 +00003871
Andrew Svetlov5bb42072012-11-01 21:47:54 +02003872 Test whether the set is a proper superset of *other*, that is, ``set >=
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003873 other and set != other``.
Georg Brandla6f52782007-09-01 15:49:30 +00003874
Raymond Hettingera33e9f72016-09-12 23:38:50 -07003875 .. method:: union(*others)
Georg Brandlc28e1fa2008-06-10 19:20:26 +00003876 set | other | ...
Georg Brandl116aa622007-08-15 14:28:22 +00003877
Benjamin Petersonb58dda72009-01-18 22:27:04 +00003878 Return a new set with elements from the set and all others.
Georg Brandl116aa622007-08-15 14:28:22 +00003879
Raymond Hettingera33e9f72016-09-12 23:38:50 -07003880 .. method:: intersection(*others)
Georg Brandlc28e1fa2008-06-10 19:20:26 +00003881 set & other & ...
Georg Brandl116aa622007-08-15 14:28:22 +00003882
Benjamin Petersonb58dda72009-01-18 22:27:04 +00003883 Return a new set with elements common to the set and all others.
Georg Brandl116aa622007-08-15 14:28:22 +00003884
Raymond Hettingera33e9f72016-09-12 23:38:50 -07003885 .. method:: difference(*others)
Amaury Forgeot d'Arcfdfe62d2008-06-17 20:36:03 +00003886 set - other - ...
Georg Brandlc28e1fa2008-06-10 19:20:26 +00003887
Amaury Forgeot d'Arcfdfe62d2008-06-17 20:36:03 +00003888 Return a new set with elements in the set that are not in the others.
Georg Brandl116aa622007-08-15 14:28:22 +00003889
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003890 .. method:: symmetric_difference(other)
3891 set ^ other
Georg Brandl116aa622007-08-15 14:28:22 +00003892
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003893 Return a new set with elements in either the set or *other* but not both.
Georg Brandl116aa622007-08-15 14:28:22 +00003894
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003895 .. method:: copy()
Georg Brandl116aa622007-08-15 14:28:22 +00003896
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003897 Return a new set with a shallow copy of *s*.
Georg Brandl116aa622007-08-15 14:28:22 +00003898
3899
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003900 Note, the non-operator versions of :meth:`union`, :meth:`intersection`,
3901 :meth:`difference`, and :meth:`symmetric_difference`, :meth:`issubset`, and
3902 :meth:`issuperset` methods will accept any iterable as an argument. In
3903 contrast, their operator based counterparts require their arguments to be
3904 sets. This precludes error-prone constructions like ``set('abc') & 'cbs'``
3905 in favor of the more readable ``set('abc').intersection('cbs')``.
Georg Brandl116aa622007-08-15 14:28:22 +00003906
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003907 Both :class:`set` and :class:`frozenset` support set to set comparisons. Two
3908 sets are equal if and only if every element of each set is contained in the
3909 other (each is a subset of the other). A set is less than another set if and
3910 only if the first set is a proper subset of the second set (is a subset, but
3911 is not equal). A set is greater than another set if and only if the first set
3912 is a proper superset of the second set (is a superset, but is not equal).
Georg Brandl116aa622007-08-15 14:28:22 +00003913
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003914 Instances of :class:`set` are compared to instances of :class:`frozenset`
3915 based on their members. For example, ``set('abc') == frozenset('abc')``
3916 returns ``True`` and so does ``set('abc') in set([frozenset('abc')])``.
Georg Brandl116aa622007-08-15 14:28:22 +00003917
Raymond Hettinger12f588a2013-05-06 18:22:43 -07003918 The subset and equality comparisons do not generalize to a total ordering
3919 function. For example, any two nonempty disjoint sets are not equal and are not
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003920 subsets of each other, so *all* of the following return ``False``: ``a<b``,
Georg Brandl05f5ab72008-09-24 09:11:47 +00003921 ``a==b``, or ``a>b``.
Georg Brandl116aa622007-08-15 14:28:22 +00003922
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003923 Since sets only define partial ordering (subset relationships), the output of
3924 the :meth:`list.sort` method is undefined for lists of sets.
Georg Brandl116aa622007-08-15 14:28:22 +00003925
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003926 Set elements, like dictionary keys, must be :term:`hashable`.
Georg Brandl116aa622007-08-15 14:28:22 +00003927
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003928 Binary operations that mix :class:`set` instances with :class:`frozenset`
3929 return the type of the first operand. For example: ``frozenset('ab') |
3930 set('bc')`` returns an instance of :class:`frozenset`.
Georg Brandl116aa622007-08-15 14:28:22 +00003931
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003932 The following table lists operations available for :class:`set` that do not
3933 apply to immutable instances of :class:`frozenset`:
Georg Brandl116aa622007-08-15 14:28:22 +00003934
Raymond Hettingera33e9f72016-09-12 23:38:50 -07003935 .. method:: update(*others)
Georg Brandlc28e1fa2008-06-10 19:20:26 +00003936 set |= other | ...
Georg Brandl116aa622007-08-15 14:28:22 +00003937
Georg Brandla6053b42009-09-01 08:11:14 +00003938 Update the set, adding elements from all others.
Georg Brandl116aa622007-08-15 14:28:22 +00003939
Raymond Hettingera33e9f72016-09-12 23:38:50 -07003940 .. method:: intersection_update(*others)
Georg Brandlc28e1fa2008-06-10 19:20:26 +00003941 set &= other & ...
Georg Brandl116aa622007-08-15 14:28:22 +00003942
Georg Brandla6053b42009-09-01 08:11:14 +00003943 Update the set, keeping only elements found in it and all others.
Georg Brandl116aa622007-08-15 14:28:22 +00003944
Raymond Hettingera33e9f72016-09-12 23:38:50 -07003945 .. method:: difference_update(*others)
Amaury Forgeot d'Arcfdfe62d2008-06-17 20:36:03 +00003946 set -= other | ...
Georg Brandl116aa622007-08-15 14:28:22 +00003947
Amaury Forgeot d'Arcfdfe62d2008-06-17 20:36:03 +00003948 Update the set, removing elements found in others.
3949
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003950 .. method:: symmetric_difference_update(other)
3951 set ^= other
Georg Brandl116aa622007-08-15 14:28:22 +00003952
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003953 Update the set, keeping only elements found in either set, but not in both.
Georg Brandl116aa622007-08-15 14:28:22 +00003954
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003955 .. method:: add(elem)
Georg Brandl116aa622007-08-15 14:28:22 +00003956
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003957 Add element *elem* to the set.
Georg Brandl116aa622007-08-15 14:28:22 +00003958
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003959 .. method:: remove(elem)
Georg Brandl116aa622007-08-15 14:28:22 +00003960
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003961 Remove element *elem* from the set. Raises :exc:`KeyError` if *elem* is
3962 not contained in the set.
Georg Brandl116aa622007-08-15 14:28:22 +00003963
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003964 .. method:: discard(elem)
Georg Brandl116aa622007-08-15 14:28:22 +00003965
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003966 Remove element *elem* from the set if it is present.
Georg Brandl116aa622007-08-15 14:28:22 +00003967
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003968 .. method:: pop()
Georg Brandl116aa622007-08-15 14:28:22 +00003969
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003970 Remove and return an arbitrary element from the set. Raises
3971 :exc:`KeyError` if the set is empty.
Georg Brandl116aa622007-08-15 14:28:22 +00003972
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003973 .. method:: clear()
Georg Brandl116aa622007-08-15 14:28:22 +00003974
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003975 Remove all elements from the set.
Georg Brandl116aa622007-08-15 14:28:22 +00003976
3977
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003978 Note, the non-operator versions of the :meth:`update`,
3979 :meth:`intersection_update`, :meth:`difference_update`, and
3980 :meth:`symmetric_difference_update` methods will accept any iterable as an
3981 argument.
Georg Brandl116aa622007-08-15 14:28:22 +00003982
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003983 Note, the *elem* argument to the :meth:`__contains__`, :meth:`remove`, and
3984 :meth:`discard` methods may be a set. To support searching for an equivalent
3985 frozenset, the *elem* set is temporarily mutated during the search and then
3986 restored. During the search, the *elem* set should not be read or mutated
3987 since it does not have a meaningful value.
Benjamin Peterson699adb92008-05-08 22:27:58 +00003988
Georg Brandl116aa622007-08-15 14:28:22 +00003989
3990.. _typesmapping:
3991
3992Mapping Types --- :class:`dict`
3993===============================
3994
3995.. index::
3996 object: mapping
3997 object: dictionary
3998 triple: operations on; mapping; types
3999 triple: operations on; dictionary; type
4000 statement: del
4001 builtin: len
4002
Chris Jerdonek11f3f172012-11-03 12:05:55 -07004003A :term:`mapping` object maps :term:`hashable` values to arbitrary objects.
Guido van Rossum2cc30da2007-11-02 23:46:40 +00004004Mappings are mutable objects. There is currently only one standard mapping
Nick Coghlan83c0ae52012-08-21 17:42:52 +10004005type, the :dfn:`dictionary`. (For other containers see the built-in
Guido van Rossum2cc30da2007-11-02 23:46:40 +00004006:class:`list`, :class:`set`, and :class:`tuple` classes, and the
4007:mod:`collections` module.)
Georg Brandl116aa622007-08-15 14:28:22 +00004008
Guido van Rossum2cc30da2007-11-02 23:46:40 +00004009A dictionary's keys are *almost* arbitrary values. Values that are not
4010:term:`hashable`, that is, values containing lists, dictionaries or other
4011mutable types (that are compared by value rather than by object identity) may
4012not be used as keys. Numeric types used for keys obey the normal rules for
4013numeric comparison: if two numbers compare equal (such as ``1`` and ``1.0``)
4014then they can be used interchangeably to index the same dictionary entry. (Note
4015however, that since computers store floating-point numbers as approximations it
4016is usually unwise to use them as dictionary keys.)
Georg Brandl116aa622007-08-15 14:28:22 +00004017
4018Dictionaries can be created by placing a comma-separated list of ``key: value``
4019pairs within braces, for example: ``{'jack': 4098, 'sjoerd': 4127}`` or ``{4098:
4020'jack', 4127: 'sjoerd'}``, or by the :class:`dict` constructor.
4021
Chris Jerdonekf3413172012-10-13 03:22:33 -07004022.. class:: dict(**kwarg)
4023 dict(mapping, **kwarg)
4024 dict(iterable, **kwarg)
Georg Brandl116aa622007-08-15 14:28:22 +00004025
Chris Jerdonekf3413172012-10-13 03:22:33 -07004026 Return a new dictionary initialized from an optional positional argument
4027 and a possibly empty set of keyword arguments.
4028
4029 If no positional argument is given, an empty dictionary is created.
4030 If a positional argument is given and it is a mapping object, a dictionary
4031 is created with the same key-value pairs as the mapping object. Otherwise,
Terry Jan Reedyb52f8762014-06-02 20:42:56 -04004032 the positional argument must be an :term:`iterable` object. Each item in
4033 the iterable must itself be an iterable with exactly two objects. The
Chris Jerdonekf3413172012-10-13 03:22:33 -07004034 first object of each item becomes a key in the new dictionary, and the
4035 second object the corresponding value. If a key occurs more than once, the
4036 last value for that key becomes the corresponding value in the new
Georg Brandld22a8152007-09-04 17:43:37 +00004037 dictionary.
Georg Brandl116aa622007-08-15 14:28:22 +00004038
Chris Jerdonekf3413172012-10-13 03:22:33 -07004039 If keyword arguments are given, the keyword arguments and their values are
4040 added to the dictionary created from the positional argument. If a key
4041 being added is already present, the value from the keyword argument
4042 replaces the value from the positional argument.
Georg Brandl116aa622007-08-15 14:28:22 +00004043
Chris Jerdonekf3413172012-10-13 03:22:33 -07004044 To illustrate, the following examples all return a dictionary equal to
Ezio Melottia20879f2012-10-26 19:14:16 +03004045 ``{"one": 1, "two": 2, "three": 3}``::
Georg Brandl116aa622007-08-15 14:28:22 +00004046
Ezio Melottia20879f2012-10-26 19:14:16 +03004047 >>> a = dict(one=1, two=2, three=3)
4048 >>> b = {'one': 1, 'two': 2, 'three': 3}
4049 >>> c = dict(zip(['one', 'two', 'three'], [1, 2, 3]))
4050 >>> d = dict([('two', 2), ('one', 1), ('three', 3)])
4051 >>> e = dict({'three': 3, 'one': 1, 'two': 2})
Chris Jerdonekf3413172012-10-13 03:22:33 -07004052 >>> a == b == c == d == e
4053 True
4054
4055 Providing keyword arguments as in the first example only works for keys that
4056 are valid Python identifiers. Otherwise, any valid keys can be used.
Georg Brandl116aa622007-08-15 14:28:22 +00004057
Georg Brandl116aa622007-08-15 14:28:22 +00004058
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004059 These are the operations that dictionaries support (and therefore, custom
4060 mapping types should support too):
Georg Brandl116aa622007-08-15 14:28:22 +00004061
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004062 .. describe:: len(d)
Georg Brandl116aa622007-08-15 14:28:22 +00004063
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004064 Return the number of items in the dictionary *d*.
Georg Brandl116aa622007-08-15 14:28:22 +00004065
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004066 .. describe:: d[key]
Georg Brandl116aa622007-08-15 14:28:22 +00004067
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004068 Return the item of *d* with key *key*. Raises a :exc:`KeyError` if *key* is
4069 not in the map.
Georg Brandl48310cd2009-01-03 21:18:54 +00004070
Terry Jan Reedy06c62182014-12-10 18:48:23 -05004071 .. index:: __missing__()
Terry Jan Reedye40031d2014-12-10 18:49:58 -05004072
Terry Jan Reedyb67f6e22014-12-10 18:38:19 -05004073 If a subclass of dict defines a method :meth:`__missing__` and *key*
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004074 is not present, the ``d[key]`` operation calls that method with the key *key*
4075 as argument. The ``d[key]`` operation then returns or raises whatever is
Terry Jan Reedyb67f6e22014-12-10 18:38:19 -05004076 returned or raised by the ``__missing__(key)`` call.
4077 No other operations or methods invoke :meth:`__missing__`. If
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004078 :meth:`__missing__` is not defined, :exc:`KeyError` is raised.
Raymond Hettinger5254e972011-01-08 09:35:38 +00004079 :meth:`__missing__` must be a method; it cannot be an instance variable::
4080
4081 >>> class Counter(dict):
4082 ... def __missing__(self, key):
4083 ... return 0
4084 >>> c = Counter()
4085 >>> c['red']
4086 0
4087 >>> c['red'] += 1
4088 >>> c['red']
4089 1
4090
Terry Jan Reedyb67f6e22014-12-10 18:38:19 -05004091 The example above shows part of the implementation of
4092 :class:`collections.Counter`. A different ``__missing__`` method is used
4093 by :class:`collections.defaultdict`.
Georg Brandl116aa622007-08-15 14:28:22 +00004094
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004095 .. describe:: d[key] = value
Georg Brandl116aa622007-08-15 14:28:22 +00004096
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004097 Set ``d[key]`` to *value*.
Georg Brandl116aa622007-08-15 14:28:22 +00004098
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004099 .. describe:: del d[key]
Georg Brandl116aa622007-08-15 14:28:22 +00004100
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004101 Remove ``d[key]`` from *d*. Raises a :exc:`KeyError` if *key* is not in the
4102 map.
Georg Brandl116aa622007-08-15 14:28:22 +00004103
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004104 .. describe:: key in d
Georg Brandl116aa622007-08-15 14:28:22 +00004105
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004106 Return ``True`` if *d* has a key *key*, else ``False``.
Georg Brandl116aa622007-08-15 14:28:22 +00004107
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004108 .. describe:: key not in d
Georg Brandl116aa622007-08-15 14:28:22 +00004109
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004110 Equivalent to ``not key in d``.
Georg Brandl116aa622007-08-15 14:28:22 +00004111
Benjamin Petersond23f8222009-04-05 19:13:16 +00004112 .. describe:: iter(d)
4113
4114 Return an iterator over the keys of the dictionary. This is a shortcut
Georg Brandlede6c2a2010-01-05 10:22:04 +00004115 for ``iter(d.keys())``.
Benjamin Petersond23f8222009-04-05 19:13:16 +00004116
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004117 .. method:: clear()
Georg Brandl116aa622007-08-15 14:28:22 +00004118
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004119 Remove all items from the dictionary.
Georg Brandl116aa622007-08-15 14:28:22 +00004120
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004121 .. method:: copy()
Georg Brandl116aa622007-08-15 14:28:22 +00004122
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004123 Return a shallow copy of the dictionary.
Georg Brandl116aa622007-08-15 14:28:22 +00004124
Georg Brandlabc38772009-04-12 15:51:51 +00004125 .. classmethod:: fromkeys(seq[, value])
Georg Brandl116aa622007-08-15 14:28:22 +00004126
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004127 Create a new dictionary with keys from *seq* and values set to *value*.
Georg Brandl116aa622007-08-15 14:28:22 +00004128
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004129 :meth:`fromkeys` is a class method that returns a new dictionary. *value*
4130 defaults to ``None``.
Georg Brandl116aa622007-08-15 14:28:22 +00004131
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004132 .. method:: get(key[, default])
Georg Brandl116aa622007-08-15 14:28:22 +00004133
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004134 Return the value for *key* if *key* is in the dictionary, else *default*.
4135 If *default* is not given, it defaults to ``None``, so that this method
4136 never raises a :exc:`KeyError`.
Georg Brandl116aa622007-08-15 14:28:22 +00004137
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004138 .. method:: items()
Georg Brandl116aa622007-08-15 14:28:22 +00004139
Victor Stinner0db176f2012-04-16 00:16:30 +02004140 Return a new view of the dictionary's items (``(key, value)`` pairs).
4141 See the :ref:`documentation of view objects <dict-views>`.
Georg Brandl116aa622007-08-15 14:28:22 +00004142
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004143 .. method:: keys()
Georg Brandl116aa622007-08-15 14:28:22 +00004144
Victor Stinner0db176f2012-04-16 00:16:30 +02004145 Return a new view of the dictionary's keys. See the :ref:`documentation
4146 of view objects <dict-views>`.
Georg Brandl116aa622007-08-15 14:28:22 +00004147
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004148 .. method:: pop(key[, default])
Georg Brandl116aa622007-08-15 14:28:22 +00004149
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004150 If *key* is in the dictionary, remove it and return its value, else return
4151 *default*. If *default* is not given and *key* is not in the dictionary,
4152 a :exc:`KeyError` is raised.
Georg Brandl116aa622007-08-15 14:28:22 +00004153
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004154 .. method:: popitem()
Georg Brandl116aa622007-08-15 14:28:22 +00004155
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004156 Remove and return an arbitrary ``(key, value)`` pair from the dictionary.
Georg Brandl116aa622007-08-15 14:28:22 +00004157
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004158 :meth:`popitem` is useful to destructively iterate over a dictionary, as
4159 often used in set algorithms. If the dictionary is empty, calling
4160 :meth:`popitem` raises a :exc:`KeyError`.
Georg Brandl116aa622007-08-15 14:28:22 +00004161
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004162 .. method:: setdefault(key[, default])
Georg Brandl116aa622007-08-15 14:28:22 +00004163
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004164 If *key* is in the dictionary, return its value. If not, insert *key*
4165 with a value of *default* and return *default*. *default* defaults to
4166 ``None``.
Georg Brandl116aa622007-08-15 14:28:22 +00004167
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004168 .. method:: update([other])
Georg Brandl116aa622007-08-15 14:28:22 +00004169
Éric Araujo0fc86b82010-08-18 22:29:54 +00004170 Update the dictionary with the key/value pairs from *other*, overwriting
4171 existing keys. Return ``None``.
Georg Brandl116aa622007-08-15 14:28:22 +00004172
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004173 :meth:`update` accepts either another dictionary object or an iterable of
Georg Brandlfda21062010-09-25 16:56:36 +00004174 key/value pairs (as tuples or other iterables of length two). If keyword
Benjamin Peterson8719ad52009-09-11 22:24:02 +00004175 arguments are specified, the dictionary is then updated with those
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004176 key/value pairs: ``d.update(red=1, blue=2)``.
Georg Brandl116aa622007-08-15 14:28:22 +00004177
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004178 .. method:: values()
Georg Brandl116aa622007-08-15 14:28:22 +00004179
Victor Stinner0db176f2012-04-16 00:16:30 +02004180 Return a new view of the dictionary's values. See the
4181 :ref:`documentation of view objects <dict-views>`.
4182
Terry Jan Reedyfe63c9a2015-06-12 16:38:57 -04004183 Dictionaries compare equal if and only if they have the same ``(key,
4184 value)`` pairs. Order comparisons ('<', '<=', '>=', '>') raise
4185 :exc:`TypeError`.
Terry Jan Reedy6ac5cc12015-06-12 16:47:44 -04004186
Victor Stinner0db176f2012-04-16 00:16:30 +02004187.. seealso::
4188 :class:`types.MappingProxyType` can be used to create a read-only view
4189 of a :class:`dict`.
Georg Brandld22a8152007-09-04 17:43:37 +00004190
4191
Benjamin Peterson44309e62008-11-22 00:41:45 +00004192.. _dict-views:
4193
Georg Brandld22a8152007-09-04 17:43:37 +00004194Dictionary view objects
4195-----------------------
4196
4197The objects returned by :meth:`dict.keys`, :meth:`dict.values` and
4198:meth:`dict.items` are *view objects*. They provide a dynamic view on the
4199dictionary's entries, which means that when the dictionary changes, the view
Benjamin Petersonce0506c2008-11-17 21:47:41 +00004200reflects these changes.
Georg Brandld22a8152007-09-04 17:43:37 +00004201
4202Dictionary views can be iterated over to yield their respective data, and
4203support membership tests:
4204
4205.. describe:: len(dictview)
4206
4207 Return the number of entries in the dictionary.
4208
4209.. describe:: iter(dictview)
4210
4211 Return an iterator over the keys, values or items (represented as tuples of
4212 ``(key, value)``) in the dictionary.
4213
4214 Keys and values are iterated over in an arbitrary order which is non-random,
4215 varies across Python implementations, and depends on the dictionary's history
4216 of insertions and deletions. If keys, values and items views are iterated
4217 over with no intervening modifications to the dictionary, the order of items
4218 will directly correspond. This allows the creation of ``(value, key)`` pairs
4219 using :func:`zip`: ``pairs = zip(d.values(), d.keys())``. Another way to
4220 create the same list is ``pairs = [(v, k) for (k, v) in d.items()]``.
4221
Georg Brandl81269142009-05-17 08:31:29 +00004222 Iterating views while adding or deleting entries in the dictionary may raise
4223 a :exc:`RuntimeError` or fail to iterate over all entries.
Benjamin Petersond23f8222009-04-05 19:13:16 +00004224
Georg Brandld22a8152007-09-04 17:43:37 +00004225.. describe:: x in dictview
4226
4227 Return ``True`` if *x* is in the underlying dictionary's keys, values or
4228 items (in the latter case, *x* should be a ``(key, value)`` tuple).
4229
4230
Benjamin Petersonce0506c2008-11-17 21:47:41 +00004231Keys views are set-like since their entries are unique and hashable. If all
Georg Brandlf74cf772010-10-15 16:03:02 +00004232values are hashable, so that ``(key, value)`` pairs are unique and hashable,
4233then the items view is also set-like. (Values views are not treated as set-like
4234since the entries are generally not unique.) For set-like views, all of the
Nick Coghlan273069c2012-08-20 17:14:07 +10004235operations defined for the abstract base class :class:`collections.abc.Set` are
Georg Brandlf74cf772010-10-15 16:03:02 +00004236available (for example, ``==``, ``<``, or ``^``).
Georg Brandl116aa622007-08-15 14:28:22 +00004237
Georg Brandlc53c9662007-09-04 17:58:02 +00004238An example of dictionary view usage::
4239
4240 >>> dishes = {'eggs': 2, 'sausage': 1, 'bacon': 1, 'spam': 500}
4241 >>> keys = dishes.keys()
4242 >>> values = dishes.values()
4243
4244 >>> # iteration
4245 >>> n = 0
4246 >>> for val in values:
4247 ... n += val
4248 >>> print(n)
4249 504
4250
4251 >>> # keys and values are iterated over in the same order
4252 >>> list(keys)
4253 ['eggs', 'bacon', 'sausage', 'spam']
4254 >>> list(values)
4255 [2, 1, 1, 500]
4256
4257 >>> # view objects are dynamic and reflect dict changes
4258 >>> del dishes['eggs']
4259 >>> del dishes['sausage']
4260 >>> list(keys)
4261 ['spam', 'bacon']
4262
4263 >>> # set operations
4264 >>> keys & {'eggs', 'bacon', 'salad'}
Gregory P. Smithe8388122008-09-04 04:18:09 +00004265 {'bacon'}
Georg Brandlf74cf772010-10-15 16:03:02 +00004266 >>> keys ^ {'sausage', 'juice'}
Sandro Tosi2a8d1952011-08-02 18:42:04 +02004267 {'juice', 'sausage', 'bacon', 'spam'}
Georg Brandlc53c9662007-09-04 17:58:02 +00004268
4269
Georg Brandl116aa622007-08-15 14:28:22 +00004270.. _typecontextmanager:
4271
4272Context Manager Types
4273=====================
4274
Georg Brandl116aa622007-08-15 14:28:22 +00004275.. index::
4276 single: context manager
4277 single: context management protocol
4278 single: protocol; context management
4279
4280Python's :keyword:`with` statement supports the concept of a runtime context
Antoine Pitroua6540902010-12-12 20:09:18 +00004281defined by a context manager. This is implemented using a pair of methods
Georg Brandl116aa622007-08-15 14:28:22 +00004282that allow user-defined classes to define a runtime context that is entered
Antoine Pitroua6540902010-12-12 20:09:18 +00004283before the statement body is executed and exited when the statement ends:
Georg Brandl116aa622007-08-15 14:28:22 +00004284
4285
4286.. method:: contextmanager.__enter__()
4287
4288 Enter the runtime context and return either this object or another object
4289 related to the runtime context. The value returned by this method is bound to
4290 the identifier in the :keyword:`as` clause of :keyword:`with` statements using
4291 this context manager.
4292
Antoine Pitrou11cb9612010-09-15 11:11:28 +00004293 An example of a context manager that returns itself is a :term:`file object`.
4294 File objects return themselves from __enter__() to allow :func:`open` to be
4295 used as the context expression in a :keyword:`with` statement.
Georg Brandl116aa622007-08-15 14:28:22 +00004296
4297 An example of a context manager that returns a related object is the one
Christian Heimesfaf2f632008-01-06 16:59:19 +00004298 returned by :func:`decimal.localcontext`. These managers set the active
Georg Brandl116aa622007-08-15 14:28:22 +00004299 decimal context to a copy of the original decimal context and then return the
4300 copy. This allows changes to be made to the current decimal context in the body
4301 of the :keyword:`with` statement without affecting code outside the
4302 :keyword:`with` statement.
4303
4304
4305.. method:: contextmanager.__exit__(exc_type, exc_val, exc_tb)
4306
Georg Brandl9afde1c2007-11-01 20:32:30 +00004307 Exit the runtime context and return a Boolean flag indicating if any exception
Georg Brandl116aa622007-08-15 14:28:22 +00004308 that occurred should be suppressed. If an exception occurred while executing the
4309 body of the :keyword:`with` statement, the arguments contain the exception type,
4310 value and traceback information. Otherwise, all three arguments are ``None``.
4311
4312 Returning a true value from this method will cause the :keyword:`with` statement
4313 to suppress the exception and continue execution with the statement immediately
4314 following the :keyword:`with` statement. Otherwise the exception continues
4315 propagating after this method has finished executing. Exceptions that occur
4316 during execution of this method will replace any exception that occurred in the
4317 body of the :keyword:`with` statement.
4318
4319 The exception passed in should never be reraised explicitly - instead, this
4320 method should return a false value to indicate that the method completed
4321 successfully and does not want to suppress the raised exception. This allows
Georg Brandle4196d32014-10-31 09:41:46 +01004322 context management code to easily detect whether or not an :meth:`__exit__`
4323 method has actually failed.
Georg Brandl116aa622007-08-15 14:28:22 +00004324
4325Python defines several context managers to support easy thread synchronisation,
4326prompt closure of files or other objects, and simpler manipulation of the active
4327decimal arithmetic context. The specific types are not treated specially beyond
4328their implementation of the context management protocol. See the
4329:mod:`contextlib` module for some examples.
4330
Antoine Pitroua6540902010-12-12 20:09:18 +00004331Python's :term:`generator`\s and the :class:`contextlib.contextmanager` decorator
Christian Heimesd8654cf2007-12-02 15:22:16 +00004332provide a convenient way to implement these protocols. If a generator function is
Antoine Pitroua6540902010-12-12 20:09:18 +00004333decorated with the :class:`contextlib.contextmanager` decorator, it will return a
Georg Brandl116aa622007-08-15 14:28:22 +00004334context manager implementing the necessary :meth:`__enter__` and
4335:meth:`__exit__` methods, rather than the iterator produced by an undecorated
4336generator function.
4337
4338Note that there is no specific slot for any of these methods in the type
4339structure for Python objects in the Python/C API. Extension types wanting to
4340define these methods must provide them as a normal Python accessible method.
4341Compared to the overhead of setting up the runtime context, the overhead of a
4342single class dictionary lookup is negligible.
4343
4344
4345.. _typesother:
4346
4347Other Built-in Types
4348====================
4349
4350The interpreter supports several other kinds of objects. Most of these support
4351only one or two operations.
4352
4353
4354.. _typesmodules:
4355
4356Modules
4357-------
4358
4359The only special operation on a module is attribute access: ``m.name``, where
4360*m* is a module and *name* accesses a name defined in *m*'s symbol table.
4361Module attributes can be assigned to. (Note that the :keyword:`import`
4362statement is not, strictly speaking, an operation on a module object; ``import
4363foo`` does not require a module object named *foo* to exist, rather it requires
4364an (external) *definition* for a module named *foo* somewhere.)
4365
Serhiy Storchaka0d196ed2013-10-09 14:02:31 +03004366A special attribute of every module is :attr:`~object.__dict__`. This is the
4367dictionary containing the module's symbol table. Modifying this dictionary will
4368actually change the module's symbol table, but direct assignment to the
Martin Panterbae5d812016-06-18 03:57:31 +00004369:attr:`~object.__dict__` attribute is not possible (you can write
Serhiy Storchaka0d196ed2013-10-09 14:02:31 +03004370``m.__dict__['a'] = 1``, which defines ``m.a`` to be ``1``, but you can't write
Martin Panterbae5d812016-06-18 03:57:31 +00004371``m.__dict__ = {}``). Modifying :attr:`~object.__dict__` directly is
4372not recommended.
Georg Brandl116aa622007-08-15 14:28:22 +00004373
4374Modules built into the interpreter are written like this: ``<module 'sys'
4375(built-in)>``. If loaded from a file, they are written as ``<module 'os' from
4376'/usr/local/lib/pythonX.Y/os.pyc'>``.
4377
4378
4379.. _typesobjects:
4380
4381Classes and Class Instances
4382---------------------------
4383
4384See :ref:`objects` and :ref:`class` for these.
4385
4386
4387.. _typesfunctions:
4388
4389Functions
4390---------
4391
4392Function objects are created by function definitions. The only operation on a
4393function object is to call it: ``func(argument-list)``.
4394
4395There are really two flavors of function objects: built-in functions and
4396user-defined functions. Both support the same operation (to call the function),
4397but the implementation is different, hence the different object types.
4398
4399See :ref:`function` for more information.
4400
4401
4402.. _typesmethods:
4403
4404Methods
4405-------
4406
4407.. index:: object: method
4408
4409Methods are functions that are called using the attribute notation. There are
4410two flavors: built-in methods (such as :meth:`append` on lists) and class
4411instance methods. Built-in methods are described with the types that support
4412them.
4413
Georg Brandl2e0b7552007-11-27 12:43:08 +00004414If you access a method (a function defined in a class namespace) through an
4415instance, you get a special object: a :dfn:`bound method` (also called
4416:dfn:`instance method`) object. When called, it will add the ``self`` argument
4417to the argument list. Bound methods have two special read-only attributes:
4418``m.__self__`` is the object on which the method operates, and ``m.__func__`` is
4419the function implementing the method. Calling ``m(arg-1, arg-2, ..., arg-n)``
4420is completely equivalent to calling ``m.__func__(m.__self__, arg-1, arg-2, ...,
4421arg-n)``.
Georg Brandl116aa622007-08-15 14:28:22 +00004422
Georg Brandl2e0b7552007-11-27 12:43:08 +00004423Like function objects, bound method objects support getting arbitrary
4424attributes. However, since method attributes are actually stored on the
4425underlying function object (``meth.__func__``), setting method attributes on
Ezio Melotti8b6b1762012-11-09 01:08:25 +02004426bound methods is disallowed. Attempting to set an attribute on a method
4427results in an :exc:`AttributeError` being raised. In order to set a method
4428attribute, you need to explicitly set it on the underlying function object::
Georg Brandl116aa622007-08-15 14:28:22 +00004429
Ezio Melotti8b6b1762012-11-09 01:08:25 +02004430 >>> class C:
4431 ... def method(self):
4432 ... pass
4433 ...
4434 >>> c = C()
4435 >>> c.method.whoami = 'my name is method' # can't set on the method
4436 Traceback (most recent call last):
4437 File "<stdin>", line 1, in <module>
4438 AttributeError: 'method' object has no attribute 'whoami'
4439 >>> c.method.__func__.whoami = 'my name is method'
4440 >>> c.method.whoami
4441 'my name is method'
Georg Brandl116aa622007-08-15 14:28:22 +00004442
4443See :ref:`types` for more information.
4444
4445
Tommy Beadlee9b84032016-06-02 19:26:51 -04004446.. index:: object; code, code object
4447
Georg Brandl116aa622007-08-15 14:28:22 +00004448.. _bltin-code-objects:
4449
4450Code Objects
4451------------
4452
Georg Brandl116aa622007-08-15 14:28:22 +00004453.. index::
4454 builtin: compile
4455 single: __code__ (function object attribute)
4456
4457Code objects are used by the implementation to represent "pseudo-compiled"
4458executable Python code such as a function body. They differ from function
4459objects because they don't contain a reference to their global execution
4460environment. Code objects are returned by the built-in :func:`compile` function
4461and can be extracted from function objects through their :attr:`__code__`
4462attribute. See also the :mod:`code` module.
4463
4464.. index::
4465 builtin: exec
4466 builtin: eval
4467
4468A code object can be executed or evaluated by passing it (instead of a source
4469string) to the :func:`exec` or :func:`eval` built-in functions.
4470
4471See :ref:`types` for more information.
4472
4473
4474.. _bltin-type-objects:
4475
4476Type Objects
4477------------
4478
4479.. index::
4480 builtin: type
4481 module: types
4482
4483Type objects represent the various object types. An object's type is accessed
4484by the built-in function :func:`type`. There are no special operations on
4485types. The standard module :mod:`types` defines names for all standard built-in
4486types.
4487
Martin v. Löwis250ad612008-04-07 05:43:42 +00004488Types are written like this: ``<class 'int'>``.
Georg Brandl116aa622007-08-15 14:28:22 +00004489
4490
4491.. _bltin-null-object:
4492
4493The Null Object
4494---------------
4495
4496This object is returned by functions that don't explicitly return a value. It
4497supports no special operations. There is exactly one null object, named
Benjamin Peterson98f2b9b2011-07-30 12:26:27 -05004498``None`` (a built-in name). ``type(None)()`` produces the same singleton.
Georg Brandl116aa622007-08-15 14:28:22 +00004499
4500It is written as ``None``.
4501
4502
4503.. _bltin-ellipsis-object:
4504
4505The Ellipsis Object
4506-------------------
4507
Benjamin Petersond5a1c442012-05-14 22:09:31 -07004508This object is commonly used by slicing (see :ref:`slicings`). It supports no
4509special operations. There is exactly one ellipsis object, named
4510:const:`Ellipsis` (a built-in name). ``type(Ellipsis)()`` produces the
4511:const:`Ellipsis` singleton.
Georg Brandl116aa622007-08-15 14:28:22 +00004512
4513It is written as ``Ellipsis`` or ``...``.
4514
4515
Éric Araujo18ddf822011-09-01 23:10:36 +02004516.. _bltin-notimplemented-object:
4517
Benjamin Peterson50211fa2011-07-30 09:57:24 -05004518The NotImplemented Object
4519-------------------------
4520
4521This object is returned from comparisons and binary operations when they are
4522asked to operate on types they don't support. See :ref:`comparisons` for more
Benjamin Peterson98f2b9b2011-07-30 12:26:27 -05004523information. There is exactly one ``NotImplemented`` object.
4524``type(NotImplemented)()`` produces the singleton instance.
Benjamin Peterson50211fa2011-07-30 09:57:24 -05004525
4526It is written as ``NotImplemented``.
4527
Georg Brandl116aa622007-08-15 14:28:22 +00004528
Éric Araujo18ddf822011-09-01 23:10:36 +02004529.. _bltin-boolean-values:
4530
Georg Brandl116aa622007-08-15 14:28:22 +00004531Boolean Values
4532--------------
4533
4534Boolean values are the two constant objects ``False`` and ``True``. They are
4535used to represent truth values (although other values can also be considered
4536false or true). In numeric contexts (for example when used as the argument to
4537an arithmetic operator), they behave like the integers 0 and 1, respectively.
Ezio Melottic1f26f62011-12-02 19:47:24 +02004538The built-in function :func:`bool` can be used to convert any value to a
4539Boolean, if the value can be interpreted as a truth value (see section
4540:ref:`truth` above).
Georg Brandl116aa622007-08-15 14:28:22 +00004541
4542.. index::
4543 single: False
4544 single: True
4545 pair: Boolean; values
4546
4547They are written as ``False`` and ``True``, respectively.
4548
4549
4550.. _typesinternal:
4551
4552Internal Objects
4553----------------
4554
4555See :ref:`types` for this information. It describes stack frame objects,
4556traceback objects, and slice objects.
4557
4558
4559.. _specialattrs:
4560
4561Special Attributes
4562==================
4563
4564The implementation adds a few special read-only attributes to several object
4565types, where they are relevant. Some of these are not reported by the
4566:func:`dir` built-in function.
4567
4568
4569.. attribute:: object.__dict__
4570
4571 A dictionary or other mapping object used to store an object's (writable)
4572 attributes.
4573
4574
4575.. attribute:: instance.__class__
4576
4577 The class to which a class instance belongs.
4578
4579
4580.. attribute:: class.__bases__
4581
Benjamin Peterson1baf4652009-12-31 03:11:23 +00004582 The tuple of base classes of a class object.
Georg Brandl116aa622007-08-15 14:28:22 +00004583
4584
Martin Panterbae5d812016-06-18 03:57:31 +00004585.. attribute:: definition.__name__
Georg Brandl116aa622007-08-15 14:28:22 +00004586
Martin Panterbae5d812016-06-18 03:57:31 +00004587 The name of the class, function, method, descriptor, or
4588 generator instance.
Georg Brandl116aa622007-08-15 14:28:22 +00004589
Georg Brandl7a51e582009-03-28 19:13:21 +00004590
Martin Panterbae5d812016-06-18 03:57:31 +00004591.. attribute:: definition.__qualname__
Antoine Pitrou86a36b52011-11-25 18:56:07 +01004592
Martin Panterbae5d812016-06-18 03:57:31 +00004593 The :term:`qualified name` of the class, function, method, descriptor,
4594 or generator instance.
Antoine Pitrou86a36b52011-11-25 18:56:07 +01004595
4596 .. versionadded:: 3.3
4597
4598
Benjamin Petersond23f8222009-04-05 19:13:16 +00004599.. attribute:: class.__mro__
4600
4601 This attribute is a tuple of classes that are considered when looking for
4602 base classes during method resolution.
4603
4604
4605.. method:: class.mro()
4606
4607 This method can be overridden by a metaclass to customize the method
4608 resolution order for its instances. It is called at class instantiation, and
Serhiy Storchaka0d196ed2013-10-09 14:02:31 +03004609 its result is stored in :attr:`~class.__mro__`.
Benjamin Petersond23f8222009-04-05 19:13:16 +00004610
4611
Georg Brandl7a51e582009-03-28 19:13:21 +00004612.. method:: class.__subclasses__
4613
Florent Xicluna74e64952011-10-28 11:21:19 +02004614 Each class keeps a list of weak references to its immediate subclasses. This
4615 method returns a list of all those references still alive.
Benjamin Petersond23f8222009-04-05 19:13:16 +00004616 Example::
Georg Brandl7a51e582009-03-28 19:13:21 +00004617
4618 >>> int.__subclasses__()
Florent Xicluna74e64952011-10-28 11:21:19 +02004619 [<class 'bool'>]
Georg Brandl7a51e582009-03-28 19:13:21 +00004620
4621
Georg Brandl116aa622007-08-15 14:28:22 +00004622.. rubric:: Footnotes
4623
Ezio Melotti0656a562011-08-15 14:27:19 +03004624.. [1] Additional information on these special methods may be found in the Python
Georg Brandl116aa622007-08-15 14:28:22 +00004625 Reference Manual (:ref:`customization`).
4626
Ezio Melotti0656a562011-08-15 14:27:19 +03004627.. [2] As a consequence, the list ``[1, 2]`` is considered equal to ``[1.0, 2.0]``, and
Georg Brandl116aa622007-08-15 14:28:22 +00004628 similarly for tuples.
4629
Ezio Melotti0656a562011-08-15 14:27:19 +03004630.. [3] They must have since the parser can't tell the type of the operands.
Georg Brandl116aa622007-08-15 14:28:22 +00004631
Ezio Melotti0656a562011-08-15 14:27:19 +03004632.. [4] Cased characters are those with general category property being one of
4633 "Lu" (Letter, uppercase), "Ll" (Letter, lowercase), or "Lt" (Letter, titlecase).
4634
4635.. [5] To format only a tuple you should therefore provide a singleton tuple whose only
Georg Brandl116aa622007-08-15 14:28:22 +00004636 element is the tuple to be formatted.