blob: a0f98fd067f6bb1f202bd82bed114baebe73d545 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001.. XXX: reference/datamodel and this have quite a few overlaps!
2
3
4.. _bltin-types:
5
6**************
7Built-in Types
8**************
9
10The following sections describe the standard types that are built into the
11interpreter.
12
Georg Brandl116aa622007-08-15 14:28:22 +000013.. index:: pair: built-in; types
14
Antoine Pitroue231e392009-12-19 18:22:15 +000015The principal built-in types are numerics, sequences, mappings, classes,
Georg Brandl116aa622007-08-15 14:28:22 +000016instances and exceptions.
17
Georg Brandl388349a2011-10-08 18:32:40 +020018Some collection classes are mutable. The methods that add, subtract, or
19rearrange their members in place, and don't return a specific item, never return
20the collection instance itself but ``None``.
21
Georg Brandl116aa622007-08-15 14:28:22 +000022Some operations are supported by several object types; in particular,
23practically all objects can be compared, tested for truth value, and converted
24to a string (with the :func:`repr` function or the slightly different
25:func:`str` function). The latter function is implicitly used when an object is
26written by the :func:`print` function.
27
28
29.. _truth:
30
31Truth Value Testing
32===================
33
34.. index::
35 statement: if
36 statement: while
37 pair: truth; value
38 pair: Boolean; operations
39 single: false
40
41Any object can be tested for truth value, for use in an :keyword:`if` or
42:keyword:`while` condition or as operand of the Boolean operations below. The
43following values are considered false:
44
45 .. index:: single: None (Built-in object)
46
47* ``None``
48
49 .. index:: single: False (Built-in object)
50
51* ``False``
52
Mark Summerfieldbbfd71d2008-07-01 15:50:04 +000053* zero of any numeric type, for example, ``0``, ``0.0``, ``0j``.
Georg Brandl116aa622007-08-15 14:28:22 +000054
55* any empty sequence, for example, ``''``, ``()``, ``[]``.
56
57* any empty mapping, for example, ``{}``.
58
59* instances of user-defined classes, if the class defines a :meth:`__bool__` or
60 :meth:`__len__` method, when that method returns the integer zero or
Ezio Melotti0656a562011-08-15 14:27:19 +030061 :class:`bool` value ``False``. [1]_
Georg Brandl116aa622007-08-15 14:28:22 +000062
63.. index:: single: true
64
65All other values are considered true --- so objects of many types are always
66true.
67
68.. index::
69 operator: or
70 operator: and
71 single: False
72 single: True
73
74Operations and built-in functions that have a Boolean result always return ``0``
75or ``False`` for false and ``1`` or ``True`` for true, unless otherwise stated.
76(Important exception: the Boolean operations ``or`` and ``and`` always return
77one of their operands.)
78
79
80.. _boolean:
81
82Boolean Operations --- :keyword:`and`, :keyword:`or`, :keyword:`not`
83====================================================================
84
85.. index:: pair: Boolean; operations
86
87These are the Boolean operations, ordered by ascending priority:
88
89+-------------+---------------------------------+-------+
90| Operation | Result | Notes |
91+=============+=================================+=======+
92| ``x or y`` | if *x* is false, then *y*, else | \(1) |
93| | *x* | |
94+-------------+---------------------------------+-------+
95| ``x and y`` | if *x* is false, then *x*, else | \(2) |
96| | *y* | |
97+-------------+---------------------------------+-------+
98| ``not x`` | if *x* is false, then ``True``, | \(3) |
99| | else ``False`` | |
100+-------------+---------------------------------+-------+
101
102.. index::
103 operator: and
104 operator: or
105 operator: not
106
107Notes:
108
109(1)
110 This is a short-circuit operator, so it only evaluates the second
111 argument if the first one is :const:`False`.
112
113(2)
114 This is a short-circuit operator, so it only evaluates the second
115 argument if the first one is :const:`True`.
116
117(3)
118 ``not`` has a lower priority than non-Boolean operators, so ``not a == b`` is
119 interpreted as ``not (a == b)``, and ``a == not b`` is a syntax error.
120
121
122.. _stdcomparisons:
123
124Comparisons
125===========
126
Alexandre Vassalotti6d3dfc32009-07-29 19:54:39 +0000127.. index::
128 pair: chaining; comparisons
129 pair: operator; comparison
130 operator: ==
131 operator: <
132 operator: <=
133 operator: >
134 operator: >=
135 operator: !=
136 operator: is
137 operator: is not
Georg Brandl116aa622007-08-15 14:28:22 +0000138
Georg Brandl905ec322007-09-28 13:39:25 +0000139There are eight comparison operations in Python. They all have the same
140priority (which is higher than that of the Boolean operations). Comparisons can
Georg Brandl116aa622007-08-15 14:28:22 +0000141be chained arbitrarily; for example, ``x < y <= z`` is equivalent to ``x < y and
142y <= z``, except that *y* is evaluated only once (but in both cases *z* is not
143evaluated at all when ``x < y`` is found to be false).
144
145This table summarizes the comparison operations:
146
Georg Brandlfd855162008-01-07 09:13:03 +0000147+------------+-------------------------+
148| Operation | Meaning |
149+============+=========================+
150| ``<`` | strictly less than |
151+------------+-------------------------+
152| ``<=`` | less than or equal |
153+------------+-------------------------+
154| ``>`` | strictly greater than |
155+------------+-------------------------+
156| ``>=`` | greater than or equal |
157+------------+-------------------------+
158| ``==`` | equal |
159+------------+-------------------------+
160| ``!=`` | not equal |
161+------------+-------------------------+
162| ``is`` | object identity |
163+------------+-------------------------+
164| ``is not`` | negated object identity |
165+------------+-------------------------+
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000166
167.. index::
Georg Brandl116aa622007-08-15 14:28:22 +0000168 pair: object; numeric
169 pair: objects; comparing
170
Georg Brandl905ec322007-09-28 13:39:25 +0000171Objects of different types, except different numeric types, never compare equal.
Antoine Pitroue231e392009-12-19 18:22:15 +0000172Furthermore, some types (for example, function objects) support only a degenerate
Georg Brandl905ec322007-09-28 13:39:25 +0000173notion of comparison where any two objects of that type are unequal. The ``<``,
174``<=``, ``>`` and ``>=`` operators will raise a :exc:`TypeError` exception when
Mark Dickinsonf673f0c2010-03-13 09:48:39 +0000175comparing a complex number with another built-in numeric type, when the objects
176are of different types that cannot be compared, or in other cases where there is
177no defined ordering.
Georg Brandl116aa622007-08-15 14:28:22 +0000178
Georg Brandl48310cd2009-01-03 21:18:54 +0000179.. index::
Georg Brandl905ec322007-09-28 13:39:25 +0000180 single: __eq__() (instance method)
181 single: __ne__() (instance method)
182 single: __lt__() (instance method)
183 single: __le__() (instance method)
184 single: __gt__() (instance method)
185 single: __ge__() (instance method)
Georg Brandl116aa622007-08-15 14:28:22 +0000186
Georg Brandl05f5ab72008-09-24 09:11:47 +0000187Non-identical instances of a class normally compare as non-equal unless the
188class defines the :meth:`__eq__` method.
Georg Brandl116aa622007-08-15 14:28:22 +0000189
Georg Brandl905ec322007-09-28 13:39:25 +0000190Instances of a class cannot be ordered with respect to other instances of the
191same class, or other types of object, unless the class defines enough of the
Georg Brandl05f5ab72008-09-24 09:11:47 +0000192methods :meth:`__lt__`, :meth:`__le__`, :meth:`__gt__`, and :meth:`__ge__` (in
193general, :meth:`__lt__` and :meth:`__eq__` are sufficient, if you want the
194conventional meanings of the comparison operators).
Georg Brandl905ec322007-09-28 13:39:25 +0000195
196The behavior of the :keyword:`is` and :keyword:`is not` operators cannot be
197customized; also they can be applied to any two objects and never raise an
198exception.
Georg Brandl116aa622007-08-15 14:28:22 +0000199
200.. index::
201 operator: in
202 operator: not in
203
Georg Brandl375aec22011-01-15 17:03:02 +0000204Two more operations with the same syntactic priority, :keyword:`in` and
205:keyword:`not in`, are supported only by sequence types (below).
Georg Brandl116aa622007-08-15 14:28:22 +0000206
207
208.. _typesnumeric:
209
Georg Brandl905ec322007-09-28 13:39:25 +0000210Numeric Types --- :class:`int`, :class:`float`, :class:`complex`
211================================================================
Georg Brandl116aa622007-08-15 14:28:22 +0000212
213.. index::
214 object: numeric
215 object: Boolean
216 object: integer
Georg Brandl116aa622007-08-15 14:28:22 +0000217 object: floating point
218 object: complex number
219 pair: C; language
220
Mark Summerfieldbbfd71d2008-07-01 15:50:04 +0000221There are three distinct numeric types: :dfn:`integers`, :dfn:`floating
222point numbers`, and :dfn:`complex numbers`. In addition, Booleans are a
223subtype of integers. Integers have unlimited precision. Floating point
Georg Brandl60203b42010-10-06 10:11:56 +0000224numbers are usually implemented using :c:type:`double` in C; information
Mark Dickinson74f59022010-08-04 18:42:43 +0000225about the precision and internal representation of floating point
226numbers for the machine on which your program is running is available
227in :data:`sys.float_info`. Complex numbers have a real and imaginary
228part, which are each a floating point number. To extract these parts
229from a complex number *z*, use ``z.real`` and ``z.imag``. (The standard
230library includes additional numeric types, :mod:`fractions` that hold
231rationals, and :mod:`decimal` that hold floating-point numbers with
232user-definable precision.)
Georg Brandl116aa622007-08-15 14:28:22 +0000233
234.. index::
235 pair: numeric; literals
236 pair: integer; literals
Georg Brandl116aa622007-08-15 14:28:22 +0000237 pair: floating point; literals
238 pair: complex number; literals
239 pair: hexadecimal; literals
240 pair: octal; literals
Neal Norwitz1d2aef52007-10-02 07:26:14 +0000241 pair: binary; literals
Georg Brandl116aa622007-08-15 14:28:22 +0000242
243Numbers are created by numeric literals or as the result of built-in functions
Georg Brandl905ec322007-09-28 13:39:25 +0000244and operators. Unadorned integer literals (including hex, octal and binary
245numbers) yield integers. Numeric literals containing a decimal point or an
246exponent sign yield floating point numbers. Appending ``'j'`` or ``'J'`` to a
247numeric literal yields an imaginary number (a complex number with a zero real
248part) which you can add to an integer or float to get a complex number with real
249and imaginary parts.
Georg Brandl116aa622007-08-15 14:28:22 +0000250
251.. index::
252 single: arithmetic
253 builtin: int
Georg Brandl116aa622007-08-15 14:28:22 +0000254 builtin: float
255 builtin: complex
Alexandre Vassalotti6d3dfc32009-07-29 19:54:39 +0000256 operator: +
257 operator: -
258 operator: *
259 operator: /
260 operator: //
261 operator: %
262 operator: **
Georg Brandl116aa622007-08-15 14:28:22 +0000263
264Python fully supports mixed arithmetic: when a binary arithmetic operator has
265operands of different numeric types, the operand with the "narrower" type is
Georg Brandl905ec322007-09-28 13:39:25 +0000266widened to that of the other, where integer is narrower than floating point,
267which is narrower than complex. Comparisons between numbers of mixed type use
Ezio Melotti0656a562011-08-15 14:27:19 +0300268the same rule. [2]_ The constructors :func:`int`, :func:`float`, and
Georg Brandl905ec322007-09-28 13:39:25 +0000269:func:`complex` can be used to produce numbers of a specific type.
Georg Brandl116aa622007-08-15 14:28:22 +0000270
271All numeric types (except complex) support the following operations, sorted by
Georg Brandle4196d32014-10-31 09:41:46 +0100272ascending priority (all numeric operations have a higher priority than
273comparison operations):
Georg Brandl116aa622007-08-15 14:28:22 +0000274
Raymond Hettingerc706dbf2011-03-22 17:33:17 -0700275+---------------------+---------------------------------+---------+--------------------+
276| Operation | Result | Notes | Full documentation |
277+=====================+=================================+=========+====================+
278| ``x + y`` | sum of *x* and *y* | | |
279+---------------------+---------------------------------+---------+--------------------+
280| ``x - y`` | difference of *x* and *y* | | |
281+---------------------+---------------------------------+---------+--------------------+
282| ``x * y`` | product of *x* and *y* | | |
283+---------------------+---------------------------------+---------+--------------------+
284| ``x / y`` | quotient of *x* and *y* | | |
285+---------------------+---------------------------------+---------+--------------------+
286| ``x // y`` | floored quotient of *x* and | \(1) | |
287| | *y* | | |
288+---------------------+---------------------------------+---------+--------------------+
289| ``x % y`` | remainder of ``x / y`` | \(2) | |
290+---------------------+---------------------------------+---------+--------------------+
291| ``-x`` | *x* negated | | |
292+---------------------+---------------------------------+---------+--------------------+
293| ``+x`` | *x* unchanged | | |
294+---------------------+---------------------------------+---------+--------------------+
295| ``abs(x)`` | absolute value or magnitude of | | :func:`abs` |
296| | *x* | | |
297+---------------------+---------------------------------+---------+--------------------+
298| ``int(x)`` | *x* converted to integer | \(3)\(6)| :func:`int` |
299+---------------------+---------------------------------+---------+--------------------+
300| ``float(x)`` | *x* converted to floating point | \(4)\(6)| :func:`float` |
301+---------------------+---------------------------------+---------+--------------------+
302| ``complex(re, im)`` | a complex number with real part | \(6) | :func:`complex` |
303| | *re*, imaginary part *im*. | | |
304| | *im* defaults to zero. | | |
305+---------------------+---------------------------------+---------+--------------------+
306| ``c.conjugate()`` | conjugate of the complex number | | |
307| | *c* | | |
308+---------------------+---------------------------------+---------+--------------------+
309| ``divmod(x, y)`` | the pair ``(x // y, x % y)`` | \(2) | :func:`divmod` |
310+---------------------+---------------------------------+---------+--------------------+
311| ``pow(x, y)`` | *x* to the power *y* | \(5) | :func:`pow` |
312+---------------------+---------------------------------+---------+--------------------+
313| ``x ** y`` | *x* to the power *y* | \(5) | |
314+---------------------+---------------------------------+---------+--------------------+
Georg Brandl116aa622007-08-15 14:28:22 +0000315
316.. index::
317 triple: operations on; numeric; types
318 single: conjugate() (complex number method)
319
320Notes:
321
322(1)
Georg Brandl905ec322007-09-28 13:39:25 +0000323 Also referred to as integer division. The resultant value is a whole
324 integer, though the result's type is not necessarily int. The result is
325 always rounded towards minus infinity: ``1//2`` is ``0``, ``(-1)//2`` is
326 ``-1``, ``1//(-2)`` is ``-1``, and ``(-1)//(-2)`` is ``0``.
Georg Brandl116aa622007-08-15 14:28:22 +0000327
328(2)
Georg Brandl905ec322007-09-28 13:39:25 +0000329 Not for complex numbers. Instead convert to floats using :func:`abs` if
330 appropriate.
331
332(3)
Georg Brandl116aa622007-08-15 14:28:22 +0000333 .. index::
334 module: math
335 single: floor() (in module math)
336 single: ceil() (in module math)
Benjamin Peterson28d88b42009-01-09 03:03:23 +0000337 single: trunc() (in module math)
Georg Brandl116aa622007-08-15 14:28:22 +0000338 pair: numeric; conversions
339 pair: C; language
340
Georg Brandlba956ae2007-11-29 17:24:34 +0000341 Conversion from floating point to integer may round or truncate
Serhiy Storchaka0d196ed2013-10-09 14:02:31 +0300342 as in C; see functions :func:`math.floor` and :func:`math.ceil` for
343 well-defined conversions.
Georg Brandl116aa622007-08-15 14:28:22 +0000344
Georg Brandl74f36692008-01-06 17:39:49 +0000345(4)
Georg Brandl48310cd2009-01-03 21:18:54 +0000346 float also accepts the strings "nan" and "inf" with an optional prefix "+"
Christian Heimes99170a52007-12-19 02:07:34 +0000347 or "-" for Not a Number (NaN) and positive or negative infinity.
Christian Heimes7f044312008-01-06 17:05:40 +0000348
Georg Brandl74f36692008-01-06 17:39:49 +0000349(5)
Christian Heimes7f044312008-01-06 17:05:40 +0000350 Python defines ``pow(0, 0)`` and ``0 ** 0`` to be ``1``, as is common for
351 programming languages.
352
Raymond Hettingerc706dbf2011-03-22 17:33:17 -0700353(6)
354 The numeric literals accepted include the digits ``0`` to ``9`` or any
355 Unicode equivalent (code points with the ``Nd`` property).
356
Benjamin Peterson48013832015-06-27 15:45:56 -0500357 See http://www.unicode.org/Public/8.0.0/ucd/extracted/DerivedNumericType.txt
Raymond Hettingerc706dbf2011-03-22 17:33:17 -0700358 for a complete list of code points with the ``Nd`` property.
Georg Brandl48310cd2009-01-03 21:18:54 +0000359
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000360
Benjamin Peterson10116d42011-05-01 17:38:17 -0500361All :class:`numbers.Real` types (:class:`int` and :class:`float`) also include
362the following operations:
Christian Heimesfaf2f632008-01-06 16:59:19 +0000363
Benjamin Petersonb58dda72009-01-18 22:27:04 +0000364+--------------------+------------------------------------+--------+
365| Operation | Result | Notes |
366+====================+====================================+========+
367| ``math.trunc(x)`` | *x* truncated to Integral | |
368+--------------------+------------------------------------+--------+
369| ``round(x[, n])`` | *x* rounded to n digits, | |
370| | rounding half to even. If n is | |
371| | omitted, it defaults to 0. | |
372+--------------------+------------------------------------+--------+
373| ``math.floor(x)`` | the greatest integral float <= *x* | |
374+--------------------+------------------------------------+--------+
375| ``math.ceil(x)`` | the least integral float >= *x* | |
376+--------------------+------------------------------------+--------+
Christian Heimesfaf2f632008-01-06 16:59:19 +0000377
Mark Summerfieldbbfd71d2008-07-01 15:50:04 +0000378For additional numeric operations see the :mod:`math` and :mod:`cmath`
379modules.
380
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000381.. XXXJH exceptions: overflow (when? what operations?) zerodivision
Georg Brandl116aa622007-08-15 14:28:22 +0000382
383
384.. _bitstring-ops:
385
Benjamin Petersone9fca252012-01-25 16:29:03 -0500386Bitwise Operations on Integer Types
Georg Brandl116aa622007-08-15 14:28:22 +0000387--------------------------------------
388
Alexandre Vassalotti6d3dfc32009-07-29 19:54:39 +0000389.. index::
390 triple: operations on; integer; types
Benjamin Petersone9fca252012-01-25 16:29:03 -0500391 pair: bitwise; operations
Alexandre Vassalotti6d3dfc32009-07-29 19:54:39 +0000392 pair: shifting; operations
393 pair: masking; operations
394 operator: ^
395 operator: &
396 operator: <<
397 operator: >>
Georg Brandl116aa622007-08-15 14:28:22 +0000398
Benjamin Petersonb4b0b352012-01-25 16:30:18 -0500399Bitwise operations only make sense for integers. Negative numbers are treated
400as their 2's complement value (this assumes a sufficiently large number of bits
401that no overflow occurs during the operation).
Georg Brandl116aa622007-08-15 14:28:22 +0000402
Christian Heimesfaf2f632008-01-06 16:59:19 +0000403The priorities of the binary bitwise operations are all lower than the numeric
Georg Brandl116aa622007-08-15 14:28:22 +0000404operations and higher than the comparisons; the unary operation ``~`` has the
405same priority as the other unary numeric operations (``+`` and ``-``).
406
Georg Brandle4196d32014-10-31 09:41:46 +0100407This table lists the bitwise operations sorted in ascending priority:
Georg Brandl116aa622007-08-15 14:28:22 +0000408
409+------------+--------------------------------+----------+
410| Operation | Result | Notes |
411+============+================================+==========+
412| ``x | y`` | bitwise :dfn:`or` of *x* and | |
413| | *y* | |
414+------------+--------------------------------+----------+
415| ``x ^ y`` | bitwise :dfn:`exclusive or` of | |
416| | *x* and *y* | |
417+------------+--------------------------------+----------+
418| ``x & y`` | bitwise :dfn:`and` of *x* and | |
419| | *y* | |
420+------------+--------------------------------+----------+
Christian Heimes043d6f62008-01-07 17:19:16 +0000421| ``x << n`` | *x* shifted left by *n* bits | (1)(2) |
Georg Brandl116aa622007-08-15 14:28:22 +0000422+------------+--------------------------------+----------+
Christian Heimes043d6f62008-01-07 17:19:16 +0000423| ``x >> n`` | *x* shifted right by *n* bits | (1)(3) |
Georg Brandl116aa622007-08-15 14:28:22 +0000424+------------+--------------------------------+----------+
425| ``~x`` | the bits of *x* inverted | |
426+------------+--------------------------------+----------+
427
Georg Brandl116aa622007-08-15 14:28:22 +0000428Notes:
429
430(1)
431 Negative shift counts are illegal and cause a :exc:`ValueError` to be raised.
432
433(2)
434 A left shift by *n* bits is equivalent to multiplication by ``pow(2, n)``
435 without overflow check.
436
437(3)
438 A right shift by *n* bits is equivalent to division by ``pow(2, n)`` without
439 overflow check.
440
441
Mark Dickinson54bc1ec2008-12-17 16:19:07 +0000442Additional Methods on Integer Types
443-----------------------------------
444
Raymond Hettinger9b2fd322011-05-01 18:14:49 -0700445The int type implements the :class:`numbers.Integral` :term:`abstract base
Georg Brandle4196d32014-10-31 09:41:46 +0100446class`. In addition, it provides a few more methods:
Benjamin Peterson10116d42011-05-01 17:38:17 -0500447
Mark Dickinson54bc1ec2008-12-17 16:19:07 +0000448.. method:: int.bit_length()
449
Raymond Hettingerd3e18b72008-12-19 09:11:49 +0000450 Return the number of bits necessary to represent an integer in binary,
451 excluding the sign and leading zeros::
Mark Dickinson54bc1ec2008-12-17 16:19:07 +0000452
Raymond Hettingerd3e18b72008-12-19 09:11:49 +0000453 >>> n = -37
Mark Dickinson54bc1ec2008-12-17 16:19:07 +0000454 >>> bin(n)
Raymond Hettingerd3e18b72008-12-19 09:11:49 +0000455 '-0b100101'
Mark Dickinson54bc1ec2008-12-17 16:19:07 +0000456 >>> n.bit_length()
457 6
Mark Dickinson54bc1ec2008-12-17 16:19:07 +0000458
Raymond Hettingerd3e18b72008-12-19 09:11:49 +0000459 More precisely, if ``x`` is nonzero, then ``x.bit_length()`` is the
460 unique positive integer ``k`` such that ``2**(k-1) <= abs(x) < 2**k``.
461 Equivalently, when ``abs(x)`` is small enough to have a correctly
462 rounded logarithm, then ``k = 1 + int(log(abs(x), 2))``.
463 If ``x`` is zero, then ``x.bit_length()`` returns ``0``.
Mark Dickinson54bc1ec2008-12-17 16:19:07 +0000464
465 Equivalent to::
466
467 def bit_length(self):
Senthil Kumaran0aae6dc2010-06-22 02:57:23 +0000468 s = bin(self) # binary representation: bin(-37) --> '-0b100101'
Raymond Hettingerd3e18b72008-12-19 09:11:49 +0000469 s = s.lstrip('-0b') # remove leading zeros and minus sign
470 return len(s) # len('100101') --> 6
Mark Dickinson54bc1ec2008-12-17 16:19:07 +0000471
472 .. versionadded:: 3.1
473
Georg Brandl67b21b72010-08-17 15:07:14 +0000474.. method:: int.to_bytes(length, byteorder, \*, signed=False)
Alexandre Vassalottic36c3782010-01-09 20:35:09 +0000475
476 Return an array of bytes representing an integer.
477
478 >>> (1024).to_bytes(2, byteorder='big')
479 b'\x04\x00'
480 >>> (1024).to_bytes(10, byteorder='big')
481 b'\x00\x00\x00\x00\x00\x00\x00\x00\x04\x00'
482 >>> (-1024).to_bytes(10, byteorder='big', signed=True)
483 b'\xff\xff\xff\xff\xff\xff\xff\xff\xfc\x00'
484 >>> x = 1000
485 >>> x.to_bytes((x.bit_length() // 8) + 1, byteorder='little')
486 b'\xe8\x03'
487
488 The integer is represented using *length* bytes. An :exc:`OverflowError`
489 is raised if the integer is not representable with the given number of
490 bytes.
491
492 The *byteorder* argument determines the byte order used to represent the
493 integer. If *byteorder* is ``"big"``, the most significant byte is at the
494 beginning of the byte array. If *byteorder* is ``"little"``, the most
495 significant byte is at the end of the byte array. To request the native
496 byte order of the host system, use :data:`sys.byteorder` as the byte order
497 value.
498
499 The *signed* argument determines whether two's complement is used to
500 represent the integer. If *signed* is ``False`` and a negative integer is
501 given, an :exc:`OverflowError` is raised. The default value for *signed*
502 is ``False``.
503
504 .. versionadded:: 3.2
505
Georg Brandl67b21b72010-08-17 15:07:14 +0000506.. classmethod:: int.from_bytes(bytes, byteorder, \*, signed=False)
Alexandre Vassalottic36c3782010-01-09 20:35:09 +0000507
508 Return the integer represented by the given array of bytes.
509
510 >>> int.from_bytes(b'\x00\x10', byteorder='big')
511 16
512 >>> int.from_bytes(b'\x00\x10', byteorder='little')
513 4096
514 >>> int.from_bytes(b'\xfc\x00', byteorder='big', signed=True)
515 -1024
516 >>> int.from_bytes(b'\xfc\x00', byteorder='big', signed=False)
517 64512
518 >>> int.from_bytes([255, 0, 0], byteorder='big')
519 16711680
520
Ezio Melottic228e962013-05-04 18:06:34 +0300521 The argument *bytes* must either be a :term:`bytes-like object` or an
522 iterable producing bytes.
Alexandre Vassalottic36c3782010-01-09 20:35:09 +0000523
524 The *byteorder* argument determines the byte order used to represent the
525 integer. If *byteorder* is ``"big"``, the most significant byte is at the
526 beginning of the byte array. If *byteorder* is ``"little"``, the most
527 significant byte is at the end of the byte array. To request the native
528 byte order of the host system, use :data:`sys.byteorder` as the byte order
529 value.
530
531 The *signed* argument indicates whether two's complement is used to
532 represent the integer.
533
534 .. versionadded:: 3.2
535
Mark Dickinson54bc1ec2008-12-17 16:19:07 +0000536
Mark Dickinson65fe25e2008-07-16 11:30:51 +0000537Additional Methods on Float
538---------------------------
539
Benjamin Peterson10116d42011-05-01 17:38:17 -0500540The float type implements the :class:`numbers.Real` :term:`abstract base
541class`. float also has the following additional methods.
Benjamin Petersond7b03282008-09-13 15:58:53 +0000542
543.. method:: float.as_integer_ratio()
544
Mark Dickinson4a3c7c42010-11-07 12:48:18 +0000545 Return a pair of integers whose ratio is exactly equal to the
546 original float and with a positive denominator. Raises
547 :exc:`OverflowError` on infinities and a :exc:`ValueError` on
548 NaNs.
549
550.. method:: float.is_integer()
551
552 Return ``True`` if the float instance is finite with integral
553 value, and ``False`` otherwise::
554
555 >>> (-2.0).is_integer()
556 True
557 >>> (3.2).is_integer()
558 False
Georg Brandl48310cd2009-01-03 21:18:54 +0000559
Benjamin Petersond7b03282008-09-13 15:58:53 +0000560Two methods support conversion to
Mark Dickinson65fe25e2008-07-16 11:30:51 +0000561and from hexadecimal strings. Since Python's floats are stored
562internally as binary numbers, converting a float to or from a
563*decimal* string usually involves a small rounding error. In
564contrast, hexadecimal strings allow exact representation and
565specification of floating-point numbers. This can be useful when
566debugging, and in numerical work.
567
568
569.. method:: float.hex()
570
571 Return a representation of a floating-point number as a hexadecimal
572 string. For finite floating-point numbers, this representation
573 will always include a leading ``0x`` and a trailing ``p`` and
574 exponent.
575
576
Georg Brandlabc38772009-04-12 15:51:51 +0000577.. classmethod:: float.fromhex(s)
Mark Dickinson65fe25e2008-07-16 11:30:51 +0000578
579 Class method to return the float represented by a hexadecimal
580 string *s*. The string *s* may have leading and trailing
581 whitespace.
582
583
584Note that :meth:`float.hex` is an instance method, while
585:meth:`float.fromhex` is a class method.
586
587A hexadecimal string takes the form::
588
589 [sign] ['0x'] integer ['.' fraction] ['p' exponent]
590
591where the optional ``sign`` may by either ``+`` or ``-``, ``integer``
592and ``fraction`` are strings of hexadecimal digits, and ``exponent``
593is a decimal integer with an optional leading sign. Case is not
594significant, and there must be at least one hexadecimal digit in
595either the integer or the fraction. This syntax is similar to the
596syntax specified in section 6.4.4.2 of the C99 standard, and also to
597the syntax used in Java 1.5 onwards. In particular, the output of
598:meth:`float.hex` is usable as a hexadecimal floating-point literal in
599C or Java code, and hexadecimal strings produced by C's ``%a`` format
600character or Java's ``Double.toHexString`` are accepted by
601:meth:`float.fromhex`.
602
603
604Note that the exponent is written in decimal rather than hexadecimal,
605and that it gives the power of 2 by which to multiply the coefficient.
606For example, the hexadecimal string ``0x3.a7p10`` represents the
607floating-point number ``(3 + 10./16 + 7./16**2) * 2.0**10``, or
608``3740.0``::
609
610 >>> float.fromhex('0x3.a7p10')
611 3740.0
612
613
614Applying the reverse conversion to ``3740.0`` gives a different
615hexadecimal string representing the same number::
616
617 >>> float.hex(3740.0)
618 '0x1.d380000000000p+11'
619
620
Mark Dickinsondc787d22010-05-23 13:33:13 +0000621.. _numeric-hash:
622
623Hashing of numeric types
624------------------------
625
626For numbers ``x`` and ``y``, possibly of different types, it's a requirement
627that ``hash(x) == hash(y)`` whenever ``x == y`` (see the :meth:`__hash__`
628method documentation for more details). For ease of implementation and
629efficiency across a variety of numeric types (including :class:`int`,
630:class:`float`, :class:`decimal.Decimal` and :class:`fractions.Fraction`)
631Python's hash for numeric types is based on a single mathematical function
632that's defined for any rational number, and hence applies to all instances of
Serhiy Storchaka0d196ed2013-10-09 14:02:31 +0300633:class:`int` and :class:`fractions.Fraction`, and all finite instances of
Mark Dickinsondc787d22010-05-23 13:33:13 +0000634:class:`float` and :class:`decimal.Decimal`. Essentially, this function is
635given by reduction modulo ``P`` for a fixed prime ``P``. The value of ``P`` is
636made available to Python as the :attr:`modulus` attribute of
637:data:`sys.hash_info`.
638
639.. impl-detail::
640
641 Currently, the prime used is ``P = 2**31 - 1`` on machines with 32-bit C
642 longs and ``P = 2**61 - 1`` on machines with 64-bit C longs.
643
644Here are the rules in detail:
645
Georg Brandl226ed7e2012-03-24 08:12:41 +0100646- If ``x = m / n`` is a nonnegative rational number and ``n`` is not divisible
647 by ``P``, define ``hash(x)`` as ``m * invmod(n, P) % P``, where ``invmod(n,
648 P)`` gives the inverse of ``n`` modulo ``P``.
Mark Dickinsondc787d22010-05-23 13:33:13 +0000649
Georg Brandl226ed7e2012-03-24 08:12:41 +0100650- If ``x = m / n`` is a nonnegative rational number and ``n`` is
651 divisible by ``P`` (but ``m`` is not) then ``n`` has no inverse
652 modulo ``P`` and the rule above doesn't apply; in this case define
653 ``hash(x)`` to be the constant value ``sys.hash_info.inf``.
Mark Dickinsondc787d22010-05-23 13:33:13 +0000654
Georg Brandl226ed7e2012-03-24 08:12:41 +0100655- If ``x = m / n`` is a negative rational number define ``hash(x)``
656 as ``-hash(-x)``. If the resulting hash is ``-1``, replace it with
657 ``-2``.
Mark Dickinsondc787d22010-05-23 13:33:13 +0000658
Georg Brandl226ed7e2012-03-24 08:12:41 +0100659- The particular values ``sys.hash_info.inf``, ``-sys.hash_info.inf``
660 and ``sys.hash_info.nan`` are used as hash values for positive
661 infinity, negative infinity, or nans (respectively). (All hashable
662 nans have the same hash value.)
Mark Dickinsondc787d22010-05-23 13:33:13 +0000663
Georg Brandl226ed7e2012-03-24 08:12:41 +0100664- For a :class:`complex` number ``z``, the hash values of the real
665 and imaginary parts are combined by computing ``hash(z.real) +
666 sys.hash_info.imag * hash(z.imag)``, reduced modulo
667 ``2**sys.hash_info.width`` so that it lies in
668 ``range(-2**(sys.hash_info.width - 1), 2**(sys.hash_info.width -
669 1))``. Again, if the result is ``-1``, it's replaced with ``-2``.
Mark Dickinsondc787d22010-05-23 13:33:13 +0000670
671
672To clarify the above rules, here's some example Python code,
Nick Coghlan273069c2012-08-20 17:14:07 +1000673equivalent to the built-in hash, for computing the hash of a rational
Mark Dickinsondc787d22010-05-23 13:33:13 +0000674number, :class:`float`, or :class:`complex`::
675
676
677 import sys, math
678
679 def hash_fraction(m, n):
680 """Compute the hash of a rational number m / n.
681
682 Assumes m and n are integers, with n positive.
683 Equivalent to hash(fractions.Fraction(m, n)).
684
685 """
686 P = sys.hash_info.modulus
687 # Remove common factors of P. (Unnecessary if m and n already coprime.)
688 while m % P == n % P == 0:
689 m, n = m // P, n // P
690
691 if n % P == 0:
692 hash_ = sys.hash_info.inf
693 else:
694 # Fermat's Little Theorem: pow(n, P-1, P) is 1, so
695 # pow(n, P-2, P) gives the inverse of n modulo P.
696 hash_ = (abs(m) % P) * pow(n, P - 2, P) % P
697 if m < 0:
698 hash_ = -hash_
699 if hash_ == -1:
700 hash_ = -2
701 return hash_
702
703 def hash_float(x):
704 """Compute the hash of a float x."""
705
706 if math.isnan(x):
707 return sys.hash_info.nan
708 elif math.isinf(x):
709 return sys.hash_info.inf if x > 0 else -sys.hash_info.inf
710 else:
711 return hash_fraction(*x.as_integer_ratio())
712
713 def hash_complex(z):
714 """Compute the hash of a complex number z."""
715
716 hash_ = hash_float(z.real) + sys.hash_info.imag * hash_float(z.imag)
717 # do a signed reduction modulo 2**sys.hash_info.width
718 M = 2**(sys.hash_info.width - 1)
719 hash_ = (hash_ & (M - 1)) - (hash & M)
720 if hash_ == -1:
721 hash_ == -2
722 return hash_
723
Georg Brandl6ea420b2008-07-16 12:58:29 +0000724.. _typeiter:
725
Georg Brandl116aa622007-08-15 14:28:22 +0000726Iterator Types
727==============
728
Georg Brandl116aa622007-08-15 14:28:22 +0000729.. index::
730 single: iterator protocol
731 single: protocol; iterator
732 single: sequence; iteration
733 single: container; iteration over
734
735Python supports a concept of iteration over containers. This is implemented
736using two distinct methods; these are used to allow user-defined classes to
737support iteration. Sequences, described below in more detail, always support
738the iteration methods.
739
740One method needs to be defined for container objects to provide iteration
741support:
742
Christian Heimes790c8232008-01-07 21:14:23 +0000743.. XXX duplicated in reference/datamodel!
Georg Brandl116aa622007-08-15 14:28:22 +0000744
Christian Heimes790c8232008-01-07 21:14:23 +0000745.. method:: container.__iter__()
Georg Brandl116aa622007-08-15 14:28:22 +0000746
747 Return an iterator object. The object is required to support the iterator
748 protocol described below. If a container supports different types of
749 iteration, additional methods can be provided to specifically request
750 iterators for those iteration types. (An example of an object supporting
751 multiple forms of iteration would be a tree structure which supports both
752 breadth-first and depth-first traversal.) This method corresponds to the
Antoine Pitrou39668f52013-08-01 21:12:45 +0200753 :c:member:`~PyTypeObject.tp_iter` slot of the type structure for Python objects in the Python/C
Georg Brandl116aa622007-08-15 14:28:22 +0000754 API.
755
756The iterator objects themselves are required to support the following two
757methods, which together form the :dfn:`iterator protocol`:
758
759
760.. method:: iterator.__iter__()
761
762 Return the iterator object itself. This is required to allow both containers
763 and iterators to be used with the :keyword:`for` and :keyword:`in` statements.
Antoine Pitrou39668f52013-08-01 21:12:45 +0200764 This method corresponds to the :c:member:`~PyTypeObject.tp_iter` slot of the type structure for
Georg Brandl116aa622007-08-15 14:28:22 +0000765 Python objects in the Python/C API.
766
767
Georg Brandl905ec322007-09-28 13:39:25 +0000768.. method:: iterator.__next__()
Georg Brandl116aa622007-08-15 14:28:22 +0000769
770 Return the next item from the container. If there are no further items, raise
771 the :exc:`StopIteration` exception. This method corresponds to the
Antoine Pitrou39668f52013-08-01 21:12:45 +0200772 :c:member:`~PyTypeObject.tp_iternext` slot of the type structure for Python objects in the
Georg Brandl116aa622007-08-15 14:28:22 +0000773 Python/C API.
774
775Python defines several iterator objects to support iteration over general and
776specific sequence types, dictionaries, and other more specialized forms. The
777specific types are not important beyond their implementation of the iterator
778protocol.
779
Ezio Melotti7fa82222012-10-12 13:42:08 +0300780Once an iterator's :meth:`~iterator.__next__` method raises
781:exc:`StopIteration`, it must continue to do so on subsequent calls.
782Implementations that do not obey this property are deemed broken.
Georg Brandl116aa622007-08-15 14:28:22 +0000783
Benjamin Peterson0289b152009-06-28 17:22:03 +0000784
785.. _generator-types:
786
787Generator Types
788---------------
789
Georg Brandl9afde1c2007-11-01 20:32:30 +0000790Python's :term:`generator`\s provide a convenient way to implement the iterator
791protocol. If a container object's :meth:`__iter__` method is implemented as a
792generator, it will automatically return an iterator object (technically, a
Ezio Melotti7fa82222012-10-12 13:42:08 +0300793generator object) supplying the :meth:`__iter__` and :meth:`~generator.__next__`
794methods.
Benjamin Peterson0289b152009-06-28 17:22:03 +0000795More information about generators can be found in :ref:`the documentation for
796the yield expression <yieldexpr>`.
Georg Brandl116aa622007-08-15 14:28:22 +0000797
798
799.. _typesseq:
800
Nick Coghlan273069c2012-08-20 17:14:07 +1000801Sequence Types --- :class:`list`, :class:`tuple`, :class:`range`
802================================================================
Georg Brandl116aa622007-08-15 14:28:22 +0000803
Nick Coghlan273069c2012-08-20 17:14:07 +1000804There are three basic sequence types: lists, tuples, and range objects.
805Additional sequence types tailored for processing of
806:ref:`binary data <binaryseq>` and :ref:`text strings <textseq>` are
807described in dedicated sections.
Georg Brandle17d5862009-01-18 10:40:25 +0000808
Georg Brandl116aa622007-08-15 14:28:22 +0000809
Nick Coghlan273069c2012-08-20 17:14:07 +1000810.. _typesseq-common:
Georg Brandl116aa622007-08-15 14:28:22 +0000811
Nick Coghlan273069c2012-08-20 17:14:07 +1000812Common Sequence Operations
813--------------------------
Georg Brandl7c676132007-10-23 18:17:00 +0000814
Nick Coghlan273069c2012-08-20 17:14:07 +1000815.. index:: object: sequence
Georg Brandl4b491312007-08-31 09:22:56 +0000816
Nick Coghlan273069c2012-08-20 17:14:07 +1000817The operations in the following table are supported by most sequence types,
818both mutable and immutable. The :class:`collections.abc.Sequence` ABC is
819provided to make it easier to correctly implement these operations on
820custom sequence types.
Georg Brandl116aa622007-08-15 14:28:22 +0000821
Georg Brandle4196d32014-10-31 09:41:46 +0100822This table lists the sequence operations sorted in ascending priority. In the
823table, *s* and *t* are sequences of the same type, *n*, *i*, *j* and *k* are
824integers and *x* is an arbitrary object that meets any type and value
825restrictions imposed by *s*.
Georg Brandl116aa622007-08-15 14:28:22 +0000826
Nick Coghlan273069c2012-08-20 17:14:07 +1000827The ``in`` and ``not in`` operations have the same priorities as the
828comparison operations. The ``+`` (concatenation) and ``*`` (repetition)
829operations have the same priority as the corresponding numeric operations.
Georg Brandl116aa622007-08-15 14:28:22 +0000830
Nick Coghlan83c0ae52012-08-21 17:42:52 +1000831.. index::
832 triple: operations on; sequence; types
833 builtin: len
834 builtin: min
835 builtin: max
836 pair: concatenation; operation
837 pair: repetition; operation
838 pair: subscript; operation
839 pair: slice; operation
840 operator: in
841 operator: not in
842 single: count() (sequence method)
843 single: index() (sequence method)
844
Nick Coghlan273069c2012-08-20 17:14:07 +1000845+--------------------------+--------------------------------+----------+
846| Operation | Result | Notes |
847+==========================+================================+==========+
848| ``x in s`` | ``True`` if an item of *s* is | \(1) |
849| | equal to *x*, else ``False`` | |
850+--------------------------+--------------------------------+----------+
851| ``x not in s`` | ``False`` if an item of *s* is | \(1) |
852| | equal to *x*, else ``True`` | |
853+--------------------------+--------------------------------+----------+
854| ``s + t`` | the concatenation of *s* and | (6)(7) |
855| | *t* | |
856+--------------------------+--------------------------------+----------+
Martin Panter7f02d6d2015-09-07 02:08:55 +0000857| ``s * n`` or | equivalent to adding *s* to | (2)(7) |
858| ``n * s`` | itself *n* times | |
Nick Coghlan273069c2012-08-20 17:14:07 +1000859+--------------------------+--------------------------------+----------+
860| ``s[i]`` | *i*\ th item of *s*, origin 0 | \(3) |
861+--------------------------+--------------------------------+----------+
862| ``s[i:j]`` | slice of *s* from *i* to *j* | (3)(4) |
863+--------------------------+--------------------------------+----------+
864| ``s[i:j:k]`` | slice of *s* from *i* to *j* | (3)(5) |
865| | with step *k* | |
866+--------------------------+--------------------------------+----------+
867| ``len(s)`` | length of *s* | |
868+--------------------------+--------------------------------+----------+
869| ``min(s)`` | smallest item of *s* | |
870+--------------------------+--------------------------------+----------+
871| ``max(s)`` | largest item of *s* | |
872+--------------------------+--------------------------------+----------+
Ned Deily0995c472013-07-14 12:43:16 -0700873| ``s.index(x[, i[, j]])`` | index of the first occurrence | \(8) |
Nick Coghlan273069c2012-08-20 17:14:07 +1000874| | of *x* in *s* (at or after | |
875| | index *i* and before index *j*)| |
876+--------------------------+--------------------------------+----------+
Ned Deily0995c472013-07-14 12:43:16 -0700877| ``s.count(x)`` | total number of occurrences of | |
Nick Coghlan273069c2012-08-20 17:14:07 +1000878| | *x* in *s* | |
879+--------------------------+--------------------------------+----------+
880
881Sequences of the same type also support comparisons. In particular, tuples
882and lists are compared lexicographically by comparing corresponding elements.
883This means that to compare equal, every element must compare equal and the
884two sequences must be of the same type and have the same length. (For full
885details see :ref:`comparisons` in the language reference.)
Georg Brandl116aa622007-08-15 14:28:22 +0000886
Georg Brandl116aa622007-08-15 14:28:22 +0000887Notes:
888
889(1)
Nick Coghlan273069c2012-08-20 17:14:07 +1000890 While the ``in`` and ``not in`` operations are used only for simple
891 containment testing in the general case, some specialised sequences
892 (such as :class:`str`, :class:`bytes` and :class:`bytearray`) also use
893 them for subsequence testing::
894
895 >>> "gg" in "eggs"
896 True
Georg Brandl116aa622007-08-15 14:28:22 +0000897
898(2)
899 Values of *n* less than ``0`` are treated as ``0`` (which yields an empty
Martin Panter7f02d6d2015-09-07 02:08:55 +0000900 sequence of the same type as *s*). Note that items in the sequence *s*
901 are not copied; they are referenced multiple times. This often haunts
902 new Python programmers; consider::
Georg Brandl116aa622007-08-15 14:28:22 +0000903
904 >>> lists = [[]] * 3
905 >>> lists
906 [[], [], []]
907 >>> lists[0].append(3)
908 >>> lists
909 [[3], [3], [3]]
910
911 What has happened is that ``[[]]`` is a one-element list containing an empty
Martin Panter7f02d6d2015-09-07 02:08:55 +0000912 list, so all three elements of ``[[]] * 3`` are references to this single empty
Christian Heimesfe337bf2008-03-23 21:54:12 +0000913 list. Modifying any of the elements of ``lists`` modifies this single list.
Nick Coghlan273069c2012-08-20 17:14:07 +1000914 You can create a list of different lists this way::
Georg Brandl116aa622007-08-15 14:28:22 +0000915
916 >>> lists = [[] for i in range(3)]
917 >>> lists[0].append(3)
918 >>> lists[1].append(5)
919 >>> lists[2].append(7)
920 >>> lists
921 [[3], [5], [7]]
922
Martin Panter7f02d6d2015-09-07 02:08:55 +0000923 Further explanation is available in the FAQ entry
924 :ref:`faq-multidimensional-list`.
925
Georg Brandl116aa622007-08-15 14:28:22 +0000926(3)
927 If *i* or *j* is negative, the index is relative to the end of the string:
Georg Brandl7c676132007-10-23 18:17:00 +0000928 ``len(s) + i`` or ``len(s) + j`` is substituted. But note that ``-0`` is
929 still ``0``.
Georg Brandl116aa622007-08-15 14:28:22 +0000930
931(4)
932 The slice of *s* from *i* to *j* is defined as the sequence of items with index
933 *k* such that ``i <= k < j``. If *i* or *j* is greater than ``len(s)``, use
934 ``len(s)``. If *i* is omitted or ``None``, use ``0``. If *j* is omitted or
935 ``None``, use ``len(s)``. If *i* is greater than or equal to *j*, the slice is
936 empty.
937
938(5)
939 The slice of *s* from *i* to *j* with step *k* is defined as the sequence of
Christian Heimes2c181612007-12-17 20:04:13 +0000940 items with index ``x = i + n*k`` such that ``0 <= n < (j-i)/k``. In other words,
Georg Brandl116aa622007-08-15 14:28:22 +0000941 the indices are ``i``, ``i+k``, ``i+2*k``, ``i+3*k`` and so on, stopping when
942 *j* is reached (but never including *j*). If *i* or *j* is greater than
943 ``len(s)``, use ``len(s)``. If *i* or *j* are omitted or ``None``, they become
944 "end" values (which end depends on the sign of *k*). Note, *k* cannot be zero.
945 If *k* is ``None``, it is treated like ``1``.
946
947(6)
Nick Coghlan273069c2012-08-20 17:14:07 +1000948 Concatenating immutable sequences always results in a new object. This
949 means that building up a sequence by repeated concatenation will have a
950 quadratic runtime cost in the total sequence length. To get a linear
951 runtime cost, you must switch to one of the alternatives below:
Georg Brandl495f7b52009-10-27 15:28:25 +0000952
Antoine Pitroufd9ebd42011-11-25 16:33:53 +0100953 * if concatenating :class:`str` objects, you can build a list and use
Martin Panter7462b6492015-11-02 03:37:02 +0000954 :meth:`str.join` at the end or else write to an :class:`io.StringIO`
Nick Coghlan83c0ae52012-08-21 17:42:52 +1000955 instance and retrieve its value when complete
Antoine Pitroufd9ebd42011-11-25 16:33:53 +0100956
957 * if concatenating :class:`bytes` objects, you can similarly use
Nick Coghlan273069c2012-08-20 17:14:07 +1000958 :meth:`bytes.join` or :class:`io.BytesIO`, or you can do in-place
959 concatenation with a :class:`bytearray` object. :class:`bytearray`
Nick Coghlan83c0ae52012-08-21 17:42:52 +1000960 objects are mutable and have an efficient overallocation mechanism
Georg Brandl116aa622007-08-15 14:28:22 +0000961
Nick Coghlan83c0ae52012-08-21 17:42:52 +1000962 * if concatenating :class:`tuple` objects, extend a :class:`list` instead
Nick Coghlan273069c2012-08-20 17:14:07 +1000963
964 * for other types, investigate the relevant class documentation
965
966
967(7)
968 Some sequence types (such as :class:`range`) only support item sequences
969 that follow specific patterns, and hence don't support sequence
970 concatenation or repetition.
971
972(8)
973 ``index`` raises :exc:`ValueError` when *x* is not found in *s*.
974 When supported, the additional arguments to the index method allow
975 efficient searching of subsections of the sequence. Passing the extra
976 arguments is roughly equivalent to using ``s[i:j].index(x)``, only
977 without copying any data and with the returned index being relative to
978 the start of the sequence rather than the start of the slice.
979
980
981.. _typesseq-immutable:
982
983Immutable Sequence Types
984------------------------
985
986.. index::
987 triple: immutable; sequence; types
988 object: tuple
Nick Coghlan83c0ae52012-08-21 17:42:52 +1000989 builtin: hash
Nick Coghlan273069c2012-08-20 17:14:07 +1000990
991The only operation that immutable sequence types generally implement that is
992not also implemented by mutable sequence types is support for the :func:`hash`
993built-in.
994
995This support allows immutable sequences, such as :class:`tuple` instances, to
996be used as :class:`dict` keys and stored in :class:`set` and :class:`frozenset`
997instances.
998
999Attempting to hash an immutable sequence that contains unhashable values will
1000result in :exc:`TypeError`.
1001
1002
1003.. _typesseq-mutable:
1004
1005Mutable Sequence Types
1006----------------------
1007
1008.. index::
1009 triple: mutable; sequence; types
1010 object: list
1011 object: bytearray
1012
1013The operations in the following table are defined on mutable sequence types.
1014The :class:`collections.abc.MutableSequence` ABC is provided to make it
1015easier to correctly implement these operations on custom sequence types.
1016
1017In the table *s* is an instance of a mutable sequence type, *t* is any
1018iterable object and *x* is an arbitrary object that meets any type
1019and value restrictions imposed by *s* (for example, :class:`bytearray` only
1020accepts integers that meet the value restriction ``0 <= x <= 255``).
1021
1022
1023.. index::
1024 triple: operations on; sequence; types
1025 triple: operations on; list; type
1026 pair: subscript; assignment
1027 pair: slice; assignment
1028 statement: del
1029 single: append() (sequence method)
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001030 single: clear() (sequence method)
1031 single: copy() (sequence method)
Nick Coghlan273069c2012-08-20 17:14:07 +10001032 single: extend() (sequence method)
Nick Coghlan273069c2012-08-20 17:14:07 +10001033 single: insert() (sequence method)
1034 single: pop() (sequence method)
1035 single: remove() (sequence method)
1036 single: reverse() (sequence method)
1037
1038+------------------------------+--------------------------------+---------------------+
1039| Operation | Result | Notes |
1040+==============================+================================+=====================+
1041| ``s[i] = x`` | item *i* of *s* is replaced by | |
1042| | *x* | |
1043+------------------------------+--------------------------------+---------------------+
1044| ``s[i:j] = t`` | slice of *s* from *i* to *j* | |
1045| | is replaced by the contents of | |
1046| | the iterable *t* | |
1047+------------------------------+--------------------------------+---------------------+
1048| ``del s[i:j]`` | same as ``s[i:j] = []`` | |
1049+------------------------------+--------------------------------+---------------------+
1050| ``s[i:j:k] = t`` | the elements of ``s[i:j:k]`` | \(1) |
1051| | are replaced by those of *t* | |
1052+------------------------------+--------------------------------+---------------------+
1053| ``del s[i:j:k]`` | removes the elements of | |
1054| | ``s[i:j:k]`` from the list | |
1055+------------------------------+--------------------------------+---------------------+
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001056| ``s.append(x)`` | appends *x* to the end of the | |
1057| | sequence (same as | |
1058| | ``s[len(s):len(s)] = [x]``) | |
Nick Coghlan273069c2012-08-20 17:14:07 +10001059+------------------------------+--------------------------------+---------------------+
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001060| ``s.clear()`` | removes all items from ``s`` | \(5) |
Nick Coghlan273069c2012-08-20 17:14:07 +10001061| | (same as ``del s[:]``) | |
1062+------------------------------+--------------------------------+---------------------+
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001063| ``s.copy()`` | creates a shallow copy of ``s``| \(5) |
Nick Coghlan273069c2012-08-20 17:14:07 +10001064| | (same as ``s[:]``) | |
1065+------------------------------+--------------------------------+---------------------+
Martin Panter3795d122015-10-03 07:46:04 +00001066| ``s.extend(t)`` or | extends *s* with the | |
1067| ``s += t`` | contents of *t* (for the | |
1068| | most part the same as | |
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001069| | ``s[len(s):len(s)] = t``) | |
Nick Coghlan273069c2012-08-20 17:14:07 +10001070+------------------------------+--------------------------------+---------------------+
Martin Panter3795d122015-10-03 07:46:04 +00001071| ``s *= n`` | updates *s* with its contents | \(6) |
1072| | repeated *n* times | |
1073+------------------------------+--------------------------------+---------------------+
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001074| ``s.insert(i, x)`` | inserts *x* into *s* at the | |
1075| | index given by *i* | |
1076| | (same as ``s[i:i] = [x]``) | |
Nick Coghlan273069c2012-08-20 17:14:07 +10001077+------------------------------+--------------------------------+---------------------+
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001078| ``s.pop([i])`` | retrieves the item at *i* and | \(2) |
1079| | also removes it from *s* | |
Nick Coghlan273069c2012-08-20 17:14:07 +10001080+------------------------------+--------------------------------+---------------------+
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001081| ``s.remove(x)`` | remove the first item from *s* | \(3) |
1082| | where ``s[i] == x`` | |
Nick Coghlan273069c2012-08-20 17:14:07 +10001083+------------------------------+--------------------------------+---------------------+
1084| ``s.reverse()`` | reverses the items of *s* in | \(4) |
1085| | place | |
1086+------------------------------+--------------------------------+---------------------+
1087
1088
1089Notes:
1090
1091(1)
1092 *t* must have the same length as the slice it is replacing.
1093
1094(2)
1095 The optional argument *i* defaults to ``-1``, so that by default the last
1096 item is removed and returned.
1097
1098(3)
1099 ``remove`` raises :exc:`ValueError` when *x* is not found in *s*.
1100
1101(4)
1102 The :meth:`reverse` method modifies the sequence in place for economy of
1103 space when reversing a large sequence. To remind users that it operates by
1104 side effect, it does not return the reversed sequence.
1105
1106(5)
1107 :meth:`clear` and :meth:`!copy` are included for consistency with the
1108 interfaces of mutable containers that don't support slicing operations
1109 (such as :class:`dict` and :class:`set`)
1110
1111 .. versionadded:: 3.3
1112 :meth:`clear` and :meth:`!copy` methods.
1113
Martin Panter3795d122015-10-03 07:46:04 +00001114(6)
1115 The value *n* is an integer, or an object implementing
1116 :meth:`~object.__index__`. Zero and negative values of *n* clear
1117 the sequence. Items in the sequence are not copied; they are referenced
1118 multiple times, as explained for ``s * n`` under :ref:`typesseq-common`.
1119
Nick Coghlan273069c2012-08-20 17:14:07 +10001120
1121.. _typesseq-list:
1122
1123Lists
1124-----
1125
1126.. index:: object: list
1127
1128Lists are mutable sequences, typically used to store collections of
1129homogeneous items (where the precise degree of similarity will vary by
1130application).
1131
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001132.. class:: list([iterable])
Nick Coghlan273069c2012-08-20 17:14:07 +10001133
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001134 Lists may be constructed in several ways:
Nick Coghlan273069c2012-08-20 17:14:07 +10001135
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001136 * Using a pair of square brackets to denote the empty list: ``[]``
1137 * Using square brackets, separating items with commas: ``[a]``, ``[a, b, c]``
1138 * Using a list comprehension: ``[x for x in iterable]``
1139 * Using the type constructor: ``list()`` or ``list(iterable)``
Nick Coghlan273069c2012-08-20 17:14:07 +10001140
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001141 The constructor builds a list whose items are the same and in the same
1142 order as *iterable*'s items. *iterable* may be either a sequence, a
1143 container that supports iteration, or an iterator object. If *iterable*
1144 is already a list, a copy is made and returned, similar to ``iterable[:]``.
1145 For example, ``list('abc')`` returns ``['a', 'b', 'c']`` and
1146 ``list( (1, 2, 3) )`` returns ``[1, 2, 3]``.
1147 If no argument is given, the constructor creates a new empty list, ``[]``.
Nick Coghlan273069c2012-08-20 17:14:07 +10001148
Nick Coghlan273069c2012-08-20 17:14:07 +10001149
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001150 Many other operations also produce lists, including the :func:`sorted`
1151 built-in.
Nick Coghlan273069c2012-08-20 17:14:07 +10001152
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001153 Lists implement all of the :ref:`common <typesseq-common>` and
1154 :ref:`mutable <typesseq-mutable>` sequence operations. Lists also provide the
1155 following additional method:
Nick Coghlan273069c2012-08-20 17:14:07 +10001156
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001157 .. method:: list.sort(*, key=None, reverse=None)
Nick Coghlan273069c2012-08-20 17:14:07 +10001158
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001159 This method sorts the list in place, using only ``<`` comparisons
1160 between items. Exceptions are not suppressed - if any comparison operations
1161 fail, the entire sort operation will fail (and the list will likely be left
1162 in a partially modified state).
Nick Coghlan273069c2012-08-20 17:14:07 +10001163
Zachary Waree1391a02013-11-22 13:58:34 -06001164 :meth:`sort` accepts two arguments that can only be passed by keyword
1165 (:ref:`keyword-only arguments <keyword-only_parameter>`):
1166
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001167 *key* specifies a function of one argument that is used to extract a
1168 comparison key from each list element (for example, ``key=str.lower``).
1169 The key corresponding to each item in the list is calculated once and
1170 then used for the entire sorting process. The default value of ``None``
1171 means that list items are sorted directly without calculating a separate
1172 key value.
Nick Coghlan273069c2012-08-20 17:14:07 +10001173
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001174 The :func:`functools.cmp_to_key` utility is available to convert a 2.x
1175 style *cmp* function to a *key* function.
Nick Coghlan273069c2012-08-20 17:14:07 +10001176
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001177 *reverse* is a boolean value. If set to ``True``, then the list elements
1178 are sorted as if each comparison were reversed.
Nick Coghlan273069c2012-08-20 17:14:07 +10001179
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001180 This method modifies the sequence in place for economy of space when
1181 sorting a large sequence. To remind users that it operates by side
1182 effect, it does not return the sorted sequence (use :func:`sorted` to
1183 explicitly request a new sorted list instance).
1184
1185 The :meth:`sort` method is guaranteed to be stable. A sort is stable if it
1186 guarantees not to change the relative order of elements that compare equal
1187 --- this is helpful for sorting in multiple passes (for example, sort by
1188 department, then by salary grade).
1189
1190 .. impl-detail::
1191
1192 While a list is being sorted, the effect of attempting to mutate, or even
1193 inspect, the list is undefined. The C implementation of Python makes the
1194 list appear empty for the duration, and raises :exc:`ValueError` if it can
1195 detect that the list has been mutated during a sort.
Nick Coghlan273069c2012-08-20 17:14:07 +10001196
1197
1198.. _typesseq-tuple:
1199
1200Tuples
1201------
1202
1203.. index:: object: tuple
1204
1205Tuples are immutable sequences, typically used to store collections of
1206heterogeneous data (such as the 2-tuples produced by the :func:`enumerate`
1207built-in). Tuples are also used for cases where an immutable sequence of
1208homogeneous data is needed (such as allowing storage in a :class:`set` or
1209:class:`dict` instance).
1210
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001211.. class:: tuple([iterable])
Nick Coghlan273069c2012-08-20 17:14:07 +10001212
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001213 Tuples may be constructed in a number of ways:
Nick Coghlan273069c2012-08-20 17:14:07 +10001214
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001215 * Using a pair of parentheses to denote the empty tuple: ``()``
1216 * Using a trailing comma for a singleton tuple: ``a,`` or ``(a,)``
1217 * Separating items with commas: ``a, b, c`` or ``(a, b, c)``
1218 * Using the :func:`tuple` built-in: ``tuple()`` or ``tuple(iterable)``
Nick Coghlan273069c2012-08-20 17:14:07 +10001219
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001220 The constructor builds a tuple whose items are the same and in the same
1221 order as *iterable*'s items. *iterable* may be either a sequence, a
1222 container that supports iteration, or an iterator object. If *iterable*
1223 is already a tuple, it is returned unchanged. For example,
1224 ``tuple('abc')`` returns ``('a', 'b', 'c')`` and
1225 ``tuple( [1, 2, 3] )`` returns ``(1, 2, 3)``.
1226 If no argument is given, the constructor creates a new empty tuple, ``()``.
Nick Coghlan273069c2012-08-20 17:14:07 +10001227
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001228 Note that it is actually the comma which makes a tuple, not the parentheses.
1229 The parentheses are optional, except in the empty tuple case, or
1230 when they are needed to avoid syntactic ambiguity. For example,
1231 ``f(a, b, c)`` is a function call with three arguments, while
1232 ``f((a, b, c))`` is a function call with a 3-tuple as the sole argument.
1233
1234 Tuples implement all of the :ref:`common <typesseq-common>` sequence
1235 operations.
1236
1237For heterogeneous collections of data where access by name is clearer than
1238access by index, :func:`collections.namedtuple` may be a more appropriate
1239choice than a simple tuple object.
Nick Coghlan273069c2012-08-20 17:14:07 +10001240
1241
1242.. _typesseq-range:
1243
1244Ranges
1245------
1246
1247.. index:: object: range
1248
1249The :class:`range` type represents an immutable sequence of numbers and is
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001250commonly used for looping a specific number of times in :keyword:`for`
1251loops.
Nick Coghlan273069c2012-08-20 17:14:07 +10001252
Ezio Melotti8429b672012-09-14 06:35:09 +03001253.. class:: range(stop)
1254 range(start, stop[, step])
Nick Coghlan273069c2012-08-20 17:14:07 +10001255
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001256 The arguments to the range constructor must be integers (either built-in
1257 :class:`int` or any object that implements the ``__index__`` special
1258 method). If the *step* argument is omitted, it defaults to ``1``.
1259 If the *start* argument is omitted, it defaults to ``0``.
1260 If *step* is zero, :exc:`ValueError` is raised.
1261
1262 For a positive *step*, the contents of a range ``r`` are determined by the
1263 formula ``r[i] = start + step*i`` where ``i >= 0`` and
1264 ``r[i] < stop``.
1265
1266 For a negative *step*, the contents of the range are still determined by
1267 the formula ``r[i] = start + step*i``, but the constraints are ``i >= 0``
1268 and ``r[i] > stop``.
1269
Sandro Tosi4c1b9f42013-01-27 00:33:04 +01001270 A range object will be empty if ``r[0]`` does not meet the value
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001271 constraint. Ranges do support negative indices, but these are interpreted
1272 as indexing from the end of the sequence determined by the positive
1273 indices.
1274
1275 Ranges containing absolute values larger than :data:`sys.maxsize` are
1276 permitted but some features (such as :func:`len`) may raise
1277 :exc:`OverflowError`.
1278
1279 Range examples::
1280
1281 >>> list(range(10))
1282 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
1283 >>> list(range(1, 11))
1284 [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
1285 >>> list(range(0, 30, 5))
1286 [0, 5, 10, 15, 20, 25]
1287 >>> list(range(0, 10, 3))
1288 [0, 3, 6, 9]
1289 >>> list(range(0, -10, -1))
1290 [0, -1, -2, -3, -4, -5, -6, -7, -8, -9]
1291 >>> list(range(0))
1292 []
1293 >>> list(range(1, 0))
1294 []
1295
1296 Ranges implement all of the :ref:`common <typesseq-common>` sequence operations
1297 except concatenation and repetition (due to the fact that range objects can
1298 only represent sequences that follow a strict pattern and repetition and
1299 concatenation will usually violate that pattern).
1300
Georg Brandl8c16cb92016-02-25 20:17:45 +01001301 .. attribute:: start
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001302
1303 The value of the *start* parameter (or ``0`` if the parameter was
1304 not supplied)
1305
Georg Brandl8c16cb92016-02-25 20:17:45 +01001306 .. attribute:: stop
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001307
1308 The value of the *stop* parameter
1309
Georg Brandl8c16cb92016-02-25 20:17:45 +01001310 .. attribute:: step
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001311
1312 The value of the *step* parameter (or ``1`` if the parameter was
1313 not supplied)
Nick Coghlan273069c2012-08-20 17:14:07 +10001314
1315The advantage of the :class:`range` type over a regular :class:`list` or
1316:class:`tuple` is that a :class:`range` object will always take the same
1317(small) amount of memory, no matter the size of the range it represents (as it
1318only stores the ``start``, ``stop`` and ``step`` values, calculating individual
1319items and subranges as needed).
1320
Serhiy Storchaka0d196ed2013-10-09 14:02:31 +03001321Range objects implement the :class:`collections.abc.Sequence` ABC, and provide
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001322features such as containment tests, element index lookup, slicing and
1323support for negative indices (see :ref:`typesseq`):
1324
1325 >>> r = range(0, 20, 2)
1326 >>> r
1327 range(0, 20, 2)
1328 >>> 11 in r
1329 False
1330 >>> 10 in r
1331 True
1332 >>> r.index(10)
1333 5
1334 >>> r[5]
1335 10
1336 >>> r[:5]
1337 range(0, 10, 2)
1338 >>> r[-1]
1339 18
1340
1341Testing range objects for equality with ``==`` and ``!=`` compares
1342them as sequences. That is, two range objects are considered equal if
1343they represent the same sequence of values. (Note that two range
Serhiy Storchaka0d196ed2013-10-09 14:02:31 +03001344objects that compare equal might have different :attr:`~range.start`,
1345:attr:`~range.stop` and :attr:`~range.step` attributes, for example
1346``range(0) == range(2, 1, 3)`` or ``range(0, 3, 2) == range(0, 4, 2)``.)
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001347
1348.. versionchanged:: 3.2
1349 Implement the Sequence ABC.
1350 Support slicing and negative indices.
1351 Test :class:`int` objects for membership in constant time instead of
1352 iterating through all items.
1353
1354.. versionchanged:: 3.3
1355 Define '==' and '!=' to compare range objects based on the
1356 sequence of values they define (instead of comparing based on
1357 object identity).
1358
1359.. versionadded:: 3.3
Serhiy Storchaka0d196ed2013-10-09 14:02:31 +03001360 The :attr:`~range.start`, :attr:`~range.stop` and :attr:`~range.step`
1361 attributes.
Nick Coghlan273069c2012-08-20 17:14:07 +10001362
1363
Chris Jerdonek5fae0e52012-11-20 17:45:51 -08001364.. index::
1365 single: string; text sequence type
Chris Jerdonekbb4e9412012-11-28 01:38:40 -08001366 single: str (built-in class); (see also string)
Chris Jerdonek5fae0e52012-11-20 17:45:51 -08001367 object: string
1368
Nick Coghlan273069c2012-08-20 17:14:07 +10001369.. _textseq:
1370
1371Text Sequence Type --- :class:`str`
1372===================================
1373
Chris Jerdonek5fae0e52012-11-20 17:45:51 -08001374Textual data in Python is handled with :class:`str` objects, or :dfn:`strings`.
1375Strings are immutable
Chris Jerdonekc33899b2012-10-11 18:57:48 -07001376:ref:`sequences <typesseq>` of Unicode code points. String literals are
Nick Coghlan273069c2012-08-20 17:14:07 +10001377written in a variety of ways:
1378
1379* Single quotes: ``'allows embedded "double" quotes'``
1380* Double quotes: ``"allows embedded 'single' quotes"``.
1381* Triple quoted: ``'''Three single quotes'''``, ``"""Three double quotes"""``
1382
1383Triple quoted strings may span multiple lines - all associated whitespace will
1384be included in the string literal.
1385
1386String literals that are part of a single expression and have only whitespace
Nick Coghlan83c0ae52012-08-21 17:42:52 +10001387between them will be implicitly converted to a single string literal. That
1388is, ``("spam " "eggs") == "spam eggs"``.
Nick Coghlan273069c2012-08-20 17:14:07 +10001389
1390See :ref:`strings` for more about the various forms of string literal,
1391including supported escape sequences, and the ``r`` ("raw") prefix that
1392disables most escape sequence processing.
1393
Chris Jerdonekbb4e9412012-11-28 01:38:40 -08001394Strings may also be created from other objects using the :class:`str`
1395constructor.
Nick Coghlan273069c2012-08-20 17:14:07 +10001396
1397Since there is no separate "character" type, indexing a string produces
1398strings of length 1. That is, for a non-empty string *s*, ``s[0] == s[0:1]``.
1399
Chris Jerdonek5fae0e52012-11-20 17:45:51 -08001400.. index::
1401 object: io.StringIO
1402
Nick Coghlan273069c2012-08-20 17:14:07 +10001403There is also no mutable string type, but :meth:`str.join` or
1404:class:`io.StringIO` can be used to efficiently construct strings from
1405multiple fragments.
1406
1407.. versionchanged:: 3.3
1408 For backwards compatibility with the Python 2 series, the ``u`` prefix is
1409 once again permitted on string literals. It has no effect on the meaning
1410 of string literals and cannot be combined with the ``r`` prefix.
Georg Brandl116aa622007-08-15 14:28:22 +00001411
Chris Jerdonekbb4e9412012-11-28 01:38:40 -08001412
1413.. index::
1414 single: string; str (built-in class)
1415
1416.. class:: str(object='')
1417 str(object=b'', encoding='utf-8', errors='strict')
1418
1419 Return a :ref:`string <textseq>` version of *object*. If *object* is not
1420 provided, returns the empty string. Otherwise, the behavior of ``str()``
1421 depends on whether *encoding* or *errors* is given, as follows.
1422
1423 If neither *encoding* nor *errors* is given, ``str(object)`` returns
1424 :meth:`object.__str__() <object.__str__>`, which is the "informal" or nicely
1425 printable string representation of *object*. For string objects, this is
1426 the string itself. If *object* does not have a :meth:`~object.__str__`
1427 method, then :func:`str` falls back to returning
1428 :meth:`repr(object) <repr>`.
1429
1430 .. index::
1431 single: buffer protocol; str (built-in class)
1432 single: bytes; str (built-in class)
1433
1434 If at least one of *encoding* or *errors* is given, *object* should be a
Ezio Melottic228e962013-05-04 18:06:34 +03001435 :term:`bytes-like object` (e.g. :class:`bytes` or :class:`bytearray`). In
1436 this case, if *object* is a :class:`bytes` (or :class:`bytearray`) object,
1437 then ``str(bytes, encoding, errors)`` is equivalent to
Chris Jerdonekbb4e9412012-11-28 01:38:40 -08001438 :meth:`bytes.decode(encoding, errors) <bytes.decode>`. Otherwise, the bytes
1439 object underlying the buffer object is obtained before calling
1440 :meth:`bytes.decode`. See :ref:`binaryseq` and
1441 :ref:`bufferobjects` for information on buffer objects.
1442
1443 Passing a :class:`bytes` object to :func:`str` without the *encoding*
1444 or *errors* arguments falls under the first case of returning the informal
1445 string representation (see also the :option:`-b` command-line option to
1446 Python). For example::
1447
1448 >>> str(b'Zoot!')
1449 "b'Zoot!'"
1450
1451 For more information on the ``str`` class and its methods, see
1452 :ref:`textseq` and the :ref:`string-methods` section below. To output
Martin Panterd5db1472016-02-08 01:34:09 +00001453 formatted strings, see the :ref:`formatstrings` section. In addition,
Chris Jerdonekbb4e9412012-11-28 01:38:40 -08001454 see the :ref:`stringservices` section.
1455
1456
1457.. index::
1458 pair: string; methods
1459
Georg Brandl116aa622007-08-15 14:28:22 +00001460.. _string-methods:
1461
1462String Methods
1463--------------
1464
Nick Coghlan273069c2012-08-20 17:14:07 +10001465.. index::
Nick Coghlan273069c2012-08-20 17:14:07 +10001466 module: re
Georg Brandl116aa622007-08-15 14:28:22 +00001467
Nick Coghlan273069c2012-08-20 17:14:07 +10001468Strings implement all of the :ref:`common <typesseq-common>` sequence
1469operations, along with the additional methods described below.
Thomas Wouters8ce81f72007-09-20 18:22:40 +00001470
Nick Coghlan273069c2012-08-20 17:14:07 +10001471Strings also support two styles of string formatting, one providing a large
1472degree of flexibility and customization (see :meth:`str.format`,
1473:ref:`formatstrings` and :ref:`string-formatting`) and the other based on C
1474``printf`` style formatting that handles a narrower range of types and is
1475slightly harder to use correctly, but is often faster for the cases it can
1476handle (:ref:`old-string-formatting`).
1477
1478The :ref:`textservices` section of the standard library covers a number of
1479other modules that provide various text related utilities (including regular
1480expression support in the :mod:`re` module).
Georg Brandl116aa622007-08-15 14:28:22 +00001481
1482.. method:: str.capitalize()
1483
Senthil Kumaranfa897982010-07-05 11:41:42 +00001484 Return a copy of the string with its first character capitalized and the
Senthil Kumaran37c63a32010-07-06 02:08:36 +00001485 rest lowercased.
Georg Brandl116aa622007-08-15 14:28:22 +00001486
Georg Brandl116aa622007-08-15 14:28:22 +00001487
Benjamin Petersond5890c82012-01-14 13:23:30 -05001488.. method:: str.casefold()
1489
1490 Return a casefolded copy of the string. Casefolded strings may be used for
Benjamin Peterson94303542012-01-18 23:09:32 -05001491 caseless matching.
1492
1493 Casefolding is similar to lowercasing but more aggressive because it is
1494 intended to remove all case distinctions in a string. For example, the German
1495 lowercase letter ``'ß'`` is equivalent to ``"ss"``. Since it is already
1496 lowercase, :meth:`lower` would do nothing to ``'ß'``; :meth:`casefold`
1497 converts it to ``"ss"``.
1498
1499 The casefolding algorithm is described in section 3.13 of the Unicode
1500 Standard.
Benjamin Petersond5890c82012-01-14 13:23:30 -05001501
1502 .. versionadded:: 3.3
1503
1504
Georg Brandl116aa622007-08-15 14:28:22 +00001505.. method:: str.center(width[, fillchar])
1506
1507 Return centered in a string of length *width*. Padding is done using the
Nick Coghlane4936b82014-08-09 16:14:04 +10001508 specified *fillchar* (default is an ASCII space). The original string is
1509 returned if *width* is less than or equal to ``len(s)``.
1510
Georg Brandl116aa622007-08-15 14:28:22 +00001511
Georg Brandl116aa622007-08-15 14:28:22 +00001512
1513.. method:: str.count(sub[, start[, end]])
1514
Benjamin Petersonad3d5c22009-02-26 03:38:59 +00001515 Return the number of non-overlapping occurrences of substring *sub* in the
1516 range [*start*, *end*]. Optional arguments *start* and *end* are
1517 interpreted as in slice notation.
Georg Brandl116aa622007-08-15 14:28:22 +00001518
1519
Victor Stinnere14e2122010-11-07 18:41:46 +00001520.. method:: str.encode(encoding="utf-8", errors="strict")
Georg Brandl116aa622007-08-15 14:28:22 +00001521
Victor Stinnere14e2122010-11-07 18:41:46 +00001522 Return an encoded version of the string as a bytes object. Default encoding
1523 is ``'utf-8'``. *errors* may be given to set a different error handling scheme.
1524 The default for *errors* is ``'strict'``, meaning that encoding errors raise
1525 a :exc:`UnicodeError`. Other possible
Georg Brandl4f5f98d2009-05-04 21:01:20 +00001526 values are ``'ignore'``, ``'replace'``, ``'xmlcharrefreplace'``,
1527 ``'backslashreplace'`` and any other name registered via
Nick Coghlanb9fdb7a2015-01-07 00:22:00 +10001528 :func:`codecs.register_error`, see section :ref:`error-handlers`. For a
Georg Brandl4f5f98d2009-05-04 21:01:20 +00001529 list of possible encodings, see section :ref:`standard-encodings`.
Georg Brandl116aa622007-08-15 14:28:22 +00001530
Benjamin Peterson308d6372009-09-18 21:42:35 +00001531 .. versionchanged:: 3.1
Georg Brandl67b21b72010-08-17 15:07:14 +00001532 Support for keyword arguments added.
1533
Georg Brandl116aa622007-08-15 14:28:22 +00001534
1535.. method:: str.endswith(suffix[, start[, end]])
1536
1537 Return ``True`` if the string ends with the specified *suffix*, otherwise return
1538 ``False``. *suffix* can also be a tuple of suffixes to look for. With optional
1539 *start*, test beginning at that position. With optional *end*, stop comparing
1540 at that position.
1541
Georg Brandl116aa622007-08-15 14:28:22 +00001542
Ezio Melotti745d54d2013-11-16 19:10:57 +02001543.. method:: str.expandtabs(tabsize=8)
Georg Brandl116aa622007-08-15 14:28:22 +00001544
Ned Deilybebe91a2013-04-21 13:05:21 -07001545 Return a copy of the string where all tab characters are replaced by one or
1546 more spaces, depending on the current column and the given tab size. Tab
1547 positions occur every *tabsize* characters (default is 8, giving tab
1548 positions at columns 0, 8, 16 and so on). To expand the string, the current
1549 column is set to zero and the string is examined character by character. If
1550 the character is a tab (``\t``), one or more space characters are inserted
1551 in the result until the current column is equal to the next tab position.
1552 (The tab character itself is not copied.) If the character is a newline
1553 (``\n``) or return (``\r``), it is copied and the current column is reset to
1554 zero. Any other character is copied unchanged and the current column is
1555 incremented by one regardless of how the character is represented when
1556 printed.
1557
1558 >>> '01\t012\t0123\t01234'.expandtabs()
1559 '01 012 0123 01234'
1560 >>> '01\t012\t0123\t01234'.expandtabs(4)
1561 '01 012 0123 01234'
Georg Brandl116aa622007-08-15 14:28:22 +00001562
1563
1564.. method:: str.find(sub[, start[, end]])
1565
Senthil Kumaran114a1d62016-01-03 17:57:10 -08001566 Return the lowest index in the string where substring *sub* is found within
1567 the slice ``s[start:end]``. Optional arguments *start* and *end* are
1568 interpreted as in slice notation. Return ``-1`` if *sub* is not found.
Georg Brandl116aa622007-08-15 14:28:22 +00001569
Ezio Melotti0ed8c682011-05-09 03:54:30 +03001570 .. note::
1571
1572 The :meth:`~str.find` method should be used only if you need to know the
1573 position of *sub*. To check if *sub* is a substring or not, use the
1574 :keyword:`in` operator::
1575
1576 >>> 'Py' in 'Python'
1577 True
1578
Georg Brandl116aa622007-08-15 14:28:22 +00001579
Benjamin Petersonad3d5c22009-02-26 03:38:59 +00001580.. method:: str.format(*args, **kwargs)
Georg Brandl4b491312007-08-31 09:22:56 +00001581
Georg Brandl1f70cdf2010-03-21 09:04:24 +00001582 Perform a string formatting operation. The string on which this method is
1583 called can contain literal text or replacement fields delimited by braces
1584 ``{}``. Each replacement field contains either the numeric index of a
1585 positional argument, or the name of a keyword argument. Returns a copy of
1586 the string where each replacement field is replaced with the string value of
1587 the corresponding argument.
Georg Brandl4b491312007-08-31 09:22:56 +00001588
1589 >>> "The sum of 1 + 2 is {0}".format(1+2)
1590 'The sum of 1 + 2 is 3'
1591
1592 See :ref:`formatstrings` for a description of the various formatting options
1593 that can be specified in format strings.
1594
Georg Brandl4b491312007-08-31 09:22:56 +00001595
Eric Smith27bbca62010-11-04 17:06:58 +00001596.. method:: str.format_map(mapping)
1597
Éric Araujo2642ad02010-11-06 04:59:27 +00001598 Similar to ``str.format(**mapping)``, except that ``mapping`` is
Serhiy Storchakaa4d170d2013-12-23 18:20:51 +02001599 used directly and not copied to a :class:`dict`. This is useful
Eric Smith5ad85f82010-11-06 13:22:13 +00001600 if for example ``mapping`` is a dict subclass:
Eric Smith27bbca62010-11-04 17:06:58 +00001601
Eric Smith5ad85f82010-11-06 13:22:13 +00001602 >>> class Default(dict):
1603 ... def __missing__(self, key):
1604 ... return key
1605 ...
1606 >>> '{name} was born in {country}'.format_map(Default(name='Guido'))
1607 'Guido was born in country'
1608
1609 .. versionadded:: 3.2
1610
Eric Smith27bbca62010-11-04 17:06:58 +00001611
Georg Brandl116aa622007-08-15 14:28:22 +00001612.. method:: str.index(sub[, start[, end]])
1613
Nick Coghlane4936b82014-08-09 16:14:04 +10001614 Like :meth:`~str.find`, but raise :exc:`ValueError` when the substring is
1615 not found.
Georg Brandl116aa622007-08-15 14:28:22 +00001616
1617
1618.. method:: str.isalnum()
1619
1620 Return true if all characters in the string are alphanumeric and there is at
Alexander Belopolsky0d267982010-12-23 02:58:25 +00001621 least one character, false otherwise. A character ``c`` is alphanumeric if one
1622 of the following returns ``True``: ``c.isalpha()``, ``c.isdecimal()``,
1623 ``c.isdigit()``, or ``c.isnumeric()``.
Georg Brandl116aa622007-08-15 14:28:22 +00001624
Georg Brandl116aa622007-08-15 14:28:22 +00001625
1626.. method:: str.isalpha()
1627
1628 Return true if all characters in the string are alphabetic and there is at least
Alexander Belopolsky0d267982010-12-23 02:58:25 +00001629 one character, false otherwise. Alphabetic characters are those characters defined
1630 in the Unicode character database as "Letter", i.e., those with general category
1631 property being one of "Lm", "Lt", "Lu", "Ll", or "Lo". Note that this is different
1632 from the "Alphabetic" property defined in the Unicode Standard.
Georg Brandl116aa622007-08-15 14:28:22 +00001633
Georg Brandl116aa622007-08-15 14:28:22 +00001634
Mark Summerfieldbbfd71d2008-07-01 15:50:04 +00001635.. method:: str.isdecimal()
1636
1637 Return true if all characters in the string are decimal
1638 characters and there is at least one character, false
Alexander Belopolsky0d267982010-12-23 02:58:25 +00001639 otherwise. Decimal characters are those from general category "Nd". This category
1640 includes digit characters, and all characters
Ezio Melottie130a522011-10-19 10:58:56 +03001641 that can be used to form decimal-radix numbers, e.g. U+0660,
Mark Summerfieldbbfd71d2008-07-01 15:50:04 +00001642 ARABIC-INDIC DIGIT ZERO.
Georg Brandl48310cd2009-01-03 21:18:54 +00001643
Mark Summerfieldbbfd71d2008-07-01 15:50:04 +00001644
Georg Brandl116aa622007-08-15 14:28:22 +00001645.. method:: str.isdigit()
1646
1647 Return true if all characters in the string are digits and there is at least one
Alexander Belopolsky0d267982010-12-23 02:58:25 +00001648 character, false otherwise. Digits include decimal characters and digits that need
1649 special handling, such as the compatibility superscript digits. Formally, a digit
1650 is a character that has the property value Numeric_Type=Digit or Numeric_Type=Decimal.
Georg Brandl116aa622007-08-15 14:28:22 +00001651
Georg Brandl116aa622007-08-15 14:28:22 +00001652
1653.. method:: str.isidentifier()
1654
1655 Return true if the string is a valid identifier according to the language
Georg Brandl4b491312007-08-31 09:22:56 +00001656 definition, section :ref:`identifiers`.
Georg Brandl116aa622007-08-15 14:28:22 +00001657
Raymond Hettinger378170d2013-03-23 08:21:12 -07001658 Use :func:`keyword.iskeyword` to test for reserved identifiers such as
1659 :keyword:`def` and :keyword:`class`.
Georg Brandl116aa622007-08-15 14:28:22 +00001660
1661.. method:: str.islower()
1662
Ezio Melotti0656a562011-08-15 14:27:19 +03001663 Return true if all cased characters [4]_ in the string are lowercase and
1664 there is at least one cased character, false otherwise.
Georg Brandl116aa622007-08-15 14:28:22 +00001665
Georg Brandl116aa622007-08-15 14:28:22 +00001666
Mark Summerfieldbbfd71d2008-07-01 15:50:04 +00001667.. method:: str.isnumeric()
1668
1669 Return true if all characters in the string are numeric
1670 characters, and there is at least one character, false
1671 otherwise. Numeric characters include digit characters, and all characters
1672 that have the Unicode numeric value property, e.g. U+2155,
Alexander Belopolsky0d267982010-12-23 02:58:25 +00001673 VULGAR FRACTION ONE FIFTH. Formally, numeric characters are those with the property
1674 value Numeric_Type=Digit, Numeric_Type=Decimal or Numeric_Type=Numeric.
Mark Summerfieldbbfd71d2008-07-01 15:50:04 +00001675
Georg Brandl48310cd2009-01-03 21:18:54 +00001676
Georg Brandl559e5d72008-06-11 18:37:52 +00001677.. method:: str.isprintable()
1678
1679 Return true if all characters in the string are printable or the string is
1680 empty, false otherwise. Nonprintable characters are those characters defined
1681 in the Unicode character database as "Other" or "Separator", excepting the
1682 ASCII space (0x20) which is considered printable. (Note that printable
1683 characters in this context are those which should not be escaped when
1684 :func:`repr` is invoked on a string. It has no bearing on the handling of
1685 strings written to :data:`sys.stdout` or :data:`sys.stderr`.)
1686
1687
Georg Brandl116aa622007-08-15 14:28:22 +00001688.. method:: str.isspace()
1689
1690 Return true if there are only whitespace characters in the string and there is
Alexander Belopolsky0d267982010-12-23 02:58:25 +00001691 at least one character, false otherwise. Whitespace characters are those
1692 characters defined in the Unicode character database as "Other" or "Separator"
1693 and those with bidirectional property being one of "WS", "B", or "S".
Georg Brandl116aa622007-08-15 14:28:22 +00001694
1695.. method:: str.istitle()
1696
1697 Return true if the string is a titlecased string and there is at least one
1698 character, for example uppercase characters may only follow uncased characters
1699 and lowercase characters only cased ones. Return false otherwise.
1700
Georg Brandl116aa622007-08-15 14:28:22 +00001701
1702.. method:: str.isupper()
1703
Ezio Melotti0656a562011-08-15 14:27:19 +03001704 Return true if all cased characters [4]_ in the string are uppercase and
1705 there is at least one cased character, false otherwise.
Georg Brandl116aa622007-08-15 14:28:22 +00001706
Georg Brandl116aa622007-08-15 14:28:22 +00001707
Georg Brandl495f7b52009-10-27 15:28:25 +00001708.. method:: str.join(iterable)
Georg Brandl116aa622007-08-15 14:28:22 +00001709
Georg Brandl495f7b52009-10-27 15:28:25 +00001710 Return a string which is the concatenation of the strings in the
1711 :term:`iterable` *iterable*. A :exc:`TypeError` will be raised if there are
Terry Jan Reedyf4ec3c52012-01-11 03:29:42 -05001712 any non-string values in *iterable*, including :class:`bytes` objects. The
Georg Brandl495f7b52009-10-27 15:28:25 +00001713 separator between elements is the string providing this method.
Georg Brandl116aa622007-08-15 14:28:22 +00001714
1715
1716.. method:: str.ljust(width[, fillchar])
1717
Nick Coghlane4936b82014-08-09 16:14:04 +10001718 Return the string left justified in a string of length *width*. Padding is
1719 done using the specified *fillchar* (default is an ASCII space). The
1720 original string is returned if *width* is less than or equal to ``len(s)``.
Georg Brandl116aa622007-08-15 14:28:22 +00001721
Georg Brandl116aa622007-08-15 14:28:22 +00001722
1723.. method:: str.lower()
1724
Ezio Melotti0656a562011-08-15 14:27:19 +03001725 Return a copy of the string with all the cased characters [4]_ converted to
1726 lowercase.
Georg Brandl116aa622007-08-15 14:28:22 +00001727
Benjamin Peterson94303542012-01-18 23:09:32 -05001728 The lowercasing algorithm used is described in section 3.13 of the Unicode
1729 Standard.
1730
Georg Brandl116aa622007-08-15 14:28:22 +00001731
1732.. method:: str.lstrip([chars])
1733
1734 Return a copy of the string with leading characters removed. The *chars*
1735 argument is a string specifying the set of characters to be removed. If omitted
1736 or ``None``, the *chars* argument defaults to removing whitespace. The *chars*
Nick Coghlane4936b82014-08-09 16:14:04 +10001737 argument is not a prefix; rather, all combinations of its values are stripped::
Georg Brandl116aa622007-08-15 14:28:22 +00001738
1739 >>> ' spacious '.lstrip()
1740 'spacious '
1741 >>> 'www.example.com'.lstrip('cmowz.')
1742 'example.com'
1743
Georg Brandl116aa622007-08-15 14:28:22 +00001744
Georg Brandlabc38772009-04-12 15:51:51 +00001745.. staticmethod:: str.maketrans(x[, y[, z]])
Georg Brandlceee0772007-11-27 23:48:05 +00001746
1747 This static method returns a translation table usable for :meth:`str.translate`.
1748
1749 If there is only one argument, it must be a dictionary mapping Unicode
1750 ordinals (integers) or characters (strings of length 1) to Unicode ordinals,
1751 strings (of arbitrary lengths) or None. Character keys will then be
1752 converted to ordinals.
1753
1754 If there are two arguments, they must be strings of equal length, and in the
1755 resulting dictionary, each character in x will be mapped to the character at
1756 the same position in y. If there is a third argument, it must be a string,
1757 whose characters will be mapped to None in the result.
1758
1759
Georg Brandl116aa622007-08-15 14:28:22 +00001760.. method:: str.partition(sep)
1761
1762 Split the string at the first occurrence of *sep*, and return a 3-tuple
1763 containing the part before the separator, the separator itself, and the part
1764 after the separator. If the separator is not found, return a 3-tuple containing
1765 the string itself, followed by two empty strings.
1766
Georg Brandl116aa622007-08-15 14:28:22 +00001767
1768.. method:: str.replace(old, new[, count])
1769
1770 Return a copy of the string with all occurrences of substring *old* replaced by
1771 *new*. If the optional argument *count* is given, only the first *count*
1772 occurrences are replaced.
1773
1774
Georg Brandl226878c2007-08-31 10:15:37 +00001775.. method:: str.rfind(sub[, start[, end]])
Georg Brandl116aa622007-08-15 14:28:22 +00001776
Benjamin Petersond99cd812010-04-27 22:58:50 +00001777 Return the highest index in the string where substring *sub* is found, such
1778 that *sub* is contained within ``s[start:end]``. Optional arguments *start*
1779 and *end* are interpreted as in slice notation. Return ``-1`` on failure.
Georg Brandl116aa622007-08-15 14:28:22 +00001780
1781
1782.. method:: str.rindex(sub[, start[, end]])
1783
1784 Like :meth:`rfind` but raises :exc:`ValueError` when the substring *sub* is not
1785 found.
1786
1787
1788.. method:: str.rjust(width[, fillchar])
1789
Nick Coghlane4936b82014-08-09 16:14:04 +10001790 Return the string right justified in a string of length *width*. Padding is
1791 done using the specified *fillchar* (default is an ASCII space). The
1792 original string is returned if *width* is less than or equal to ``len(s)``.
Georg Brandl116aa622007-08-15 14:28:22 +00001793
Georg Brandl116aa622007-08-15 14:28:22 +00001794
1795.. method:: str.rpartition(sep)
1796
1797 Split the string at the last occurrence of *sep*, and return a 3-tuple
1798 containing the part before the separator, the separator itself, and the part
1799 after the separator. If the separator is not found, return a 3-tuple containing
1800 two empty strings, followed by the string itself.
1801
Georg Brandl116aa622007-08-15 14:28:22 +00001802
Ezio Melotticda6b6d2012-02-26 09:39:55 +02001803.. method:: str.rsplit(sep=None, maxsplit=-1)
Georg Brandl116aa622007-08-15 14:28:22 +00001804
1805 Return a list of the words in the string, using *sep* as the delimiter string.
1806 If *maxsplit* is given, at most *maxsplit* splits are done, the *rightmost*
1807 ones. If *sep* is not specified or ``None``, any whitespace string is a
1808 separator. Except for splitting from the right, :meth:`rsplit` behaves like
1809 :meth:`split` which is described in detail below.
1810
Georg Brandl116aa622007-08-15 14:28:22 +00001811
1812.. method:: str.rstrip([chars])
1813
1814 Return a copy of the string with trailing characters removed. The *chars*
1815 argument is a string specifying the set of characters to be removed. If omitted
1816 or ``None``, the *chars* argument defaults to removing whitespace. The *chars*
Nick Coghlane4936b82014-08-09 16:14:04 +10001817 argument is not a suffix; rather, all combinations of its values are stripped::
Georg Brandl116aa622007-08-15 14:28:22 +00001818
1819 >>> ' spacious '.rstrip()
1820 ' spacious'
1821 >>> 'mississippi'.rstrip('ipz')
1822 'mississ'
1823
Georg Brandl116aa622007-08-15 14:28:22 +00001824
Ezio Melotticda6b6d2012-02-26 09:39:55 +02001825.. method:: str.split(sep=None, maxsplit=-1)
Georg Brandl116aa622007-08-15 14:28:22 +00001826
Georg Brandl226878c2007-08-31 10:15:37 +00001827 Return a list of the words in the string, using *sep* as the delimiter
1828 string. If *maxsplit* is given, at most *maxsplit* splits are done (thus,
1829 the list will have at most ``maxsplit+1`` elements). If *maxsplit* is not
Ezio Melottibf3165b2012-05-10 15:30:42 +03001830 specified or ``-1``, then there is no limit on the number of splits
1831 (all possible splits are made).
Georg Brandl9afde1c2007-11-01 20:32:30 +00001832
Guido van Rossum2cc30da2007-11-02 23:46:40 +00001833 If *sep* is given, consecutive delimiters are not grouped together and are
Georg Brandl226878c2007-08-31 10:15:37 +00001834 deemed to delimit empty strings (for example, ``'1,,2'.split(',')`` returns
1835 ``['1', '', '2']``). The *sep* argument may consist of multiple characters
Georg Brandl9afde1c2007-11-01 20:32:30 +00001836 (for example, ``'1<>2<>3'.split('<>')`` returns ``['1', '2', '3']``).
Georg Brandl226878c2007-08-31 10:15:37 +00001837 Splitting an empty string with a specified separator returns ``['']``.
Georg Brandl116aa622007-08-15 14:28:22 +00001838
Nick Coghlane4936b82014-08-09 16:14:04 +10001839 For example::
1840
1841 >>> '1,2,3'.split(',')
1842 ['1', '2', '3']
1843 >>> '1,2,3'.split(',', maxsplit=1)
Benjamin Petersoneb83ffe2014-09-22 22:43:50 -04001844 ['1', '2,3']
Nick Coghlane4936b82014-08-09 16:14:04 +10001845 >>> '1,2,,3,'.split(',')
1846 ['1', '2', '', '3', '']
1847
Georg Brandl116aa622007-08-15 14:28:22 +00001848 If *sep* is not specified or is ``None``, a different splitting algorithm is
Georg Brandl9afde1c2007-11-01 20:32:30 +00001849 applied: runs of consecutive whitespace are regarded as a single separator,
1850 and the result will contain no empty strings at the start or end if the
1851 string has leading or trailing whitespace. Consequently, splitting an empty
1852 string or a string consisting of just whitespace with a ``None`` separator
1853 returns ``[]``.
1854
Nick Coghlane4936b82014-08-09 16:14:04 +10001855 For example::
1856
1857 >>> '1 2 3'.split()
1858 ['1', '2', '3']
1859 >>> '1 2 3'.split(maxsplit=1)
1860 ['1', '2 3']
1861 >>> ' 1 2 3 '.split()
1862 ['1', '2', '3']
Georg Brandl116aa622007-08-15 14:28:22 +00001863
1864
R David Murray1b00f252012-08-15 10:43:58 -04001865.. index::
1866 single: universal newlines; str.splitlines method
1867
Georg Brandl116aa622007-08-15 14:28:22 +00001868.. method:: str.splitlines([keepends])
1869
Benjamin Peterson8218bd42015-03-31 21:20:36 -04001870 Return a list of the lines in the string, breaking at line boundaries. Line
1871 breaks are not included in the resulting list unless *keepends* is given and
1872 true.
1873
1874 This method splits on the following line boundaries. In particular, the
1875 boundaries are a superset of :term:`universal newlines`.
1876
1877 +-----------------------+-----------------------------+
1878 | Representation | Description |
1879 +=======================+=============================+
1880 | ``\n`` | Line Feed |
1881 +-----------------------+-----------------------------+
1882 | ``\r`` | Carriage Return |
1883 +-----------------------+-----------------------------+
1884 | ``\r\n`` | Carriage Return + Line Feed |
1885 +-----------------------+-----------------------------+
1886 | ``\v`` or ``\x0b`` | Line Tabulation |
1887 +-----------------------+-----------------------------+
1888 | ``\f`` or ``\x0c`` | Form Feed |
1889 +-----------------------+-----------------------------+
1890 | ``\x1c`` | File Separator |
1891 +-----------------------+-----------------------------+
1892 | ``\x1d`` | Group Separator |
1893 +-----------------------+-----------------------------+
1894 | ``\x1e`` | Record Separator |
1895 +-----------------------+-----------------------------+
1896 | ``\x85`` | Next Line (C1 Control Code) |
1897 +-----------------------+-----------------------------+
1898 | ``\u2028`` | Line Separator |
1899 +-----------------------+-----------------------------+
1900 | ``\u2029`` | Paragraph Separator |
1901 +-----------------------+-----------------------------+
1902
1903 .. versionchanged:: 3.2
1904
1905 ``\v`` and ``\f`` added to list of line boundaries.
R David Murrayae1b94b2012-06-01 16:19:36 -04001906
Nick Coghlane4936b82014-08-09 16:14:04 +10001907 For example::
1908
1909 >>> 'ab c\n\nde fg\rkl\r\n'.splitlines()
Larry Hastingsc6256e52014-10-05 19:03:48 -07001910 ['ab c', '', 'de fg', 'kl']
Nick Coghlane4936b82014-08-09 16:14:04 +10001911 >>> 'ab c\n\nde fg\rkl\r\n'.splitlines(keepends=True)
1912 ['ab c\n', '\n', 'de fg\r', 'kl\r\n']
Georg Brandl116aa622007-08-15 14:28:22 +00001913
R David Murray05c35a62012-08-06 16:08:09 -04001914 Unlike :meth:`~str.split` when a delimiter string *sep* is given, this
1915 method returns an empty list for the empty string, and a terminal line
Nick Coghlane4936b82014-08-09 16:14:04 +10001916 break does not result in an extra line::
1917
1918 >>> "".splitlines()
1919 []
1920 >>> "One line\n".splitlines()
1921 ['One line']
1922
1923 For comparison, ``split('\n')`` gives::
1924
1925 >>> ''.split('\n')
1926 ['']
1927 >>> 'Two lines\n'.split('\n')
1928 ['Two lines', '']
R David Murray05c35a62012-08-06 16:08:09 -04001929
Georg Brandl116aa622007-08-15 14:28:22 +00001930
1931.. method:: str.startswith(prefix[, start[, end]])
1932
1933 Return ``True`` if string starts with the *prefix*, otherwise return ``False``.
1934 *prefix* can also be a tuple of prefixes to look for. With optional *start*,
1935 test string beginning at that position. With optional *end*, stop comparing
1936 string at that position.
1937
Georg Brandl116aa622007-08-15 14:28:22 +00001938
1939.. method:: str.strip([chars])
1940
1941 Return a copy of the string with the leading and trailing characters removed.
1942 The *chars* argument is a string specifying the set of characters to be removed.
1943 If omitted or ``None``, the *chars* argument defaults to removing whitespace.
1944 The *chars* argument is not a prefix or suffix; rather, all combinations of its
Nick Coghlane4936b82014-08-09 16:14:04 +10001945 values are stripped::
Georg Brandl116aa622007-08-15 14:28:22 +00001946
1947 >>> ' spacious '.strip()
1948 'spacious'
1949 >>> 'www.example.com'.strip('cmowz.')
1950 'example'
1951
Raymond Hettinger19cfb572015-05-23 09:11:55 -07001952 The outermost leading and trailing *chars* argument values are stripped
1953 from the string. Characters are removed from the leading end until
1954 reaching a string character that is not contained in the set of
1955 characters in *chars*. A similar action takes place on the trailing end.
1956 For example::
1957
1958 >>> comment_string = '#....... Section 3.2.1 Issue #32 .......'
1959 >>> comment_string.strip('.#! ')
1960 'Section 3.2.1 Issue #32'
1961
Georg Brandl116aa622007-08-15 14:28:22 +00001962
1963.. method:: str.swapcase()
1964
1965 Return a copy of the string with uppercase characters converted to lowercase and
Benjamin Petersonb2bf01d2012-01-11 18:17:06 -05001966 vice versa. Note that it is not necessarily true that
1967 ``s.swapcase().swapcase() == s``.
Georg Brandl116aa622007-08-15 14:28:22 +00001968
Georg Brandl116aa622007-08-15 14:28:22 +00001969
1970.. method:: str.title()
1971
Raymond Hettingerb8b0ba12009-09-29 06:22:28 +00001972 Return a titlecased version of the string where words start with an uppercase
1973 character and the remaining characters are lowercase.
1974
Nick Coghlane4936b82014-08-09 16:14:04 +10001975 For example::
1976
1977 >>> 'Hello world'.title()
1978 'Hello World'
1979
Raymond Hettingerb8b0ba12009-09-29 06:22:28 +00001980 The algorithm uses a simple language-independent definition of a word as
1981 groups of consecutive letters. The definition works in many contexts but
1982 it means that apostrophes in contractions and possessives form word
1983 boundaries, which may not be the desired result::
1984
1985 >>> "they're bill's friends from the UK".title()
1986 "They'Re Bill'S Friends From The Uk"
1987
1988 A workaround for apostrophes can be constructed using regular expressions::
1989
1990 >>> import re
1991 >>> def titlecase(s):
Andrew Svetlov5c904362012-11-08 17:26:53 +02001992 ... return re.sub(r"[A-Za-z]+('[A-Za-z]+)?",
1993 ... lambda mo: mo.group(0)[0].upper() +
1994 ... mo.group(0)[1:].lower(),
1995 ... s)
1996 ...
Raymond Hettingerb8b0ba12009-09-29 06:22:28 +00001997 >>> titlecase("they're bill's friends.")
1998 "They're Bill's Friends."
Georg Brandl116aa622007-08-15 14:28:22 +00001999
Georg Brandl116aa622007-08-15 14:28:22 +00002000
Zachary Ware79b98df2015-08-05 23:54:15 -05002001.. method:: str.translate(table)
Georg Brandl116aa622007-08-15 14:28:22 +00002002
Zachary Ware79b98df2015-08-05 23:54:15 -05002003 Return a copy of the string in which each character has been mapped through
2004 the given translation table. The table must be an object that implements
2005 indexing via :meth:`__getitem__`, typically a :term:`mapping` or
2006 :term:`sequence`. When indexed by a Unicode ordinal (an integer), the
2007 table object can do any of the following: return a Unicode ordinal or a
2008 string, to map the character to one or more other characters; return
2009 ``None``, to delete the character from the return string; or raise a
2010 :exc:`LookupError` exception, to map the character to itself.
Georg Brandlceee0772007-11-27 23:48:05 +00002011
Georg Brandl454636f2008-12-27 23:33:20 +00002012 You can use :meth:`str.maketrans` to create a translation map from
2013 character-to-character mappings in different formats.
Christian Heimesfe337bf2008-03-23 21:54:12 +00002014
Zachary Ware79b98df2015-08-05 23:54:15 -05002015 See also the :mod:`codecs` module for a more flexible approach to custom
2016 character mappings.
Georg Brandl116aa622007-08-15 14:28:22 +00002017
2018
2019.. method:: str.upper()
2020
Ezio Melotti0656a562011-08-15 14:27:19 +03002021 Return a copy of the string with all the cased characters [4]_ converted to
2022 uppercase. Note that ``str.upper().isupper()`` might be ``False`` if ``s``
2023 contains uncased characters or if the Unicode category of the resulting
Benjamin Peterson94303542012-01-18 23:09:32 -05002024 character(s) is not "Lu" (Letter, uppercase), but e.g. "Lt" (Letter,
2025 titlecase).
2026
2027 The uppercasing algorithm used is described in section 3.13 of the Unicode
2028 Standard.
Georg Brandl116aa622007-08-15 14:28:22 +00002029
Georg Brandl116aa622007-08-15 14:28:22 +00002030
2031.. method:: str.zfill(width)
2032
Nick Coghlane4936b82014-08-09 16:14:04 +10002033 Return a copy of the string left filled with ASCII ``'0'`` digits to
Tim Golden42c235e2015-04-06 11:04:49 +01002034 make a string of length *width*. A leading sign prefix (``'+'``/``'-'``)
Nick Coghlane4936b82014-08-09 16:14:04 +10002035 is handled by inserting the padding *after* the sign character rather
2036 than before. The original string is returned if *width* is less than
2037 or equal to ``len(s)``.
2038
2039 For example::
2040
2041 >>> "42".zfill(5)
2042 '00042'
2043 >>> "-42".zfill(5)
2044 '-0042'
Christian Heimesb186d002008-03-18 15:15:01 +00002045
2046
Georg Brandl116aa622007-08-15 14:28:22 +00002047
Georg Brandl4b491312007-08-31 09:22:56 +00002048.. _old-string-formatting:
Georg Brandl116aa622007-08-15 14:28:22 +00002049
Nick Coghlan273069c2012-08-20 17:14:07 +10002050``printf``-style String Formatting
2051----------------------------------
Georg Brandl116aa622007-08-15 14:28:22 +00002052
2053.. index::
2054 single: formatting, string (%)
2055 single: interpolation, string (%)
2056 single: string; formatting
2057 single: string; interpolation
2058 single: printf-style formatting
2059 single: sprintf-style formatting
2060 single: % formatting
2061 single: % interpolation
2062
Georg Brandl4b491312007-08-31 09:22:56 +00002063.. note::
2064
Nick Coghlan273069c2012-08-20 17:14:07 +10002065 The formatting operations described here exhibit a variety of quirks that
2066 lead to a number of common errors (such as failing to display tuples and
2067 dictionaries correctly). Using the newer :meth:`str.format` interface
2068 helps avoid these errors, and also provides a generally more powerful,
2069 flexible and extensible approach to formatting text.
Georg Brandl4b491312007-08-31 09:22:56 +00002070
2071String objects have one unique built-in operation: the ``%`` operator (modulo).
2072This is also known as the string *formatting* or *interpolation* operator.
2073Given ``format % values`` (where *format* is a string), ``%`` conversion
2074specifications in *format* are replaced with zero or more elements of *values*.
Nick Coghlan273069c2012-08-20 17:14:07 +10002075The effect is similar to using the :c:func:`sprintf` in the C language.
Georg Brandl116aa622007-08-15 14:28:22 +00002076
2077If *format* requires a single argument, *values* may be a single non-tuple
Ezio Melotti0656a562011-08-15 14:27:19 +03002078object. [5]_ Otherwise, *values* must be a tuple with exactly the number of
Georg Brandl116aa622007-08-15 14:28:22 +00002079items specified by the format string, or a single mapping object (for example, a
2080dictionary).
2081
2082A conversion specifier contains two or more characters and has the following
2083components, which must occur in this order:
2084
2085#. The ``'%'`` character, which marks the start of the specifier.
2086
2087#. Mapping key (optional), consisting of a parenthesised sequence of characters
2088 (for example, ``(somename)``).
2089
2090#. Conversion flags (optional), which affect the result of some conversion
2091 types.
2092
2093#. Minimum field width (optional). If specified as an ``'*'`` (asterisk), the
2094 actual width is read from the next element of the tuple in *values*, and the
2095 object to convert comes after the minimum field width and optional precision.
2096
2097#. Precision (optional), given as a ``'.'`` (dot) followed by the precision. If
Eli Benderskyef4902a2011-07-29 09:30:42 +03002098 specified as ``'*'`` (an asterisk), the actual precision is read from the next
Georg Brandl116aa622007-08-15 14:28:22 +00002099 element of the tuple in *values*, and the value to convert comes after the
2100 precision.
2101
2102#. Length modifier (optional).
2103
2104#. Conversion type.
2105
2106When the right argument is a dictionary (or other mapping type), then the
2107formats in the string *must* include a parenthesised mapping key into that
2108dictionary inserted immediately after the ``'%'`` character. The mapping key
Christian Heimesfe337bf2008-03-23 21:54:12 +00002109selects the value to be formatted from the mapping. For example:
Georg Brandl116aa622007-08-15 14:28:22 +00002110
Georg Brandledc9e7f2010-10-17 09:19:03 +00002111 >>> print('%(language)s has %(number)03d quote types.' %
2112 ... {'language': "Python", "number": 2})
Georg Brandl116aa622007-08-15 14:28:22 +00002113 Python has 002 quote types.
2114
2115In this case no ``*`` specifiers may occur in a format (since they require a
2116sequential parameter list).
2117
2118The conversion flag characters are:
2119
2120+---------+---------------------------------------------------------------------+
2121| Flag | Meaning |
2122+=========+=====================================================================+
2123| ``'#'`` | The value conversion will use the "alternate form" (where defined |
2124| | below). |
2125+---------+---------------------------------------------------------------------+
2126| ``'0'`` | The conversion will be zero padded for numeric values. |
2127+---------+---------------------------------------------------------------------+
2128| ``'-'`` | The converted value is left adjusted (overrides the ``'0'`` |
2129| | conversion if both are given). |
2130+---------+---------------------------------------------------------------------+
2131| ``' '`` | (a space) A blank should be left before a positive number (or empty |
2132| | string) produced by a signed conversion. |
2133+---------+---------------------------------------------------------------------+
2134| ``'+'`` | A sign character (``'+'`` or ``'-'``) will precede the conversion |
2135| | (overrides a "space" flag). |
2136+---------+---------------------------------------------------------------------+
2137
2138A length modifier (``h``, ``l``, or ``L``) may be present, but is ignored as it
Alexandre Vassalotti5f8ced22008-05-16 00:03:33 +00002139is not necessary for Python -- so e.g. ``%ld`` is identical to ``%d``.
Georg Brandl116aa622007-08-15 14:28:22 +00002140
2141The conversion types are:
2142
2143+------------+-----------------------------------------------------+-------+
2144| Conversion | Meaning | Notes |
2145+============+=====================================================+=======+
2146| ``'d'`` | Signed integer decimal. | |
2147+------------+-----------------------------------------------------+-------+
2148| ``'i'`` | Signed integer decimal. | |
2149+------------+-----------------------------------------------------+-------+
Alexandre Vassalotti5f8ced22008-05-16 00:03:33 +00002150| ``'o'`` | Signed octal value. | \(1) |
Georg Brandl116aa622007-08-15 14:28:22 +00002151+------------+-----------------------------------------------------+-------+
Benjamin Petersone0124bd2009-03-09 21:04:33 +00002152| ``'u'`` | Obsolete type -- it is identical to ``'d'``. | \(7) |
Georg Brandl116aa622007-08-15 14:28:22 +00002153+------------+-----------------------------------------------------+-------+
Alexandre Vassalotti5f8ced22008-05-16 00:03:33 +00002154| ``'x'`` | Signed hexadecimal (lowercase). | \(2) |
Georg Brandl116aa622007-08-15 14:28:22 +00002155+------------+-----------------------------------------------------+-------+
Alexandre Vassalotti5f8ced22008-05-16 00:03:33 +00002156| ``'X'`` | Signed hexadecimal (uppercase). | \(2) |
Georg Brandl116aa622007-08-15 14:28:22 +00002157+------------+-----------------------------------------------------+-------+
2158| ``'e'`` | Floating point exponential format (lowercase). | \(3) |
2159+------------+-----------------------------------------------------+-------+
2160| ``'E'`` | Floating point exponential format (uppercase). | \(3) |
2161+------------+-----------------------------------------------------+-------+
Eric Smith22b85b32008-07-17 19:18:29 +00002162| ``'f'`` | Floating point decimal format. | \(3) |
Georg Brandl116aa622007-08-15 14:28:22 +00002163+------------+-----------------------------------------------------+-------+
Eric Smith22b85b32008-07-17 19:18:29 +00002164| ``'F'`` | Floating point decimal format. | \(3) |
Georg Brandl116aa622007-08-15 14:28:22 +00002165+------------+-----------------------------------------------------+-------+
Christian Heimes8dc226f2008-05-06 23:45:46 +00002166| ``'g'`` | Floating point format. Uses lowercase exponential | \(4) |
2167| | format if exponent is less than -4 or not less than | |
2168| | precision, decimal format otherwise. | |
Georg Brandl116aa622007-08-15 14:28:22 +00002169+------------+-----------------------------------------------------+-------+
Christian Heimes8dc226f2008-05-06 23:45:46 +00002170| ``'G'`` | Floating point format. Uses uppercase exponential | \(4) |
2171| | format if exponent is less than -4 or not less than | |
2172| | precision, decimal format otherwise. | |
Georg Brandl116aa622007-08-15 14:28:22 +00002173+------------+-----------------------------------------------------+-------+
2174| ``'c'`` | Single character (accepts integer or single | |
2175| | character string). | |
2176+------------+-----------------------------------------------------+-------+
Ezio Melotti0639d5a2009-12-19 23:26:38 +00002177| ``'r'`` | String (converts any Python object using | \(5) |
Georg Brandl116aa622007-08-15 14:28:22 +00002178| | :func:`repr`). | |
2179+------------+-----------------------------------------------------+-------+
Eli Benderskyef4902a2011-07-29 09:30:42 +03002180| ``'s'`` | String (converts any Python object using | \(5) |
Georg Brandl116aa622007-08-15 14:28:22 +00002181| | :func:`str`). | |
2182+------------+-----------------------------------------------------+-------+
Eli Benderskyef4902a2011-07-29 09:30:42 +03002183| ``'a'`` | String (converts any Python object using | \(5) |
2184| | :func:`ascii`). | |
2185+------------+-----------------------------------------------------+-------+
Georg Brandl116aa622007-08-15 14:28:22 +00002186| ``'%'`` | No argument is converted, results in a ``'%'`` | |
2187| | character in the result. | |
2188+------------+-----------------------------------------------------+-------+
2189
2190Notes:
2191
2192(1)
2193 The alternate form causes a leading zero (``'0'``) to be inserted between
2194 left-hand padding and the formatting of the number if the leading character
2195 of the result is not already a zero.
2196
2197(2)
2198 The alternate form causes a leading ``'0x'`` or ``'0X'`` (depending on whether
2199 the ``'x'`` or ``'X'`` format was used) to be inserted between left-hand padding
2200 and the formatting of the number if the leading character of the result is not
2201 already a zero.
2202
2203(3)
2204 The alternate form causes the result to always contain a decimal point, even if
2205 no digits follow it.
2206
2207 The precision determines the number of digits after the decimal point and
2208 defaults to 6.
2209
2210(4)
2211 The alternate form causes the result to always contain a decimal point, and
2212 trailing zeroes are not removed as they would otherwise be.
2213
2214 The precision determines the number of significant digits before and after the
2215 decimal point and defaults to 6.
2216
2217(5)
Eli Benderskyef4902a2011-07-29 09:30:42 +03002218 If precision is ``N``, the output is truncated to ``N`` characters.
Georg Brandl116aa622007-08-15 14:28:22 +00002219
Georg Brandl116aa622007-08-15 14:28:22 +00002220
Alexandre Vassalotti5f8ced22008-05-16 00:03:33 +00002221(7)
2222 See :pep:`237`.
2223
Georg Brandl116aa622007-08-15 14:28:22 +00002224Since Python strings have an explicit length, ``%s`` conversions do not assume
2225that ``'\0'`` is the end of the string.
2226
Christian Heimes5b5e81c2007-12-31 16:14:33 +00002227.. XXX Examples?
2228
Mark Dickinson33841c32009-05-01 15:37:04 +00002229.. versionchanged:: 3.1
2230 ``%f`` conversions for numbers whose absolute value is over 1e50 are no
2231 longer replaced by ``%g`` conversions.
Georg Brandl116aa622007-08-15 14:28:22 +00002232
Georg Brandl116aa622007-08-15 14:28:22 +00002233
Chris Jerdonek5fae0e52012-11-20 17:45:51 -08002234.. index::
2235 single: buffer protocol; binary sequence types
2236
Nick Coghlan273069c2012-08-20 17:14:07 +10002237.. _binaryseq:
Georg Brandl116aa622007-08-15 14:28:22 +00002238
Nick Coghlan273069c2012-08-20 17:14:07 +10002239Binary Sequence Types --- :class:`bytes`, :class:`bytearray`, :class:`memoryview`
2240=================================================================================
Georg Brandl116aa622007-08-15 14:28:22 +00002241
2242.. index::
Nick Coghlan273069c2012-08-20 17:14:07 +10002243 object: bytes
Georg Brandl95414632007-11-22 11:00:28 +00002244 object: bytearray
Nick Coghlan273069c2012-08-20 17:14:07 +10002245 object: memoryview
2246 module: array
Georg Brandl116aa622007-08-15 14:28:22 +00002247
Nick Coghlan273069c2012-08-20 17:14:07 +10002248The core built-in types for manipulating binary data are :class:`bytes` and
2249:class:`bytearray`. They are supported by :class:`memoryview` which uses
Chris Jerdonek5fae0e52012-11-20 17:45:51 -08002250the :ref:`buffer protocol <bufferobjects>` to access the memory of other
2251binary objects without needing to make a copy.
Georg Brandl226878c2007-08-31 10:15:37 +00002252
Nick Coghlan273069c2012-08-20 17:14:07 +10002253The :mod:`array` module supports efficient storage of basic data types like
225432-bit integers and IEEE754 double-precision floating values.
Georg Brandl116aa622007-08-15 14:28:22 +00002255
Nick Coghlan273069c2012-08-20 17:14:07 +10002256.. _typebytes:
Senthil Kumaran7cafd262010-10-02 03:16:04 +00002257
Nick Coghlan273069c2012-08-20 17:14:07 +10002258Bytes
2259-----
2260
2261.. index:: object: bytes
2262
2263Bytes objects are immutable sequences of single bytes. Since many major
2264binary protocols are based on the ASCII text encoding, bytes objects offer
2265several methods that are only valid when working with ASCII compatible
2266data and are closely related to string objects in a variety of other ways.
2267
2268Firstly, the syntax for bytes literals is largely the same as that for string
2269literals, except that a ``b`` prefix is added:
2270
2271* Single quotes: ``b'still allows embedded "double" quotes'``
2272* Double quotes: ``b"still allows embedded 'single' quotes"``.
2273* Triple quoted: ``b'''3 single quotes'''``, ``b"""3 double quotes"""``
2274
2275Only ASCII characters are permitted in bytes literals (regardless of the
2276declared source code encoding). Any binary values over 127 must be entered
2277into bytes literals using the appropriate escape sequence.
2278
2279As with string literals, bytes literals may also use a ``r`` prefix to disable
2280processing of escape sequences. See :ref:`strings` for more about the various
2281forms of bytes literal, including supported escape sequences.
2282
2283While bytes literals and representations are based on ASCII text, bytes
2284objects actually behave like immutable sequences of integers, with each
2285value in the sequence restricted such that ``0 <= x < 256`` (attempts to
2286violate this restriction will trigger :exc:`ValueError`. This is done
2287deliberately to emphasise that while many binary formats include ASCII based
2288elements and can be usefully manipulated with some text-oriented algorithms,
2289this is not generally the case for arbitrary binary data (blindly applying
2290text processing algorithms to binary data formats that are not ASCII
2291compatible will usually lead to data corruption).
2292
2293In addition to the literal forms, bytes objects can be created in a number of
2294other ways:
2295
2296* A zero-filled bytes object of a specified length: ``bytes(10)``
2297* From an iterable of integers: ``bytes(range(20))``
2298* Copying existing binary data via the buffer protocol: ``bytes(obj)``
2299
Nick Coghlan83c0ae52012-08-21 17:42:52 +10002300Also see the :ref:`bytes <func-bytes>` built-in.
2301
Nick Coghlane4936b82014-08-09 16:14:04 +10002302Since 2 hexadecimal digits correspond precisely to a single byte, hexadecimal
2303numbers are a commonly used format for describing binary data. Accordingly,
2304the bytes type has an additional class method to read data in that format:
2305
2306.. classmethod:: bytes.fromhex(string)
2307
2308 This :class:`bytes` class method returns a bytes object, decoding the
2309 given string object. The string must contain two hexadecimal digits per
2310 byte, with ASCII spaces being ignored.
2311
2312 >>> bytes.fromhex('2Ef0 F1f2 ')
2313 b'.\xf0\xf1\xf2'
2314
Gregory P. Smith8cb65692015-04-25 23:22:26 +00002315A reverse conversion function exists to transform a bytes object into its
2316hexadecimal representation.
2317
2318.. method:: bytes.hex()
2319
2320 Return a string object containing two hexadecimal digits for each
2321 byte in the instance.
2322
2323 >>> b'\xf0\xf1\xf2'.hex()
2324 'f0f1f2'
2325
2326 .. versionadded:: 3.5
2327
Nick Coghlane4936b82014-08-09 16:14:04 +10002328Since bytes objects are sequences of integers (akin to a tuple), for a bytes
2329object *b*, ``b[0]`` will be an integer, while ``b[0:1]`` will be a bytes
2330object of length 1. (This contrasts with text strings, where both indexing
2331and slicing will produce a string of length 1)
Nick Coghlan273069c2012-08-20 17:14:07 +10002332
2333The representation of bytes objects uses the literal format (``b'...'``)
2334since it is often more useful than e.g. ``bytes([46, 46, 46])``. You can
2335always convert a bytes object into a list of integers using ``list(b)``.
Georg Brandl116aa622007-08-15 14:28:22 +00002336
Nick Coghlan273069c2012-08-20 17:14:07 +10002337.. note::
2338 For Python 2.x users: In the Python 2.x series, a variety of implicit
2339 conversions between 8-bit strings (the closest thing 2.x offers to a
2340 built-in binary data type) and Unicode strings were permitted. This was a
2341 backwards compatibility workaround to account for the fact that Python
2342 originally only supported 8-bit text, and Unicode text was a later
2343 addition. In Python 3.x, those implicit conversions are gone - conversions
2344 between 8-bit binary data and Unicode text must be explicit, and bytes and
2345 string objects will always compare unequal.
Raymond Hettingerc50846a2010-04-05 18:56:31 +00002346
Georg Brandl116aa622007-08-15 14:28:22 +00002347
Nick Coghlan273069c2012-08-20 17:14:07 +10002348.. _typebytearray:
Georg Brandl116aa622007-08-15 14:28:22 +00002349
Nick Coghlan273069c2012-08-20 17:14:07 +10002350Bytearray Objects
2351-----------------
Georg Brandl116aa622007-08-15 14:28:22 +00002352
Nick Coghlan273069c2012-08-20 17:14:07 +10002353.. index:: object: bytearray
Georg Brandl495f7b52009-10-27 15:28:25 +00002354
Nick Coghlan273069c2012-08-20 17:14:07 +10002355:class:`bytearray` objects are a mutable counterpart to :class:`bytes`
2356objects. There is no dedicated literal syntax for bytearray objects, instead
2357they are always created by calling the constructor:
Georg Brandl116aa622007-08-15 14:28:22 +00002358
Nick Coghlan273069c2012-08-20 17:14:07 +10002359* Creating an empty instance: ``bytearray()``
2360* Creating a zero-filled instance with a given length: ``bytearray(10)``
2361* From an iterable of integers: ``bytearray(range(20))``
Ezio Melotti971ba4c2012-10-27 23:25:18 +03002362* Copying existing binary data via the buffer protocol: ``bytearray(b'Hi!')``
Eli Benderskycbbaa962011-02-25 05:47:53 +00002363
Nick Coghlan273069c2012-08-20 17:14:07 +10002364As bytearray objects are mutable, they support the
2365:ref:`mutable <typesseq-mutable>` sequence operations in addition to the
2366common bytes and bytearray operations described in :ref:`bytes-methods`.
Georg Brandl116aa622007-08-15 14:28:22 +00002367
Nick Coghlan83c0ae52012-08-21 17:42:52 +10002368Also see the :ref:`bytearray <func-bytearray>` built-in.
2369
Nick Coghlane4936b82014-08-09 16:14:04 +10002370Since 2 hexadecimal digits correspond precisely to a single byte, hexadecimal
2371numbers are a commonly used format for describing binary data. Accordingly,
2372the bytearray type has an additional class method to read data in that format:
2373
2374.. classmethod:: bytearray.fromhex(string)
2375
2376 This :class:`bytearray` class method returns bytearray object, decoding
2377 the given string object. The string must contain two hexadecimal digits
2378 per byte, with ASCII spaces being ignored.
2379
2380 >>> bytearray.fromhex('2Ef0 F1f2 ')
2381 bytearray(b'.\xf0\xf1\xf2')
2382
Gregory P. Smith8cb65692015-04-25 23:22:26 +00002383A reverse conversion function exists to transform a bytearray object into its
2384hexadecimal representation.
2385
2386.. method:: bytearray.hex()
2387
2388 Return a string object containing two hexadecimal digits for each
2389 byte in the instance.
2390
2391 >>> bytearray(b'\xf0\xf1\xf2').hex()
2392 'f0f1f2'
2393
2394 .. versionadded:: 3.5
2395
Nick Coghlane4936b82014-08-09 16:14:04 +10002396Since bytearray objects are sequences of integers (akin to a list), for a
2397bytearray object *b*, ``b[0]`` will be an integer, while ``b[0:1]`` will be
2398a bytearray object of length 1. (This contrasts with text strings, where
2399both indexing and slicing will produce a string of length 1)
2400
2401The representation of bytearray objects uses the bytes literal format
2402(``bytearray(b'...')``) since it is often more useful than e.g.
2403``bytearray([46, 46, 46])``. You can always convert a bytearray object into
2404a list of integers using ``list(b)``.
2405
Georg Brandl495f7b52009-10-27 15:28:25 +00002406
Georg Brandl226878c2007-08-31 10:15:37 +00002407.. _bytes-methods:
2408
Nick Coghlan273069c2012-08-20 17:14:07 +10002409Bytes and Bytearray Operations
2410------------------------------
Georg Brandl226878c2007-08-31 10:15:37 +00002411
2412.. index:: pair: bytes; methods
Georg Brandl95414632007-11-22 11:00:28 +00002413 pair: bytearray; methods
Georg Brandl226878c2007-08-31 10:15:37 +00002414
Nick Coghlan273069c2012-08-20 17:14:07 +10002415Both bytes and bytearray objects support the :ref:`common <typesseq-common>`
2416sequence operations. They interoperate not just with operands of the same
Nick Coghlane4936b82014-08-09 16:14:04 +10002417type, but with any :term:`bytes-like object`. Due to this flexibility, they can be
Nick Coghlan273069c2012-08-20 17:14:07 +10002418freely mixed in operations without causing errors. However, the return type
2419of the result may depend on the order of operands.
Guido van Rossum98297ee2007-11-06 21:34:58 +00002420
Georg Brandl7c676132007-10-23 18:17:00 +00002421.. note::
Georg Brandl226878c2007-08-31 10:15:37 +00002422
Georg Brandl95414632007-11-22 11:00:28 +00002423 The methods on bytes and bytearray objects don't accept strings as their
Georg Brandl7c676132007-10-23 18:17:00 +00002424 arguments, just as the methods on strings don't accept bytes as their
Nick Coghlan273069c2012-08-20 17:14:07 +10002425 arguments. For example, you have to write::
Georg Brandl226878c2007-08-31 10:15:37 +00002426
Georg Brandl7c676132007-10-23 18:17:00 +00002427 a = "abc"
2428 b = a.replace("a", "f")
2429
Nick Coghlan273069c2012-08-20 17:14:07 +10002430 and::
Georg Brandl7c676132007-10-23 18:17:00 +00002431
2432 a = b"abc"
2433 b = a.replace(b"a", b"f")
Georg Brandl226878c2007-08-31 10:15:37 +00002434
Nick Coghlane4936b82014-08-09 16:14:04 +10002435Some bytes and bytearray operations assume the use of ASCII compatible
2436binary formats, and hence should be avoided when working with arbitrary
2437binary data. These restrictions are covered below.
Nick Coghlan273069c2012-08-20 17:14:07 +10002438
2439.. note::
Nick Coghlane4936b82014-08-09 16:14:04 +10002440 Using these ASCII based operations to manipulate binary data that is not
Nick Coghlan273069c2012-08-20 17:14:07 +10002441 stored in an ASCII based format may lead to data corruption.
2442
Nick Coghlane4936b82014-08-09 16:14:04 +10002443The following methods on bytes and bytearray objects can be used with
2444arbitrary binary data.
Nick Coghlan273069c2012-08-20 17:14:07 +10002445
Nick Coghlane4936b82014-08-09 16:14:04 +10002446.. method:: bytes.count(sub[, start[, end]])
2447 bytearray.count(sub[, start[, end]])
Nick Coghlan273069c2012-08-20 17:14:07 +10002448
Nick Coghlane4936b82014-08-09 16:14:04 +10002449 Return the number of non-overlapping occurrences of subsequence *sub* in
2450 the range [*start*, *end*]. Optional arguments *start* and *end* are
2451 interpreted as in slice notation.
Nick Coghlan273069c2012-08-20 17:14:07 +10002452
Nick Coghlane4936b82014-08-09 16:14:04 +10002453 The subsequence to search for may be any :term:`bytes-like object` or an
2454 integer in the range 0 to 255.
2455
2456 .. versionchanged:: 3.3
2457 Also accept an integer in the range 0 to 255 as the subsequence.
2458
Georg Brandl226878c2007-08-31 10:15:37 +00002459
Victor Stinnere14e2122010-11-07 18:41:46 +00002460.. method:: bytes.decode(encoding="utf-8", errors="strict")
2461 bytearray.decode(encoding="utf-8", errors="strict")
Georg Brandl4f5f98d2009-05-04 21:01:20 +00002462
Victor Stinnere14e2122010-11-07 18:41:46 +00002463 Return a string decoded from the given bytes. Default encoding is
2464 ``'utf-8'``. *errors* may be given to set a different
Georg Brandl4f5f98d2009-05-04 21:01:20 +00002465 error handling scheme. The default for *errors* is ``'strict'``, meaning
2466 that encoding errors raise a :exc:`UnicodeError`. Other possible values are
2467 ``'ignore'``, ``'replace'`` and any other name registered via
Nick Coghlanb9fdb7a2015-01-07 00:22:00 +10002468 :func:`codecs.register_error`, see section :ref:`error-handlers`. For a
Georg Brandl4f5f98d2009-05-04 21:01:20 +00002469 list of possible encodings, see section :ref:`standard-encodings`.
2470
Nick Coghlane4936b82014-08-09 16:14:04 +10002471 .. note::
2472
2473 Passing the *encoding* argument to :class:`str` allows decoding any
2474 :term:`bytes-like object` directly, without needing to make a temporary
2475 bytes or bytearray object.
2476
Benjamin Peterson308d6372009-09-18 21:42:35 +00002477 .. versionchanged:: 3.1
2478 Added support for keyword arguments.
2479
Georg Brandl226878c2007-08-31 10:15:37 +00002480
Nick Coghlane4936b82014-08-09 16:14:04 +10002481.. method:: bytes.endswith(suffix[, start[, end]])
2482 bytearray.endswith(suffix[, start[, end]])
Georg Brandl226878c2007-08-31 10:15:37 +00002483
Nick Coghlane4936b82014-08-09 16:14:04 +10002484 Return ``True`` if the binary data ends with the specified *suffix*,
2485 otherwise return ``False``. *suffix* can also be a tuple of suffixes to
2486 look for. With optional *start*, test beginning at that position. With
2487 optional *end*, stop comparing at that position.
Georg Brandl226878c2007-08-31 10:15:37 +00002488
Nick Coghlane4936b82014-08-09 16:14:04 +10002489 The suffix(es) to search for may be any :term:`bytes-like object`.
Georg Brandl226878c2007-08-31 10:15:37 +00002490
Georg Brandlabc38772009-04-12 15:51:51 +00002491
Nick Coghlane4936b82014-08-09 16:14:04 +10002492.. method:: bytes.find(sub[, start[, end]])
2493 bytearray.find(sub[, start[, end]])
2494
2495 Return the lowest index in the data where the subsequence *sub* is found,
2496 such that *sub* is contained in the slice ``s[start:end]``. Optional
2497 arguments *start* and *end* are interpreted as in slice notation. Return
2498 ``-1`` if *sub* is not found.
2499
2500 The subsequence to search for may be any :term:`bytes-like object` or an
2501 integer in the range 0 to 255.
2502
2503 .. note::
2504
2505 The :meth:`~bytes.find` method should be used only if you need to know the
2506 position of *sub*. To check if *sub* is a substring or not, use the
2507 :keyword:`in` operator::
2508
2509 >>> b'Py' in b'Python'
2510 True
2511
2512 .. versionchanged:: 3.3
2513 Also accept an integer in the range 0 to 255 as the subsequence.
2514
2515
2516.. method:: bytes.index(sub[, start[, end]])
2517 bytearray.index(sub[, start[, end]])
2518
2519 Like :meth:`~bytes.find`, but raise :exc:`ValueError` when the
2520 subsequence is not found.
2521
2522 The subsequence to search for may be any :term:`bytes-like object` or an
2523 integer in the range 0 to 255.
2524
2525 .. versionchanged:: 3.3
2526 Also accept an integer in the range 0 to 255 as the subsequence.
2527
2528
2529.. method:: bytes.join(iterable)
2530 bytearray.join(iterable)
2531
2532 Return a bytes or bytearray object which is the concatenation of the
2533 binary data sequences in the :term:`iterable` *iterable*. A
2534 :exc:`TypeError` will be raised if there are any values in *iterable*
R David Murray0e8168c2015-05-17 10:16:37 -04002535 that are not :term:`bytes-like objects <bytes-like object>`, including
Nick Coghlane4936b82014-08-09 16:14:04 +10002536 :class:`str` objects. The separator between elements is the contents
2537 of the bytes or bytearray object providing this method.
2538
2539
2540.. staticmethod:: bytes.maketrans(from, to)
2541 bytearray.maketrans(from, to)
2542
2543 This static method returns a translation table usable for
2544 :meth:`bytes.translate` that will map each character in *from* into the
2545 character at the same position in *to*; *from* and *to* must both be
2546 :term:`bytes-like objects <bytes-like object>` and have the same length.
2547
2548 .. versionadded:: 3.1
2549
2550
2551.. method:: bytes.partition(sep)
2552 bytearray.partition(sep)
2553
2554 Split the sequence at the first occurrence of *sep*, and return a 3-tuple
2555 containing the part before the separator, the separator, and the part
2556 after the separator. If the separator is not found, return a 3-tuple
2557 containing a copy of the original sequence, followed by two empty bytes or
2558 bytearray objects.
2559
2560 The separator to search for may be any :term:`bytes-like object`.
2561
2562
2563.. method:: bytes.replace(old, new[, count])
2564 bytearray.replace(old, new[, count])
2565
2566 Return a copy of the sequence with all occurrences of subsequence *old*
2567 replaced by *new*. If the optional argument *count* is given, only the
2568 first *count* occurrences are replaced.
2569
2570 The subsequence to search for and its replacement may be any
2571 :term:`bytes-like object`.
2572
2573 .. note::
2574
2575 The bytearray version of this method does *not* operate in place - it
2576 always produces a new object, even if no changes were made.
2577
2578
2579.. method:: bytes.rfind(sub[, start[, end]])
2580 bytearray.rfind(sub[, start[, end]])
2581
2582 Return the highest index in the sequence where the subsequence *sub* is
2583 found, such that *sub* is contained within ``s[start:end]``. Optional
2584 arguments *start* and *end* are interpreted as in slice notation. Return
2585 ``-1`` on failure.
2586
2587 The subsequence to search for may be any :term:`bytes-like object` or an
2588 integer in the range 0 to 255.
2589
2590 .. versionchanged:: 3.3
2591 Also accept an integer in the range 0 to 255 as the subsequence.
2592
2593
2594.. method:: bytes.rindex(sub[, start[, end]])
2595 bytearray.rindex(sub[, start[, end]])
2596
2597 Like :meth:`~bytes.rfind` but raises :exc:`ValueError` when the
2598 subsequence *sub* is not found.
2599
2600 The subsequence to search for may be any :term:`bytes-like object` or an
2601 integer in the range 0 to 255.
2602
2603 .. versionchanged:: 3.3
2604 Also accept an integer in the range 0 to 255 as the subsequence.
2605
2606
2607.. method:: bytes.rpartition(sep)
2608 bytearray.rpartition(sep)
2609
2610 Split the sequence at the last occurrence of *sep*, and return a 3-tuple
2611 containing the part before the separator, the separator, and the part
2612 after the separator. If the separator is not found, return a 3-tuple
2613 containing a copy of the original sequence, followed by two empty bytes or
2614 bytearray objects.
2615
2616 The separator to search for may be any :term:`bytes-like object`.
2617
2618
2619.. method:: bytes.startswith(prefix[, start[, end]])
2620 bytearray.startswith(prefix[, start[, end]])
2621
2622 Return ``True`` if the binary data starts with the specified *prefix*,
2623 otherwise return ``False``. *prefix* can also be a tuple of prefixes to
2624 look for. With optional *start*, test beginning at that position. With
2625 optional *end*, stop comparing at that position.
2626
2627 The prefix(es) to search for may be any :term:`bytes-like object`.
2628
Georg Brandl48310cd2009-01-03 21:18:54 +00002629
Georg Brandl454636f2008-12-27 23:33:20 +00002630.. method:: bytes.translate(table[, delete])
Georg Brandl751771b2009-05-31 21:38:37 +00002631 bytearray.translate(table[, delete])
Georg Brandl226878c2007-08-31 10:15:37 +00002632
Georg Brandl454636f2008-12-27 23:33:20 +00002633 Return a copy of the bytes or bytearray object where all bytes occurring in
Nick Coghlane4936b82014-08-09 16:14:04 +10002634 the optional argument *delete* are removed, and the remaining bytes have
2635 been mapped through the given translation table, which must be a bytes
2636 object of length 256.
Georg Brandl226878c2007-08-31 10:15:37 +00002637
Nick Coghlane4936b82014-08-09 16:14:04 +10002638 You can use the :func:`bytes.maketrans` method to create a translation
2639 table.
Georg Brandl226878c2007-08-31 10:15:37 +00002640
Georg Brandl454636f2008-12-27 23:33:20 +00002641 Set the *table* argument to ``None`` for translations that only delete
2642 characters::
Georg Brandl226878c2007-08-31 10:15:37 +00002643
Georg Brandl454636f2008-12-27 23:33:20 +00002644 >>> b'read this short text'.translate(None, b'aeiou')
2645 b'rd ths shrt txt'
Georg Brandl226878c2007-08-31 10:15:37 +00002646
2647
Nick Coghlane4936b82014-08-09 16:14:04 +10002648The following methods on bytes and bytearray objects have default behaviours
2649that assume the use of ASCII compatible binary formats, but can still be used
2650with arbitrary binary data by passing appropriate arguments. Note that all of
2651the bytearray methods in this section do *not* operate in place, and instead
2652produce new objects.
Georg Brandlabc38772009-04-12 15:51:51 +00002653
Nick Coghlane4936b82014-08-09 16:14:04 +10002654.. method:: bytes.center(width[, fillbyte])
2655 bytearray.center(width[, fillbyte])
Georg Brandlabc38772009-04-12 15:51:51 +00002656
Nick Coghlane4936b82014-08-09 16:14:04 +10002657 Return a copy of the object centered in a sequence of length *width*.
2658 Padding is done using the specified *fillbyte* (default is an ASCII
2659 space). For :class:`bytes` objects, the original sequence is returned if
2660 *width* is less than or equal to ``len(s)``.
2661
2662 .. note::
2663
2664 The bytearray version of this method does *not* operate in place -
2665 it always produces a new object, even if no changes were made.
2666
2667
2668.. method:: bytes.ljust(width[, fillbyte])
2669 bytearray.ljust(width[, fillbyte])
2670
2671 Return a copy of the object left justified in a sequence of length *width*.
2672 Padding is done using the specified *fillbyte* (default is an ASCII
2673 space). For :class:`bytes` objects, the original sequence is returned if
2674 *width* is less than or equal to ``len(s)``.
2675
2676 .. note::
2677
2678 The bytearray version of this method does *not* operate in place -
2679 it always produces a new object, even if no changes were made.
2680
2681
2682.. method:: bytes.lstrip([chars])
2683 bytearray.lstrip([chars])
2684
2685 Return a copy of the sequence with specified leading bytes removed. The
2686 *chars* argument is a binary sequence specifying the set of byte values to
2687 be removed - the name refers to the fact this method is usually used with
2688 ASCII characters. If omitted or ``None``, the *chars* argument defaults
2689 to removing ASCII whitespace. The *chars* argument is not a prefix;
2690 rather, all combinations of its values are stripped::
2691
2692 >>> b' spacious '.lstrip()
2693 b'spacious '
2694 >>> b'www.example.com'.lstrip(b'cmowz.')
2695 b'example.com'
2696
2697 The binary sequence of byte values to remove may be any
2698 :term:`bytes-like object`.
2699
2700 .. note::
2701
2702 The bytearray version of this method does *not* operate in place -
2703 it always produces a new object, even if no changes were made.
2704
2705
2706.. method:: bytes.rjust(width[, fillbyte])
2707 bytearray.rjust(width[, fillbyte])
2708
2709 Return a copy of the object right justified in a sequence of length *width*.
2710 Padding is done using the specified *fillbyte* (default is an ASCII
2711 space). For :class:`bytes` objects, the original sequence is returned if
2712 *width* is less than or equal to ``len(s)``.
2713
2714 .. note::
2715
2716 The bytearray version of this method does *not* operate in place -
2717 it always produces a new object, even if no changes were made.
2718
2719
2720.. method:: bytes.rsplit(sep=None, maxsplit=-1)
2721 bytearray.rsplit(sep=None, maxsplit=-1)
2722
2723 Split the binary sequence into subsequences of the same type, using *sep*
2724 as the delimiter string. If *maxsplit* is given, at most *maxsplit* splits
2725 are done, the *rightmost* ones. If *sep* is not specified or ``None``,
2726 any subsequence consisting solely of ASCII whitespace is a separator.
2727 Except for splitting from the right, :meth:`rsplit` behaves like
2728 :meth:`split` which is described in detail below.
2729
2730
2731.. method:: bytes.rstrip([chars])
2732 bytearray.rstrip([chars])
2733
2734 Return a copy of the sequence with specified trailing bytes removed. The
2735 *chars* argument is a binary sequence specifying the set of byte values to
2736 be removed - the name refers to the fact this method is usually used with
2737 ASCII characters. If omitted or ``None``, the *chars* argument defaults to
2738 removing ASCII whitespace. The *chars* argument is not a suffix; rather,
2739 all combinations of its values are stripped::
2740
2741 >>> b' spacious '.rstrip()
2742 b' spacious'
2743 >>> b'mississippi'.rstrip(b'ipz')
2744 b'mississ'
2745
2746 The binary sequence of byte values to remove may be any
2747 :term:`bytes-like object`.
2748
2749 .. note::
2750
2751 The bytearray version of this method does *not* operate in place -
2752 it always produces a new object, even if no changes were made.
2753
2754
2755.. method:: bytes.split(sep=None, maxsplit=-1)
2756 bytearray.split(sep=None, maxsplit=-1)
2757
2758 Split the binary sequence into subsequences of the same type, using *sep*
2759 as the delimiter string. If *maxsplit* is given and non-negative, at most
2760 *maxsplit* splits are done (thus, the list will have at most ``maxsplit+1``
2761 elements). If *maxsplit* is not specified or is ``-1``, then there is no
2762 limit on the number of splits (all possible splits are made).
2763
2764 If *sep* is given, consecutive delimiters are not grouped together and are
2765 deemed to delimit empty subsequences (for example, ``b'1,,2'.split(b',')``
2766 returns ``[b'1', b'', b'2']``). The *sep* argument may consist of a
2767 multibyte sequence (for example, ``b'1<>2<>3'.split(b'<>')`` returns
2768 ``[b'1', b'2', b'3']``). Splitting an empty sequence with a specified
2769 separator returns ``[b'']`` or ``[bytearray(b'')]`` depending on the type
2770 of object being split. The *sep* argument may be any
2771 :term:`bytes-like object`.
2772
2773 For example::
2774
2775 >>> b'1,2,3'.split(b',')
2776 [b'1', b'2', b'3']
2777 >>> b'1,2,3'.split(b',', maxsplit=1)
Benjamin Petersoneb83ffe2014-09-22 22:43:50 -04002778 [b'1', b'2,3']
Nick Coghlane4936b82014-08-09 16:14:04 +10002779 >>> b'1,2,,3,'.split(b',')
2780 [b'1', b'2', b'', b'3', b'']
2781
2782 If *sep* is not specified or is ``None``, a different splitting algorithm
2783 is applied: runs of consecutive ASCII whitespace are regarded as a single
2784 separator, and the result will contain no empty strings at the start or
2785 end if the sequence has leading or trailing whitespace. Consequently,
2786 splitting an empty sequence or a sequence consisting solely of ASCII
2787 whitespace without a specified separator returns ``[]``.
2788
2789 For example::
2790
2791
2792 >>> b'1 2 3'.split()
2793 [b'1', b'2', b'3']
2794 >>> b'1 2 3'.split(maxsplit=1)
2795 [b'1', b'2 3']
2796 >>> b' 1 2 3 '.split()
2797 [b'1', b'2', b'3']
2798
2799
2800.. method:: bytes.strip([chars])
2801 bytearray.strip([chars])
2802
2803 Return a copy of the sequence with specified leading and trailing bytes
2804 removed. The *chars* argument is a binary sequence specifying the set of
2805 byte values to be removed - the name refers to the fact this method is
2806 usually used with ASCII characters. If omitted or ``None``, the *chars*
2807 argument defaults to removing ASCII whitespace. The *chars* argument is
2808 not a prefix or suffix; rather, all combinations of its values are
2809 stripped::
2810
2811 >>> b' spacious '.strip()
2812 b'spacious'
2813 >>> b'www.example.com'.strip(b'cmowz.')
2814 b'example'
2815
2816 The binary sequence of byte values to remove may be any
2817 :term:`bytes-like object`.
2818
2819 .. note::
2820
2821 The bytearray version of this method does *not* operate in place -
2822 it always produces a new object, even if no changes were made.
2823
2824
2825The following methods on bytes and bytearray objects assume the use of ASCII
2826compatible binary formats and should not be applied to arbitrary binary data.
2827Note that all of the bytearray methods in this section do *not* operate in
2828place, and instead produce new objects.
2829
2830.. method:: bytes.capitalize()
2831 bytearray.capitalize()
2832
2833 Return a copy of the sequence with each byte interpreted as an ASCII
2834 character, and the first byte capitalized and the rest lowercased.
2835 Non-ASCII byte values are passed through unchanged.
2836
2837 .. note::
2838
2839 The bytearray version of this method does *not* operate in place - it
2840 always produces a new object, even if no changes were made.
2841
2842
2843.. method:: bytes.expandtabs(tabsize=8)
2844 bytearray.expandtabs(tabsize=8)
2845
2846 Return a copy of the sequence where all ASCII tab characters are replaced
2847 by one or more ASCII spaces, depending on the current column and the given
2848 tab size. Tab positions occur every *tabsize* bytes (default is 8,
2849 giving tab positions at columns 0, 8, 16 and so on). To expand the
2850 sequence, the current column is set to zero and the sequence is examined
2851 byte by byte. If the byte is an ASCII tab character (``b'\t'``), one or
2852 more space characters are inserted in the result until the current column
2853 is equal to the next tab position. (The tab character itself is not
2854 copied.) If the current byte is an ASCII newline (``b'\n'``) or
2855 carriage return (``b'\r'``), it is copied and the current column is reset
2856 to zero. Any other byte value is copied unchanged and the current column
2857 is incremented by one regardless of how the byte value is represented when
2858 printed::
2859
2860 >>> b'01\t012\t0123\t01234'.expandtabs()
2861 b'01 012 0123 01234'
2862 >>> b'01\t012\t0123\t01234'.expandtabs(4)
2863 b'01 012 0123 01234'
2864
2865 .. note::
2866
2867 The bytearray version of this method does *not* operate in place - it
2868 always produces a new object, even if no changes were made.
2869
2870
2871.. method:: bytes.isalnum()
2872 bytearray.isalnum()
2873
2874 Return true if all bytes in the sequence are alphabetical ASCII characters
2875 or ASCII decimal digits and the sequence is not empty, false otherwise.
2876 Alphabetic ASCII characters are those byte values in the sequence
2877 ``b'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'``. ASCII decimal
2878 digits are those byte values in the sequence ``b'0123456789'``.
2879
2880 For example::
2881
2882 >>> b'ABCabc1'.isalnum()
2883 True
2884 >>> b'ABC abc1'.isalnum()
2885 False
2886
2887
2888.. method:: bytes.isalpha()
2889 bytearray.isalpha()
2890
2891 Return true if all bytes in the sequence are alphabetic ASCII characters
2892 and the sequence is not empty, false otherwise. Alphabetic ASCII
2893 characters are those byte values in the sequence
2894 ``b'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'``.
2895
2896 For example::
2897
2898 >>> b'ABCabc'.isalpha()
2899 True
2900 >>> b'ABCabc1'.isalpha()
2901 False
2902
2903
2904.. method:: bytes.isdigit()
2905 bytearray.isdigit()
2906
2907 Return true if all bytes in the sequence are ASCII decimal digits
2908 and the sequence is not empty, false otherwise. ASCII decimal digits are
2909 those byte values in the sequence ``b'0123456789'``.
2910
2911 For example::
2912
2913 >>> b'1234'.isdigit()
2914 True
2915 >>> b'1.23'.isdigit()
2916 False
2917
2918
2919.. method:: bytes.islower()
2920 bytearray.islower()
2921
2922 Return true if there is at least one lowercase ASCII character
2923 in the sequence and no uppercase ASCII characters, false otherwise.
2924
2925 For example::
2926
2927 >>> b'hello world'.islower()
2928 True
2929 >>> b'Hello world'.islower()
2930 False
2931
2932 Lowercase ASCII characters are those byte values in the sequence
2933 ``b'abcdefghijklmnopqrstuvwxyz'``. Uppercase ASCII characters
2934 are those byte values in the sequence ``b'ABCDEFGHIJKLMNOPQRSTUVWXYZ'``.
2935
2936
2937.. method:: bytes.isspace()
2938 bytearray.isspace()
2939
2940 Return true if all bytes in the sequence are ASCII whitespace and the
2941 sequence is not empty, false otherwise. ASCII whitespace characters are
Serhiy Storchakabf7b9ed2015-11-23 16:43:05 +02002942 those byte values in the sequence ``b' \t\n\r\x0b\f'`` (space, tab, newline,
Nick Coghlane4936b82014-08-09 16:14:04 +10002943 carriage return, vertical tab, form feed).
2944
2945
2946.. method:: bytes.istitle()
2947 bytearray.istitle()
2948
2949 Return true if the sequence is ASCII titlecase and the sequence is not
2950 empty, false otherwise. See :meth:`bytes.title` for more details on the
2951 definition of "titlecase".
2952
2953 For example::
2954
2955 >>> b'Hello World'.istitle()
2956 True
2957 >>> b'Hello world'.istitle()
2958 False
2959
2960
2961.. method:: bytes.isupper()
2962 bytearray.isupper()
2963
Zachary Ware0b496372015-02-27 01:40:22 -06002964 Return true if there is at least one uppercase alphabetic ASCII character
2965 in the sequence and no lowercase ASCII characters, false otherwise.
Nick Coghlane4936b82014-08-09 16:14:04 +10002966
2967 For example::
2968
2969 >>> b'HELLO WORLD'.isupper()
2970 True
2971 >>> b'Hello world'.isupper()
2972 False
2973
2974 Lowercase ASCII characters are those byte values in the sequence
2975 ``b'abcdefghijklmnopqrstuvwxyz'``. Uppercase ASCII characters
2976 are those byte values in the sequence ``b'ABCDEFGHIJKLMNOPQRSTUVWXYZ'``.
2977
2978
2979.. method:: bytes.lower()
2980 bytearray.lower()
2981
2982 Return a copy of the sequence with all the uppercase ASCII characters
2983 converted to their corresponding lowercase counterpart.
2984
2985 For example::
2986
2987 >>> b'Hello World'.lower()
2988 b'hello world'
2989
2990 Lowercase ASCII characters are those byte values in the sequence
2991 ``b'abcdefghijklmnopqrstuvwxyz'``. Uppercase ASCII characters
2992 are those byte values in the sequence ``b'ABCDEFGHIJKLMNOPQRSTUVWXYZ'``.
2993
2994 .. note::
2995
2996 The bytearray version of this method does *not* operate in place - it
2997 always produces a new object, even if no changes were made.
2998
2999
3000.. index::
3001 single: universal newlines; bytes.splitlines method
3002 single: universal newlines; bytearray.splitlines method
3003
3004.. method:: bytes.splitlines(keepends=False)
3005 bytearray.splitlines(keepends=False)
3006
3007 Return a list of the lines in the binary sequence, breaking at ASCII
3008 line boundaries. This method uses the :term:`universal newlines` approach
3009 to splitting lines. Line breaks are not included in the resulting list
3010 unless *keepends* is given and true.
3011
3012 For example::
3013
3014 >>> b'ab c\n\nde fg\rkl\r\n'.splitlines()
Larry Hastingsc6256e52014-10-05 19:03:48 -07003015 [b'ab c', b'', b'de fg', b'kl']
Nick Coghlane4936b82014-08-09 16:14:04 +10003016 >>> b'ab c\n\nde fg\rkl\r\n'.splitlines(keepends=True)
3017 [b'ab c\n', b'\n', b'de fg\r', b'kl\r\n']
3018
3019 Unlike :meth:`~bytes.split` when a delimiter string *sep* is given, this
3020 method returns an empty list for the empty string, and a terminal line
3021 break does not result in an extra line::
3022
3023 >>> b"".split(b'\n'), b"Two lines\n".split(b'\n')
3024 ([b''], [b'Two lines', b''])
3025 >>> b"".splitlines(), b"One line\n".splitlines()
3026 ([], [b'One line'])
3027
3028
3029.. method:: bytes.swapcase()
3030 bytearray.swapcase()
3031
3032 Return a copy of the sequence with all the lowercase ASCII characters
3033 converted to their corresponding uppercase counterpart and vice-versa.
3034
3035 For example::
3036
3037 >>> b'Hello World'.swapcase()
3038 b'hELLO wORLD'
3039
3040 Lowercase ASCII characters are those byte values in the sequence
3041 ``b'abcdefghijklmnopqrstuvwxyz'``. Uppercase ASCII characters
3042 are those byte values in the sequence ``b'ABCDEFGHIJKLMNOPQRSTUVWXYZ'``.
3043
3044 Unlike :func:`str.swapcase()`, it is always the case that
3045 ``bin.swapcase().swapcase() == bin`` for the binary versions. Case
3046 conversions are symmetrical in ASCII, even though that is not generally
3047 true for arbitrary Unicode code points.
3048
3049 .. note::
3050
3051 The bytearray version of this method does *not* operate in place - it
3052 always produces a new object, even if no changes were made.
3053
3054
3055.. method:: bytes.title()
3056 bytearray.title()
3057
3058 Return a titlecased version of the binary sequence where words start with
3059 an uppercase ASCII character and the remaining characters are lowercase.
3060 Uncased byte values are left unmodified.
3061
3062 For example::
3063
3064 >>> b'Hello world'.title()
3065 b'Hello World'
3066
3067 Lowercase ASCII characters are those byte values in the sequence
3068 ``b'abcdefghijklmnopqrstuvwxyz'``. Uppercase ASCII characters
3069 are those byte values in the sequence ``b'ABCDEFGHIJKLMNOPQRSTUVWXYZ'``.
3070 All other byte values are uncased.
3071
3072 The algorithm uses a simple language-independent definition of a word as
3073 groups of consecutive letters. The definition works in many contexts but
3074 it means that apostrophes in contractions and possessives form word
3075 boundaries, which may not be the desired result::
3076
3077 >>> b"they're bill's friends from the UK".title()
3078 b"They'Re Bill'S Friends From The Uk"
3079
3080 A workaround for apostrophes can be constructed using regular expressions::
3081
3082 >>> import re
3083 >>> def titlecase(s):
3084 ... return re.sub(rb"[A-Za-z]+('[A-Za-z]+)?",
3085 ... lambda mo: mo.group(0)[0:1].upper() +
3086 ... mo.group(0)[1:].lower(),
3087 ... s)
3088 ...
3089 >>> titlecase(b"they're bill's friends.")
3090 b"They're Bill's Friends."
3091
3092 .. note::
3093
3094 The bytearray version of this method does *not* operate in place - it
3095 always produces a new object, even if no changes were made.
3096
3097
3098.. method:: bytes.upper()
3099 bytearray.upper()
3100
3101 Return a copy of the sequence with all the lowercase ASCII characters
3102 converted to their corresponding uppercase counterpart.
3103
3104 For example::
3105
3106 >>> b'Hello World'.upper()
3107 b'HELLO WORLD'
3108
3109 Lowercase ASCII characters are those byte values in the sequence
3110 ``b'abcdefghijklmnopqrstuvwxyz'``. Uppercase ASCII characters
3111 are those byte values in the sequence ``b'ABCDEFGHIJKLMNOPQRSTUVWXYZ'``.
3112
3113 .. note::
3114
3115 The bytearray version of this method does *not* operate in place - it
3116 always produces a new object, even if no changes were made.
3117
3118
3119.. method:: bytes.zfill(width)
3120 bytearray.zfill(width)
3121
3122 Return a copy of the sequence left filled with ASCII ``b'0'`` digits to
3123 make a sequence of length *width*. A leading sign prefix (``b'+'``/
3124 ``b'-'`` is handled by inserting the padding *after* the sign character
3125 rather than before. For :class:`bytes` objects, the original sequence is
3126 returned if *width* is less than or equal to ``len(seq)``.
3127
3128 For example::
3129
3130 >>> b"42".zfill(5)
3131 b'00042'
3132 >>> b"-42".zfill(5)
3133 b'-0042'
3134
3135 .. note::
3136
3137 The bytearray version of this method does *not* operate in place - it
3138 always produces a new object, even if no changes were made.
Georg Brandlabc38772009-04-12 15:51:51 +00003139
3140
Ethan Furmanb95b5612015-01-23 20:05:18 -08003141.. _bytes-formatting:
3142
3143``printf``-style Bytes Formatting
3144----------------------------------
3145
3146.. index::
3147 single: formatting, bytes (%)
3148 single: formatting, bytearray (%)
3149 single: interpolation, bytes (%)
3150 single: interpolation, bytearray (%)
3151 single: bytes; formatting
3152 single: bytearray; formatting
3153 single: bytes; interpolation
3154 single: bytearray; interpolation
3155 single: printf-style formatting
3156 single: sprintf-style formatting
3157 single: % formatting
3158 single: % interpolation
3159
3160.. note::
3161
3162 The formatting operations described here exhibit a variety of quirks that
3163 lead to a number of common errors (such as failing to display tuples and
3164 dictionaries correctly). If the value being printed may be a tuple or
3165 dictionary, wrap it in a tuple.
3166
3167Bytes objects (``bytes``/``bytearray``) have one unique built-in operation:
3168the ``%`` operator (modulo).
3169This is also known as the bytes *formatting* or *interpolation* operator.
3170Given ``format % values`` (where *format* is a bytes object), ``%`` conversion
3171specifications in *format* are replaced with zero or more elements of *values*.
3172The effect is similar to using the :c:func:`sprintf` in the C language.
3173
3174If *format* requires a single argument, *values* may be a single non-tuple
3175object. [5]_ Otherwise, *values* must be a tuple with exactly the number of
3176items specified by the format bytes object, or a single mapping object (for
3177example, a dictionary).
3178
3179A conversion specifier contains two or more characters and has the following
3180components, which must occur in this order:
3181
3182#. The ``'%'`` character, which marks the start of the specifier.
3183
3184#. Mapping key (optional), consisting of a parenthesised sequence of characters
3185 (for example, ``(somename)``).
3186
3187#. Conversion flags (optional), which affect the result of some conversion
3188 types.
3189
3190#. Minimum field width (optional). If specified as an ``'*'`` (asterisk), the
3191 actual width is read from the next element of the tuple in *values*, and the
3192 object to convert comes after the minimum field width and optional precision.
3193
3194#. Precision (optional), given as a ``'.'`` (dot) followed by the precision. If
3195 specified as ``'*'`` (an asterisk), the actual precision is read from the next
3196 element of the tuple in *values*, and the value to convert comes after the
3197 precision.
3198
3199#. Length modifier (optional).
3200
3201#. Conversion type.
3202
3203When the right argument is a dictionary (or other mapping type), then the
3204formats in the bytes object *must* include a parenthesised mapping key into that
3205dictionary inserted immediately after the ``'%'`` character. The mapping key
3206selects the value to be formatted from the mapping. For example:
3207
3208 >>> print(b'%(language)s has %(number)03d quote types.' %
3209 ... {b'language': b"Python", b"number": 2})
3210 b'Python has 002 quote types.'
3211
3212In this case no ``*`` specifiers may occur in a format (since they require a
3213sequential parameter list).
3214
3215The conversion flag characters are:
3216
3217+---------+---------------------------------------------------------------------+
3218| Flag | Meaning |
3219+=========+=====================================================================+
3220| ``'#'`` | The value conversion will use the "alternate form" (where defined |
3221| | below). |
3222+---------+---------------------------------------------------------------------+
3223| ``'0'`` | The conversion will be zero padded for numeric values. |
3224+---------+---------------------------------------------------------------------+
3225| ``'-'`` | The converted value is left adjusted (overrides the ``'0'`` |
3226| | conversion if both are given). |
3227+---------+---------------------------------------------------------------------+
3228| ``' '`` | (a space) A blank should be left before a positive number (or empty |
3229| | string) produced by a signed conversion. |
3230+---------+---------------------------------------------------------------------+
3231| ``'+'`` | A sign character (``'+'`` or ``'-'``) will precede the conversion |
3232| | (overrides a "space" flag). |
3233+---------+---------------------------------------------------------------------+
3234
3235A length modifier (``h``, ``l``, or ``L``) may be present, but is ignored as it
3236is not necessary for Python -- so e.g. ``%ld`` is identical to ``%d``.
3237
3238The conversion types are:
3239
3240+------------+-----------------------------------------------------+-------+
3241| Conversion | Meaning | Notes |
3242+============+=====================================================+=======+
3243| ``'d'`` | Signed integer decimal. | |
3244+------------+-----------------------------------------------------+-------+
3245| ``'i'`` | Signed integer decimal. | |
3246+------------+-----------------------------------------------------+-------+
3247| ``'o'`` | Signed octal value. | \(1) |
3248+------------+-----------------------------------------------------+-------+
Ethan Furman62e977f2015-03-11 08:17:00 -07003249| ``'u'`` | Obsolete type -- it is identical to ``'d'``. | \(8) |
Ethan Furmanb95b5612015-01-23 20:05:18 -08003250+------------+-----------------------------------------------------+-------+
3251| ``'x'`` | Signed hexadecimal (lowercase). | \(2) |
3252+------------+-----------------------------------------------------+-------+
3253| ``'X'`` | Signed hexadecimal (uppercase). | \(2) |
3254+------------+-----------------------------------------------------+-------+
3255| ``'e'`` | Floating point exponential format (lowercase). | \(3) |
3256+------------+-----------------------------------------------------+-------+
3257| ``'E'`` | Floating point exponential format (uppercase). | \(3) |
3258+------------+-----------------------------------------------------+-------+
3259| ``'f'`` | Floating point decimal format. | \(3) |
3260+------------+-----------------------------------------------------+-------+
3261| ``'F'`` | Floating point decimal format. | \(3) |
3262+------------+-----------------------------------------------------+-------+
3263| ``'g'`` | Floating point format. Uses lowercase exponential | \(4) |
3264| | format if exponent is less than -4 or not less than | |
3265| | precision, decimal format otherwise. | |
3266+------------+-----------------------------------------------------+-------+
3267| ``'G'`` | Floating point format. Uses uppercase exponential | \(4) |
3268| | format if exponent is less than -4 or not less than | |
3269| | precision, decimal format otherwise. | |
3270+------------+-----------------------------------------------------+-------+
3271| ``'c'`` | Single byte (accepts integer or single | |
3272| | byte objects). | |
3273+------------+-----------------------------------------------------+-------+
3274| ``'b'`` | Bytes (any object that follows the | \(5) |
3275| | :ref:`buffer protocol <bufferobjects>` or has | |
3276| | :meth:`__bytes__`). | |
3277+------------+-----------------------------------------------------+-------+
3278| ``'s'`` | ``'s'`` is an alias for ``'b'`` and should only | \(6) |
3279| | be used for Python2/3 code bases. | |
3280+------------+-----------------------------------------------------+-------+
3281| ``'a'`` | Bytes (converts any Python object using | \(5) |
3282| | ``repr(obj).encode('ascii','backslashreplace)``). | |
3283+------------+-----------------------------------------------------+-------+
Ethan Furman62e977f2015-03-11 08:17:00 -07003284| ``'r'`` | ``'r'`` is an alias for ``'a'`` and should only | \(7) |
3285| | be used for Python2/3 code bases. | |
3286+------------+-----------------------------------------------------+-------+
Ethan Furmanb95b5612015-01-23 20:05:18 -08003287| ``'%'`` | No argument is converted, results in a ``'%'`` | |
3288| | character in the result. | |
3289+------------+-----------------------------------------------------+-------+
3290
3291Notes:
3292
3293(1)
3294 The alternate form causes a leading zero (``'0'``) to be inserted between
3295 left-hand padding and the formatting of the number if the leading character
3296 of the result is not already a zero.
3297
3298(2)
3299 The alternate form causes a leading ``'0x'`` or ``'0X'`` (depending on whether
3300 the ``'x'`` or ``'X'`` format was used) to be inserted between left-hand padding
3301 and the formatting of the number if the leading character of the result is not
3302 already a zero.
3303
3304(3)
3305 The alternate form causes the result to always contain a decimal point, even if
3306 no digits follow it.
3307
3308 The precision determines the number of digits after the decimal point and
3309 defaults to 6.
3310
3311(4)
3312 The alternate form causes the result to always contain a decimal point, and
3313 trailing zeroes are not removed as they would otherwise be.
3314
3315 The precision determines the number of significant digits before and after the
3316 decimal point and defaults to 6.
3317
3318(5)
3319 If precision is ``N``, the output is truncated to ``N`` characters.
3320
3321(6)
3322 ``b'%s'`` is deprecated, but will not be removed during the 3.x series.
3323
3324(7)
Ethan Furman62e977f2015-03-11 08:17:00 -07003325 ``b'%r'`` is deprecated, but will not be removed during the 3.x series.
3326
3327(8)
Ethan Furmanb95b5612015-01-23 20:05:18 -08003328 See :pep:`237`.
3329
3330.. note::
3331
3332 The bytearray version of this method does *not* operate in place - it
3333 always produces a new object, even if no changes were made.
3334
3335.. seealso:: :pep:`461`.
3336.. versionadded:: 3.5
3337
Nick Coghlan273069c2012-08-20 17:14:07 +10003338.. _typememoryview:
3339
3340Memory Views
3341------------
3342
3343:class:`memoryview` objects allow Python code to access the internal data
3344of an object that supports the :ref:`buffer protocol <bufferobjects>` without
3345copying.
3346
3347.. class:: memoryview(obj)
3348
3349 Create a :class:`memoryview` that references *obj*. *obj* must support the
3350 buffer protocol. Built-in objects that support the buffer protocol include
3351 :class:`bytes` and :class:`bytearray`.
3352
3353 A :class:`memoryview` has the notion of an *element*, which is the
3354 atomic memory unit handled by the originating object *obj*. For many
3355 simple types such as :class:`bytes` and :class:`bytearray`, an element
3356 is a single byte, but other types such as :class:`array.array` may have
3357 bigger elements.
3358
3359 ``len(view)`` is equal to the length of :class:`~memoryview.tolist`.
3360 If ``view.ndim = 0``, the length is 1. If ``view.ndim = 1``, the length
3361 is equal to the number of elements in the view. For higher dimensions,
3362 the length is equal to the length of the nested list representation of
3363 the view. The :class:`~memoryview.itemsize` attribute will give you the
3364 number of bytes in a single element.
3365
Antoine Pitrou31084ba2015-03-19 23:29:36 +01003366 A :class:`memoryview` supports slicing and indexing to expose its data.
3367 One-dimensional slicing will result in a subview::
Nick Coghlan273069c2012-08-20 17:14:07 +10003368
3369 >>> v = memoryview(b'abcefg')
3370 >>> v[1]
3371 98
3372 >>> v[-1]
3373 103
3374 >>> v[1:4]
3375 <memory at 0x7f3ddc9f4350>
3376 >>> bytes(v[1:4])
3377 b'bce'
3378
Antoine Pitrou31084ba2015-03-19 23:29:36 +01003379 If :class:`~memoryview.format` is one of the native format specifiers
3380 from the :mod:`struct` module, indexing with an integer or a tuple of
3381 integers is also supported and returns a single *element* with
3382 the correct type. One-dimensional memoryviews can be indexed
3383 with an integer or a one-integer tuple. Multi-dimensional memoryviews
3384 can be indexed with tuples of exactly *ndim* integers where *ndim* is
3385 the number of dimensions. Zero-dimensional memoryviews can be indexed
3386 with the empty tuple.
3387
3388 Here is an example with a non-byte format::
Nick Coghlan273069c2012-08-20 17:14:07 +10003389
3390 >>> import array
3391 >>> a = array.array('l', [-11111111, 22222222, -33333333, 44444444])
Antoine Pitrou31084ba2015-03-19 23:29:36 +01003392 >>> m = memoryview(a)
3393 >>> m[0]
Nick Coghlan273069c2012-08-20 17:14:07 +10003394 -11111111
Antoine Pitrou31084ba2015-03-19 23:29:36 +01003395 >>> m[-1]
Nick Coghlan273069c2012-08-20 17:14:07 +10003396 44444444
Antoine Pitrou31084ba2015-03-19 23:29:36 +01003397 >>> m[::2].tolist()
Nick Coghlan273069c2012-08-20 17:14:07 +10003398 [-11111111, -33333333]
Nick Coghlan273069c2012-08-20 17:14:07 +10003399
Antoine Pitrou31084ba2015-03-19 23:29:36 +01003400 If the underlying object is writable, the memoryview supports
3401 one-dimensional slice assignment. Resizing is not allowed::
Nick Coghlan273069c2012-08-20 17:14:07 +10003402
3403 >>> data = bytearray(b'abcefg')
3404 >>> v = memoryview(data)
3405 >>> v.readonly
3406 False
3407 >>> v[0] = ord(b'z')
3408 >>> data
3409 bytearray(b'zbcefg')
3410 >>> v[1:4] = b'123'
3411 >>> data
3412 bytearray(b'z123fg')
3413 >>> v[2:3] = b'spam'
3414 Traceback (most recent call last):
3415 File "<stdin>", line 1, in <module>
3416 ValueError: memoryview assignment: lvalue and rvalue have different structures
3417 >>> v[2:6] = b'spam'
3418 >>> data
3419 bytearray(b'z1spam')
3420
Stefan Kraha3b84fb2012-09-02 14:50:56 +02003421 One-dimensional memoryviews of hashable (read-only) types with formats
3422 'B', 'b' or 'c' are also hashable. The hash is defined as
3423 ``hash(m) == hash(m.tobytes())``::
Nick Coghlan273069c2012-08-20 17:14:07 +10003424
3425 >>> v = memoryview(b'abcefg')
3426 >>> hash(v) == hash(b'abcefg')
3427 True
3428 >>> hash(v[2:4]) == hash(b'ce')
3429 True
3430 >>> hash(v[::-2]) == hash(b'abcefg'[::-2])
3431 True
3432
Nick Coghlan273069c2012-08-20 17:14:07 +10003433 .. versionchanged:: 3.3
Antoine Pitrou31084ba2015-03-19 23:29:36 +01003434 One-dimensional memoryviews can now be sliced.
Stefan Kraha3b84fb2012-09-02 14:50:56 +02003435 One-dimensional memoryviews with formats 'B', 'b' or 'c' are now hashable.
Nick Coghlan273069c2012-08-20 17:14:07 +10003436
Nick Coghlan45163cc2013-10-02 22:31:47 +10003437 .. versionchanged:: 3.4
3438 memoryview is now registered automatically with
3439 :class:`collections.abc.Sequence`
3440
Antoine Pitrou31084ba2015-03-19 23:29:36 +01003441 .. versionchanged:: 3.5
3442 memoryviews can now be indexed with tuple of integers.
3443
Nick Coghlan273069c2012-08-20 17:14:07 +10003444 :class:`memoryview` has several methods:
3445
Nick Coghlan06e1ab02012-08-25 17:59:50 +10003446 .. method:: __eq__(exporter)
3447
3448 A memoryview and a :pep:`3118` exporter are equal if their shapes are
3449 equivalent and if all corresponding values are equal when the operands'
3450 respective format codes are interpreted using :mod:`struct` syntax.
3451
3452 For the subset of :mod:`struct` format strings currently supported by
3453 :meth:`tolist`, ``v`` and ``w`` are equal if ``v.tolist() == w.tolist()``::
3454
3455 >>> import array
3456 >>> a = array.array('I', [1, 2, 3, 4, 5])
3457 >>> b = array.array('d', [1.0, 2.0, 3.0, 4.0, 5.0])
3458 >>> c = array.array('b', [5, 3, 1])
3459 >>> x = memoryview(a)
3460 >>> y = memoryview(b)
3461 >>> x == a == y == b
3462 True
3463 >>> x.tolist() == a.tolist() == y.tolist() == b.tolist()
3464 True
3465 >>> z = y[::-2]
3466 >>> z == c
3467 True
3468 >>> z.tolist() == c.tolist()
3469 True
3470
3471 If either format string is not supported by the :mod:`struct` module,
3472 then the objects will always compare as unequal (even if the format
3473 strings and buffer contents are identical)::
3474
3475 >>> from ctypes import BigEndianStructure, c_long
3476 >>> class BEPoint(BigEndianStructure):
3477 ... _fields_ = [("x", c_long), ("y", c_long)]
3478 ...
3479 >>> point = BEPoint(100, 200)
3480 >>> a = memoryview(point)
3481 >>> b = memoryview(point)
3482 >>> a == point
3483 False
3484 >>> a == b
3485 False
3486
3487 Note that, as with floating point numbers, ``v is w`` does *not* imply
3488 ``v == w`` for memoryview objects.
3489
3490 .. versionchanged:: 3.3
Stefan Krahab0c3c72012-08-30 12:09:09 +02003491 Previous versions compared the raw memory disregarding the item format
3492 and the logical array structure.
Nick Coghlan06e1ab02012-08-25 17:59:50 +10003493
Nick Coghlan273069c2012-08-20 17:14:07 +10003494 .. method:: tobytes()
3495
3496 Return the data in the buffer as a bytestring. This is equivalent to
3497 calling the :class:`bytes` constructor on the memoryview. ::
3498
3499 >>> m = memoryview(b"abc")
3500 >>> m.tobytes()
3501 b'abc'
3502 >>> bytes(m)
3503 b'abc'
3504
3505 For non-contiguous arrays the result is equal to the flattened list
Nick Coghlan06e1ab02012-08-25 17:59:50 +10003506 representation with all elements converted to bytes. :meth:`tobytes`
3507 supports all format strings, including those that are not in
3508 :mod:`struct` module syntax.
Nick Coghlan273069c2012-08-20 17:14:07 +10003509
Gregory P. Smith8cb65692015-04-25 23:22:26 +00003510 .. method:: hex()
3511
3512 Return a string object containing two hexadecimal digits for each
3513 byte in the buffer. ::
3514
3515 >>> m = memoryview(b"abc")
3516 >>> m.hex()
3517 '616263'
3518
3519 .. versionadded:: 3.5
3520
Nick Coghlan273069c2012-08-20 17:14:07 +10003521 .. method:: tolist()
3522
3523 Return the data in the buffer as a list of elements. ::
3524
3525 >>> memoryview(b'abc').tolist()
3526 [97, 98, 99]
3527 >>> import array
3528 >>> a = array.array('d', [1.1, 2.2, 3.3])
3529 >>> m = memoryview(a)
3530 >>> m.tolist()
3531 [1.1, 2.2, 3.3]
3532
Stefan Krahab0c3c72012-08-30 12:09:09 +02003533 .. versionchanged:: 3.3
3534 :meth:`tolist` now supports all single character native formats in
3535 :mod:`struct` module syntax as well as multi-dimensional
3536 representations.
Nick Coghlan06e1ab02012-08-25 17:59:50 +10003537
Nick Coghlan273069c2012-08-20 17:14:07 +10003538 .. method:: release()
3539
3540 Release the underlying buffer exposed by the memoryview object. Many
3541 objects take special actions when a view is held on them (for example,
3542 a :class:`bytearray` would temporarily forbid resizing); therefore,
3543 calling release() is handy to remove these restrictions (and free any
3544 dangling resources) as soon as possible.
3545
3546 After this method has been called, any further operation on the view
3547 raises a :class:`ValueError` (except :meth:`release()` itself which can
3548 be called multiple times)::
3549
3550 >>> m = memoryview(b'abc')
3551 >>> m.release()
3552 >>> m[0]
3553 Traceback (most recent call last):
3554 File "<stdin>", line 1, in <module>
3555 ValueError: operation forbidden on released memoryview object
3556
3557 The context management protocol can be used for a similar effect,
3558 using the ``with`` statement::
3559
3560 >>> with memoryview(b'abc') as m:
3561 ... m[0]
3562 ...
3563 97
3564 >>> m[0]
3565 Traceback (most recent call last):
3566 File "<stdin>", line 1, in <module>
3567 ValueError: operation forbidden on released memoryview object
3568
3569 .. versionadded:: 3.2
3570
3571 .. method:: cast(format[, shape])
3572
3573 Cast a memoryview to a new format or shape. *shape* defaults to
3574 ``[byte_length//new_itemsize]``, which means that the result view
3575 will be one-dimensional. The return value is a new memoryview, but
Stefan Krah70e543b2015-08-08 14:33:28 +02003576 the buffer itself is not copied. Supported casts are 1D -> C-:term:`contiguous`
Nick Coghlan06e1ab02012-08-25 17:59:50 +10003577 and C-contiguous -> 1D.
3578
Stefan Krah0c515952015-08-08 13:38:10 +02003579 The destination format is restricted to a single element native format in
Nick Coghlan06e1ab02012-08-25 17:59:50 +10003580 :mod:`struct` syntax. One of the formats must be a byte format
Nick Coghlan273069c2012-08-20 17:14:07 +10003581 ('B', 'b' or 'c'). The byte length of the result must be the same
3582 as the original length.
3583
3584 Cast 1D/long to 1D/unsigned bytes::
3585
3586 >>> import array
3587 >>> a = array.array('l', [1,2,3])
3588 >>> x = memoryview(a)
3589 >>> x.format
3590 'l'
3591 >>> x.itemsize
3592 8
3593 >>> len(x)
3594 3
3595 >>> x.nbytes
3596 24
3597 >>> y = x.cast('B')
3598 >>> y.format
3599 'B'
3600 >>> y.itemsize
3601 1
3602 >>> len(y)
3603 24
3604 >>> y.nbytes
3605 24
3606
3607 Cast 1D/unsigned bytes to 1D/char::
3608
3609 >>> b = bytearray(b'zyz')
3610 >>> x = memoryview(b)
3611 >>> x[0] = b'a'
3612 Traceback (most recent call last):
3613 File "<stdin>", line 1, in <module>
3614 ValueError: memoryview: invalid value for format "B"
3615 >>> y = x.cast('c')
3616 >>> y[0] = b'a'
3617 >>> b
3618 bytearray(b'ayz')
3619
3620 Cast 1D/bytes to 3D/ints to 1D/signed char::
3621
3622 >>> import struct
3623 >>> buf = struct.pack("i"*12, *list(range(12)))
3624 >>> x = memoryview(buf)
3625 >>> y = x.cast('i', shape=[2,2,3])
3626 >>> y.tolist()
3627 [[[0, 1, 2], [3, 4, 5]], [[6, 7, 8], [9, 10, 11]]]
3628 >>> y.format
3629 'i'
3630 >>> y.itemsize
3631 4
3632 >>> len(y)
3633 2
3634 >>> y.nbytes
3635 48
3636 >>> z = y.cast('b')
3637 >>> z.format
3638 'b'
3639 >>> z.itemsize
3640 1
3641 >>> len(z)
3642 48
3643 >>> z.nbytes
3644 48
3645
Terry Jan Reedy0f847642013-03-11 18:34:00 -04003646 Cast 1D/unsigned char to 2D/unsigned long::
Nick Coghlan273069c2012-08-20 17:14:07 +10003647
3648 >>> buf = struct.pack("L"*6, *list(range(6)))
3649 >>> x = memoryview(buf)
3650 >>> y = x.cast('L', shape=[2,3])
3651 >>> len(y)
3652 2
3653 >>> y.nbytes
3654 48
3655 >>> y.tolist()
3656 [[0, 1, 2], [3, 4, 5]]
3657
3658 .. versionadded:: 3.3
3659
Stefan Krah0c515952015-08-08 13:38:10 +02003660 .. versionchanged:: 3.5
3661 The source format is no longer restricted when casting to a byte view.
3662
Nick Coghlan273069c2012-08-20 17:14:07 +10003663 There are also several readonly attributes available:
3664
3665 .. attribute:: obj
3666
3667 The underlying object of the memoryview::
3668
3669 >>> b = bytearray(b'xyz')
3670 >>> m = memoryview(b)
3671 >>> m.obj is b
3672 True
3673
3674 .. versionadded:: 3.3
3675
3676 .. attribute:: nbytes
3677
3678 ``nbytes == product(shape) * itemsize == len(m.tobytes())``. This is
3679 the amount of space in bytes that the array would use in a contiguous
3680 representation. It is not necessarily equal to len(m)::
3681
3682 >>> import array
3683 >>> a = array.array('i', [1,2,3,4,5])
3684 >>> m = memoryview(a)
3685 >>> len(m)
3686 5
3687 >>> m.nbytes
3688 20
3689 >>> y = m[::2]
3690 >>> len(y)
3691 3
3692 >>> y.nbytes
3693 12
3694 >>> len(y.tobytes())
3695 12
3696
3697 Multi-dimensional arrays::
3698
3699 >>> import struct
3700 >>> buf = struct.pack("d"*12, *[1.5*x for x in range(12)])
3701 >>> x = memoryview(buf)
3702 >>> y = x.cast('d', shape=[3,4])
3703 >>> y.tolist()
3704 [[0.0, 1.5, 3.0, 4.5], [6.0, 7.5, 9.0, 10.5], [12.0, 13.5, 15.0, 16.5]]
3705 >>> len(y)
3706 3
3707 >>> y.nbytes
3708 96
3709
3710 .. versionadded:: 3.3
3711
3712 .. attribute:: readonly
3713
3714 A bool indicating whether the memory is read only.
3715
3716 .. attribute:: format
3717
3718 A string containing the format (in :mod:`struct` module style) for each
3719 element in the view. A memoryview can be created from exporters with
3720 arbitrary format strings, but some methods (e.g. :meth:`tolist`) are
Nick Coghlan06e1ab02012-08-25 17:59:50 +10003721 restricted to native single element formats.
Nick Coghlan273069c2012-08-20 17:14:07 +10003722
Stefan Krahab0c3c72012-08-30 12:09:09 +02003723 .. versionchanged:: 3.3
3724 format ``'B'`` is now handled according to the struct module syntax.
3725 This means that ``memoryview(b'abc')[0] == b'abc'[0] == 97``.
3726
Nick Coghlan273069c2012-08-20 17:14:07 +10003727 .. attribute:: itemsize
3728
3729 The size in bytes of each element of the memoryview::
3730
3731 >>> import array, struct
3732 >>> m = memoryview(array.array('H', [32000, 32001, 32002]))
3733 >>> m.itemsize
3734 2
3735 >>> m[0]
3736 32000
3737 >>> struct.calcsize('H') == m.itemsize
3738 True
3739
3740 .. attribute:: ndim
3741
3742 An integer indicating how many dimensions of a multi-dimensional array the
3743 memory represents.
3744
3745 .. attribute:: shape
3746
3747 A tuple of integers the length of :attr:`ndim` giving the shape of the
Alexander Belopolskye8677c02012-09-03 17:29:22 -04003748 memory as an N-dimensional array.
3749
3750 .. versionchanged:: 3.3
3751 An empty tuple instead of None when ndim = 0.
Nick Coghlan273069c2012-08-20 17:14:07 +10003752
3753 .. attribute:: strides
3754
3755 A tuple of integers the length of :attr:`ndim` giving the size in bytes to
3756 access each element for each dimension of the array.
3757
Alexander Belopolskye8677c02012-09-03 17:29:22 -04003758 .. versionchanged:: 3.3
3759 An empty tuple instead of None when ndim = 0.
3760
Nick Coghlan273069c2012-08-20 17:14:07 +10003761 .. attribute:: suboffsets
3762
3763 Used internally for PIL-style arrays. The value is informational only.
3764
3765 .. attribute:: c_contiguous
3766
Stefan Krah70e543b2015-08-08 14:33:28 +02003767 A bool indicating whether the memory is C-:term:`contiguous`.
Nick Coghlan273069c2012-08-20 17:14:07 +10003768
3769 .. versionadded:: 3.3
3770
3771 .. attribute:: f_contiguous
3772
Stefan Krah70e543b2015-08-08 14:33:28 +02003773 A bool indicating whether the memory is Fortran :term:`contiguous`.
Nick Coghlan273069c2012-08-20 17:14:07 +10003774
3775 .. versionadded:: 3.3
3776
3777 .. attribute:: contiguous
3778
Stefan Krah70e543b2015-08-08 14:33:28 +02003779 A bool indicating whether the memory is :term:`contiguous`.
Nick Coghlan273069c2012-08-20 17:14:07 +10003780
3781 .. versionadded:: 3.3
3782
3783
Georg Brandl116aa622007-08-15 14:28:22 +00003784.. _types-set:
3785
3786Set Types --- :class:`set`, :class:`frozenset`
3787==============================================
3788
3789.. index:: object: set
3790
Guido van Rossum2cc30da2007-11-02 23:46:40 +00003791A :dfn:`set` object is an unordered collection of distinct :term:`hashable` objects.
Georg Brandl116aa622007-08-15 14:28:22 +00003792Common uses include membership testing, removing duplicates from a sequence, and
3793computing mathematical operations such as intersection, union, difference, and
3794symmetric difference.
Nick Coghlan83c0ae52012-08-21 17:42:52 +10003795(For other containers see the built-in :class:`dict`, :class:`list`,
Georg Brandl116aa622007-08-15 14:28:22 +00003796and :class:`tuple` classes, and the :mod:`collections` module.)
3797
Georg Brandl116aa622007-08-15 14:28:22 +00003798Like other collections, sets support ``x in set``, ``len(set)``, and ``for x in
3799set``. Being an unordered collection, sets do not record element position or
3800order of insertion. Accordingly, sets do not support indexing, slicing, or
3801other sequence-like behavior.
3802
Georg Brandl22b34312009-07-26 14:54:51 +00003803There are currently two built-in set types, :class:`set` and :class:`frozenset`.
Georg Brandl116aa622007-08-15 14:28:22 +00003804The :class:`set` type is mutable --- the contents can be changed using methods
Serhiy Storchaka0d196ed2013-10-09 14:02:31 +03003805like :meth:`~set.add` and :meth:`~set.remove`. Since it is mutable, it has no
3806hash value and cannot be used as either a dictionary key or as an element of
3807another set. The :class:`frozenset` type is immutable and :term:`hashable` ---
3808its contents cannot be altered after it is created; it can therefore be used as
3809a dictionary key or as an element of another set.
Georg Brandl116aa622007-08-15 14:28:22 +00003810
Georg Brandl99cd9572010-03-21 09:10:32 +00003811Non-empty sets (not frozensets) can be created by placing a comma-separated list
Georg Brandl53b95e72010-03-21 11:53:50 +00003812of elements within braces, for example: ``{'jack', 'sjoerd'}``, in addition to the
3813:class:`set` constructor.
Georg Brandl99cd9572010-03-21 09:10:32 +00003814
Georg Brandl116aa622007-08-15 14:28:22 +00003815The constructors for both classes work the same:
3816
3817.. class:: set([iterable])
3818 frozenset([iterable])
3819
3820 Return a new set or frozenset object whose elements are taken from
Andrew Svetlov9a411ce2013-04-05 16:21:50 +03003821 *iterable*. The elements of a set must be :term:`hashable`. To
3822 represent sets of sets, the inner sets must be :class:`frozenset`
3823 objects. If *iterable* is not specified, a new empty set is
3824 returned.
Georg Brandl116aa622007-08-15 14:28:22 +00003825
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003826 Instances of :class:`set` and :class:`frozenset` provide the following
3827 operations:
Georg Brandl116aa622007-08-15 14:28:22 +00003828
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003829 .. describe:: len(s)
Georg Brandl116aa622007-08-15 14:28:22 +00003830
Gregory P. Smithe27403b2016-02-08 09:58:40 -08003831 Return the number of elements in set *s* (cardinality of *s*).
Georg Brandl116aa622007-08-15 14:28:22 +00003832
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003833 .. describe:: x in s
Georg Brandl116aa622007-08-15 14:28:22 +00003834
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003835 Test *x* for membership in *s*.
Georg Brandl116aa622007-08-15 14:28:22 +00003836
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003837 .. describe:: x not in s
Georg Brandl116aa622007-08-15 14:28:22 +00003838
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003839 Test *x* for non-membership in *s*.
Georg Brandl116aa622007-08-15 14:28:22 +00003840
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003841 .. method:: isdisjoint(other)
Guido van Rossum58da9312007-11-10 23:39:45 +00003842
Serhiy Storchakafbc1c262013-11-29 12:17:13 +02003843 Return ``True`` if the set has no elements in common with *other*. Sets are
Georg Brandl2ee470f2008-07-16 12:55:28 +00003844 disjoint if and only if their intersection is the empty set.
Guido van Rossum58da9312007-11-10 23:39:45 +00003845
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003846 .. method:: issubset(other)
3847 set <= other
Georg Brandl116aa622007-08-15 14:28:22 +00003848
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003849 Test whether every element in the set is in *other*.
Georg Brandl116aa622007-08-15 14:28:22 +00003850
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003851 .. method:: set < other
Georg Brandla6f52782007-09-01 15:49:30 +00003852
Andrew Svetlov5bb42072012-11-01 21:47:54 +02003853 Test whether the set is a proper subset of *other*, that is,
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003854 ``set <= other and set != other``.
Georg Brandla6f52782007-09-01 15:49:30 +00003855
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003856 .. method:: issuperset(other)
3857 set >= other
Georg Brandl116aa622007-08-15 14:28:22 +00003858
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003859 Test whether every element in *other* is in the set.
Georg Brandl116aa622007-08-15 14:28:22 +00003860
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003861 .. method:: set > other
Georg Brandla6f52782007-09-01 15:49:30 +00003862
Andrew Svetlov5bb42072012-11-01 21:47:54 +02003863 Test whether the set is a proper superset of *other*, that is, ``set >=
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003864 other and set != other``.
Georg Brandla6f52782007-09-01 15:49:30 +00003865
Georg Brandlc28e1fa2008-06-10 19:20:26 +00003866 .. method:: union(other, ...)
3867 set | other | ...
Georg Brandl116aa622007-08-15 14:28:22 +00003868
Benjamin Petersonb58dda72009-01-18 22:27:04 +00003869 Return a new set with elements from the set and all others.
Georg Brandl116aa622007-08-15 14:28:22 +00003870
Georg Brandlc28e1fa2008-06-10 19:20:26 +00003871 .. method:: intersection(other, ...)
3872 set & other & ...
Georg Brandl116aa622007-08-15 14:28:22 +00003873
Benjamin Petersonb58dda72009-01-18 22:27:04 +00003874 Return a new set with elements common to the set and all others.
Georg Brandl116aa622007-08-15 14:28:22 +00003875
Amaury Forgeot d'Arcfdfe62d2008-06-17 20:36:03 +00003876 .. method:: difference(other, ...)
3877 set - other - ...
Georg Brandlc28e1fa2008-06-10 19:20:26 +00003878
Amaury Forgeot d'Arcfdfe62d2008-06-17 20:36:03 +00003879 Return a new set with elements in the set that are not in the others.
Georg Brandl116aa622007-08-15 14:28:22 +00003880
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003881 .. method:: symmetric_difference(other)
3882 set ^ other
Georg Brandl116aa622007-08-15 14:28:22 +00003883
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003884 Return a new set with elements in either the set or *other* but not both.
Georg Brandl116aa622007-08-15 14:28:22 +00003885
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003886 .. method:: copy()
Georg Brandl116aa622007-08-15 14:28:22 +00003887
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003888 Return a new set with a shallow copy of *s*.
Georg Brandl116aa622007-08-15 14:28:22 +00003889
3890
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003891 Note, the non-operator versions of :meth:`union`, :meth:`intersection`,
3892 :meth:`difference`, and :meth:`symmetric_difference`, :meth:`issubset`, and
3893 :meth:`issuperset` methods will accept any iterable as an argument. In
3894 contrast, their operator based counterparts require their arguments to be
3895 sets. This precludes error-prone constructions like ``set('abc') & 'cbs'``
3896 in favor of the more readable ``set('abc').intersection('cbs')``.
Georg Brandl116aa622007-08-15 14:28:22 +00003897
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003898 Both :class:`set` and :class:`frozenset` support set to set comparisons. Two
3899 sets are equal if and only if every element of each set is contained in the
3900 other (each is a subset of the other). A set is less than another set if and
3901 only if the first set is a proper subset of the second set (is a subset, but
3902 is not equal). A set is greater than another set if and only if the first set
3903 is a proper superset of the second set (is a superset, but is not equal).
Georg Brandl116aa622007-08-15 14:28:22 +00003904
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003905 Instances of :class:`set` are compared to instances of :class:`frozenset`
3906 based on their members. For example, ``set('abc') == frozenset('abc')``
3907 returns ``True`` and so does ``set('abc') in set([frozenset('abc')])``.
Georg Brandl116aa622007-08-15 14:28:22 +00003908
Raymond Hettinger12f588a2013-05-06 18:22:43 -07003909 The subset and equality comparisons do not generalize to a total ordering
3910 function. For example, any two nonempty disjoint sets are not equal and are not
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003911 subsets of each other, so *all* of the following return ``False``: ``a<b``,
Georg Brandl05f5ab72008-09-24 09:11:47 +00003912 ``a==b``, or ``a>b``.
Georg Brandl116aa622007-08-15 14:28:22 +00003913
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003914 Since sets only define partial ordering (subset relationships), the output of
3915 the :meth:`list.sort` method is undefined for lists of sets.
Georg Brandl116aa622007-08-15 14:28:22 +00003916
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003917 Set elements, like dictionary keys, must be :term:`hashable`.
Georg Brandl116aa622007-08-15 14:28:22 +00003918
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003919 Binary operations that mix :class:`set` instances with :class:`frozenset`
3920 return the type of the first operand. For example: ``frozenset('ab') |
3921 set('bc')`` returns an instance of :class:`frozenset`.
Georg Brandl116aa622007-08-15 14:28:22 +00003922
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003923 The following table lists operations available for :class:`set` that do not
3924 apply to immutable instances of :class:`frozenset`:
Georg Brandl116aa622007-08-15 14:28:22 +00003925
Georg Brandlc28e1fa2008-06-10 19:20:26 +00003926 .. method:: update(other, ...)
3927 set |= other | ...
Georg Brandl116aa622007-08-15 14:28:22 +00003928
Georg Brandla6053b42009-09-01 08:11:14 +00003929 Update the set, adding elements from all others.
Georg Brandl116aa622007-08-15 14:28:22 +00003930
Georg Brandlc28e1fa2008-06-10 19:20:26 +00003931 .. method:: intersection_update(other, ...)
3932 set &= other & ...
Georg Brandl116aa622007-08-15 14:28:22 +00003933
Georg Brandla6053b42009-09-01 08:11:14 +00003934 Update the set, keeping only elements found in it and all others.
Georg Brandl116aa622007-08-15 14:28:22 +00003935
Amaury Forgeot d'Arcfdfe62d2008-06-17 20:36:03 +00003936 .. method:: difference_update(other, ...)
3937 set -= other | ...
Georg Brandl116aa622007-08-15 14:28:22 +00003938
Amaury Forgeot d'Arcfdfe62d2008-06-17 20:36:03 +00003939 Update the set, removing elements found in others.
3940
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003941 .. method:: symmetric_difference_update(other)
3942 set ^= other
Georg Brandl116aa622007-08-15 14:28:22 +00003943
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003944 Update the set, keeping only elements found in either set, but not in both.
Georg Brandl116aa622007-08-15 14:28:22 +00003945
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003946 .. method:: add(elem)
Georg Brandl116aa622007-08-15 14:28:22 +00003947
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003948 Add element *elem* to the set.
Georg Brandl116aa622007-08-15 14:28:22 +00003949
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003950 .. method:: remove(elem)
Georg Brandl116aa622007-08-15 14:28:22 +00003951
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003952 Remove element *elem* from the set. Raises :exc:`KeyError` if *elem* is
3953 not contained in the set.
Georg Brandl116aa622007-08-15 14:28:22 +00003954
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003955 .. method:: discard(elem)
Georg Brandl116aa622007-08-15 14:28:22 +00003956
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003957 Remove element *elem* from the set if it is present.
Georg Brandl116aa622007-08-15 14:28:22 +00003958
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003959 .. method:: pop()
Georg Brandl116aa622007-08-15 14:28:22 +00003960
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003961 Remove and return an arbitrary element from the set. Raises
3962 :exc:`KeyError` if the set is empty.
Georg Brandl116aa622007-08-15 14:28:22 +00003963
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003964 .. method:: clear()
Georg Brandl116aa622007-08-15 14:28:22 +00003965
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003966 Remove all elements from the set.
Georg Brandl116aa622007-08-15 14:28:22 +00003967
3968
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003969 Note, the non-operator versions of the :meth:`update`,
3970 :meth:`intersection_update`, :meth:`difference_update`, and
3971 :meth:`symmetric_difference_update` methods will accept any iterable as an
3972 argument.
Georg Brandl116aa622007-08-15 14:28:22 +00003973
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00003974 Note, the *elem* argument to the :meth:`__contains__`, :meth:`remove`, and
3975 :meth:`discard` methods may be a set. To support searching for an equivalent
3976 frozenset, the *elem* set is temporarily mutated during the search and then
3977 restored. During the search, the *elem* set should not be read or mutated
3978 since it does not have a meaningful value.
Benjamin Peterson699adb92008-05-08 22:27:58 +00003979
Georg Brandl116aa622007-08-15 14:28:22 +00003980
3981.. _typesmapping:
3982
3983Mapping Types --- :class:`dict`
3984===============================
3985
3986.. index::
3987 object: mapping
3988 object: dictionary
3989 triple: operations on; mapping; types
3990 triple: operations on; dictionary; type
3991 statement: del
3992 builtin: len
3993
Chris Jerdonek11f3f172012-11-03 12:05:55 -07003994A :term:`mapping` object maps :term:`hashable` values to arbitrary objects.
Guido van Rossum2cc30da2007-11-02 23:46:40 +00003995Mappings are mutable objects. There is currently only one standard mapping
Nick Coghlan83c0ae52012-08-21 17:42:52 +10003996type, the :dfn:`dictionary`. (For other containers see the built-in
Guido van Rossum2cc30da2007-11-02 23:46:40 +00003997:class:`list`, :class:`set`, and :class:`tuple` classes, and the
3998:mod:`collections` module.)
Georg Brandl116aa622007-08-15 14:28:22 +00003999
Guido van Rossum2cc30da2007-11-02 23:46:40 +00004000A dictionary's keys are *almost* arbitrary values. Values that are not
4001:term:`hashable`, that is, values containing lists, dictionaries or other
4002mutable types (that are compared by value rather than by object identity) may
4003not be used as keys. Numeric types used for keys obey the normal rules for
4004numeric comparison: if two numbers compare equal (such as ``1`` and ``1.0``)
4005then they can be used interchangeably to index the same dictionary entry. (Note
4006however, that since computers store floating-point numbers as approximations it
4007is usually unwise to use them as dictionary keys.)
Georg Brandl116aa622007-08-15 14:28:22 +00004008
4009Dictionaries can be created by placing a comma-separated list of ``key: value``
4010pairs within braces, for example: ``{'jack': 4098, 'sjoerd': 4127}`` or ``{4098:
4011'jack', 4127: 'sjoerd'}``, or by the :class:`dict` constructor.
4012
Chris Jerdonekf3413172012-10-13 03:22:33 -07004013.. class:: dict(**kwarg)
4014 dict(mapping, **kwarg)
4015 dict(iterable, **kwarg)
Georg Brandl116aa622007-08-15 14:28:22 +00004016
Chris Jerdonekf3413172012-10-13 03:22:33 -07004017 Return a new dictionary initialized from an optional positional argument
4018 and a possibly empty set of keyword arguments.
4019
4020 If no positional argument is given, an empty dictionary is created.
4021 If a positional argument is given and it is a mapping object, a dictionary
4022 is created with the same key-value pairs as the mapping object. Otherwise,
Terry Jan Reedyb52f8762014-06-02 20:42:56 -04004023 the positional argument must be an :term:`iterable` object. Each item in
4024 the iterable must itself be an iterable with exactly two objects. The
Chris Jerdonekf3413172012-10-13 03:22:33 -07004025 first object of each item becomes a key in the new dictionary, and the
4026 second object the corresponding value. If a key occurs more than once, the
4027 last value for that key becomes the corresponding value in the new
Georg Brandld22a8152007-09-04 17:43:37 +00004028 dictionary.
Georg Brandl116aa622007-08-15 14:28:22 +00004029
Chris Jerdonekf3413172012-10-13 03:22:33 -07004030 If keyword arguments are given, the keyword arguments and their values are
4031 added to the dictionary created from the positional argument. If a key
4032 being added is already present, the value from the keyword argument
4033 replaces the value from the positional argument.
Georg Brandl116aa622007-08-15 14:28:22 +00004034
Chris Jerdonekf3413172012-10-13 03:22:33 -07004035 To illustrate, the following examples all return a dictionary equal to
Ezio Melottia20879f2012-10-26 19:14:16 +03004036 ``{"one": 1, "two": 2, "three": 3}``::
Georg Brandl116aa622007-08-15 14:28:22 +00004037
Ezio Melottia20879f2012-10-26 19:14:16 +03004038 >>> a = dict(one=1, two=2, three=3)
4039 >>> b = {'one': 1, 'two': 2, 'three': 3}
4040 >>> c = dict(zip(['one', 'two', 'three'], [1, 2, 3]))
4041 >>> d = dict([('two', 2), ('one', 1), ('three', 3)])
4042 >>> e = dict({'three': 3, 'one': 1, 'two': 2})
Chris Jerdonekf3413172012-10-13 03:22:33 -07004043 >>> a == b == c == d == e
4044 True
4045
4046 Providing keyword arguments as in the first example only works for keys that
4047 are valid Python identifiers. Otherwise, any valid keys can be used.
Georg Brandl116aa622007-08-15 14:28:22 +00004048
Georg Brandl116aa622007-08-15 14:28:22 +00004049
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004050 These are the operations that dictionaries support (and therefore, custom
4051 mapping types should support too):
Georg Brandl116aa622007-08-15 14:28:22 +00004052
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004053 .. describe:: len(d)
Georg Brandl116aa622007-08-15 14:28:22 +00004054
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004055 Return the number of items in the dictionary *d*.
Georg Brandl116aa622007-08-15 14:28:22 +00004056
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004057 .. describe:: d[key]
Georg Brandl116aa622007-08-15 14:28:22 +00004058
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004059 Return the item of *d* with key *key*. Raises a :exc:`KeyError` if *key* is
4060 not in the map.
Georg Brandl48310cd2009-01-03 21:18:54 +00004061
Terry Jan Reedy06c62182014-12-10 18:48:23 -05004062 .. index:: __missing__()
Terry Jan Reedye40031d2014-12-10 18:49:58 -05004063
Terry Jan Reedyb67f6e22014-12-10 18:38:19 -05004064 If a subclass of dict defines a method :meth:`__missing__` and *key*
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004065 is not present, the ``d[key]`` operation calls that method with the key *key*
4066 as argument. The ``d[key]`` operation then returns or raises whatever is
Terry Jan Reedyb67f6e22014-12-10 18:38:19 -05004067 returned or raised by the ``__missing__(key)`` call.
4068 No other operations or methods invoke :meth:`__missing__`. If
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004069 :meth:`__missing__` is not defined, :exc:`KeyError` is raised.
Raymond Hettinger5254e972011-01-08 09:35:38 +00004070 :meth:`__missing__` must be a method; it cannot be an instance variable::
4071
4072 >>> class Counter(dict):
4073 ... def __missing__(self, key):
4074 ... return 0
4075 >>> c = Counter()
4076 >>> c['red']
4077 0
4078 >>> c['red'] += 1
4079 >>> c['red']
4080 1
4081
Terry Jan Reedyb67f6e22014-12-10 18:38:19 -05004082 The example above shows part of the implementation of
4083 :class:`collections.Counter`. A different ``__missing__`` method is used
4084 by :class:`collections.defaultdict`.
Georg Brandl116aa622007-08-15 14:28:22 +00004085
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004086 .. describe:: d[key] = value
Georg Brandl116aa622007-08-15 14:28:22 +00004087
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004088 Set ``d[key]`` to *value*.
Georg Brandl116aa622007-08-15 14:28:22 +00004089
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004090 .. describe:: del d[key]
Georg Brandl116aa622007-08-15 14:28:22 +00004091
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004092 Remove ``d[key]`` from *d*. Raises a :exc:`KeyError` if *key* is not in the
4093 map.
Georg Brandl116aa622007-08-15 14:28:22 +00004094
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004095 .. describe:: key in d
Georg Brandl116aa622007-08-15 14:28:22 +00004096
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004097 Return ``True`` if *d* has a key *key*, else ``False``.
Georg Brandl116aa622007-08-15 14:28:22 +00004098
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004099 .. describe:: key not in d
Georg Brandl116aa622007-08-15 14:28:22 +00004100
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004101 Equivalent to ``not key in d``.
Georg Brandl116aa622007-08-15 14:28:22 +00004102
Benjamin Petersond23f8222009-04-05 19:13:16 +00004103 .. describe:: iter(d)
4104
4105 Return an iterator over the keys of the dictionary. This is a shortcut
Georg Brandlede6c2a2010-01-05 10:22:04 +00004106 for ``iter(d.keys())``.
Benjamin Petersond23f8222009-04-05 19:13:16 +00004107
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004108 .. method:: clear()
Georg Brandl116aa622007-08-15 14:28:22 +00004109
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004110 Remove all items from the dictionary.
Georg Brandl116aa622007-08-15 14:28:22 +00004111
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004112 .. method:: copy()
Georg Brandl116aa622007-08-15 14:28:22 +00004113
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004114 Return a shallow copy of the dictionary.
Georg Brandl116aa622007-08-15 14:28:22 +00004115
Georg Brandlabc38772009-04-12 15:51:51 +00004116 .. classmethod:: fromkeys(seq[, value])
Georg Brandl116aa622007-08-15 14:28:22 +00004117
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004118 Create a new dictionary with keys from *seq* and values set to *value*.
Georg Brandl116aa622007-08-15 14:28:22 +00004119
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004120 :meth:`fromkeys` is a class method that returns a new dictionary. *value*
4121 defaults to ``None``.
Georg Brandl116aa622007-08-15 14:28:22 +00004122
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004123 .. method:: get(key[, default])
Georg Brandl116aa622007-08-15 14:28:22 +00004124
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004125 Return the value for *key* if *key* is in the dictionary, else *default*.
4126 If *default* is not given, it defaults to ``None``, so that this method
4127 never raises a :exc:`KeyError`.
Georg Brandl116aa622007-08-15 14:28:22 +00004128
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004129 .. method:: items()
Georg Brandl116aa622007-08-15 14:28:22 +00004130
Victor Stinner0db176f2012-04-16 00:16:30 +02004131 Return a new view of the dictionary's items (``(key, value)`` pairs).
4132 See the :ref:`documentation of view objects <dict-views>`.
Georg Brandl116aa622007-08-15 14:28:22 +00004133
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004134 .. method:: keys()
Georg Brandl116aa622007-08-15 14:28:22 +00004135
Victor Stinner0db176f2012-04-16 00:16:30 +02004136 Return a new view of the dictionary's keys. See the :ref:`documentation
4137 of view objects <dict-views>`.
Georg Brandl116aa622007-08-15 14:28:22 +00004138
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004139 .. method:: pop(key[, default])
Georg Brandl116aa622007-08-15 14:28:22 +00004140
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004141 If *key* is in the dictionary, remove it and return its value, else return
4142 *default*. If *default* is not given and *key* is not in the dictionary,
4143 a :exc:`KeyError` is raised.
Georg Brandl116aa622007-08-15 14:28:22 +00004144
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004145 .. method:: popitem()
Georg Brandl116aa622007-08-15 14:28:22 +00004146
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004147 Remove and return an arbitrary ``(key, value)`` pair from the dictionary.
Georg Brandl116aa622007-08-15 14:28:22 +00004148
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004149 :meth:`popitem` is useful to destructively iterate over a dictionary, as
4150 often used in set algorithms. If the dictionary is empty, calling
4151 :meth:`popitem` raises a :exc:`KeyError`.
Georg Brandl116aa622007-08-15 14:28:22 +00004152
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004153 .. method:: setdefault(key[, default])
Georg Brandl116aa622007-08-15 14:28:22 +00004154
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004155 If *key* is in the dictionary, return its value. If not, insert *key*
4156 with a value of *default* and return *default*. *default* defaults to
4157 ``None``.
Georg Brandl116aa622007-08-15 14:28:22 +00004158
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004159 .. method:: update([other])
Georg Brandl116aa622007-08-15 14:28:22 +00004160
Éric Araujo0fc86b82010-08-18 22:29:54 +00004161 Update the dictionary with the key/value pairs from *other*, overwriting
4162 existing keys. Return ``None``.
Georg Brandl116aa622007-08-15 14:28:22 +00004163
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004164 :meth:`update` accepts either another dictionary object or an iterable of
Georg Brandlfda21062010-09-25 16:56:36 +00004165 key/value pairs (as tuples or other iterables of length two). If keyword
Benjamin Peterson8719ad52009-09-11 22:24:02 +00004166 arguments are specified, the dictionary is then updated with those
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004167 key/value pairs: ``d.update(red=1, blue=2)``.
Georg Brandl116aa622007-08-15 14:28:22 +00004168
Alexandre Vassalottia79e33e2008-05-15 22:51:26 +00004169 .. method:: values()
Georg Brandl116aa622007-08-15 14:28:22 +00004170
Victor Stinner0db176f2012-04-16 00:16:30 +02004171 Return a new view of the dictionary's values. See the
4172 :ref:`documentation of view objects <dict-views>`.
4173
Terry Jan Reedyfe63c9a2015-06-12 16:38:57 -04004174 Dictionaries compare equal if and only if they have the same ``(key,
4175 value)`` pairs. Order comparisons ('<', '<=', '>=', '>') raise
4176 :exc:`TypeError`.
Terry Jan Reedy6ac5cc12015-06-12 16:47:44 -04004177
Victor Stinner0db176f2012-04-16 00:16:30 +02004178.. seealso::
4179 :class:`types.MappingProxyType` can be used to create a read-only view
4180 of a :class:`dict`.
Georg Brandld22a8152007-09-04 17:43:37 +00004181
4182
Benjamin Peterson44309e62008-11-22 00:41:45 +00004183.. _dict-views:
4184
Georg Brandld22a8152007-09-04 17:43:37 +00004185Dictionary view objects
4186-----------------------
4187
4188The objects returned by :meth:`dict.keys`, :meth:`dict.values` and
4189:meth:`dict.items` are *view objects*. They provide a dynamic view on the
4190dictionary's entries, which means that when the dictionary changes, the view
Benjamin Petersonce0506c2008-11-17 21:47:41 +00004191reflects these changes.
Georg Brandld22a8152007-09-04 17:43:37 +00004192
4193Dictionary views can be iterated over to yield their respective data, and
4194support membership tests:
4195
4196.. describe:: len(dictview)
4197
4198 Return the number of entries in the dictionary.
4199
4200.. describe:: iter(dictview)
4201
4202 Return an iterator over the keys, values or items (represented as tuples of
4203 ``(key, value)``) in the dictionary.
4204
4205 Keys and values are iterated over in an arbitrary order which is non-random,
4206 varies across Python implementations, and depends on the dictionary's history
4207 of insertions and deletions. If keys, values and items views are iterated
4208 over with no intervening modifications to the dictionary, the order of items
4209 will directly correspond. This allows the creation of ``(value, key)`` pairs
4210 using :func:`zip`: ``pairs = zip(d.values(), d.keys())``. Another way to
4211 create the same list is ``pairs = [(v, k) for (k, v) in d.items()]``.
4212
Georg Brandl81269142009-05-17 08:31:29 +00004213 Iterating views while adding or deleting entries in the dictionary may raise
4214 a :exc:`RuntimeError` or fail to iterate over all entries.
Benjamin Petersond23f8222009-04-05 19:13:16 +00004215
Georg Brandld22a8152007-09-04 17:43:37 +00004216.. describe:: x in dictview
4217
4218 Return ``True`` if *x* is in the underlying dictionary's keys, values or
4219 items (in the latter case, *x* should be a ``(key, value)`` tuple).
4220
4221
Benjamin Petersonce0506c2008-11-17 21:47:41 +00004222Keys views are set-like since their entries are unique and hashable. If all
Georg Brandlf74cf772010-10-15 16:03:02 +00004223values are hashable, so that ``(key, value)`` pairs are unique and hashable,
4224then the items view is also set-like. (Values views are not treated as set-like
4225since the entries are generally not unique.) For set-like views, all of the
Nick Coghlan273069c2012-08-20 17:14:07 +10004226operations defined for the abstract base class :class:`collections.abc.Set` are
Georg Brandlf74cf772010-10-15 16:03:02 +00004227available (for example, ``==``, ``<``, or ``^``).
Georg Brandl116aa622007-08-15 14:28:22 +00004228
Georg Brandlc53c9662007-09-04 17:58:02 +00004229An example of dictionary view usage::
4230
4231 >>> dishes = {'eggs': 2, 'sausage': 1, 'bacon': 1, 'spam': 500}
4232 >>> keys = dishes.keys()
4233 >>> values = dishes.values()
4234
4235 >>> # iteration
4236 >>> n = 0
4237 >>> for val in values:
4238 ... n += val
4239 >>> print(n)
4240 504
4241
4242 >>> # keys and values are iterated over in the same order
4243 >>> list(keys)
4244 ['eggs', 'bacon', 'sausage', 'spam']
4245 >>> list(values)
4246 [2, 1, 1, 500]
4247
4248 >>> # view objects are dynamic and reflect dict changes
4249 >>> del dishes['eggs']
4250 >>> del dishes['sausage']
4251 >>> list(keys)
4252 ['spam', 'bacon']
4253
4254 >>> # set operations
4255 >>> keys & {'eggs', 'bacon', 'salad'}
Gregory P. Smithe8388122008-09-04 04:18:09 +00004256 {'bacon'}
Georg Brandlf74cf772010-10-15 16:03:02 +00004257 >>> keys ^ {'sausage', 'juice'}
Sandro Tosi2a8d1952011-08-02 18:42:04 +02004258 {'juice', 'sausage', 'bacon', 'spam'}
Georg Brandlc53c9662007-09-04 17:58:02 +00004259
4260
Georg Brandl116aa622007-08-15 14:28:22 +00004261.. _typecontextmanager:
4262
4263Context Manager Types
4264=====================
4265
Georg Brandl116aa622007-08-15 14:28:22 +00004266.. index::
4267 single: context manager
4268 single: context management protocol
4269 single: protocol; context management
4270
4271Python's :keyword:`with` statement supports the concept of a runtime context
Antoine Pitroua6540902010-12-12 20:09:18 +00004272defined by a context manager. This is implemented using a pair of methods
Georg Brandl116aa622007-08-15 14:28:22 +00004273that allow user-defined classes to define a runtime context that is entered
Antoine Pitroua6540902010-12-12 20:09:18 +00004274before the statement body is executed and exited when the statement ends:
Georg Brandl116aa622007-08-15 14:28:22 +00004275
4276
4277.. method:: contextmanager.__enter__()
4278
4279 Enter the runtime context and return either this object or another object
4280 related to the runtime context. The value returned by this method is bound to
4281 the identifier in the :keyword:`as` clause of :keyword:`with` statements using
4282 this context manager.
4283
Antoine Pitrou11cb9612010-09-15 11:11:28 +00004284 An example of a context manager that returns itself is a :term:`file object`.
4285 File objects return themselves from __enter__() to allow :func:`open` to be
4286 used as the context expression in a :keyword:`with` statement.
Georg Brandl116aa622007-08-15 14:28:22 +00004287
4288 An example of a context manager that returns a related object is the one
Christian Heimesfaf2f632008-01-06 16:59:19 +00004289 returned by :func:`decimal.localcontext`. These managers set the active
Georg Brandl116aa622007-08-15 14:28:22 +00004290 decimal context to a copy of the original decimal context and then return the
4291 copy. This allows changes to be made to the current decimal context in the body
4292 of the :keyword:`with` statement without affecting code outside the
4293 :keyword:`with` statement.
4294
4295
4296.. method:: contextmanager.__exit__(exc_type, exc_val, exc_tb)
4297
Georg Brandl9afde1c2007-11-01 20:32:30 +00004298 Exit the runtime context and return a Boolean flag indicating if any exception
Georg Brandl116aa622007-08-15 14:28:22 +00004299 that occurred should be suppressed. If an exception occurred while executing the
4300 body of the :keyword:`with` statement, the arguments contain the exception type,
4301 value and traceback information. Otherwise, all three arguments are ``None``.
4302
4303 Returning a true value from this method will cause the :keyword:`with` statement
4304 to suppress the exception and continue execution with the statement immediately
4305 following the :keyword:`with` statement. Otherwise the exception continues
4306 propagating after this method has finished executing. Exceptions that occur
4307 during execution of this method will replace any exception that occurred in the
4308 body of the :keyword:`with` statement.
4309
4310 The exception passed in should never be reraised explicitly - instead, this
4311 method should return a false value to indicate that the method completed
4312 successfully and does not want to suppress the raised exception. This allows
Georg Brandle4196d32014-10-31 09:41:46 +01004313 context management code to easily detect whether or not an :meth:`__exit__`
4314 method has actually failed.
Georg Brandl116aa622007-08-15 14:28:22 +00004315
4316Python defines several context managers to support easy thread synchronisation,
4317prompt closure of files or other objects, and simpler manipulation of the active
4318decimal arithmetic context. The specific types are not treated specially beyond
4319their implementation of the context management protocol. See the
4320:mod:`contextlib` module for some examples.
4321
Antoine Pitroua6540902010-12-12 20:09:18 +00004322Python's :term:`generator`\s and the :class:`contextlib.contextmanager` decorator
Christian Heimesd8654cf2007-12-02 15:22:16 +00004323provide a convenient way to implement these protocols. If a generator function is
Antoine Pitroua6540902010-12-12 20:09:18 +00004324decorated with the :class:`contextlib.contextmanager` decorator, it will return a
Georg Brandl116aa622007-08-15 14:28:22 +00004325context manager implementing the necessary :meth:`__enter__` and
4326:meth:`__exit__` methods, rather than the iterator produced by an undecorated
4327generator function.
4328
4329Note that there is no specific slot for any of these methods in the type
4330structure for Python objects in the Python/C API. Extension types wanting to
4331define these methods must provide them as a normal Python accessible method.
4332Compared to the overhead of setting up the runtime context, the overhead of a
4333single class dictionary lookup is negligible.
4334
4335
4336.. _typesother:
4337
4338Other Built-in Types
4339====================
4340
4341The interpreter supports several other kinds of objects. Most of these support
4342only one or two operations.
4343
4344
4345.. _typesmodules:
4346
4347Modules
4348-------
4349
4350The only special operation on a module is attribute access: ``m.name``, where
4351*m* is a module and *name* accesses a name defined in *m*'s symbol table.
4352Module attributes can be assigned to. (Note that the :keyword:`import`
4353statement is not, strictly speaking, an operation on a module object; ``import
4354foo`` does not require a module object named *foo* to exist, rather it requires
4355an (external) *definition* for a module named *foo* somewhere.)
4356
Serhiy Storchaka0d196ed2013-10-09 14:02:31 +03004357A special attribute of every module is :attr:`~object.__dict__`. This is the
4358dictionary containing the module's symbol table. Modifying this dictionary will
4359actually change the module's symbol table, but direct assignment to the
4360:attr:`__dict__` attribute is not possible (you can write
4361``m.__dict__['a'] = 1``, which defines ``m.a`` to be ``1``, but you can't write
4362``m.__dict__ = {}``). Modifying :attr:`__dict__` directly is not recommended.
Georg Brandl116aa622007-08-15 14:28:22 +00004363
4364Modules built into the interpreter are written like this: ``<module 'sys'
4365(built-in)>``. If loaded from a file, they are written as ``<module 'os' from
4366'/usr/local/lib/pythonX.Y/os.pyc'>``.
4367
4368
4369.. _typesobjects:
4370
4371Classes and Class Instances
4372---------------------------
4373
4374See :ref:`objects` and :ref:`class` for these.
4375
4376
4377.. _typesfunctions:
4378
4379Functions
4380---------
4381
4382Function objects are created by function definitions. The only operation on a
4383function object is to call it: ``func(argument-list)``.
4384
4385There are really two flavors of function objects: built-in functions and
4386user-defined functions. Both support the same operation (to call the function),
4387but the implementation is different, hence the different object types.
4388
4389See :ref:`function` for more information.
4390
4391
4392.. _typesmethods:
4393
4394Methods
4395-------
4396
4397.. index:: object: method
4398
4399Methods are functions that are called using the attribute notation. There are
4400two flavors: built-in methods (such as :meth:`append` on lists) and class
4401instance methods. Built-in methods are described with the types that support
4402them.
4403
Georg Brandl2e0b7552007-11-27 12:43:08 +00004404If you access a method (a function defined in a class namespace) through an
4405instance, you get a special object: a :dfn:`bound method` (also called
4406:dfn:`instance method`) object. When called, it will add the ``self`` argument
4407to the argument list. Bound methods have two special read-only attributes:
4408``m.__self__`` is the object on which the method operates, and ``m.__func__`` is
4409the function implementing the method. Calling ``m(arg-1, arg-2, ..., arg-n)``
4410is completely equivalent to calling ``m.__func__(m.__self__, arg-1, arg-2, ...,
4411arg-n)``.
Georg Brandl116aa622007-08-15 14:28:22 +00004412
Georg Brandl2e0b7552007-11-27 12:43:08 +00004413Like function objects, bound method objects support getting arbitrary
4414attributes. However, since method attributes are actually stored on the
4415underlying function object (``meth.__func__``), setting method attributes on
Ezio Melotti8b6b1762012-11-09 01:08:25 +02004416bound methods is disallowed. Attempting to set an attribute on a method
4417results in an :exc:`AttributeError` being raised. In order to set a method
4418attribute, you need to explicitly set it on the underlying function object::
Georg Brandl116aa622007-08-15 14:28:22 +00004419
Ezio Melotti8b6b1762012-11-09 01:08:25 +02004420 >>> class C:
4421 ... def method(self):
4422 ... pass
4423 ...
4424 >>> c = C()
4425 >>> c.method.whoami = 'my name is method' # can't set on the method
4426 Traceback (most recent call last):
4427 File "<stdin>", line 1, in <module>
4428 AttributeError: 'method' object has no attribute 'whoami'
4429 >>> c.method.__func__.whoami = 'my name is method'
4430 >>> c.method.whoami
4431 'my name is method'
Georg Brandl116aa622007-08-15 14:28:22 +00004432
4433See :ref:`types` for more information.
4434
4435
4436.. _bltin-code-objects:
4437
4438Code Objects
4439------------
4440
4441.. index:: object: code
4442
4443.. index::
4444 builtin: compile
4445 single: __code__ (function object attribute)
4446
4447Code objects are used by the implementation to represent "pseudo-compiled"
4448executable Python code such as a function body. They differ from function
4449objects because they don't contain a reference to their global execution
4450environment. Code objects are returned by the built-in :func:`compile` function
4451and can be extracted from function objects through their :attr:`__code__`
4452attribute. See also the :mod:`code` module.
4453
4454.. index::
4455 builtin: exec
4456 builtin: eval
4457
4458A code object can be executed or evaluated by passing it (instead of a source
4459string) to the :func:`exec` or :func:`eval` built-in functions.
4460
4461See :ref:`types` for more information.
4462
4463
4464.. _bltin-type-objects:
4465
4466Type Objects
4467------------
4468
4469.. index::
4470 builtin: type
4471 module: types
4472
4473Type objects represent the various object types. An object's type is accessed
4474by the built-in function :func:`type`. There are no special operations on
4475types. The standard module :mod:`types` defines names for all standard built-in
4476types.
4477
Martin v. Löwis250ad612008-04-07 05:43:42 +00004478Types are written like this: ``<class 'int'>``.
Georg Brandl116aa622007-08-15 14:28:22 +00004479
4480
4481.. _bltin-null-object:
4482
4483The Null Object
4484---------------
4485
4486This object is returned by functions that don't explicitly return a value. It
4487supports no special operations. There is exactly one null object, named
Benjamin Peterson98f2b9b2011-07-30 12:26:27 -05004488``None`` (a built-in name). ``type(None)()`` produces the same singleton.
Georg Brandl116aa622007-08-15 14:28:22 +00004489
4490It is written as ``None``.
4491
4492
4493.. _bltin-ellipsis-object:
4494
4495The Ellipsis Object
4496-------------------
4497
Benjamin Petersond5a1c442012-05-14 22:09:31 -07004498This object is commonly used by slicing (see :ref:`slicings`). It supports no
4499special operations. There is exactly one ellipsis object, named
4500:const:`Ellipsis` (a built-in name). ``type(Ellipsis)()`` produces the
4501:const:`Ellipsis` singleton.
Georg Brandl116aa622007-08-15 14:28:22 +00004502
4503It is written as ``Ellipsis`` or ``...``.
4504
4505
Éric Araujo18ddf822011-09-01 23:10:36 +02004506.. _bltin-notimplemented-object:
4507
Benjamin Peterson50211fa2011-07-30 09:57:24 -05004508The NotImplemented Object
4509-------------------------
4510
4511This object is returned from comparisons and binary operations when they are
4512asked to operate on types they don't support. See :ref:`comparisons` for more
Benjamin Peterson98f2b9b2011-07-30 12:26:27 -05004513information. There is exactly one ``NotImplemented`` object.
4514``type(NotImplemented)()`` produces the singleton instance.
Benjamin Peterson50211fa2011-07-30 09:57:24 -05004515
4516It is written as ``NotImplemented``.
4517
Georg Brandl116aa622007-08-15 14:28:22 +00004518
Éric Araujo18ddf822011-09-01 23:10:36 +02004519.. _bltin-boolean-values:
4520
Georg Brandl116aa622007-08-15 14:28:22 +00004521Boolean Values
4522--------------
4523
4524Boolean values are the two constant objects ``False`` and ``True``. They are
4525used to represent truth values (although other values can also be considered
4526false or true). In numeric contexts (for example when used as the argument to
4527an arithmetic operator), they behave like the integers 0 and 1, respectively.
Ezio Melottic1f26f62011-12-02 19:47:24 +02004528The built-in function :func:`bool` can be used to convert any value to a
4529Boolean, if the value can be interpreted as a truth value (see section
4530:ref:`truth` above).
Georg Brandl116aa622007-08-15 14:28:22 +00004531
4532.. index::
4533 single: False
4534 single: True
4535 pair: Boolean; values
4536
4537They are written as ``False`` and ``True``, respectively.
4538
4539
4540.. _typesinternal:
4541
4542Internal Objects
4543----------------
4544
4545See :ref:`types` for this information. It describes stack frame objects,
4546traceback objects, and slice objects.
4547
4548
4549.. _specialattrs:
4550
4551Special Attributes
4552==================
4553
4554The implementation adds a few special read-only attributes to several object
4555types, where they are relevant. Some of these are not reported by the
4556:func:`dir` built-in function.
4557
4558
4559.. attribute:: object.__dict__
4560
4561 A dictionary or other mapping object used to store an object's (writable)
4562 attributes.
4563
4564
4565.. attribute:: instance.__class__
4566
4567 The class to which a class instance belongs.
4568
4569
4570.. attribute:: class.__bases__
4571
Benjamin Peterson1baf4652009-12-31 03:11:23 +00004572 The tuple of base classes of a class object.
Georg Brandl116aa622007-08-15 14:28:22 +00004573
4574
4575.. attribute:: class.__name__
4576
4577 The name of the class or type.
4578
Georg Brandl7a51e582009-03-28 19:13:21 +00004579
Antoine Pitrou86a36b52011-11-25 18:56:07 +01004580.. attribute:: class.__qualname__
4581
4582 The :term:`qualified name` of the class or type.
4583
4584 .. versionadded:: 3.3
4585
4586
Benjamin Petersond23f8222009-04-05 19:13:16 +00004587.. attribute:: class.__mro__
4588
4589 This attribute is a tuple of classes that are considered when looking for
4590 base classes during method resolution.
4591
4592
4593.. method:: class.mro()
4594
4595 This method can be overridden by a metaclass to customize the method
4596 resolution order for its instances. It is called at class instantiation, and
Serhiy Storchaka0d196ed2013-10-09 14:02:31 +03004597 its result is stored in :attr:`~class.__mro__`.
Benjamin Petersond23f8222009-04-05 19:13:16 +00004598
4599
Georg Brandl7a51e582009-03-28 19:13:21 +00004600.. method:: class.__subclasses__
4601
Florent Xicluna74e64952011-10-28 11:21:19 +02004602 Each class keeps a list of weak references to its immediate subclasses. This
4603 method returns a list of all those references still alive.
Benjamin Petersond23f8222009-04-05 19:13:16 +00004604 Example::
Georg Brandl7a51e582009-03-28 19:13:21 +00004605
4606 >>> int.__subclasses__()
Florent Xicluna74e64952011-10-28 11:21:19 +02004607 [<class 'bool'>]
Georg Brandl7a51e582009-03-28 19:13:21 +00004608
4609
Georg Brandl116aa622007-08-15 14:28:22 +00004610.. rubric:: Footnotes
4611
Ezio Melotti0656a562011-08-15 14:27:19 +03004612.. [1] Additional information on these special methods may be found in the Python
Georg Brandl116aa622007-08-15 14:28:22 +00004613 Reference Manual (:ref:`customization`).
4614
Ezio Melotti0656a562011-08-15 14:27:19 +03004615.. [2] As a consequence, the list ``[1, 2]`` is considered equal to ``[1.0, 2.0]``, and
Georg Brandl116aa622007-08-15 14:28:22 +00004616 similarly for tuples.
4617
Ezio Melotti0656a562011-08-15 14:27:19 +03004618.. [3] They must have since the parser can't tell the type of the operands.
Georg Brandl116aa622007-08-15 14:28:22 +00004619
Ezio Melotti0656a562011-08-15 14:27:19 +03004620.. [4] Cased characters are those with general category property being one of
4621 "Lu" (Letter, uppercase), "Ll" (Letter, lowercase), or "Lt" (Letter, titlecase).
4622
4623.. [5] To format only a tuple you should therefore provide a singleton tuple whose only
Georg Brandl116aa622007-08-15 14:28:22 +00004624 element is the tuple to be formatted.