blob: 740887de2f74b942348e265037ff5f7d4ae1bbc3 [file] [log] [blame]
Fred Drake7a2f0661998-09-10 18:25:58 +00001\section{Built-in Types \label{types}}
Fred Drake64e3b431998-07-24 13:56:11 +00002
3The following sections describe the standard types that are built into
Steve Holden1e4519f2002-06-14 09:16:40 +00004the interpreter. Historically, Python's built-in types have differed
5from user-defined types because it was not possible to use the built-in
6types as the basis for object-oriented inheritance. With the 2.2
7release this situation has started to change, although the intended
8unification of user-defined and built-in types is as yet far from
9complete.
10
11The principal built-in types are numerics, sequences, mappings, files
12classes, instances and exceptions.
Fred Drake64e3b431998-07-24 13:56:11 +000013\indexii{built-in}{types}
Fred Drake64e3b431998-07-24 13:56:11 +000014
15Some operations are supported by several object types; in particular,
Guido van Rossum50e7a112003-12-31 06:32:38 +000016practically all objects can be compared, tested for truth value,
17and converted to a string (with the \code{`\textrm{\ldots}`} notation,
18the equivalent \function{repr()} function, or the slightly different
19\function{str()} function). The latter
20function is implicitly used when an object is written by the
Fred Drake84538cd1998-11-30 21:51:25 +000021\keyword{print}\stindex{print} statement.
Fred Drake90fc0b32003-04-30 16:44:36 +000022(Information on \ulink{\keyword{print} statement}{../ref/print.html}
23and other language statements can be found in the
24\citetitle[../ref/ref.html]{Python Reference Manual} and the
25\citetitle[../tut/tut.html]{Python Tutorial}.)
Fred Drake64e3b431998-07-24 13:56:11 +000026
27
Fred Drake90fc0b32003-04-30 16:44:36 +000028\subsection{Truth Value Testing\label{truth}}
Fred Drake64e3b431998-07-24 13:56:11 +000029
Fred Drake84538cd1998-11-30 21:51:25 +000030Any object can be tested for truth value, for use in an \keyword{if} or
31\keyword{while} condition or as operand of the Boolean operations below.
Fred Drake64e3b431998-07-24 13:56:11 +000032The following values are considered false:
33\stindex{if}
34\stindex{while}
35\indexii{truth}{value}
36\indexii{Boolean}{operations}
37\index{false}
38
Fred Drake64e3b431998-07-24 13:56:11 +000039\begin{itemize}
40
41\item \code{None}
Fred Drake442c7c72002-08-07 15:40:15 +000042 \withsubitem{(Built-in object)}{\ttindex{None}}
Fred Drake64e3b431998-07-24 13:56:11 +000043
Guido van Rossum77f6a652002-04-03 22:41:51 +000044\item \code{False}
Fred Drake442c7c72002-08-07 15:40:15 +000045 \withsubitem{(Built-in object)}{\ttindex{False}}
Guido van Rossum77f6a652002-04-03 22:41:51 +000046
Fred Drake38e5d272000-04-03 20:13:55 +000047\item zero of any numeric type, for example, \code{0}, \code{0L},
48 \code{0.0}, \code{0j}.
Fred Drake64e3b431998-07-24 13:56:11 +000049
Fred Drake38e5d272000-04-03 20:13:55 +000050\item any empty sequence, for example, \code{''}, \code{()}, \code{[]}.
Fred Drake64e3b431998-07-24 13:56:11 +000051
Fred Drake38e5d272000-04-03 20:13:55 +000052\item any empty mapping, for example, \code{\{\}}.
Fred Drake64e3b431998-07-24 13:56:11 +000053
54\item instances of user-defined classes, if the class defines a
Fred Drake442c7c72002-08-07 15:40:15 +000055 \method{__nonzero__()} or \method{__len__()} method, when that
56 method returns the integer zero or \class{bool} value
57 \code{False}.\footnote{Additional
Fred Drake3e59f722002-07-12 17:15:10 +000058information on these special methods may be found in the
59\citetitle[../ref/ref.html]{Python Reference Manual}.}
Fred Drake64e3b431998-07-24 13:56:11 +000060
61\end{itemize}
62
63All other values are considered true --- so objects of many types are
64always true.
65\index{true}
66
67Operations and built-in functions that have a Boolean result always
Guido van Rossum77f6a652002-04-03 22:41:51 +000068return \code{0} or \code{False} for false and \code{1} or \code{True}
69for true, unless otherwise stated. (Important exception: the Boolean
70operations \samp{or}\opindex{or} and \samp{and}\opindex{and} always
71return one of their operands.)
72\index{False}
73\index{True}
Fred Drake64e3b431998-07-24 13:56:11 +000074
Raymond Hettinger24d75212005-06-14 08:45:43 +000075\subsection{Boolean Operations ---
76 \keyword{and}, \keyword{or}, \keyword{not}
77 \label{boolean}}
Fred Drake64e3b431998-07-24 13:56:11 +000078
79These are the Boolean operations, ordered by ascending priority:
80\indexii{Boolean}{operations}
81
82\begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
Fred Drake8c071d42001-01-26 20:48:35 +000083 \lineiii{\var{x} or \var{y}}
84 {if \var{x} is false, then \var{y}, else \var{x}}{(1)}
85 \lineiii{\var{x} and \var{y}}
86 {if \var{x} is false, then \var{x}, else \var{y}}{(1)}
Fred Drake64e3b431998-07-24 13:56:11 +000087 \hline
Fred Drake8c071d42001-01-26 20:48:35 +000088 \lineiii{not \var{x}}
Guido van Rossum77f6a652002-04-03 22:41:51 +000089 {if \var{x} is false, then \code{True}, else \code{False}}{(2)}
Fred Drake64e3b431998-07-24 13:56:11 +000090\end{tableiii}
91\opindex{and}
92\opindex{or}
93\opindex{not}
94
95\noindent
96Notes:
97
98\begin{description}
99
100\item[(1)]
101These only evaluate their second argument if needed for their outcome.
102
103\item[(2)]
Fred Drake38e5d272000-04-03 20:13:55 +0000104\samp{not} has a lower priority than non-Boolean operators, so
105\code{not \var{a} == \var{b}} is interpreted as \code{not (\var{a} ==
106\var{b})}, and \code{\var{a} == not \var{b}} is a syntax error.
Fred Drake64e3b431998-07-24 13:56:11 +0000107
108\end{description}
109
110
Fred Drake7a2f0661998-09-10 18:25:58 +0000111\subsection{Comparisons \label{comparisons}}
Fred Drake64e3b431998-07-24 13:56:11 +0000112
113Comparison operations are supported by all objects. They all have the
114same priority (which is higher than that of the Boolean operations).
Fred Drake38e5d272000-04-03 20:13:55 +0000115Comparisons can be chained arbitrarily; for example, \code{\var{x} <
116\var{y} <= \var{z}} is equivalent to \code{\var{x} < \var{y} and
117\var{y} <= \var{z}}, except that \var{y} is evaluated only once (but
118in both cases \var{z} is not evaluated at all when \code{\var{x} <
119\var{y}} is found to be false).
Fred Drake64e3b431998-07-24 13:56:11 +0000120\indexii{chaining}{comparisons}
121
122This table summarizes the comparison operations:
123
124\begin{tableiii}{c|l|c}{code}{Operation}{Meaning}{Notes}
125 \lineiii{<}{strictly less than}{}
126 \lineiii{<=}{less than or equal}{}
127 \lineiii{>}{strictly greater than}{}
128 \lineiii{>=}{greater than or equal}{}
129 \lineiii{==}{equal}{}
Fred Drake64e3b431998-07-24 13:56:11 +0000130 \lineiii{!=}{not equal}{(1)}
Fred Drake512bb722000-08-18 03:12:38 +0000131 \lineiii{<>}{not equal}{(1)}
Fred Drake64e3b431998-07-24 13:56:11 +0000132 \lineiii{is}{object identity}{}
133 \lineiii{is not}{negated object identity}{}
134\end{tableiii}
135\indexii{operator}{comparison}
136\opindex{==} % XXX *All* others have funny characters < ! >
137\opindex{is}
138\opindex{is not}
139
140\noindent
141Notes:
142
143\begin{description}
144
145\item[(1)]
146\code{<>} and \code{!=} are alternate spellings for the same operator.
Fred Drake38e5d272000-04-03 20:13:55 +0000147\code{!=} is the preferred spelling; \code{<>} is obsolescent.
Fred Drake64e3b431998-07-24 13:56:11 +0000148
149\end{description}
150
Martin v. Löwis19a5a712003-05-31 08:05:49 +0000151Objects of different types, except different numeric types and different string types, never
Fred Drake64e3b431998-07-24 13:56:11 +0000152compare equal; such objects are ordered consistently but arbitrarily
153(so that sorting a heterogeneous array yields a consistent result).
Fred Drake38e5d272000-04-03 20:13:55 +0000154Furthermore, some types (for example, file objects) support only a
155degenerate notion of comparison where any two objects of that type are
156unequal. Again, such objects are ordered arbitrarily but
Steve Holden1e4519f2002-06-14 09:16:40 +0000157consistently. The \code{<}, \code{<=}, \code{>} and \code{>=}
158operators will raise a \exception{TypeError} exception when any operand
159is a complex number.
Fred Drake38e5d272000-04-03 20:13:55 +0000160\indexii{object}{numeric}
Fred Drake64e3b431998-07-24 13:56:11 +0000161\indexii{objects}{comparing}
162
Fred Drake38e5d272000-04-03 20:13:55 +0000163Instances of a class normally compare as non-equal unless the class
164\withsubitem{(instance method)}{\ttindex{__cmp__()}}
Fred Drake66571cc2000-09-09 03:30:34 +0000165defines the \method{__cmp__()} method. Refer to the
166\citetitle[../ref/customization.html]{Python Reference Manual} for
167information on the use of this method to effect object comparisons.
Fred Drake64e3b431998-07-24 13:56:11 +0000168
Fred Drake38e5d272000-04-03 20:13:55 +0000169\strong{Implementation note:} Objects of different types except
170numbers are ordered by their type names; objects of the same types
171that don't support proper comparison are ordered by their address.
172
173Two more operations with the same syntactic priority,
174\samp{in}\opindex{in} and \samp{not in}\opindex{not in}, are supported
175only by sequence types (below).
Fred Drake64e3b431998-07-24 13:56:11 +0000176
177
Raymond Hettinger24d75212005-06-14 08:45:43 +0000178\subsection{Numeric Types ---
179 \class{int}, \class{float}, \class{long}, \class{complex}
180 \label{typesnumeric}}
Fred Drake64e3b431998-07-24 13:56:11 +0000181
Guido van Rossum77f6a652002-04-03 22:41:51 +0000182There are four distinct numeric types: \dfn{plain integers},
183\dfn{long integers},
Fred Drake64e3b431998-07-24 13:56:11 +0000184\dfn{floating point numbers}, and \dfn{complex numbers}.
Guido van Rossum77f6a652002-04-03 22:41:51 +0000185In addition, Booleans are a subtype of plain integers.
Fred Drake64e3b431998-07-24 13:56:11 +0000186Plain integers (also just called \dfn{integers})
Fred Drake38e5d272000-04-03 20:13:55 +0000187are implemented using \ctype{long} in C, which gives them at least 32
Fred Drake64e3b431998-07-24 13:56:11 +0000188bits of precision. Long integers have unlimited precision. Floating
Fred Drake38e5d272000-04-03 20:13:55 +0000189point numbers are implemented using \ctype{double} in C. All bets on
Fred Drake64e3b431998-07-24 13:56:11 +0000190their precision are off unless you happen to know the machine you are
191working with.
Fred Drake0b4e25d2000-10-04 04:21:19 +0000192\obindex{numeric}
Guido van Rossum77f6a652002-04-03 22:41:51 +0000193\obindex{Boolean}
Fred Drake0b4e25d2000-10-04 04:21:19 +0000194\obindex{integer}
195\obindex{long integer}
196\obindex{floating point}
197\obindex{complex number}
Fred Drake38e5d272000-04-03 20:13:55 +0000198\indexii{C}{language}
Fred Drake64e3b431998-07-24 13:56:11 +0000199
Steve Holden1e4519f2002-06-14 09:16:40 +0000200Complex numbers have a real and imaginary part, which are each
Fred Drake38e5d272000-04-03 20:13:55 +0000201implemented using \ctype{double} in C. To extract these parts from
Tim Peters8f01b682002-03-12 03:04:44 +0000202a complex number \var{z}, use \code{\var{z}.real} and \code{\var{z}.imag}.
Fred Drake64e3b431998-07-24 13:56:11 +0000203
204Numbers are created by numeric literals or as the result of built-in
205functions and operators. Unadorned integer literals (including hex
Steve Holden1e4519f2002-06-14 09:16:40 +0000206and octal numbers) yield plain integers unless the value they denote
207is too large to be represented as a plain integer, in which case
208they yield a long integer. Integer literals with an
Fred Drake38e5d272000-04-03 20:13:55 +0000209\character{L} or \character{l} suffix yield long integers
210(\character{L} is preferred because \samp{1l} looks too much like
211eleven!). Numeric literals containing a decimal point or an exponent
212sign yield floating point numbers. Appending \character{j} or
Steve Holden1e4519f2002-06-14 09:16:40 +0000213\character{J} to a numeric literal yields a complex number with a
214zero real part. A complex numeric literal is the sum of a real and
215an imaginary part.
Fred Drake64e3b431998-07-24 13:56:11 +0000216\indexii{numeric}{literals}
217\indexii{integer}{literals}
218\indexiii{long}{integer}{literals}
219\indexii{floating point}{literals}
220\indexii{complex number}{literals}
221\indexii{hexadecimal}{literals}
222\indexii{octal}{literals}
223
224Python fully supports mixed arithmetic: when a binary arithmetic
225operator has operands of different numeric types, the operand with the
Steve Holden1e4519f2002-06-14 09:16:40 +0000226``narrower'' type is widened to that of the other, where plain
227integer is narrower than long integer is narrower than floating point is
228narrower than complex.
Fred Drakeea003fc1999-04-05 21:59:15 +0000229Comparisons between numbers of mixed type use the same rule.\footnote{
230 As a consequence, the list \code{[1, 2]} is considered equal
Steve Holden1e4519f2002-06-14 09:16:40 +0000231 to \code{[1.0, 2.0]}, and similarly for tuples.
232} The constructors \function{int()}, \function{long()}, \function{float()},
Fred Drake84538cd1998-11-30 21:51:25 +0000233and \function{complex()} can be used
Steve Holden1e4519f2002-06-14 09:16:40 +0000234to produce numbers of a specific type.
Fred Drake64e3b431998-07-24 13:56:11 +0000235\index{arithmetic}
236\bifuncindex{int}
237\bifuncindex{long}
238\bifuncindex{float}
239\bifuncindex{complex}
240
Michael W. Hudson9c206152003-03-05 14:42:09 +0000241All numeric types (except complex) support the following operations,
242sorted by ascending priority (operations in the same box have the same
Fred Drake64e3b431998-07-24 13:56:11 +0000243priority; all numeric operations have a higher priority than
244comparison operations):
245
246\begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
247 \lineiii{\var{x} + \var{y}}{sum of \var{x} and \var{y}}{}
248 \lineiii{\var{x} - \var{y}}{difference of \var{x} and \var{y}}{}
249 \hline
250 \lineiii{\var{x} * \var{y}}{product of \var{x} and \var{y}}{}
251 \lineiii{\var{x} / \var{y}}{quotient of \var{x} and \var{y}}{(1)}
Michael W. Hudson9c206152003-03-05 14:42:09 +0000252 \lineiii{\var{x} \%{} \var{y}}{remainder of \code{\var{x} / \var{y}}}{(4)}
Fred Drake64e3b431998-07-24 13:56:11 +0000253 \hline
254 \lineiii{-\var{x}}{\var{x} negated}{}
255 \lineiii{+\var{x}}{\var{x} unchanged}{}
256 \hline
257 \lineiii{abs(\var{x})}{absolute value or magnitude of \var{x}}{}
258 \lineiii{int(\var{x})}{\var{x} converted to integer}{(2)}
259 \lineiii{long(\var{x})}{\var{x} converted to long integer}{(2)}
260 \lineiii{float(\var{x})}{\var{x} converted to floating point}{}
261 \lineiii{complex(\var{re},\var{im})}{a complex number with real part \var{re}, imaginary part \var{im}. \var{im} defaults to zero.}{}
Fred Drake26b698f1999-02-12 18:27:31 +0000262 \lineiii{\var{c}.conjugate()}{conjugate of the complex number \var{c}}{}
Raymond Hettingerdede3bd2005-05-31 11:04:00 +0000263 \lineiii{divmod(\var{x}, \var{y})}{the pair \code{(\var{x} // \var{y}, \var{x} \%{} \var{y})}}{(3)(4)}
Fred Drake64e3b431998-07-24 13:56:11 +0000264 \lineiii{pow(\var{x}, \var{y})}{\var{x} to the power \var{y}}{}
265 \lineiii{\var{x} ** \var{y}}{\var{x} to the power \var{y}}{}
266\end{tableiii}
267\indexiii{operations on}{numeric}{types}
Fred Drake26b698f1999-02-12 18:27:31 +0000268\withsubitem{(complex number method)}{\ttindex{conjugate()}}
Fred Drake64e3b431998-07-24 13:56:11 +0000269
270\noindent
271Notes:
272\begin{description}
273
274\item[(1)]
275For (plain or long) integer division, the result is an integer.
Tim Peters8f01b682002-03-12 03:04:44 +0000276The result is always rounded towards minus infinity: 1/2 is 0,
Fred Drake38e5d272000-04-03 20:13:55 +0000277(-1)/2 is -1, 1/(-2) is -1, and (-1)/(-2) is 0. Note that the result
278is a long integer if either operand is a long integer, regardless of
279the numeric value.
Fred Drake64e3b431998-07-24 13:56:11 +0000280\indexii{integer}{division}
281\indexiii{long}{integer}{division}
282
283\item[(2)]
284Conversion from floating point to (long or plain) integer may round or
Fred Drake4de96c22000-08-12 03:36:23 +0000285truncate as in C; see functions \function{floor()} and
286\function{ceil()} in the \refmodule{math}\refbimodindex{math} module
287for well-defined conversions.
Fred Drake9474d861999-02-12 22:05:33 +0000288\withsubitem{(in module math)}{\ttindex{floor()}\ttindex{ceil()}}
Fred Drake64e3b431998-07-24 13:56:11 +0000289\indexii{numeric}{conversions}
Fred Drake4de96c22000-08-12 03:36:23 +0000290\indexii{C}{language}
Fred Drake64e3b431998-07-24 13:56:11 +0000291
292\item[(3)]
Fred Drake38e5d272000-04-03 20:13:55 +0000293See section \ref{built-in-funcs}, ``Built-in Functions,'' for a full
294description.
Fred Drake64e3b431998-07-24 13:56:11 +0000295
Michael W. Hudson9c206152003-03-05 14:42:09 +0000296\item[(4)]
297Complex floor division operator, modulo operator, and \function{divmod()}.
298
299\deprecated{2.3}{Instead convert to float using \function{abs()}
300if appropriate.}
301
Fred Drake64e3b431998-07-24 13:56:11 +0000302\end{description}
303% XXXJH exceptions: overflow (when? what operations?) zerodivision
304
Fred Drake4e7c2051999-02-19 15:30:25 +0000305\subsubsection{Bit-string Operations on Integer Types \label{bitstring-ops}}
Fred Drake64e3b431998-07-24 13:56:11 +0000306\nodename{Bit-string Operations}
307
308Plain and long integer types support additional operations that make
309sense only for bit-strings. Negative numbers are treated as their 2's
310complement value (for long integers, this assumes a sufficiently large
311number of bits that no overflow occurs during the operation).
312
313The priorities of the binary bit-wise operations are all lower than
314the numeric operations and higher than the comparisons; the unary
315operation \samp{\~} has the same priority as the other unary numeric
316operations (\samp{+} and \samp{-}).
317
318This table lists the bit-string operations sorted in ascending
319priority (operations in the same box have the same priority):
320
321\begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
322 \lineiii{\var{x} | \var{y}}{bitwise \dfn{or} of \var{x} and \var{y}}{}
323 \lineiii{\var{x} \^{} \var{y}}{bitwise \dfn{exclusive or} of \var{x} and \var{y}}{}
324 \lineiii{\var{x} \&{} \var{y}}{bitwise \dfn{and} of \var{x} and \var{y}}{}
Fred Drake2269d862004-11-11 06:14:05 +0000325 % The empty groups below prevent conversion to guillemets.
326 \lineiii{\var{x} <{}< \var{n}}{\var{x} shifted left by \var{n} bits}{(1), (2)}
327 \lineiii{\var{x} >{}> \var{n}}{\var{x} shifted right by \var{n} bits}{(1), (3)}
Fred Drake64e3b431998-07-24 13:56:11 +0000328 \hline
329 \lineiii{\~\var{x}}{the bits of \var{x} inverted}{}
330\end{tableiii}
331\indexiii{operations on}{integer}{types}
332\indexii{bit-string}{operations}
333\indexii{shifting}{operations}
334\indexii{masking}{operations}
335
336\noindent
337Notes:
338\begin{description}
339\item[(1)] Negative shift counts are illegal and cause a
340\exception{ValueError} to be raised.
341\item[(2)] A left shift by \var{n} bits is equivalent to
342multiplication by \code{pow(2, \var{n})} without overflow check.
343\item[(3)] A right shift by \var{n} bits is equivalent to
344division by \code{pow(2, \var{n})} without overflow check.
345\end{description}
346
347
Fred Drake93656e72001-05-02 20:18:03 +0000348\subsection{Iterator Types \label{typeiter}}
349
Fred Drakef42cc452001-05-03 04:39:10 +0000350\versionadded{2.2}
Fred Drake93656e72001-05-02 20:18:03 +0000351\index{iterator protocol}
352\index{protocol!iterator}
353\index{sequence!iteration}
354\index{container!iteration over}
355
356Python supports a concept of iteration over containers. This is
357implemented using two distinct methods; these are used to allow
358user-defined classes to support iteration. Sequences, described below
359in more detail, always support the iteration methods.
360
361One method needs to be defined for container objects to provide
362iteration support:
363
364\begin{methoddesc}[container]{__iter__}{}
Greg Ward54f65092001-07-26 21:01:21 +0000365 Return an iterator object. The object is required to support the
Fred Drake93656e72001-05-02 20:18:03 +0000366 iterator protocol described below. If a container supports
367 different types of iteration, additional methods can be provided to
368 specifically request iterators for those iteration types. (An
369 example of an object supporting multiple forms of iteration would be
370 a tree structure which supports both breadth-first and depth-first
371 traversal.) This method corresponds to the \member{tp_iter} slot of
372 the type structure for Python objects in the Python/C API.
373\end{methoddesc}
374
375The iterator objects themselves are required to support the following
376two methods, which together form the \dfn{iterator protocol}:
377
378\begin{methoddesc}[iterator]{__iter__}{}
379 Return the iterator object itself. This is required to allow both
380 containers and iterators to be used with the \keyword{for} and
381 \keyword{in} statements. This method corresponds to the
382 \member{tp_iter} slot of the type structure for Python objects in
383 the Python/C API.
384\end{methoddesc}
385
Fred Drakef42cc452001-05-03 04:39:10 +0000386\begin{methoddesc}[iterator]{next}{}
Fred Drake93656e72001-05-02 20:18:03 +0000387 Return the next item from the container. If there are no further
388 items, raise the \exception{StopIteration} exception. This method
389 corresponds to the \member{tp_iternext} slot of the type structure
390 for Python objects in the Python/C API.
391\end{methoddesc}
392
393Python defines several iterator objects to support iteration over
394general and specific sequence types, dictionaries, and other more
395specialized forms. The specific types are not important beyond their
396implementation of the iterator protocol.
397
Guido van Rossum9534e142002-07-16 19:53:39 +0000398The intention of the protocol is that once an iterator's
399\method{next()} method raises \exception{StopIteration}, it will
400continue to do so on subsequent calls. Implementations that
401do not obey this property are deemed broken. (This constraint
402was added in Python 2.3; in Python 2.2, various iterators are
403broken according to this rule.)
404
Raymond Hettinger2dd8c422003-06-25 19:03:22 +0000405Python's generators provide a convenient way to implement the
406iterator protocol. If a container object's \method{__iter__()}
407method is implemented as a generator, it will automatically
408return an iterator object (technically, a generator object)
409supplying the \method{__iter__()} and \method{next()} methods.
410
Fred Drake93656e72001-05-02 20:18:03 +0000411
Raymond Hettinger24d75212005-06-14 08:45:43 +0000412\subsection{Sequence Types ---
413 \class{str}, \class{unicode}, \class{list},
414 \class{tuple}, \class{buffer}, \class{xrange}
415 \label{typesseq}}
Fred Drake64e3b431998-07-24 13:56:11 +0000416
Fred Drake107b9672000-08-14 15:37:59 +0000417There are six sequence types: strings, Unicode strings, lists,
Fred Drake512bb722000-08-18 03:12:38 +0000418tuples, buffers, and xrange objects.
Fred Drake64e3b431998-07-24 13:56:11 +0000419
Steve Holden1e4519f2002-06-14 09:16:40 +0000420String literals are written in single or double quotes:
Fred Drake38e5d272000-04-03 20:13:55 +0000421\code{'xyzzy'}, \code{"frobozz"}. See chapter 2 of the
Fred Drake4de96c22000-08-12 03:36:23 +0000422\citetitle[../ref/strings.html]{Python Reference Manual} for more about
423string literals. Unicode strings are much like strings, but are
Raymond Hettinger68804312005-01-01 00:28:46 +0000424specified in the syntax using a preceding \character{u} character:
Fred Drake4de96c22000-08-12 03:36:23 +0000425\code{u'abc'}, \code{u"def"}. Lists are constructed with square brackets,
Fred Drake37f15741999-11-10 16:21:37 +0000426separating items with commas: \code{[a, b, c]}. Tuples are
427constructed by the comma operator (not within square brackets), with
428or without enclosing parentheses, but an empty tuple must have the
Raymond Hettingerb67449d2003-09-08 18:52:18 +0000429enclosing parentheses, such as \code{a, b, c} or \code{()}. A single
430item tuple must have a trailing comma, such as \code{(d,)}.
Fred Drake0b4e25d2000-10-04 04:21:19 +0000431\obindex{sequence}
432\obindex{string}
433\obindex{Unicode}
Fred Drake0b4e25d2000-10-04 04:21:19 +0000434\obindex{tuple}
435\obindex{list}
Guido van Rossum5fe2c132001-07-05 15:27:19 +0000436
437Buffer objects are not directly supported by Python syntax, but can be
438created by calling the builtin function
Fred Drake36c2bd82002-09-24 15:32:04 +0000439\function{buffer()}.\bifuncindex{buffer} They don't support
Steve Holden1e4519f2002-06-14 09:16:40 +0000440concatenation or repetition.
Guido van Rossum5fe2c132001-07-05 15:27:19 +0000441\obindex{buffer}
442
443Xrange objects are similar to buffers in that there is no specific
Steve Holden1e4519f2002-06-14 09:16:40 +0000444syntax to create them, but they are created using the \function{xrange()}
445function.\bifuncindex{xrange} They don't support slicing,
446concatenation or repetition, and using \code{in}, \code{not in},
447\function{min()} or \function{max()} on them is inefficient.
Fred Drake0b4e25d2000-10-04 04:21:19 +0000448\obindex{xrange}
Fred Drake64e3b431998-07-24 13:56:11 +0000449
Guido van Rossum5fe2c132001-07-05 15:27:19 +0000450Most sequence types support the following operations. The \samp{in} and
Fred Drake64e3b431998-07-24 13:56:11 +0000451\samp{not in} operations have the same priorities as the comparison
452operations. The \samp{+} and \samp{*} operations have the same
453priority as the corresponding numeric operations.\footnote{They must
454have since the parser can't tell the type of the operands.}
455
456This table lists the sequence operations sorted in ascending priority
457(operations in the same box have the same priority). In the table,
458\var{s} and \var{t} are sequences of the same type; \var{n}, \var{i}
459and \var{j} are integers:
460
461\begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
Raymond Hettinger77d110d2004-10-08 01:52:15 +0000462 \lineiii{\var{x} in \var{s}}{\code{True} if an item of \var{s} is equal to \var{x}, else \code{False}}{(1)}
463 \lineiii{\var{x} not in \var{s}}{\code{False} if an item of \var{s} is
464equal to \var{x}, else \code{True}}{(1)}
Fred Drake64e3b431998-07-24 13:56:11 +0000465 \hline
Raymond Hettinger52a21b82004-08-06 18:43:09 +0000466 \lineiii{\var{s} + \var{t}}{the concatenation of \var{s} and \var{t}}{(6)}
Barry Warsaw817918c2002-08-06 16:58:21 +0000467 \lineiii{\var{s} * \var{n}\textrm{,} \var{n} * \var{s}}{\var{n} shallow copies of \var{s} concatenated}{(2)}
Fred Drake64e3b431998-07-24 13:56:11 +0000468 \hline
Barry Warsaw817918c2002-08-06 16:58:21 +0000469 \lineiii{\var{s}[\var{i}]}{\var{i}'th item of \var{s}, origin 0}{(3)}
470 \lineiii{\var{s}[\var{i}:\var{j}]}{slice of \var{s} from \var{i} to \var{j}}{(3), (4)}
Michael W. Hudson9c206152003-03-05 14:42:09 +0000471 \lineiii{\var{s}[\var{i}:\var{j}:\var{k}]}{slice of \var{s} from \var{i} to \var{j} with step \var{k}}{(3), (5)}
Fred Drake64e3b431998-07-24 13:56:11 +0000472 \hline
473 \lineiii{len(\var{s})}{length of \var{s}}{}
474 \lineiii{min(\var{s})}{smallest item of \var{s}}{}
475 \lineiii{max(\var{s})}{largest item of \var{s}}{}
476\end{tableiii}
477\indexiii{operations on}{sequence}{types}
478\bifuncindex{len}
479\bifuncindex{min}
480\bifuncindex{max}
481\indexii{concatenation}{operation}
482\indexii{repetition}{operation}
483\indexii{subscript}{operation}
484\indexii{slice}{operation}
Michael W. Hudson9c206152003-03-05 14:42:09 +0000485\indexii{extended slice}{operation}
Fred Drake64e3b431998-07-24 13:56:11 +0000486\opindex{in}
487\opindex{not in}
488
489\noindent
490Notes:
491
492\begin{description}
Barry Warsaw817918c2002-08-06 16:58:21 +0000493\item[(1)] When \var{s} is a string or Unicode string object the
494\code{in} and \code{not in} operations act like a substring test. In
495Python versions before 2.3, \var{x} had to be a string of length 1.
496In Python 2.3 and beyond, \var{x} may be a string of any length.
497
498\item[(2)] Values of \var{n} less than \code{0} are treated as
Fred Drake38e5d272000-04-03 20:13:55 +0000499 \code{0} (which yields an empty sequence of the same type as
Fred Draked800cff2001-08-28 14:56:05 +0000500 \var{s}). Note also that the copies are shallow; nested structures
501 are not copied. This often haunts new Python programmers; consider:
502
503\begin{verbatim}
504>>> lists = [[]] * 3
505>>> lists
506[[], [], []]
507>>> lists[0].append(3)
508>>> lists
509[[3], [3], [3]]
510\end{verbatim}
511
Armin Rigo80adba62004-11-04 11:29:09 +0000512 What has happened is that \code{[[]]} is a one-element list containing
513 an empty list, so all three elements of \code{[[]] * 3} are (pointers to)
514 this single empty list. Modifying any of the elements of \code{lists}
515 modifies this single list. You can create a list of different lists this
516 way:
Fred Draked800cff2001-08-28 14:56:05 +0000517
518\begin{verbatim}
519>>> lists = [[] for i in range(3)]
520>>> lists[0].append(3)
521>>> lists[1].append(5)
522>>> lists[2].append(7)
523>>> lists
524[[3], [5], [7]]
525\end{verbatim}
Fred Drake38e5d272000-04-03 20:13:55 +0000526
Barry Warsaw817918c2002-08-06 16:58:21 +0000527\item[(3)] If \var{i} or \var{j} is negative, the index is relative to
Fred Drake907e76b2001-07-06 20:30:11 +0000528 the end of the string: \code{len(\var{s}) + \var{i}} or
Fred Drake64e3b431998-07-24 13:56:11 +0000529 \code{len(\var{s}) + \var{j}} is substituted. But note that \code{-0} is
530 still \code{0}.
Tim Peters8f01b682002-03-12 03:04:44 +0000531
Barry Warsaw817918c2002-08-06 16:58:21 +0000532\item[(4)] The slice of \var{s} from \var{i} to \var{j} is defined as
Fred Drake64e3b431998-07-24 13:56:11 +0000533 the sequence of items with index \var{k} such that \code{\var{i} <=
534 \var{k} < \var{j}}. If \var{i} or \var{j} is greater than
535 \code{len(\var{s})}, use \code{len(\var{s})}. If \var{i} is omitted,
536 use \code{0}. If \var{j} is omitted, use \code{len(\var{s})}. If
537 \var{i} is greater than or equal to \var{j}, the slice is empty.
Michael W. Hudson9c206152003-03-05 14:42:09 +0000538
539\item[(5)] The slice of \var{s} from \var{i} to \var{j} with step
540 \var{k} is defined as the sequence of items with index
Armin Rigo80adba62004-11-04 11:29:09 +0000541 \code{\var{x} = \var{i} + \var{n}*\var{k}} such that
542 $0 \leq n < \frac{j-i}{k}$. In other words, the indices
543 are \code{i}, \code{i+k}, \code{i+2*k}, \code{i+3*k} and so on, stopping when
544 \var{j} is reached (but never including \var{j}). If \var{i} or \var{j}
Michael W. Hudson9c206152003-03-05 14:42:09 +0000545 is greater than \code{len(\var{s})}, use \code{len(\var{s})}. If
Raymond Hettinger81702002003-08-30 23:31:31 +0000546 \var{i} or \var{j} are omitted then they become ``end'' values
547 (which end depends on the sign of \var{k}). Note, \var{k} cannot
548 be zero.
Michael W. Hudson9c206152003-03-05 14:42:09 +0000549
Raymond Hettinger52a21b82004-08-06 18:43:09 +0000550\item[(6)] If \var{s} and \var{t} are both strings, some Python
Andrew M. Kuchling34ed2b02004-08-06 18:55:09 +0000551implementations such as CPython can usually perform an in-place optimization
Raymond Hettinger52a21b82004-08-06 18:43:09 +0000552for assignments of the form \code{\var{s}=\var{s}+\var{t}} or
553\code{\var{s}+=\var{t}}. When applicable, this optimization makes
554quadratic run-time much less likely. This optimization is both version
555and implementation dependent. For performance sensitive code, it is
Raymond Hettinger68804312005-01-01 00:28:46 +0000556preferable to use the \method{str.join()} method which assures consistent
Raymond Hettinger52a21b82004-08-06 18:43:09 +0000557linear concatenation performance across versions and implementations.
Andrew M. Kuchling34ed2b02004-08-06 18:55:09 +0000558\versionchanged[Formerly, string concatenation never occurred in-place]{2.4}
Raymond Hettinger52a21b82004-08-06 18:43:09 +0000559
Fred Drake64e3b431998-07-24 13:56:11 +0000560\end{description}
561
Fred Drake9474d861999-02-12 22:05:33 +0000562
Fred Drake4de96c22000-08-12 03:36:23 +0000563\subsubsection{String Methods \label{string-methods}}
564
565These are the string methods which both 8-bit strings and Unicode
566objects support:
567
568\begin{methoddesc}[string]{capitalize}{}
569Return a copy of the string with only its first character capitalized.
Martin v. Löwis4a9b8062004-06-03 09:47:01 +0000570
571For 8-bit strings, this method is locale-dependent.
Fred Drake4de96c22000-08-12 03:36:23 +0000572\end{methoddesc}
573
Raymond Hettinger4f8f9762003-11-26 08:21:35 +0000574\begin{methoddesc}[string]{center}{width\optional{, fillchar}}
Fred Drake4de96c22000-08-12 03:36:23 +0000575Return centered in a string of length \var{width}. Padding is done
Raymond Hettinger4f8f9762003-11-26 08:21:35 +0000576using the specified \var{fillchar} (default is a space).
Neal Norwitz72452652003-11-26 14:54:56 +0000577\versionchanged[Support for the \var{fillchar} argument]{2.4}
Fred Drake4de96c22000-08-12 03:36:23 +0000578\end{methoddesc}
579
580\begin{methoddesc}[string]{count}{sub\optional{, start\optional{, end}}}
581Return the number of occurrences of substring \var{sub} in string
582S\code{[\var{start}:\var{end}]}. Optional arguments \var{start} and
583\var{end} are interpreted as in slice notation.
584\end{methoddesc}
585
Fred Drake6048ce92001-12-10 16:43:08 +0000586\begin{methoddesc}[string]{decode}{\optional{encoding\optional{, errors}}}
587Decodes the string using the codec registered for \var{encoding}.
588\var{encoding} defaults to the default string encoding. \var{errors}
589may be given to set a different error handling scheme. The default is
590\code{'strict'}, meaning that encoding errors raise
Walter Dörwaldac1075a2004-07-01 19:58:47 +0000591\exception{UnicodeError}. Other possible values are \code{'ignore'},
592\code{'replace'} and any other name registered via
593\function{codecs.register_error}.
Fred Drake6048ce92001-12-10 16:43:08 +0000594\versionadded{2.2}
Walter Dörwaldac1075a2004-07-01 19:58:47 +0000595\versionchanged[Support for other error handling schemes added]{2.3}
Fred Drake6048ce92001-12-10 16:43:08 +0000596\end{methoddesc}
597
Fred Drake4de96c22000-08-12 03:36:23 +0000598\begin{methoddesc}[string]{encode}{\optional{encoding\optional{,errors}}}
599Return an encoded version of the string. Default encoding is the current
600default string encoding. \var{errors} may be given to set a different
601error handling scheme. The default for \var{errors} is
602\code{'strict'}, meaning that encoding errors raise a
Walter Dörwaldac1075a2004-07-01 19:58:47 +0000603\exception{UnicodeError}. Other possible values are \code{'ignore'},
604\code{'replace'}, \code{'xmlcharrefreplace'}, \code{'backslashreplace'}
605and any other name registered via \function{codecs.register_error}.
606For a list of possible encodings, see section~\ref{standard-encodings}.
Fred Drake1dba66c2000-10-25 21:03:55 +0000607\versionadded{2.0}
Walter Dörwaldac1075a2004-07-01 19:58:47 +0000608\versionchanged[Support for \code{'xmlcharrefreplace'} and
609\code{'backslashreplace'} and other error handling schemes added]{2.3}
Fred Drake4de96c22000-08-12 03:36:23 +0000610\end{methoddesc}
611
612\begin{methoddesc}[string]{endswith}{suffix\optional{, start\optional{, end}}}
Michael W. Hudson9c206152003-03-05 14:42:09 +0000613Return \code{True} if the string ends with the specified \var{suffix},
614otherwise return \code{False}. With optional \var{start}, test beginning at
Fred Drake4de96c22000-08-12 03:36:23 +0000615that position. With optional \var{end}, stop comparing at that position.
616\end{methoddesc}
617
618\begin{methoddesc}[string]{expandtabs}{\optional{tabsize}}
619Return a copy of the string where all tab characters are expanded
620using spaces. If \var{tabsize} is not given, a tab size of \code{8}
621characters is assumed.
622\end{methoddesc}
623
624\begin{methoddesc}[string]{find}{sub\optional{, start\optional{, end}}}
625Return the lowest index in the string where substring \var{sub} is
626found, such that \var{sub} is contained in the range [\var{start},
627\var{end}). Optional arguments \var{start} and \var{end} are
628interpreted as in slice notation. Return \code{-1} if \var{sub} is
629not found.
630\end{methoddesc}
631
632\begin{methoddesc}[string]{index}{sub\optional{, start\optional{, end}}}
633Like \method{find()}, but raise \exception{ValueError} when the
634substring is not found.
635\end{methoddesc}
636
637\begin{methoddesc}[string]{isalnum}{}
638Return true if all characters in the string are alphanumeric and there
639is at least one character, false otherwise.
Martin v. Löwis4a9b8062004-06-03 09:47:01 +0000640
641For 8-bit strings, this method is locale-dependent.
Fred Drake4de96c22000-08-12 03:36:23 +0000642\end{methoddesc}
643
644\begin{methoddesc}[string]{isalpha}{}
645Return true if all characters in the string are alphabetic and there
646is at least one character, false otherwise.
Martin v. Löwis4a9b8062004-06-03 09:47:01 +0000647
648For 8-bit strings, this method is locale-dependent.
Fred Drake4de96c22000-08-12 03:36:23 +0000649\end{methoddesc}
650
651\begin{methoddesc}[string]{isdigit}{}
Martin v. Löwis6828e182003-10-18 09:55:08 +0000652Return true if all characters in the string are digits and there
653is at least one character, false otherwise.
Martin v. Löwis4a9b8062004-06-03 09:47:01 +0000654
655For 8-bit strings, this method is locale-dependent.
Fred Drake4de96c22000-08-12 03:36:23 +0000656\end{methoddesc}
657
658\begin{methoddesc}[string]{islower}{}
659Return true if all cased characters in the string are lowercase and
660there is at least one cased character, false otherwise.
Martin v. Löwis4a9b8062004-06-03 09:47:01 +0000661
662For 8-bit strings, this method is locale-dependent.
Fred Drake4de96c22000-08-12 03:36:23 +0000663\end{methoddesc}
664
665\begin{methoddesc}[string]{isspace}{}
666Return true if there are only whitespace characters in the string and
Martin v. Löwis6828e182003-10-18 09:55:08 +0000667there is at least one character, false otherwise.
Martin v. Löwis4a9b8062004-06-03 09:47:01 +0000668
669For 8-bit strings, this method is locale-dependent.
Fred Drake4de96c22000-08-12 03:36:23 +0000670\end{methoddesc}
671
672\begin{methoddesc}[string]{istitle}{}
Martin v. Löwis6828e182003-10-18 09:55:08 +0000673Return true if the string is a titlecased string and there is at least one
Raymond Hettinger0a9b9da2003-10-29 06:54:43 +0000674character, for example uppercase characters may only follow uncased
Martin v. Löwis6828e182003-10-18 09:55:08 +0000675characters and lowercase characters only cased ones. Return false
676otherwise.
Martin v. Löwis4a9b8062004-06-03 09:47:01 +0000677
678For 8-bit strings, this method is locale-dependent.
Fred Drake4de96c22000-08-12 03:36:23 +0000679\end{methoddesc}
680
681\begin{methoddesc}[string]{isupper}{}
682Return true if all cased characters in the string are uppercase and
683there is at least one cased character, false otherwise.
Martin v. Löwis4a9b8062004-06-03 09:47:01 +0000684
685For 8-bit strings, this method is locale-dependent.
Fred Drake4de96c22000-08-12 03:36:23 +0000686\end{methoddesc}
687
688\begin{methoddesc}[string]{join}{seq}
689Return a string which is the concatenation of the strings in the
690sequence \var{seq}. The separator between elements is the string
691providing this method.
692\end{methoddesc}
693
Raymond Hettinger4f8f9762003-11-26 08:21:35 +0000694\begin{methoddesc}[string]{ljust}{width\optional{, fillchar}}
Fred Drake4de96c22000-08-12 03:36:23 +0000695Return the string left justified in a string of length \var{width}.
Raymond Hettinger4f8f9762003-11-26 08:21:35 +0000696Padding is done using the specified \var{fillchar} (default is a
697space). The original string is returned if
Fred Drake4de96c22000-08-12 03:36:23 +0000698\var{width} is less than \code{len(\var{s})}.
Neal Norwitz72452652003-11-26 14:54:56 +0000699\versionchanged[Support for the \var{fillchar} argument]{2.4}
Fred Drake4de96c22000-08-12 03:36:23 +0000700\end{methoddesc}
701
702\begin{methoddesc}[string]{lower}{}
703Return a copy of the string converted to lowercase.
Martin v. Löwis4a9b8062004-06-03 09:47:01 +0000704
705For 8-bit strings, this method is locale-dependent.
Fred Drake4de96c22000-08-12 03:36:23 +0000706\end{methoddesc}
707
Fred Drake8b1c47b2002-04-13 02:43:39 +0000708\begin{methoddesc}[string]{lstrip}{\optional{chars}}
Raymond Hettinger7bebbe72005-05-31 10:26:28 +0000709Return a copy of the string with leading characters removed. The
710\var{chars} argument is a string specifying the set of characters
711to be removed. If omitted or \code{None}, the \var{chars} argument
712defaults to removing whitespace. The \var{chars} argument is not
713a prefix; rather, all combinations of its values are stripped:
714\begin{verbatim}
715 >>> ' spacious '.lstrip()
716 'spacious '
717 >>> 'www.example.com'.lstrip('cmowz.')
718 'example.com'
719\end{verbatim}
Fred Drake91718012002-11-16 00:41:55 +0000720\versionchanged[Support for the \var{chars} argument]{2.2.2}
Fred Drake4de96c22000-08-12 03:36:23 +0000721\end{methoddesc}
722
Fred Draked22bb652003-10-22 02:56:40 +0000723\begin{methoddesc}[string]{replace}{old, new\optional{, count}}
Fred Drake4de96c22000-08-12 03:36:23 +0000724Return a copy of the string with all occurrences of substring
725\var{old} replaced by \var{new}. If the optional argument
Fred Draked22bb652003-10-22 02:56:40 +0000726\var{count} is given, only the first \var{count} occurrences are
Fred Drake4de96c22000-08-12 03:36:23 +0000727replaced.
728\end{methoddesc}
729
730\begin{methoddesc}[string]{rfind}{sub \optional{,start \optional{,end}}}
731Return the highest index in the string where substring \var{sub} is
732found, such that \var{sub} is contained within s[start,end]. Optional
733arguments \var{start} and \var{end} are interpreted as in slice
734notation. Return \code{-1} on failure.
735\end{methoddesc}
736
737\begin{methoddesc}[string]{rindex}{sub\optional{, start\optional{, end}}}
738Like \method{rfind()} but raises \exception{ValueError} when the
739substring \var{sub} is not found.
740\end{methoddesc}
741
Raymond Hettinger4f8f9762003-11-26 08:21:35 +0000742\begin{methoddesc}[string]{rjust}{width\optional{, fillchar}}
Fred Drake4de96c22000-08-12 03:36:23 +0000743Return the string right justified in a string of length \var{width}.
Raymond Hettinger4f8f9762003-11-26 08:21:35 +0000744Padding is done using the specified \var{fillchar} (default is a space).
745The original string is returned if
Fred Drake4de96c22000-08-12 03:36:23 +0000746\var{width} is less than \code{len(\var{s})}.
Neal Norwitz72452652003-11-26 14:54:56 +0000747\versionchanged[Support for the \var{fillchar} argument]{2.4}
Fred Drake4de96c22000-08-12 03:36:23 +0000748\end{methoddesc}
749
Hye-Shik Changc6f066f2003-12-17 02:49:03 +0000750\begin{methoddesc}[string]{rsplit}{\optional{sep \optional{,maxsplit}}}
751Return a list of the words in the string, using \var{sep} as the
752delimiter string. If \var{maxsplit} is given, at most \var{maxsplit}
Fred Drake401d1e32003-12-30 22:21:18 +0000753splits are done, the \emph{rightmost} ones. If \var{sep} is not specified
Raymond Hettinger770184b2005-01-25 10:21:19 +0000754or \code{None}, any whitespace string is a separator. Except for splitting
755from the right, \method{rsplit()} behaves like \method{split()} which
756is described in detail below.
Hye-Shik Chang3ae811b2003-12-15 18:49:53 +0000757\versionadded{2.4}
758\end{methoddesc}
759
Fred Drake8b1c47b2002-04-13 02:43:39 +0000760\begin{methoddesc}[string]{rstrip}{\optional{chars}}
Raymond Hettinger7bebbe72005-05-31 10:26:28 +0000761Return a copy of the string with trailing characters removed. The
762\var{chars} argument is a string specifying the set of characters
763to be removed. If omitted or \code{None}, the \var{chars} argument
764defaults to removing whitespace. The \var{chars} argument is not
765a suffix; rather, all combinations of its values are stripped:
766\begin{verbatim}
767 >>> ' spacious '.rstrip()
768 ' spacious'
769 >>> 'mississippi'.rstrip('ipz')
770 'mississ'
771\end{verbatim}
Fred Drake91718012002-11-16 00:41:55 +0000772\versionchanged[Support for the \var{chars} argument]{2.2.2}
Fred Drake4de96c22000-08-12 03:36:23 +0000773\end{methoddesc}
774
775\begin{methoddesc}[string]{split}{\optional{sep \optional{,maxsplit}}}
776Return a list of the words in the string, using \var{sep} as the
777delimiter string. If \var{maxsplit} is given, at most \var{maxsplit}
Raymond Hettinger18c69602004-09-06 00:12:04 +0000778splits are done. (thus, the list will have at most \code{\var{maxsplit}+1}
Raymond Hettingerbc029af2005-01-26 22:40:08 +0000779elements). If \var{maxsplit} is not specified, then there
Raymond Hettinger18c69602004-09-06 00:12:04 +0000780is no limit on the number of splits (all possible splits are made).
781Consecutive delimiters are not grouped together and are
782deemed to delimit empty strings (for example, \samp{'1,,2'.split(',')}
Raymond Hettingerbb30af42004-09-06 00:42:14 +0000783returns \samp{['1', '', '2']}). The \var{sep} argument may consist of
Raymond Hettinger18c69602004-09-06 00:12:04 +0000784multiple characters (for example, \samp{'1, 2, 3'.split(', ')} returns
Raymond Hettingerbb30af42004-09-06 00:42:14 +0000785\samp{['1', '2', '3']}). Splitting an empty string with a specified
Raymond Hettinger87bd3fe2005-04-19 04:29:44 +0000786separator returns \samp{['']}.
Raymond Hettinger18c69602004-09-06 00:12:04 +0000787
788If \var{sep} is not specified or is \code{None}, a different splitting
Raymond Hettinger770184b2005-01-25 10:21:19 +0000789algorithm is applied. First, whitespace characters (spaces, tabs,
790newlines, returns, and formfeeds) are stripped from both ends. Then,
791words are separated by arbitrary length strings of whitespace
792characters. Consecutive whitespace delimiters are treated as a single
793delimiter (\samp{'1 2 3'.split()} returns \samp{['1', '2', '3']}).
794Splitting an empty string or a string consisting of just whitespace
Raymond Hettinger87bd3fe2005-04-19 04:29:44 +0000795returns an empty list.
Fred Drake4de96c22000-08-12 03:36:23 +0000796\end{methoddesc}
797
798\begin{methoddesc}[string]{splitlines}{\optional{keepends}}
799Return a list of the lines in the string, breaking at line
800boundaries. Line breaks are not included in the resulting list unless
801\var{keepends} is given and true.
802\end{methoddesc}
803
Fred Drake8b1c47b2002-04-13 02:43:39 +0000804\begin{methoddesc}[string]{startswith}{prefix\optional{,
805 start\optional{, end}}}
Michael W. Hudson9c206152003-03-05 14:42:09 +0000806Return \code{True} if string starts with the \var{prefix}, otherwise
807return \code{False}. With optional \var{start}, test string beginning at
Fred Drake4de96c22000-08-12 03:36:23 +0000808that position. With optional \var{end}, stop comparing string at that
809position.
810\end{methoddesc}
811
Fred Drake8b1c47b2002-04-13 02:43:39 +0000812\begin{methoddesc}[string]{strip}{\optional{chars}}
Raymond Hettinger7bebbe72005-05-31 10:26:28 +0000813Return a copy of the string with the leading and trailing characters
814removed. The \var{chars} argument is a string specifying the set of
815characters to be removed. If omitted or \code{None}, the \var{chars}
816argument defaults to removing whitespace. The \var{chars} argument is not
817a prefix or suffix; rather, all combinations of its values are stripped:
818\begin{verbatim}
819 >>> ' spacious '.strip()
820 'spacious'
821 >>> 'www.example.com'.strip('cmowz.')
822 'example'
823\end{verbatim}
Fred Drake91718012002-11-16 00:41:55 +0000824\versionchanged[Support for the \var{chars} argument]{2.2.2}
Fred Drake4de96c22000-08-12 03:36:23 +0000825\end{methoddesc}
826
827\begin{methoddesc}[string]{swapcase}{}
828Return a copy of the string with uppercase characters converted to
829lowercase and vice versa.
Martin v. Löwis4a9b8062004-06-03 09:47:01 +0000830
831For 8-bit strings, this method is locale-dependent.
Fred Drake4de96c22000-08-12 03:36:23 +0000832\end{methoddesc}
833
834\begin{methoddesc}[string]{title}{}
Fred Drake907e76b2001-07-06 20:30:11 +0000835Return a titlecased version of the string: words start with uppercase
Fred Drake4de96c22000-08-12 03:36:23 +0000836characters, all remaining cased characters are lowercase.
Martin v. Löwis4a9b8062004-06-03 09:47:01 +0000837
838For 8-bit strings, this method is locale-dependent.
Fred Drake4de96c22000-08-12 03:36:23 +0000839\end{methoddesc}
840
841\begin{methoddesc}[string]{translate}{table\optional{, deletechars}}
842Return a copy of the string where all characters occurring in the
843optional argument \var{deletechars} are removed, and the remaining
844characters have been mapped through the given translation table, which
845must be a string of length 256.
Raymond Hettinger46f681c2003-07-16 05:11:27 +0000846
847For Unicode objects, the \method{translate()} method does not
848accept the optional \var{deletechars} argument. Instead, it
849returns a copy of the \var{s} where all characters have been mapped
850through the given translation table which must be a mapping of
851Unicode ordinals to Unicode ordinals, Unicode strings or \code{None}.
852Unmapped characters are left untouched. Characters mapped to \code{None}
853are deleted. Note, a more flexible approach is to create a custom
854character mapping codec using the \refmodule{codecs} module (see
855\module{encodings.cp1251} for an example).
Fred Drake4de96c22000-08-12 03:36:23 +0000856\end{methoddesc}
857
858\begin{methoddesc}[string]{upper}{}
859Return a copy of the string converted to uppercase.
Martin v. Löwis4a9b8062004-06-03 09:47:01 +0000860
861For 8-bit strings, this method is locale-dependent.
Fred Drake4de96c22000-08-12 03:36:23 +0000862\end{methoddesc}
863
Walter Dörwald068325e2002-04-15 13:36:47 +0000864\begin{methoddesc}[string]{zfill}{width}
865Return the numeric string left filled with zeros in a string
866of length \var{width}. The original string is returned if
867\var{width} is less than \code{len(\var{s})}.
Fred Drakee55bec22002-11-16 00:44:00 +0000868\versionadded{2.2.2}
Walter Dörwald068325e2002-04-15 13:36:47 +0000869\end{methoddesc}
870
Fred Drake4de96c22000-08-12 03:36:23 +0000871
872\subsubsection{String Formatting Operations \label{typesseq-strings}}
Fred Drake64e3b431998-07-24 13:56:11 +0000873
Fred Drakeb38784e2001-12-03 22:15:56 +0000874\index{formatting, string (\%{})}
Fred Drakeab2dc1d2001-12-26 20:06:40 +0000875\index{interpolation, string (\%{})}
Fred Drake66d32b12000-09-14 17:57:42 +0000876\index{string!formatting}
Fred Drakeab2dc1d2001-12-26 20:06:40 +0000877\index{string!interpolation}
Fred Drake66d32b12000-09-14 17:57:42 +0000878\index{printf-style formatting}
879\index{sprintf-style formatting}
Fred Drakeb38784e2001-12-03 22:15:56 +0000880\index{\protect\%{} formatting}
Fred Drakeab2dc1d2001-12-26 20:06:40 +0000881\index{\protect\%{} interpolation}
Fred Drake66d32b12000-09-14 17:57:42 +0000882
Fred Drake8c071d42001-01-26 20:48:35 +0000883String and Unicode objects have one unique built-in operation: the
Fred Drakeab2dc1d2001-12-26 20:06:40 +0000884\code{\%} operator (modulo). This is also known as the string
885\emph{formatting} or \emph{interpolation} operator. Given
886\code{\var{format} \% \var{values}} (where \var{format} is a string or
887Unicode object), \code{\%} conversion specifications in \var{format}
888are replaced with zero or more elements of \var{values}. The effect
889is similar to the using \cfunction{sprintf()} in the C language. If
890\var{format} is a Unicode object, or if any of the objects being
891converted using the \code{\%s} conversion are Unicode objects, the
Steve Holden1e4519f2002-06-14 09:16:40 +0000892result will also be a Unicode object.
Fred Drake64e3b431998-07-24 13:56:11 +0000893
Fred Drake8c071d42001-01-26 20:48:35 +0000894If \var{format} requires a single argument, \var{values} may be a
Fred Drake401d1e32003-12-30 22:21:18 +0000895single non-tuple object.\footnote{To format only a tuple you
Steve Holden1e4519f2002-06-14 09:16:40 +0000896should therefore provide a singleton tuple whose only element
897is the tuple to be formatted.} Otherwise, \var{values} must be a tuple with
Fred Drake8c071d42001-01-26 20:48:35 +0000898exactly the number of items specified by the format string, or a
899single mapping object (for example, a dictionary).
Fred Drake64e3b431998-07-24 13:56:11 +0000900
Fred Drake8c071d42001-01-26 20:48:35 +0000901A conversion specifier contains two or more characters and has the
902following components, which must occur in this order:
903
904\begin{enumerate}
905 \item The \character{\%} character, which marks the start of the
906 specifier.
Steve Holden1e4519f2002-06-14 09:16:40 +0000907 \item Mapping key (optional), consisting of a parenthesised sequence
908 of characters (for example, \code{(somename)}).
Fred Drake8c071d42001-01-26 20:48:35 +0000909 \item Conversion flags (optional), which affect the result of some
910 conversion types.
911 \item Minimum field width (optional). If specified as an
912 \character{*} (asterisk), the actual width is read from the
913 next element of the tuple in \var{values}, and the object to
914 convert comes after the minimum field width and optional
915 precision.
916 \item Precision (optional), given as a \character{.} (dot) followed
917 by the precision. If specified as \character{*} (an
918 asterisk), the actual width is read from the next element of
919 the tuple in \var{values}, and the value to convert comes after
920 the precision.
921 \item Length modifier (optional).
922 \item Conversion type.
923\end{enumerate}
Fred Drake64e3b431998-07-24 13:56:11 +0000924
Steve Holden1e4519f2002-06-14 09:16:40 +0000925When the right argument is a dictionary (or other mapping type), then
926the formats in the string \emph{must} include a parenthesised mapping key into
Fred Drake8c071d42001-01-26 20:48:35 +0000927that dictionary inserted immediately after the \character{\%}
Steve Holden1e4519f2002-06-14 09:16:40 +0000928character. The mapping key selects the value to be formatted from the
Fred Drake8c071d42001-01-26 20:48:35 +0000929mapping. For example:
Fred Drake64e3b431998-07-24 13:56:11 +0000930
931\begin{verbatim}
Steve Holden1e4519f2002-06-14 09:16:40 +0000932>>> print '%(language)s has %(#)03d quote types.' % \
933 {'language': "Python", "#": 2}
Fred Drake64e3b431998-07-24 13:56:11 +0000934Python has 002 quote types.
935\end{verbatim}
936
937In this case no \code{*} specifiers may occur in a format (since they
938require a sequential parameter list).
939
Fred Drake8c071d42001-01-26 20:48:35 +0000940The conversion flag characters are:
941
942\begin{tableii}{c|l}{character}{Flag}{Meaning}
943 \lineii{\#}{The value conversion will use the ``alternate form''
944 (where defined below).}
Neal Norwitzf927f142003-02-17 18:57:06 +0000945 \lineii{0}{The conversion will be zero padded for numeric values.}
Fred Drake8c071d42001-01-26 20:48:35 +0000946 \lineii{-}{The converted value is left adjusted (overrides
Fred Drakef5968262002-10-25 16:55:51 +0000947 the \character{0} conversion if both are given).}
Fred Drake8c071d42001-01-26 20:48:35 +0000948 \lineii{{~}}{(a space) A blank should be left before a positive number
949 (or empty string) produced by a signed conversion.}
950 \lineii{+}{A sign character (\character{+} or \character{-}) will
951 precede the conversion (overrides a "space" flag).}
952\end{tableii}
953
954The length modifier may be \code{h}, \code{l}, and \code{L} may be
955present, but are ignored as they are not necessary for Python.
956
957The conversion types are:
958
Fred Drakef5968262002-10-25 16:55:51 +0000959\begin{tableiii}{c|l|c}{character}{Conversion}{Meaning}{Notes}
960 \lineiii{d}{Signed integer decimal.}{}
961 \lineiii{i}{Signed integer decimal.}{}
962 \lineiii{o}{Unsigned octal.}{(1)}
963 \lineiii{u}{Unsigned decimal.}{}
Raymond Hettinger68804312005-01-01 00:28:46 +0000964 \lineiii{x}{Unsigned hexadecimal (lowercase).}{(2)}
965 \lineiii{X}{Unsigned hexadecimal (uppercase).}{(2)}
Fred Drakef5968262002-10-25 16:55:51 +0000966 \lineiii{e}{Floating point exponential format (lowercase).}{}
967 \lineiii{E}{Floating point exponential format (uppercase).}{}
968 \lineiii{f}{Floating point decimal format.}{}
969 \lineiii{F}{Floating point decimal format.}{}
970 \lineiii{g}{Same as \character{e} if exponent is greater than -4 or
971 less than precision, \character{f} otherwise.}{}
972 \lineiii{G}{Same as \character{E} if exponent is greater than -4 or
973 less than precision, \character{F} otherwise.}{}
974 \lineiii{c}{Single character (accepts integer or single character
975 string).}{}
976 \lineiii{r}{String (converts any python object using
977 \function{repr()}).}{(3)}
978 \lineiii{s}{String (converts any python object using
Raymond Hettinger2bd15682003-01-13 04:29:19 +0000979 \function{str()}).}{(4)}
Fred Drakef5968262002-10-25 16:55:51 +0000980 \lineiii{\%}{No argument is converted, results in a \character{\%}
981 character in the result.}{}
982\end{tableiii}
983
984\noindent
985Notes:
986\begin{description}
987 \item[(1)]
988 The alternate form causes a leading zero (\character{0}) to be
989 inserted between left-hand padding and the formatting of the
990 number if the leading character of the result is not already a
991 zero.
992 \item[(2)]
993 The alternate form causes a leading \code{'0x'} or \code{'0X'}
994 (depending on whether the \character{x} or \character{X} format
995 was used) to be inserted between left-hand padding and the
996 formatting of the number if the leading character of the result is
997 not already a zero.
998 \item[(3)]
999 The \code{\%r} conversion was added in Python 2.0.
Raymond Hettinger2bd15682003-01-13 04:29:19 +00001000 \item[(4)]
1001 If the object or format provided is a \class{unicode} string,
1002 the resulting string will also be \class{unicode}.
Fred Drakef5968262002-10-25 16:55:51 +00001003\end{description}
Fred Drake8c071d42001-01-26 20:48:35 +00001004
1005% XXX Examples?
1006
Fred Drake8c071d42001-01-26 20:48:35 +00001007Since Python strings have an explicit length, \code{\%s} conversions
1008do not assume that \code{'\e0'} is the end of the string.
1009
1010For safety reasons, floating point precisions are clipped to 50;
1011\code{\%f} conversions for numbers whose absolute value is over 1e25
1012are replaced by \code{\%g} conversions.\footnote{
1013 These numbers are fairly arbitrary. They are intended to
1014 avoid printing endless strings of meaningless digits without hampering
1015 correct use and without having to know the exact precision of floating
1016 point values on a particular machine.
1017} All other errors raise exceptions.
1018
Fred Drake14f5c5f2001-12-03 18:33:13 +00001019Additional string operations are defined in standard modules
Fred Drake401d1e32003-12-30 22:21:18 +00001020\refmodule{string}\refstmodindex{string}\ and
Tim Peters8f01b682002-03-12 03:04:44 +00001021\refmodule{re}.\refstmodindex{re}
Fred Drake64e3b431998-07-24 13:56:11 +00001022
Fred Drake107b9672000-08-14 15:37:59 +00001023
Fred Drake512bb722000-08-18 03:12:38 +00001024\subsubsection{XRange Type \label{typesseq-xrange}}
Fred Drake107b9672000-08-14 15:37:59 +00001025
Fred Drake401d1e32003-12-30 22:21:18 +00001026The \class{xrange}\obindex{xrange} type is an immutable sequence which
1027is commonly used for looping. The advantage of the \class{xrange}
1028type is that an \class{xrange} object will always take the same amount
1029of memory, no matter the size of the range it represents. There are
1030no consistent performance advantages.
Fred Drake107b9672000-08-14 15:37:59 +00001031
Raymond Hettingerd2bef822002-12-11 07:14:03 +00001032XRange objects have very little behavior: they only support indexing,
1033iteration, and the \function{len()} function.
Fred Drake107b9672000-08-14 15:37:59 +00001034
1035
Fred Drake9474d861999-02-12 22:05:33 +00001036\subsubsection{Mutable Sequence Types \label{typesseq-mutable}}
Fred Drake64e3b431998-07-24 13:56:11 +00001037
1038List objects support additional operations that allow in-place
1039modification of the object.
Steve Holden1e4519f2002-06-14 09:16:40 +00001040Other mutable sequence types (when added to the language) should
1041also support these operations.
1042Strings and tuples are immutable sequence types: such objects cannot
Fred Drake64e3b431998-07-24 13:56:11 +00001043be modified once created.
1044The following operations are defined on mutable sequence types (where
1045\var{x} is an arbitrary object):
1046\indexiii{mutable}{sequence}{types}
Fred Drake0b4e25d2000-10-04 04:21:19 +00001047\obindex{list}
Fred Drake64e3b431998-07-24 13:56:11 +00001048
1049\begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
1050 \lineiii{\var{s}[\var{i}] = \var{x}}
1051 {item \var{i} of \var{s} is replaced by \var{x}}{}
1052 \lineiii{\var{s}[\var{i}:\var{j}] = \var{t}}
1053 {slice of \var{s} from \var{i} to \var{j} is replaced by \var{t}}{}
1054 \lineiii{del \var{s}[\var{i}:\var{j}]}
1055 {same as \code{\var{s}[\var{i}:\var{j}] = []}}{}
Michael W. Hudson9c206152003-03-05 14:42:09 +00001056 \lineiii{\var{s}[\var{i}:\var{j}:\var{k}] = \var{t}}
1057 {the elements of \code{\var{s}[\var{i}:\var{j}:\var{k}]} are replaced by those of \var{t}}{(1)}
1058 \lineiii{del \var{s}[\var{i}:\var{j}:\var{k}]}
1059 {removes the elements of \code{\var{s}[\var{i}:\var{j}:\var{k}]} from the list}{}
Fred Drake64e3b431998-07-24 13:56:11 +00001060 \lineiii{\var{s}.append(\var{x})}
Michael W. Hudson9c206152003-03-05 14:42:09 +00001061 {same as \code{\var{s}[len(\var{s}):len(\var{s})] = [\var{x}]}}{(2)}
Barry Warsawafd974c1998-10-09 16:39:58 +00001062 \lineiii{\var{s}.extend(\var{x})}
Michael W. Hudson9c206152003-03-05 14:42:09 +00001063 {same as \code{\var{s}[len(\var{s}):len(\var{s})] = \var{x}}}{(3)}
Fred Drake64e3b431998-07-24 13:56:11 +00001064 \lineiii{\var{s}.count(\var{x})}
1065 {return number of \var{i}'s for which \code{\var{s}[\var{i}] == \var{x}}}{}
Walter Dörwald93719b52003-06-17 16:19:56 +00001066 \lineiii{\var{s}.index(\var{x}\optional{, \var{i}\optional{, \var{j}}})}
1067 {return smallest \var{k} such that \code{\var{s}[\var{k}] == \var{x}} and
1068 \code{\var{i} <= \var{k} < \var{j}}}{(4)}
Fred Drake64e3b431998-07-24 13:56:11 +00001069 \lineiii{\var{s}.insert(\var{i}, \var{x})}
Guido van Rossum3a3cca52003-04-14 20:58:14 +00001070 {same as \code{\var{s}[\var{i}:\var{i}] = [\var{x}]}}{(5)}
Fred Drake64e3b431998-07-24 13:56:11 +00001071 \lineiii{\var{s}.pop(\optional{\var{i}})}
Michael W. Hudson9c206152003-03-05 14:42:09 +00001072 {same as \code{\var{x} = \var{s}[\var{i}]; del \var{s}[\var{i}]; return \var{x}}}{(6)}
Fred Drake64e3b431998-07-24 13:56:11 +00001073 \lineiii{\var{s}.remove(\var{x})}
Michael W. Hudson9c206152003-03-05 14:42:09 +00001074 {same as \code{del \var{s}[\var{s}.index(\var{x})]}}{(4)}
Fred Drake64e3b431998-07-24 13:56:11 +00001075 \lineiii{\var{s}.reverse()}
Michael W. Hudson9c206152003-03-05 14:42:09 +00001076 {reverses the items of \var{s} in place}{(7)}
Fred Drake401d1e32003-12-30 22:21:18 +00001077 \lineiii{\var{s}.sort(\optional{\var{cmp}\optional{,
1078 \var{key}\optional{, \var{reverse}}}})}
Michael W. Hudson9c206152003-03-05 14:42:09 +00001079 {sort the items of \var{s} in place}{(7), (8), (9), (10)}
Fred Drake64e3b431998-07-24 13:56:11 +00001080\end{tableiii}
1081\indexiv{operations on}{mutable}{sequence}{types}
1082\indexiii{operations on}{sequence}{types}
1083\indexiii{operations on}{list}{type}
1084\indexii{subscript}{assignment}
1085\indexii{slice}{assignment}
Michael W. Hudson9c206152003-03-05 14:42:09 +00001086\indexii{extended slice}{assignment}
Fred Drake64e3b431998-07-24 13:56:11 +00001087\stindex{del}
Fred Drake9474d861999-02-12 22:05:33 +00001088\withsubitem{(list method)}{
Fred Drake68921df1999-08-09 17:05:12 +00001089 \ttindex{append()}\ttindex{extend()}\ttindex{count()}\ttindex{index()}
1090 \ttindex{insert()}\ttindex{pop()}\ttindex{remove()}\ttindex{reverse()}
Fred Drakee8391991998-11-25 17:09:19 +00001091 \ttindex{sort()}}
Fred Drake64e3b431998-07-24 13:56:11 +00001092\noindent
1093Notes:
1094\begin{description}
Michael W. Hudson9c206152003-03-05 14:42:09 +00001095\item[(1)] \var{t} must have the same length as the slice it is
1096 replacing.
Michael W. Hudson5efaf7e2002-06-11 10:55:12 +00001097
Michael W. Hudson9c206152003-03-05 14:42:09 +00001098\item[(2)] The C implementation of Python has historically accepted
1099 multiple parameters and implicitly joined them into a tuple; this
1100 no longer works in Python 2.0. Use of this misfeature has been
1101 deprecated since Python 1.4.
Fred Drake38e5d272000-04-03 20:13:55 +00001102
Raymond Hettinger4e9907c2005-02-09 23:19:25 +00001103\item[(3)] \var{x} can be any iterable object.
Michael W. Hudson9c206152003-03-05 14:42:09 +00001104
1105\item[(4)] Raises \exception{ValueError} when \var{x} is not found in
Walter Dörwald93719b52003-06-17 16:19:56 +00001106 \var{s}. When a negative index is passed as the second or third parameter
1107 to the \method{index()} method, the list length is added, as for slice
1108 indices. If it is still negative, it is truncated to zero, as for
1109 slice indices. \versionchanged[Previously, \method{index()} didn't
1110 have arguments for specifying start and stop positions]{2.3}
Fred Drake68921df1999-08-09 17:05:12 +00001111
Michael W. Hudson9c206152003-03-05 14:42:09 +00001112\item[(5)] When a negative index is passed as the first parameter to
Guido van Rossum3a3cca52003-04-14 20:58:14 +00001113 the \method{insert()} method, the list length is added, as for slice
1114 indices. If it is still negative, it is truncated to zero, as for
1115 slice indices. \versionchanged[Previously, all negative indices
1116 were truncated to zero]{2.3}
Fred Drakeef428a22001-10-26 18:57:14 +00001117
Michael W. Hudson9c206152003-03-05 14:42:09 +00001118\item[(6)] The \method{pop()} method is only supported by the list and
Fred Drakefbd3b452000-07-31 23:42:23 +00001119 array types. The optional argument \var{i} defaults to \code{-1},
1120 so that by default the last item is removed and returned.
Fred Drake38e5d272000-04-03 20:13:55 +00001121
Michael W. Hudson9c206152003-03-05 14:42:09 +00001122\item[(7)] The \method{sort()} and \method{reverse()} methods modify the
Fred Drake38e5d272000-04-03 20:13:55 +00001123 list in place for economy of space when sorting or reversing a large
Skip Montanaro41d7d582001-07-25 16:18:19 +00001124 list. To remind you that they operate by side effect, they don't return
1125 the sorted or reversed list.
Fred Drake38e5d272000-04-03 20:13:55 +00001126
Raymond Hettinger64958a12003-12-17 20:43:33 +00001127\item[(8)] The \method{sort()} method takes optional arguments for
Andrew M. Kuchling55be9ea2004-09-10 12:59:54 +00001128 controlling the comparisons.
Raymond Hettinger42b1ba32003-10-16 03:41:09 +00001129
1130 \var{cmp} specifies a custom comparison function of two arguments
1131 (list items) which should return a negative, zero or positive number
1132 depending on whether the first argument is considered smaller than,
1133 equal to, or larger than the second argument:
1134 \samp{\var{cmp}=\keyword{lambda} \var{x},\var{y}:
1135 \function{cmp}(x.lower(), y.lower())}
1136
1137 \var{key} specifies a function of one argument that is used to
1138 extract a comparison key from each list element:
Raymond Hettinger5d6057f2004-12-02 08:31:41 +00001139 \samp{\var{key}=\function{str.lower}}
Raymond Hettinger42b1ba32003-10-16 03:41:09 +00001140
1141 \var{reverse} is a boolean value. If set to \code{True}, then the
1142 list elements are sorted as if each comparison were reversed.
1143
1144 In general, the \var{key} and \var{reverse} conversion processes are
1145 much faster than specifying an equivalent \var{cmp} function. This is
1146 because \var{cmp} is called multiple times for each list element while
Fred Drake5b6150e2003-10-21 17:04:21 +00001147 \var{key} and \var{reverse} touch each element only once.
Raymond Hettinger42b1ba32003-10-16 03:41:09 +00001148
Fred Drake4cee2202003-03-20 22:17:59 +00001149 \versionchanged[Support for \code{None} as an equivalent to omitting
Fred Drake401d1e32003-12-30 22:21:18 +00001150 \var{cmp} was added]{2.3}
Fred Drake4cee2202003-03-20 22:17:59 +00001151
Fred Drake5b6150e2003-10-21 17:04:21 +00001152 \versionchanged[Support for \var{key} and \var{reverse} was added]{2.4}
Fred Drake4cee2202003-03-20 22:17:59 +00001153
Fred Drake401d1e32003-12-30 22:21:18 +00001154\item[(9)] Starting with Python 2.3, the \method{sort()} method is
Raymond Hettinger64958a12003-12-17 20:43:33 +00001155 guaranteed to be stable. A sort is stable if it guarantees not to
Raymond Hettinger42b1ba32003-10-16 03:41:09 +00001156 change the relative order of elements that compare equal --- this is
1157 helpful for sorting in multiple passes (for example, sort by
1158 department, then by salary grade).
Tim Petersb9099c32002-11-12 22:08:10 +00001159
Michael W. Hudson9c206152003-03-05 14:42:09 +00001160\item[(10)] While a list is being sorted, the effect of attempting to
Fred Drake401d1e32003-12-30 22:21:18 +00001161 mutate, or even inspect, the list is undefined. The C
1162 implementation of Python 2.3 and newer makes the list appear empty
1163 for the duration, and raises \exception{ValueError} if it can detect
1164 that the list has been mutated during a sort.
Fred Drake64e3b431998-07-24 13:56:11 +00001165\end{description}
1166
Raymond Hettinger24d75212005-06-14 08:45:43 +00001167\subsection{Set Types ---
1168 \class{set}, \class{frozenset}
1169 \label{types-set}}
Raymond Hettingerf5f41bf2003-11-24 02:57:33 +00001170\obindex{set}
1171
1172A \dfn{set} object is an unordered collection of immutable values.
1173Common uses include membership testing, removing duplicates from a sequence,
1174and computing mathematical operations such as intersection, union, difference,
1175and symmetric difference.
1176\versionadded{2.4}
1177
1178Like other collections, sets support \code{\var{x} in \var{set}},
1179\code{len(\var{set})}, and \code{for \var{x} in \var{set}}. Being an
1180unordered collection, sets do not record element position or order of
1181insertion. Accordingly, sets do not support indexing, slicing, or
1182other sequence-like behavior.
1183
1184There are currently two builtin set types, \class{set} and \class{frozenset}.
1185The \class{set} type is mutable --- the contents can be changed using methods
1186like \method{add()} and \method{remove()}. Since it is mutable, it has no
1187hash value and cannot be used as either a dictionary key or as an element of
1188another set. The \class{frozenset} type is immutable and hashable --- its
1189contents cannot be altered after is created; however, it can be used as
1190a dictionary key or as an element of another set.
1191
1192Instances of \class{set} and \class{frozenset} provide the following operations:
1193
1194\begin{tableiii}{c|c|l}{code}{Operation}{Equivalent}{Result}
1195 \lineiii{len(\var{s})}{}{cardinality of set \var{s}}
1196
1197 \hline
1198 \lineiii{\var{x} in \var{s}}{}
1199 {test \var{x} for membership in \var{s}}
1200 \lineiii{\var{x} not in \var{s}}{}
1201 {test \var{x} for non-membership in \var{s}}
1202 \lineiii{\var{s}.issubset(\var{t})}{\code{\var{s} <= \var{t}}}
1203 {test whether every element in \var{s} is in \var{t}}
1204 \lineiii{\var{s}.issuperset(\var{t})}{\code{\var{s} >= \var{t}}}
1205 {test whether every element in \var{t} is in \var{s}}
1206
1207 \hline
1208 \lineiii{\var{s}.union(\var{t})}{\var{s} | \var{t}}
1209 {new set with elements from both \var{s} and \var{t}}
1210 \lineiii{\var{s}.intersection(\var{t})}{\var{s} \&\ \var{t}}
1211 {new set with elements common to \var{s} and \var{t}}
1212 \lineiii{\var{s}.difference(\var{t})}{\var{s} - \var{t}}
1213 {new set with elements in \var{s} but not in \var{t}}
1214 \lineiii{\var{s}.symmetric_difference(\var{t})}{\var{s} \^\ \var{t}}
1215 {new set with elements in either \var{s} or \var{t} but not both}
1216 \lineiii{\var{s}.copy()}{}
1217 {new set with a shallow copy of \var{s}}
1218\end{tableiii}
1219
1220Note, the non-operator versions of \method{union()}, \method{intersection()},
1221\method{difference()}, and \method{symmetric_difference()},
1222\method{issubset()}, and \method{issuperset()} methods will accept any
1223iterable as an argument. In contrast, their operator based counterparts
1224require their arguments to be sets. This precludes error-prone constructions
1225like \code{set('abc') \&\ 'cbs'} in favor of the more readable
1226\code{set('abc').intersection('cbs')}.
1227
1228Both \class{set} and \class{frozenset} support set to set comparisons.
1229Two sets are equal if and only if every element of each set is contained in
1230the other (each is a subset of the other).
1231A set is less than another set if and only if the first set is a proper
1232subset of the second set (is a subset, but is not equal).
1233A set is greater than another set if and only if the first set is a proper
1234superset of the second set (is a superset, but is not equal).
1235
Raymond Hettinger68804312005-01-01 00:28:46 +00001236Instances of \class{set} are compared to instances of \class{frozenset} based
Raymond Hettingercab5b942004-07-22 19:33:53 +00001237on their members. For example, \samp{set('abc') == frozenset('abc')} returns
1238\code{True}.
1239
Raymond Hettingerf5f41bf2003-11-24 02:57:33 +00001240The subset and equality comparisons do not generalize to a complete
1241ordering function. For example, any two disjoint sets are not equal and
1242are not subsets of each other, so \emph{all} of the following return
1243\code{False}: \code{\var{a}<\var{b}}, \code{\var{a}==\var{b}}, or
1244\code{\var{a}>\var{b}}.
1245Accordingly, sets do not implement the \method{__cmp__} method.
1246
1247Since sets only define partial ordering (subset relationships), the output
1248of the \method{list.sort()} method is undefined for lists of sets.
1249
Raymond Hettingere4905022005-04-10 17:32:35 +00001250Set elements are like dictionary keys; they need to define both
1251\method{__hash__} and \method{__eq__} methods.
1252
Raymond Hettingercab5b942004-07-22 19:33:53 +00001253Binary operations that mix \class{set} instances with \class{frozenset}
1254return the type of the first operand. For example:
1255\samp{frozenset('ab') | set('bc')} returns an instance of \class{frozenset}.
Raymond Hettingerf5f41bf2003-11-24 02:57:33 +00001256
1257The following table lists operations available for \class{set}
1258that do not apply to immutable instances of \class{frozenset}:
1259
1260\begin{tableiii}{c|c|l}{code}{Operation}{Equivalent}{Result}
1261 \lineiii{\var{s}.update(\var{t})}
1262 {\var{s} |= \var{t}}
1263 {return set \var{s} with elements added from \var{t}}
1264 \lineiii{\var{s}.intersection_update(\var{t})}
1265 {\var{s} \&= \var{t}}
1266 {return set \var{s} keeping only elements also found in \var{t}}
1267 \lineiii{\var{s}.difference_update(\var{t})}
1268 {\var{s} -= \var{t}}
1269 {return set \var{s} after removing elements found in \var{t}}
1270 \lineiii{\var{s}.symmetric_difference_update(\var{t})}
1271 {\var{s} \textasciicircum= \var{t}}
1272 {return set \var{s} with elements from \var{s} or \var{t}
1273 but not both}
1274
1275 \hline
1276 \lineiii{\var{s}.add(\var{x})}{}
1277 {add element \var{x} to set \var{s}}
1278 \lineiii{\var{s}.remove(\var{x})}{}
1279 {remove \var{x} from set \var{s}; raises KeyError if not present}
1280 \lineiii{\var{s}.discard(\var{x})}{}
1281 {removes \var{x} from set \var{s} if present}
1282 \lineiii{\var{s}.pop()}{}
1283 {remove and return an arbitrary element from \var{s}; raises
1284 \exception{KeyError} if empty}
1285 \lineiii{\var{s}.clear()}{}
1286 {remove all elements from set \var{s}}
1287\end{tableiii}
1288
1289Note, the non-operator versions of the \method{update()},
1290\method{intersection_update()}, \method{difference_update()}, and
1291\method{symmetric_difference_update()} methods will accept any iterable
1292as an argument.
1293
Fred Drake64e3b431998-07-24 13:56:11 +00001294
Raymond Hettinger24d75212005-06-14 08:45:43 +00001295\subsection{Mapping Types --- class{dict} \label{typesmapping}}
Fred Drake0b4e25d2000-10-04 04:21:19 +00001296\obindex{mapping}
1297\obindex{dictionary}
Fred Drake64e3b431998-07-24 13:56:11 +00001298
Steve Holden1e4519f2002-06-14 09:16:40 +00001299A \dfn{mapping} object maps immutable values to
Fred Drake64e3b431998-07-24 13:56:11 +00001300arbitrary objects. Mappings are mutable objects. There is currently
1301only one standard mapping type, the \dfn{dictionary}. A dictionary's keys are
Steve Holden1e4519f2002-06-14 09:16:40 +00001302almost arbitrary values. Only values containing lists, dictionaries
1303or other mutable types (that are compared by value rather than by
1304object identity) may not be used as keys.
Fred Drake64e3b431998-07-24 13:56:11 +00001305Numeric types used for keys obey the normal rules for numeric
Raymond Hettinger74c8e552003-09-12 00:02:37 +00001306comparison: if two numbers compare equal (such as \code{1} and
Fred Drake64e3b431998-07-24 13:56:11 +00001307\code{1.0}) then they can be used interchangeably to index the same
1308dictionary entry.
1309
Fred Drake64e3b431998-07-24 13:56:11 +00001310Dictionaries are created by placing a comma-separated list of
1311\code{\var{key}: \var{value}} pairs within braces, for example:
1312\code{\{'jack': 4098, 'sjoerd': 4127\}} or
1313\code{\{4098: 'jack', 4127: 'sjoerd'\}}.
1314
Fred Drake9c5cc141999-06-10 22:37:34 +00001315The following operations are defined on mappings (where \var{a} and
1316\var{b} are mappings, \var{k} is a key, and \var{v} and \var{x} are
1317arbitrary objects):
Fred Drake64e3b431998-07-24 13:56:11 +00001318\indexiii{operations on}{mapping}{types}
1319\indexiii{operations on}{dictionary}{type}
1320\stindex{del}
1321\bifuncindex{len}
Fred Drake9474d861999-02-12 22:05:33 +00001322\withsubitem{(dictionary method)}{
1323 \ttindex{clear()}
1324 \ttindex{copy()}
1325 \ttindex{has_key()}
Raymond Hettinger74c8e552003-09-12 00:02:37 +00001326 \ttindex{fromkeys()}
Fred Drake9474d861999-02-12 22:05:33 +00001327 \ttindex{items()}
1328 \ttindex{keys()}
1329 \ttindex{update()}
1330 \ttindex{values()}
Michael W. Hudson9c206152003-03-05 14:42:09 +00001331 \ttindex{get()}
1332 \ttindex{setdefault()}
1333 \ttindex{pop()}
1334 \ttindex{popitem()}
1335 \ttindex{iteritems()}
Raymond Hettinger0dfd7a92003-05-10 07:40:56 +00001336 \ttindex{iterkeys()}
Michael W. Hudson9c206152003-03-05 14:42:09 +00001337 \ttindex{itervalues()}}
Fred Drake9c5cc141999-06-10 22:37:34 +00001338
1339\begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
1340 \lineiii{len(\var{a})}{the number of items in \var{a}}{}
1341 \lineiii{\var{a}[\var{k}]}{the item of \var{a} with key \var{k}}{(1)}
Fred Drake1e75e172000-07-31 16:34:46 +00001342 \lineiii{\var{a}[\var{k}] = \var{v}}
1343 {set \code{\var{a}[\var{k}]} to \var{v}}
Fred Drake9c5cc141999-06-10 22:37:34 +00001344 {}
1345 \lineiii{del \var{a}[\var{k}]}
1346 {remove \code{\var{a}[\var{k}]} from \var{a}}
1347 {(1)}
1348 \lineiii{\var{a}.clear()}{remove all items from \code{a}}{}
1349 \lineiii{\var{a}.copy()}{a (shallow) copy of \code{a}}{}
Guido van Rossum8b3d6ca2001-04-23 13:22:59 +00001350 \lineiii{\var{a}.has_key(\var{k})}
Raymond Hettinger6e13bcc2003-08-08 11:07:59 +00001351 {\code{True} if \var{a} has a key \var{k}, else \code{False}}
Fred Drake9c5cc141999-06-10 22:37:34 +00001352 {}
Guido van Rossum8b3d6ca2001-04-23 13:22:59 +00001353 \lineiii{\var{k} \code{in} \var{a}}
1354 {Equivalent to \var{a}.has_key(\var{k})}
Fred Drakec6d8f8d2001-05-25 04:24:37 +00001355 {(2)}
Guido van Rossum0dbb4fb2001-04-20 16:50:40 +00001356 \lineiii{\var{k} not in \var{a}}
Guido van Rossum8b3d6ca2001-04-23 13:22:59 +00001357 {Equivalent to \code{not} \var{a}.has_key(\var{k})}
Fred Drakec6d8f8d2001-05-25 04:24:37 +00001358 {(2)}
Fred Drake9c5cc141999-06-10 22:37:34 +00001359 \lineiii{\var{a}.items()}
1360 {a copy of \var{a}'s list of (\var{key}, \var{value}) pairs}
Fred Drakec6d8f8d2001-05-25 04:24:37 +00001361 {(3)}
Fred Drake4a6c5c52001-06-12 03:31:56 +00001362 \lineiii{\var{a}.keys()}{a copy of \var{a}'s list of keys}{(3)}
Raymond Hettinger31017ae2004-03-04 08:25:44 +00001363 \lineiii{\var{a}.update(\optional{\var{b}})}
1364 {updates (and overwrites) key/value pairs from \var{b}}
1365 {(9)}
Raymond Hettingere33d3df2002-11-27 07:29:33 +00001366 \lineiii{\var{a}.fromkeys(\var{seq}\optional{, \var{value}})}
1367 {Creates a new dictionary with keys from \var{seq} and values set to \var{value}}
1368 {(7)}
Fred Drake4a6c5c52001-06-12 03:31:56 +00001369 \lineiii{\var{a}.values()}{a copy of \var{a}'s list of values}{(3)}
Fred Drake9c5cc141999-06-10 22:37:34 +00001370 \lineiii{\var{a}.get(\var{k}\optional{, \var{x}})}
Fred Drake4cacec52001-04-21 05:56:06 +00001371 {\code{\var{a}[\var{k}]} if \code{\var{k} in \var{a}},
Fred Drake9c5cc141999-06-10 22:37:34 +00001372 else \var{x}}
Barry Warsawe9218a12001-06-26 20:32:59 +00001373 {(4)}
Guido van Rossum8141cf52000-08-08 16:15:49 +00001374 \lineiii{\var{a}.setdefault(\var{k}\optional{, \var{x}})}
Fred Drake4cacec52001-04-21 05:56:06 +00001375 {\code{\var{a}[\var{k}]} if \code{\var{k} in \var{a}},
Guido van Rossum8141cf52000-08-08 16:15:49 +00001376 else \var{x} (also setting it)}
Barry Warsawe9218a12001-06-26 20:32:59 +00001377 {(5)}
Raymond Hettingera3e1e4c2003-03-06 23:54:28 +00001378 \lineiii{\var{a}.pop(\var{k}\optional{, \var{x}})}
1379 {\code{\var{a}[\var{k}]} if \code{\var{k} in \var{a}},
1380 else \var{x} (and remove k)}
1381 {(8)}
Guido van Rossumff63f202000-12-12 22:03:47 +00001382 \lineiii{\var{a}.popitem()}
1383 {remove and return an arbitrary (\var{key}, \var{value}) pair}
Barry Warsawe9218a12001-06-26 20:32:59 +00001384 {(6)}
Fred Drakec6d8f8d2001-05-25 04:24:37 +00001385 \lineiii{\var{a}.iteritems()}
1386 {return an iterator over (\var{key}, \var{value}) pairs}
Fred Drake01777832002-08-19 21:58:58 +00001387 {(2), (3)}
Fred Drakec6d8f8d2001-05-25 04:24:37 +00001388 \lineiii{\var{a}.iterkeys()}
1389 {return an iterator over the mapping's keys}
Fred Drake01777832002-08-19 21:58:58 +00001390 {(2), (3)}
Fred Drakec6d8f8d2001-05-25 04:24:37 +00001391 \lineiii{\var{a}.itervalues()}
1392 {return an iterator over the mapping's values}
Fred Drake01777832002-08-19 21:58:58 +00001393 {(2), (3)}
Fred Drake9c5cc141999-06-10 22:37:34 +00001394\end{tableiii}
1395
Fred Drake64e3b431998-07-24 13:56:11 +00001396\noindent
1397Notes:
1398\begin{description}
Fred Drake9c5cc141999-06-10 22:37:34 +00001399\item[(1)] Raises a \exception{KeyError} exception if \var{k} is not
1400in the map.
Fred Drake64e3b431998-07-24 13:56:11 +00001401
Fred Drakec6d8f8d2001-05-25 04:24:37 +00001402\item[(2)] \versionadded{2.2}
1403
Raymond Hettinger23ce5842004-11-25 05:16:19 +00001404\item[(3)] Keys and values are listed in an arbitrary order which is
1405non-random, varies across Python implementations, and depends on the
1406dictionary's history of insertions and deletions.
1407If \method{items()}, \method{keys()}, \method{values()},
Fred Drake01777832002-08-19 21:58:58 +00001408\method{iteritems()}, \method{iterkeys()}, and \method{itervalues()}
1409are called with no intervening modifications to the dictionary, the
1410lists will directly correspond. This allows the creation of
1411\code{(\var{value}, \var{key})} pairs using \function{zip()}:
1412\samp{pairs = zip(\var{a}.values(), \var{a}.keys())}. The same
1413relationship holds for the \method{iterkeys()} and
1414\method{itervalues()} methods: \samp{pairs = zip(\var{a}.itervalues(),
1415\var{a}.iterkeys())} provides the same value for \code{pairs}.
1416Another way to create the same list is \samp{pairs = [(v, k) for (k,
1417v) in \var{a}.iteritems()]}.
Fred Drake64e3b431998-07-24 13:56:11 +00001418
Barry Warsawe9218a12001-06-26 20:32:59 +00001419\item[(4)] Never raises an exception if \var{k} is not in the map,
Fred Drake38e5d272000-04-03 20:13:55 +00001420instead it returns \var{x}. \var{x} is optional; when \var{x} is not
Fred Drake9c5cc141999-06-10 22:37:34 +00001421provided and \var{k} is not in the map, \code{None} is returned.
Guido van Rossum8141cf52000-08-08 16:15:49 +00001422
Barry Warsawe9218a12001-06-26 20:32:59 +00001423\item[(5)] \function{setdefault()} is like \function{get()}, except
Guido van Rossum8141cf52000-08-08 16:15:49 +00001424that if \var{k} is missing, \var{x} is both returned and inserted into
Michael W. Hudson049e7aa2004-08-07 16:41:34 +00001425the dictionary as the value of \var{k}. \var{x} defaults to \var{None}.
Guido van Rossumff63f202000-12-12 22:03:47 +00001426
Barry Warsawe9218a12001-06-26 20:32:59 +00001427\item[(6)] \function{popitem()} is useful to destructively iterate
Raymond Hettinger631bfe62005-05-27 10:43:55 +00001428over a dictionary, as often used in set algorithms. If the dictionary
1429is empty, calling \function{popitem()} raises a \exception{KeyError}.
Fred Drake64e3b431998-07-24 13:56:11 +00001430
Raymond Hettingere33d3df2002-11-27 07:29:33 +00001431\item[(7)] \function{fromkeys()} is a class method that returns a
1432new dictionary. \var{value} defaults to \code{None}. \versionadded{2.3}
Raymond Hettingera3e1e4c2003-03-06 23:54:28 +00001433
1434\item[(8)] \function{pop()} raises a \exception{KeyError} when no default
1435value is given and the key is not found. \versionadded{2.3}
Raymond Hettingere33d3df2002-11-27 07:29:33 +00001436
Raymond Hettinger31017ae2004-03-04 08:25:44 +00001437\item[(9)] \function{update()} accepts either another mapping object
1438or an iterable of key/value pairs (as a tuple or other iterable of
1439length two). If keyword arguments are specified, the mapping is
1440then is updated with those key/value pairs:
1441\samp{d.update(red=1, blue=2)}.
1442\versionchanged[Allowed the argument to be an iterable of key/value
1443 pairs and allowed keyword arguments]{2.4}
Fred Drake64e3b431998-07-24 13:56:11 +00001444
Hye-Shik Chang9168c702004-03-09 05:53:15 +00001445\end{description}
1446
Fred Drake99de2182001-10-30 06:23:14 +00001447\subsection{File Objects
1448 \label{bltin-file-objects}}
Fred Drake64e3b431998-07-24 13:56:11 +00001449
Fred Drake99de2182001-10-30 06:23:14 +00001450File objects\obindex{file} are implemented using C's \code{stdio}
1451package and can be created with the built-in constructor
Tim Peters8f01b682002-03-12 03:04:44 +00001452\function{file()}\bifuncindex{file} described in section
Tim Peters003047a2001-10-30 05:54:04 +00001453\ref{built-in-funcs}, ``Built-in Functions.''\footnote{\function{file()}
1454is new in Python 2.2. The older built-in \function{open()} is an
Fred Drake401d1e32003-12-30 22:21:18 +00001455alias for \function{file()}.} File objects are also returned
Fred Drake907e76b2001-07-06 20:30:11 +00001456by some other built-in functions and methods, such as
Fred Drake4de96c22000-08-12 03:36:23 +00001457\function{os.popen()} and \function{os.fdopen()} and the
Fred Drake130072d1998-10-28 20:08:35 +00001458\method{makefile()} method of socket objects.
Fred Drake4de96c22000-08-12 03:36:23 +00001459\refstmodindex{os}
Fred Drake64e3b431998-07-24 13:56:11 +00001460\refbimodindex{socket}
1461
1462When a file operation fails for an I/O-related reason, the exception
Fred Drake84538cd1998-11-30 21:51:25 +00001463\exception{IOError} is raised. This includes situations where the
1464operation is not defined for some reason, like \method{seek()} on a tty
Fred Drake64e3b431998-07-24 13:56:11 +00001465device or writing a file opened for reading.
1466
1467Files have the following methods:
1468
1469
1470\begin{methoddesc}[file]{close}{}
Steve Holden1e4519f2002-06-14 09:16:40 +00001471 Close the file. A closed file cannot be read or written any more.
Fred Drakea776cea2000-11-06 20:17:37 +00001472 Any operation which requires that the file be open will raise a
1473 \exception{ValueError} after the file has been closed. Calling
Fred Drake752ba392000-09-19 15:18:51 +00001474 \method{close()} more than once is allowed.
Fred Drake64e3b431998-07-24 13:56:11 +00001475\end{methoddesc}
1476
1477\begin{methoddesc}[file]{flush}{}
Fred Drake752ba392000-09-19 15:18:51 +00001478 Flush the internal buffer, like \code{stdio}'s
1479 \cfunction{fflush()}. This may be a no-op on some file-like
1480 objects.
Fred Drake64e3b431998-07-24 13:56:11 +00001481\end{methoddesc}
1482
Fred Drake64e3b431998-07-24 13:56:11 +00001483\begin{methoddesc}[file]{fileno}{}
Fred Drake752ba392000-09-19 15:18:51 +00001484 \index{file descriptor}
1485 \index{descriptor, file}
1486 Return the integer ``file descriptor'' that is used by the
1487 underlying implementation to request I/O operations from the
1488 operating system. This can be useful for other, lower level
Fred Drake907e76b2001-07-06 20:30:11 +00001489 interfaces that use file descriptors, such as the
1490 \refmodule{fcntl}\refbimodindex{fcntl} module or
Fred Drake0aa811c2001-10-20 04:24:09 +00001491 \function{os.read()} and friends. \note{File-like objects
Fred Drake907e76b2001-07-06 20:30:11 +00001492 which do not have a real file descriptor should \emph{not} provide
Fred Drake0aa811c2001-10-20 04:24:09 +00001493 this method!}
Fred Drake64e3b431998-07-24 13:56:11 +00001494\end{methoddesc}
1495
Guido van Rossum0fc01862002-08-06 17:01:28 +00001496\begin{methoddesc}[file]{isatty}{}
1497 Return \code{True} if the file is connected to a tty(-like) device, else
1498 \code{False}. \note{If a file-like object is not associated
1499 with a real file, this method should \emph{not} be implemented.}
1500\end{methoddesc}
1501
1502\begin{methoddesc}[file]{next}{}
Raymond Hettinger74c8e552003-09-12 00:02:37 +00001503A file object is its own iterator, for example \code{iter(\var{f})} returns
Guido van Rossum0fc01862002-08-06 17:01:28 +00001504\var{f} (unless \var{f} is closed). When a file is used as an
1505iterator, typically in a \keyword{for} loop (for example,
1506\code{for line in f: print line}), the \method{next()} method is
1507called repeatedly. This method returns the next input line, or raises
1508\exception{StopIteration} when \EOF{} is hit. In order to make a
1509\keyword{for} loop the most efficient way of looping over the lines of
1510a file (a very common operation), the \method{next()} method uses a
1511hidden read-ahead buffer. As a consequence of using a read-ahead
1512buffer, combining \method{next()} with other file methods (like
1513\method{readline()}) does not work right. However, using
1514\method{seek()} to reposition the file to an absolute position will
1515flush the read-ahead buffer.
1516\versionadded{2.3}
1517\end{methoddesc}
1518
Fred Drake64e3b431998-07-24 13:56:11 +00001519\begin{methoddesc}[file]{read}{\optional{size}}
1520 Read at most \var{size} bytes from the file (less if the read hits
Fred Drakef4cbada1999-04-14 14:31:53 +00001521 \EOF{} before obtaining \var{size} bytes). If the \var{size}
1522 argument is negative or omitted, read all data until \EOF{} is
1523 reached. The bytes are returned as a string object. An empty
1524 string is returned when \EOF{} is encountered immediately. (For
1525 certain files, like ttys, it makes sense to continue reading after
1526 an \EOF{} is hit.) Note that this method may call the underlying
1527 C function \cfunction{fread()} more than once in an effort to
Gustavo Niemeyer786ddb22002-12-16 18:12:53 +00001528 acquire as close to \var{size} bytes as possible. Also note that
1529 when in non-blocking mode, less data than what was requested may
1530 be returned, even if no \var{size} parameter was given.
Fred Drake64e3b431998-07-24 13:56:11 +00001531\end{methoddesc}
1532
1533\begin{methoddesc}[file]{readline}{\optional{size}}
1534 Read one entire line from the file. A trailing newline character is
Fred Drake401d1e32003-12-30 22:21:18 +00001535 kept in the string (but may be absent when a file ends with an
1536 incomplete line).\footnote{
Steve Holden1e4519f2002-06-14 09:16:40 +00001537 The advantage of leaving the newline on is that
1538 returning an empty string is then an unambiguous \EOF{}
1539 indication. It is also possible (in cases where it might
1540 matter, for example, if you
Tim Peters8f01b682002-03-12 03:04:44 +00001541 want to make an exact copy of a file while scanning its lines)
Steve Holden1e4519f2002-06-14 09:16:40 +00001542 to tell whether the last line of a file ended in a newline
Fred Drake4de96c22000-08-12 03:36:23 +00001543 or not (yes this happens!).
Fred Drake401d1e32003-12-30 22:21:18 +00001544 } If the \var{size} argument is present and
Fred Drake64e3b431998-07-24 13:56:11 +00001545 non-negative, it is a maximum byte count (including the trailing
1546 newline) and an incomplete line may be returned.
Steve Holden1e4519f2002-06-14 09:16:40 +00001547 An empty string is returned \emph{only} when \EOF{} is encountered
Fred Drake0aa811c2001-10-20 04:24:09 +00001548 immediately. \note{Unlike \code{stdio}'s \cfunction{fgets()}, the
Fred Drake752ba392000-09-19 15:18:51 +00001549 returned string contains null characters (\code{'\e 0'}) if they
Fred Drake0aa811c2001-10-20 04:24:09 +00001550 occurred in the input.}
Fred Drake64e3b431998-07-24 13:56:11 +00001551\end{methoddesc}
1552
1553\begin{methoddesc}[file]{readlines}{\optional{sizehint}}
1554 Read until \EOF{} using \method{readline()} and return a list containing
1555 the lines thus read. If the optional \var{sizehint} argument is
Fred Drakec37b65e2001-11-28 07:26:15 +00001556 present, instead of reading up to \EOF, whole lines totalling
Fred Drake64e3b431998-07-24 13:56:11 +00001557 approximately \var{sizehint} bytes (possibly after rounding up to an
Fred Drake752ba392000-09-19 15:18:51 +00001558 internal buffer size) are read. Objects implementing a file-like
1559 interface may choose to ignore \var{sizehint} if it cannot be
1560 implemented, or cannot be implemented efficiently.
Fred Drake64e3b431998-07-24 13:56:11 +00001561\end{methoddesc}
1562
Guido van Rossum20ab9e92001-01-17 01:18:00 +00001563\begin{methoddesc}[file]{xreadlines}{}
Guido van Rossum0fc01862002-08-06 17:01:28 +00001564 This method returns the same thing as \code{iter(f)}.
Fred Drake82f93c62001-04-22 01:56:51 +00001565 \versionadded{2.1}
Fred Drake401d1e32003-12-30 22:21:18 +00001566 \deprecated{2.3}{Use \samp{for \var{line} in \var{file}} instead.}
Guido van Rossum20ab9e92001-01-17 01:18:00 +00001567\end{methoddesc}
1568
Fred Drake64e3b431998-07-24 13:56:11 +00001569\begin{methoddesc}[file]{seek}{offset\optional{, whence}}
1570 Set the file's current position, like \code{stdio}'s \cfunction{fseek()}.
1571 The \var{whence} argument is optional and defaults to \code{0}
1572 (absolute file positioning); other values are \code{1} (seek
1573 relative to the current position) and \code{2} (seek relative to the
Fred Drake19ae7832001-01-04 05:16:39 +00001574 file's end). There is no return value. Note that if the file is
1575 opened for appending (mode \code{'a'} or \code{'a+'}), any
1576 \method{seek()} operations will be undone at the next write. If the
1577 file is only opened for writing in append mode (mode \code{'a'}),
1578 this method is essentially a no-op, but it remains useful for files
Martin v. Löwis849a9722003-10-18 09:38:01 +00001579 opened in append mode with reading enabled (mode \code{'a+'}). If the
1580 file is opened in text mode (mode \code{'t'}), only offsets returned
1581 by \method{tell()} are legal. Use of other offsets causes undefined
1582 behavior.
1583
1584 Note that not all file objects are seekable.
Fred Drake64e3b431998-07-24 13:56:11 +00001585\end{methoddesc}
1586
1587\begin{methoddesc}[file]{tell}{}
1588 Return the file's current position, like \code{stdio}'s
1589 \cfunction{ftell()}.
1590\end{methoddesc}
1591
1592\begin{methoddesc}[file]{truncate}{\optional{size}}
Tim Peters8f01b682002-03-12 03:04:44 +00001593 Truncate the file's size. If the optional \var{size} argument is
Fred Drake752ba392000-09-19 15:18:51 +00001594 present, the file is truncated to (at most) that size. The size
Tim Peters8f01b682002-03-12 03:04:44 +00001595 defaults to the current position. The current file position is
1596 not changed. Note that if a specified size exceeds the file's
1597 current size, the result is platform-dependent: possibilities
1598 include that file may remain unchanged, increase to the specified
1599 size as if zero-filled, or increase to the specified size with
1600 undefined new content.
Raymond Hettingerb67449d2003-09-08 18:52:18 +00001601 Availability: Windows, many \UNIX{} variants.
Fred Drake64e3b431998-07-24 13:56:11 +00001602\end{methoddesc}
1603
1604\begin{methoddesc}[file]{write}{str}
Fred Drake0aa811c2001-10-20 04:24:09 +00001605 Write a string to the file. There is no return value. Due to
Fred Drake3c48ef72001-01-09 22:47:46 +00001606 buffering, the string may not actually show up in the file until
1607 the \method{flush()} or \method{close()} method is called.
Fred Drake64e3b431998-07-24 13:56:11 +00001608\end{methoddesc}
1609
Tim Peters2c9aa5e2001-09-23 04:06:05 +00001610\begin{methoddesc}[file]{writelines}{sequence}
1611 Write a sequence of strings to the file. The sequence can be any
1612 iterable object producing strings, typically a list of strings.
1613 There is no return value.
Fred Drake3c48ef72001-01-09 22:47:46 +00001614 (The name is intended to match \method{readlines()};
1615 \method{writelines()} does not add line separators.)
1616\end{methoddesc}
1617
Fred Drake64e3b431998-07-24 13:56:11 +00001618
Fred Drake038d2642001-09-22 04:34:48 +00001619Files support the iterator protocol. Each iteration returns the same
1620result as \code{\var{file}.readline()}, and iteration ends when the
1621\method{readline()} method returns an empty string.
1622
1623
Fred Drake752ba392000-09-19 15:18:51 +00001624File objects also offer a number of other interesting attributes.
1625These are not required for file-like objects, but should be
1626implemented if they make sense for the particular object.
Fred Drake64e3b431998-07-24 13:56:11 +00001627
1628\begin{memberdesc}[file]{closed}
Neal Norwitz6b353702002-04-09 18:15:00 +00001629bool indicating the current state of the file object. This is a
Fred Drake64e3b431998-07-24 13:56:11 +00001630read-only attribute; the \method{close()} method changes the value.
Fred Drake752ba392000-09-19 15:18:51 +00001631It may not be available on all file-like objects.
Fred Drake64e3b431998-07-24 13:56:11 +00001632\end{memberdesc}
1633
Martin v. Löwis5467d4c2003-05-10 07:10:12 +00001634\begin{memberdesc}[file]{encoding}
1635The encoding that this file uses. When Unicode strings are written
1636to a file, they will be converted to byte strings using this encoding.
1637In addition, when the file is connected to a terminal, the attribute
1638gives the encoding that the terminal is likely to use (that
1639information might be incorrect if the user has misconfigured the
1640terminal). The attribute is read-only and may not be present on
1641all file-like objects. It may also be \code{None}, in which case
1642the file uses the system default encoding for converting Unicode
1643strings.
1644
1645\versionadded{2.3}
1646\end{memberdesc}
1647
Fred Drake64e3b431998-07-24 13:56:11 +00001648\begin{memberdesc}[file]{mode}
1649The I/O mode for the file. If the file was created using the
1650\function{open()} built-in function, this will be the value of the
Fred Drake752ba392000-09-19 15:18:51 +00001651\var{mode} parameter. This is a read-only attribute and may not be
1652present on all file-like objects.
Fred Drake64e3b431998-07-24 13:56:11 +00001653\end{memberdesc}
1654
1655\begin{memberdesc}[file]{name}
1656If the file object was created using \function{open()}, the name of
1657the file. Otherwise, some string that indicates the source of the
1658file object, of the form \samp{<\mbox{\ldots}>}. This is a read-only
Fred Drake752ba392000-09-19 15:18:51 +00001659attribute and may not be present on all file-like objects.
Fred Drake64e3b431998-07-24 13:56:11 +00001660\end{memberdesc}
1661
Michael W. Hudson9c206152003-03-05 14:42:09 +00001662\begin{memberdesc}[file]{newlines}
Fred Drake7c67cb82003-12-30 17:17:17 +00001663If Python was built with the \longprogramopt{with-universal-newlines}
1664option to \program{configure} (the default) this read-only attribute
1665exists, and for files opened in
Michael W. Hudson9c206152003-03-05 14:42:09 +00001666universal newline read mode it keeps track of the types of newlines
1667encountered while reading the file. The values it can take are
1668\code{'\e r'}, \code{'\e n'}, \code{'\e r\e n'}, \code{None} (unknown,
1669no newlines read yet) or a tuple containing all the newline
1670types seen, to indicate that multiple
1671newline conventions were encountered. For files not opened in universal
1672newline read mode the value of this attribute will be \code{None}.
1673\end{memberdesc}
1674
Fred Drake64e3b431998-07-24 13:56:11 +00001675\begin{memberdesc}[file]{softspace}
1676Boolean that indicates whether a space character needs to be printed
1677before another value when using the \keyword{print} statement.
1678Classes that are trying to simulate a file object should also have a
1679writable \member{softspace} attribute, which should be initialized to
Fred Drake66571cc2000-09-09 03:30:34 +00001680zero. This will be automatic for most classes implemented in Python
1681(care may be needed for objects that override attribute access); types
1682implemented in C will have to provide a writable
1683\member{softspace} attribute.
Fred Drake0aa811c2001-10-20 04:24:09 +00001684\note{This attribute is not used to control the
Fred Drake51f53df2000-09-20 04:48:20 +00001685\keyword{print} statement, but to allow the implementation of
Fred Drake0aa811c2001-10-20 04:24:09 +00001686\keyword{print} to keep track of its internal state.}
Fred Drake64e3b431998-07-24 13:56:11 +00001687\end{memberdesc}
1688
Fred Drakea776cea2000-11-06 20:17:37 +00001689
Fred Drake99de2182001-10-30 06:23:14 +00001690\subsection{Other Built-in Types \label{typesother}}
1691
1692The interpreter supports several other kinds of objects.
1693Most of these support only one or two operations.
1694
1695
1696\subsubsection{Modules \label{typesmodules}}
1697
1698The only special operation on a module is attribute access:
1699\code{\var{m}.\var{name}}, where \var{m} is a module and \var{name}
1700accesses a name defined in \var{m}'s symbol table. Module attributes
1701can be assigned to. (Note that the \keyword{import} statement is not,
1702strictly speaking, an operation on a module object; \code{import
1703\var{foo}} does not require a module object named \var{foo} to exist,
1704rather it requires an (external) \emph{definition} for a module named
1705\var{foo} somewhere.)
1706
1707A special member of every module is \member{__dict__}.
1708This is the dictionary containing the module's symbol table.
1709Modifying this dictionary will actually change the module's symbol
1710table, but direct assignment to the \member{__dict__} attribute is not
1711possible (you can write \code{\var{m}.__dict__['a'] = 1}, which
1712defines \code{\var{m}.a} to be \code{1}, but you can't write
Fred Drake401d1e32003-12-30 22:21:18 +00001713\code{\var{m}.__dict__ = \{\}}). Modifying \member{__dict__} directly
1714is not recommended.
Fred Drake99de2182001-10-30 06:23:14 +00001715
1716Modules built into the interpreter are written like this:
1717\code{<module 'sys' (built-in)>}. If loaded from a file, they are
1718written as \code{<module 'os' from
1719'/usr/local/lib/python\shortversion/os.pyc'>}.
1720
1721
1722\subsubsection{Classes and Class Instances \label{typesobjects}}
1723\nodename{Classes and Instances}
1724
1725See chapters 3 and 7 of the \citetitle[../ref/ref.html]{Python
1726Reference Manual} for these.
1727
1728
1729\subsubsection{Functions \label{typesfunctions}}
1730
1731Function objects are created by function definitions. The only
1732operation on a function object is to call it:
1733\code{\var{func}(\var{argument-list})}.
1734
1735There are really two flavors of function objects: built-in functions
1736and user-defined functions. Both support the same operation (to call
1737the function), but the implementation is different, hence the
1738different object types.
1739
Michael W. Hudson5e897952004-08-12 18:12:44 +00001740See the \citetitle[../ref/ref.html]{Python Reference Manual} for more
1741information.
Fred Drake99de2182001-10-30 06:23:14 +00001742
1743\subsubsection{Methods \label{typesmethods}}
1744\obindex{method}
1745
1746Methods are functions that are called using the attribute notation.
1747There are two flavors: built-in methods (such as \method{append()} on
1748lists) and class instance methods. Built-in methods are described
1749with the types that support them.
1750
1751The implementation adds two special read-only attributes to class
1752instance methods: \code{\var{m}.im_self} is the object on which the
1753method operates, and \code{\var{m}.im_func} is the function
1754implementing the method. Calling \code{\var{m}(\var{arg-1},
1755\var{arg-2}, \textrm{\ldots}, \var{arg-n})} is completely equivalent to
1756calling \code{\var{m}.im_func(\var{m}.im_self, \var{arg-1},
1757\var{arg-2}, \textrm{\ldots}, \var{arg-n})}.
1758
1759Class instance methods are either \emph{bound} or \emph{unbound},
1760referring to whether the method was accessed through an instance or a
1761class, respectively. When a method is unbound, its \code{im_self}
1762attribute will be \code{None} and if called, an explicit \code{self}
1763object must be passed as the first argument. In this case,
1764\code{self} must be an instance of the unbound method's class (or a
1765subclass of that class), otherwise a \code{TypeError} is raised.
1766
1767Like function objects, methods objects support getting
1768arbitrary attributes. However, since method attributes are actually
1769stored on the underlying function object (\code{meth.im_func}),
1770setting method attributes on either bound or unbound methods is
1771disallowed. Attempting to set a method attribute results in a
1772\code{TypeError} being raised. In order to set a method attribute,
1773you need to explicitly set it on the underlying function object:
1774
1775\begin{verbatim}
1776class C:
1777 def method(self):
1778 pass
1779
1780c = C()
1781c.method.im_func.whoami = 'my name is c'
1782\end{verbatim}
1783
1784See the \citetitle[../ref/ref.html]{Python Reference Manual} for more
1785information.
1786
1787
1788\subsubsection{Code Objects \label{bltin-code-objects}}
1789\obindex{code}
1790
1791Code objects are used by the implementation to represent
1792``pseudo-compiled'' executable Python code such as a function body.
1793They differ from function objects because they don't contain a
1794reference to their global execution environment. Code objects are
1795returned by the built-in \function{compile()} function and can be
1796extracted from function objects through their \member{func_code}
1797attribute.
1798\bifuncindex{compile}
1799\withsubitem{(function object attribute)}{\ttindex{func_code}}
1800
1801A code object can be executed or evaluated by passing it (instead of a
1802source string) to the \keyword{exec} statement or the built-in
1803\function{eval()} function.
1804\stindex{exec}
1805\bifuncindex{eval}
1806
1807See the \citetitle[../ref/ref.html]{Python Reference Manual} for more
1808information.
1809
1810
1811\subsubsection{Type Objects \label{bltin-type-objects}}
1812
1813Type objects represent the various object types. An object's type is
1814accessed by the built-in function \function{type()}. There are no special
Fred Drake401d1e32003-12-30 22:21:18 +00001815operations on types. The standard module \refmodule{types} defines names
Fred Drake99de2182001-10-30 06:23:14 +00001816for all standard built-in types.
1817\bifuncindex{type}
1818\refstmodindex{types}
1819
1820Types are written like this: \code{<type 'int'>}.
1821
1822
1823\subsubsection{The Null Object \label{bltin-null-object}}
1824
1825This object is returned by functions that don't explicitly return a
1826value. It supports no special operations. There is exactly one null
1827object, named \code{None} (a built-in name).
1828
1829It is written as \code{None}.
1830
1831
1832\subsubsection{The Ellipsis Object \label{bltin-ellipsis-object}}
1833
1834This object is used by extended slice notation (see the
1835\citetitle[../ref/ref.html]{Python Reference Manual}). It supports no
1836special operations. There is exactly one ellipsis object, named
1837\constant{Ellipsis} (a built-in name).
1838
1839It is written as \code{Ellipsis}.
1840
Guido van Rossum77f6a652002-04-03 22:41:51 +00001841\subsubsection{Boolean Values}
1842
1843Boolean values are the two constant objects \code{False} and
1844\code{True}. They are used to represent truth values (although other
1845values can also be considered false or true). In numeric contexts
1846(for example when used as the argument to an arithmetic operator),
1847they behave like the integers 0 and 1, respectively. The built-in
1848function \function{bool()} can be used to cast any value to a Boolean,
1849if the value can be interpreted as a truth value (see section Truth
1850Value Testing above).
1851
1852They are written as \code{False} and \code{True}, respectively.
1853\index{False}
1854\index{True}
1855\indexii{Boolean}{values}
1856
Fred Drake99de2182001-10-30 06:23:14 +00001857
Fred Drake9474d861999-02-12 22:05:33 +00001858\subsubsection{Internal Objects \label{typesinternal}}
Fred Drake64e3b431998-07-24 13:56:11 +00001859
Fred Drake37f15741999-11-10 16:21:37 +00001860See the \citetitle[../ref/ref.html]{Python Reference Manual} for this
Fred Drake512bb722000-08-18 03:12:38 +00001861information. It describes stack frame objects, traceback objects, and
1862slice objects.
Fred Drake64e3b431998-07-24 13:56:11 +00001863
1864
Fred Drake7a2f0661998-09-10 18:25:58 +00001865\subsection{Special Attributes \label{specialattrs}}
Fred Drake64e3b431998-07-24 13:56:11 +00001866
1867The implementation adds a few special read-only attributes to several
Fred Drakef72de0f2004-05-12 02:48:29 +00001868object types, where they are relevant. Some of these are not reported
1869by the \function{dir()} built-in function.
Fred Drake64e3b431998-07-24 13:56:11 +00001870
Fred Drakea776cea2000-11-06 20:17:37 +00001871\begin{memberdesc}[object]{__dict__}
1872A dictionary or other mapping object used to store an
Fred Drake7a2f0661998-09-10 18:25:58 +00001873object's (writable) attributes.
Fred Drakea776cea2000-11-06 20:17:37 +00001874\end{memberdesc}
Fred Drake64e3b431998-07-24 13:56:11 +00001875
Fred Drakea776cea2000-11-06 20:17:37 +00001876\begin{memberdesc}[object]{__methods__}
Fred Drake35705512001-12-03 17:32:27 +00001877\deprecated{2.2}{Use the built-in function \function{dir()} to get a
1878list of an object's attributes. This attribute is no longer available.}
Fred Drakea776cea2000-11-06 20:17:37 +00001879\end{memberdesc}
Fred Drake64e3b431998-07-24 13:56:11 +00001880
Fred Drakea776cea2000-11-06 20:17:37 +00001881\begin{memberdesc}[object]{__members__}
Fred Drake35705512001-12-03 17:32:27 +00001882\deprecated{2.2}{Use the built-in function \function{dir()} to get a
1883list of an object's attributes. This attribute is no longer available.}
Fred Drakea776cea2000-11-06 20:17:37 +00001884\end{memberdesc}
Fred Drake64e3b431998-07-24 13:56:11 +00001885
Fred Drakea776cea2000-11-06 20:17:37 +00001886\begin{memberdesc}[instance]{__class__}
Fred Drake7a2f0661998-09-10 18:25:58 +00001887The class to which a class instance belongs.
Fred Drakea776cea2000-11-06 20:17:37 +00001888\end{memberdesc}
Fred Drake64e3b431998-07-24 13:56:11 +00001889
Fred Drakea776cea2000-11-06 20:17:37 +00001890\begin{memberdesc}[class]{__bases__}
Fred Drake907e76b2001-07-06 20:30:11 +00001891The tuple of base classes of a class object. If there are no base
1892classes, this will be an empty tuple.
Fred Drakea776cea2000-11-06 20:17:37 +00001893\end{memberdesc}
Fred Drakef72de0f2004-05-12 02:48:29 +00001894
1895\begin{memberdesc}[class]{__name__}
1896The name of the class or type.
1897\end{memberdesc}