blob: b2bb3787621569444403f77c826b95feedfd7808 [file] [log] [blame]
Fred Drake7a2f0661998-09-10 18:25:58 +00001\section{Built-in Types \label{types}}
Fred Drake64e3b431998-07-24 13:56:11 +00002
3The following sections describe the standard types that are built into
Steve Holden1e4519f2002-06-14 09:16:40 +00004the interpreter. Historically, Python's built-in types have differed
5from user-defined types because it was not possible to use the built-in
6types as the basis for object-oriented inheritance. With the 2.2
7release this situation has started to change, although the intended
8unification of user-defined and built-in types is as yet far from
9complete.
10
11The principal built-in types are numerics, sequences, mappings, files
12classes, instances and exceptions.
Fred Drake64e3b431998-07-24 13:56:11 +000013\indexii{built-in}{types}
Fred Drake64e3b431998-07-24 13:56:11 +000014
15Some operations are supported by several object types; in particular,
Guido van Rossum50e7a112003-12-31 06:32:38 +000016practically all objects can be compared, tested for truth value,
17and converted to a string (with the \code{`\textrm{\ldots}`} notation,
18the equivalent \function{repr()} function, or the slightly different
19\function{str()} function). The latter
20function is implicitly used when an object is written by the
Fred Drake84538cd1998-11-30 21:51:25 +000021\keyword{print}\stindex{print} statement.
Fred Drake90fc0b32003-04-30 16:44:36 +000022(Information on \ulink{\keyword{print} statement}{../ref/print.html}
23and other language statements can be found in the
24\citetitle[../ref/ref.html]{Python Reference Manual} and the
25\citetitle[../tut/tut.html]{Python Tutorial}.)
Fred Drake64e3b431998-07-24 13:56:11 +000026
27
Fred Drake90fc0b32003-04-30 16:44:36 +000028\subsection{Truth Value Testing\label{truth}}
Fred Drake64e3b431998-07-24 13:56:11 +000029
Fred Drake84538cd1998-11-30 21:51:25 +000030Any object can be tested for truth value, for use in an \keyword{if} or
31\keyword{while} condition or as operand of the Boolean operations below.
Fred Drake64e3b431998-07-24 13:56:11 +000032The following values are considered false:
33\stindex{if}
34\stindex{while}
35\indexii{truth}{value}
36\indexii{Boolean}{operations}
37\index{false}
38
Fred Drake64e3b431998-07-24 13:56:11 +000039\begin{itemize}
40
41\item \code{None}
Fred Drake442c7c72002-08-07 15:40:15 +000042 \withsubitem{(Built-in object)}{\ttindex{None}}
Fred Drake64e3b431998-07-24 13:56:11 +000043
Guido van Rossum77f6a652002-04-03 22:41:51 +000044\item \code{False}
Fred Drake442c7c72002-08-07 15:40:15 +000045 \withsubitem{(Built-in object)}{\ttindex{False}}
Guido van Rossum77f6a652002-04-03 22:41:51 +000046
Fred Drake38e5d272000-04-03 20:13:55 +000047\item zero of any numeric type, for example, \code{0}, \code{0L},
48 \code{0.0}, \code{0j}.
Fred Drake64e3b431998-07-24 13:56:11 +000049
Fred Drake38e5d272000-04-03 20:13:55 +000050\item any empty sequence, for example, \code{''}, \code{()}, \code{[]}.
Fred Drake64e3b431998-07-24 13:56:11 +000051
Fred Drake38e5d272000-04-03 20:13:55 +000052\item any empty mapping, for example, \code{\{\}}.
Fred Drake64e3b431998-07-24 13:56:11 +000053
54\item instances of user-defined classes, if the class defines a
Fred Drake442c7c72002-08-07 15:40:15 +000055 \method{__nonzero__()} or \method{__len__()} method, when that
56 method returns the integer zero or \class{bool} value
57 \code{False}.\footnote{Additional
Fred Drake3e59f722002-07-12 17:15:10 +000058information on these special methods may be found in the
59\citetitle[../ref/ref.html]{Python Reference Manual}.}
Fred Drake64e3b431998-07-24 13:56:11 +000060
61\end{itemize}
62
63All other values are considered true --- so objects of many types are
64always true.
65\index{true}
66
67Operations and built-in functions that have a Boolean result always
Guido van Rossum77f6a652002-04-03 22:41:51 +000068return \code{0} or \code{False} for false and \code{1} or \code{True}
69for true, unless otherwise stated. (Important exception: the Boolean
70operations \samp{or}\opindex{or} and \samp{and}\opindex{and} always
71return one of their operands.)
72\index{False}
73\index{True}
Fred Drake64e3b431998-07-24 13:56:11 +000074
Fred Drake7a2f0661998-09-10 18:25:58 +000075\subsection{Boolean Operations \label{boolean}}
Fred Drake64e3b431998-07-24 13:56:11 +000076
77These are the Boolean operations, ordered by ascending priority:
78\indexii{Boolean}{operations}
79
80\begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
Fred Drake8c071d42001-01-26 20:48:35 +000081 \lineiii{\var{x} or \var{y}}
82 {if \var{x} is false, then \var{y}, else \var{x}}{(1)}
83 \lineiii{\var{x} and \var{y}}
84 {if \var{x} is false, then \var{x}, else \var{y}}{(1)}
Fred Drake64e3b431998-07-24 13:56:11 +000085 \hline
Fred Drake8c071d42001-01-26 20:48:35 +000086 \lineiii{not \var{x}}
Guido van Rossum77f6a652002-04-03 22:41:51 +000087 {if \var{x} is false, then \code{True}, else \code{False}}{(2)}
Fred Drake64e3b431998-07-24 13:56:11 +000088\end{tableiii}
89\opindex{and}
90\opindex{or}
91\opindex{not}
92
93\noindent
94Notes:
95
96\begin{description}
97
98\item[(1)]
99These only evaluate their second argument if needed for their outcome.
100
101\item[(2)]
Fred Drake38e5d272000-04-03 20:13:55 +0000102\samp{not} has a lower priority than non-Boolean operators, so
103\code{not \var{a} == \var{b}} is interpreted as \code{not (\var{a} ==
104\var{b})}, and \code{\var{a} == not \var{b}} is a syntax error.
Fred Drake64e3b431998-07-24 13:56:11 +0000105
106\end{description}
107
108
Fred Drake7a2f0661998-09-10 18:25:58 +0000109\subsection{Comparisons \label{comparisons}}
Fred Drake64e3b431998-07-24 13:56:11 +0000110
111Comparison operations are supported by all objects. They all have the
112same priority (which is higher than that of the Boolean operations).
Fred Drake38e5d272000-04-03 20:13:55 +0000113Comparisons can be chained arbitrarily; for example, \code{\var{x} <
114\var{y} <= \var{z}} is equivalent to \code{\var{x} < \var{y} and
115\var{y} <= \var{z}}, except that \var{y} is evaluated only once (but
116in both cases \var{z} is not evaluated at all when \code{\var{x} <
117\var{y}} is found to be false).
Fred Drake64e3b431998-07-24 13:56:11 +0000118\indexii{chaining}{comparisons}
119
120This table summarizes the comparison operations:
121
122\begin{tableiii}{c|l|c}{code}{Operation}{Meaning}{Notes}
123 \lineiii{<}{strictly less than}{}
124 \lineiii{<=}{less than or equal}{}
125 \lineiii{>}{strictly greater than}{}
126 \lineiii{>=}{greater than or equal}{}
127 \lineiii{==}{equal}{}
Fred Drake64e3b431998-07-24 13:56:11 +0000128 \lineiii{!=}{not equal}{(1)}
Fred Drake512bb722000-08-18 03:12:38 +0000129 \lineiii{<>}{not equal}{(1)}
Fred Drake64e3b431998-07-24 13:56:11 +0000130 \lineiii{is}{object identity}{}
131 \lineiii{is not}{negated object identity}{}
132\end{tableiii}
133\indexii{operator}{comparison}
134\opindex{==} % XXX *All* others have funny characters < ! >
135\opindex{is}
136\opindex{is not}
137
138\noindent
139Notes:
140
141\begin{description}
142
143\item[(1)]
144\code{<>} and \code{!=} are alternate spellings for the same operator.
Fred Drake38e5d272000-04-03 20:13:55 +0000145\code{!=} is the preferred spelling; \code{<>} is obsolescent.
Fred Drake64e3b431998-07-24 13:56:11 +0000146
147\end{description}
148
Martin v. Löwis19a5a712003-05-31 08:05:49 +0000149Objects of different types, except different numeric types and different string types, never
Fred Drake64e3b431998-07-24 13:56:11 +0000150compare equal; such objects are ordered consistently but arbitrarily
151(so that sorting a heterogeneous array yields a consistent result).
Fred Drake38e5d272000-04-03 20:13:55 +0000152Furthermore, some types (for example, file objects) support only a
153degenerate notion of comparison where any two objects of that type are
154unequal. Again, such objects are ordered arbitrarily but
Steve Holden1e4519f2002-06-14 09:16:40 +0000155consistently. The \code{<}, \code{<=}, \code{>} and \code{>=}
156operators will raise a \exception{TypeError} exception when any operand
157is a complex number.
Fred Drake38e5d272000-04-03 20:13:55 +0000158\indexii{object}{numeric}
Fred Drake64e3b431998-07-24 13:56:11 +0000159\indexii{objects}{comparing}
160
Fred Drake38e5d272000-04-03 20:13:55 +0000161Instances of a class normally compare as non-equal unless the class
162\withsubitem{(instance method)}{\ttindex{__cmp__()}}
Fred Drake66571cc2000-09-09 03:30:34 +0000163defines the \method{__cmp__()} method. Refer to the
164\citetitle[../ref/customization.html]{Python Reference Manual} for
165information on the use of this method to effect object comparisons.
Fred Drake64e3b431998-07-24 13:56:11 +0000166
Fred Drake38e5d272000-04-03 20:13:55 +0000167\strong{Implementation note:} Objects of different types except
168numbers are ordered by their type names; objects of the same types
169that don't support proper comparison are ordered by their address.
170
171Two more operations with the same syntactic priority,
172\samp{in}\opindex{in} and \samp{not in}\opindex{not in}, are supported
173only by sequence types (below).
Fred Drake64e3b431998-07-24 13:56:11 +0000174
175
Fred Drake7a2f0661998-09-10 18:25:58 +0000176\subsection{Numeric Types \label{typesnumeric}}
Fred Drake64e3b431998-07-24 13:56:11 +0000177
Guido van Rossum77f6a652002-04-03 22:41:51 +0000178There are four distinct numeric types: \dfn{plain integers},
179\dfn{long integers},
Fred Drake64e3b431998-07-24 13:56:11 +0000180\dfn{floating point numbers}, and \dfn{complex numbers}.
Guido van Rossum77f6a652002-04-03 22:41:51 +0000181In addition, Booleans are a subtype of plain integers.
Fred Drake64e3b431998-07-24 13:56:11 +0000182Plain integers (also just called \dfn{integers})
Fred Drake38e5d272000-04-03 20:13:55 +0000183are implemented using \ctype{long} in C, which gives them at least 32
Fred Drake64e3b431998-07-24 13:56:11 +0000184bits of precision. Long integers have unlimited precision. Floating
Fred Drake38e5d272000-04-03 20:13:55 +0000185point numbers are implemented using \ctype{double} in C. All bets on
Fred Drake64e3b431998-07-24 13:56:11 +0000186their precision are off unless you happen to know the machine you are
187working with.
Fred Drake0b4e25d2000-10-04 04:21:19 +0000188\obindex{numeric}
Guido van Rossum77f6a652002-04-03 22:41:51 +0000189\obindex{Boolean}
Fred Drake0b4e25d2000-10-04 04:21:19 +0000190\obindex{integer}
191\obindex{long integer}
192\obindex{floating point}
193\obindex{complex number}
Fred Drake38e5d272000-04-03 20:13:55 +0000194\indexii{C}{language}
Fred Drake64e3b431998-07-24 13:56:11 +0000195
Steve Holden1e4519f2002-06-14 09:16:40 +0000196Complex numbers have a real and imaginary part, which are each
Fred Drake38e5d272000-04-03 20:13:55 +0000197implemented using \ctype{double} in C. To extract these parts from
Tim Peters8f01b682002-03-12 03:04:44 +0000198a complex number \var{z}, use \code{\var{z}.real} and \code{\var{z}.imag}.
Fred Drake64e3b431998-07-24 13:56:11 +0000199
200Numbers are created by numeric literals or as the result of built-in
201functions and operators. Unadorned integer literals (including hex
Steve Holden1e4519f2002-06-14 09:16:40 +0000202and octal numbers) yield plain integers unless the value they denote
203is too large to be represented as a plain integer, in which case
204they yield a long integer. Integer literals with an
Fred Drake38e5d272000-04-03 20:13:55 +0000205\character{L} or \character{l} suffix yield long integers
206(\character{L} is preferred because \samp{1l} looks too much like
207eleven!). Numeric literals containing a decimal point or an exponent
208sign yield floating point numbers. Appending \character{j} or
Steve Holden1e4519f2002-06-14 09:16:40 +0000209\character{J} to a numeric literal yields a complex number with a
210zero real part. A complex numeric literal is the sum of a real and
211an imaginary part.
Fred Drake64e3b431998-07-24 13:56:11 +0000212\indexii{numeric}{literals}
213\indexii{integer}{literals}
214\indexiii{long}{integer}{literals}
215\indexii{floating point}{literals}
216\indexii{complex number}{literals}
217\indexii{hexadecimal}{literals}
218\indexii{octal}{literals}
219
220Python fully supports mixed arithmetic: when a binary arithmetic
221operator has operands of different numeric types, the operand with the
Steve Holden1e4519f2002-06-14 09:16:40 +0000222``narrower'' type is widened to that of the other, where plain
223integer is narrower than long integer is narrower than floating point is
224narrower than complex.
Fred Drakeea003fc1999-04-05 21:59:15 +0000225Comparisons between numbers of mixed type use the same rule.\footnote{
226 As a consequence, the list \code{[1, 2]} is considered equal
Steve Holden1e4519f2002-06-14 09:16:40 +0000227 to \code{[1.0, 2.0]}, and similarly for tuples.
228} The constructors \function{int()}, \function{long()}, \function{float()},
Fred Drake84538cd1998-11-30 21:51:25 +0000229and \function{complex()} can be used
Steve Holden1e4519f2002-06-14 09:16:40 +0000230to produce numbers of a specific type.
Fred Drake64e3b431998-07-24 13:56:11 +0000231\index{arithmetic}
232\bifuncindex{int}
233\bifuncindex{long}
234\bifuncindex{float}
235\bifuncindex{complex}
236
Michael W. Hudson9c206152003-03-05 14:42:09 +0000237All numeric types (except complex) support the following operations,
238sorted by ascending priority (operations in the same box have the same
Fred Drake64e3b431998-07-24 13:56:11 +0000239priority; all numeric operations have a higher priority than
240comparison operations):
241
242\begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
243 \lineiii{\var{x} + \var{y}}{sum of \var{x} and \var{y}}{}
244 \lineiii{\var{x} - \var{y}}{difference of \var{x} and \var{y}}{}
245 \hline
246 \lineiii{\var{x} * \var{y}}{product of \var{x} and \var{y}}{}
247 \lineiii{\var{x} / \var{y}}{quotient of \var{x} and \var{y}}{(1)}
Michael W. Hudson9c206152003-03-05 14:42:09 +0000248 \lineiii{\var{x} \%{} \var{y}}{remainder of \code{\var{x} / \var{y}}}{(4)}
Fred Drake64e3b431998-07-24 13:56:11 +0000249 \hline
250 \lineiii{-\var{x}}{\var{x} negated}{}
251 \lineiii{+\var{x}}{\var{x} unchanged}{}
252 \hline
253 \lineiii{abs(\var{x})}{absolute value or magnitude of \var{x}}{}
254 \lineiii{int(\var{x})}{\var{x} converted to integer}{(2)}
255 \lineiii{long(\var{x})}{\var{x} converted to long integer}{(2)}
256 \lineiii{float(\var{x})}{\var{x} converted to floating point}{}
257 \lineiii{complex(\var{re},\var{im})}{a complex number with real part \var{re}, imaginary part \var{im}. \var{im} defaults to zero.}{}
Fred Drake26b698f1999-02-12 18:27:31 +0000258 \lineiii{\var{c}.conjugate()}{conjugate of the complex number \var{c}}{}
Michael W. Hudson9c206152003-03-05 14:42:09 +0000259 \lineiii{divmod(\var{x}, \var{y})}{the pair \code{(\var{x} / \var{y}, \var{x} \%{} \var{y})}}{(3)(4)}
Fred Drake64e3b431998-07-24 13:56:11 +0000260 \lineiii{pow(\var{x}, \var{y})}{\var{x} to the power \var{y}}{}
261 \lineiii{\var{x} ** \var{y}}{\var{x} to the power \var{y}}{}
262\end{tableiii}
263\indexiii{operations on}{numeric}{types}
Fred Drake26b698f1999-02-12 18:27:31 +0000264\withsubitem{(complex number method)}{\ttindex{conjugate()}}
Fred Drake64e3b431998-07-24 13:56:11 +0000265
266\noindent
267Notes:
268\begin{description}
269
270\item[(1)]
271For (plain or long) integer division, the result is an integer.
Tim Peters8f01b682002-03-12 03:04:44 +0000272The result is always rounded towards minus infinity: 1/2 is 0,
Fred Drake38e5d272000-04-03 20:13:55 +0000273(-1)/2 is -1, 1/(-2) is -1, and (-1)/(-2) is 0. Note that the result
274is a long integer if either operand is a long integer, regardless of
275the numeric value.
Fred Drake64e3b431998-07-24 13:56:11 +0000276\indexii{integer}{division}
277\indexiii{long}{integer}{division}
278
279\item[(2)]
280Conversion from floating point to (long or plain) integer may round or
Fred Drake4de96c22000-08-12 03:36:23 +0000281truncate as in C; see functions \function{floor()} and
282\function{ceil()} in the \refmodule{math}\refbimodindex{math} module
283for well-defined conversions.
Fred Drake9474d861999-02-12 22:05:33 +0000284\withsubitem{(in module math)}{\ttindex{floor()}\ttindex{ceil()}}
Fred Drake64e3b431998-07-24 13:56:11 +0000285\indexii{numeric}{conversions}
Fred Drake4de96c22000-08-12 03:36:23 +0000286\indexii{C}{language}
Fred Drake64e3b431998-07-24 13:56:11 +0000287
288\item[(3)]
Fred Drake38e5d272000-04-03 20:13:55 +0000289See section \ref{built-in-funcs}, ``Built-in Functions,'' for a full
290description.
Fred Drake64e3b431998-07-24 13:56:11 +0000291
Michael W. Hudson9c206152003-03-05 14:42:09 +0000292\item[(4)]
293Complex floor division operator, modulo operator, and \function{divmod()}.
294
295\deprecated{2.3}{Instead convert to float using \function{abs()}
296if appropriate.}
297
Fred Drake64e3b431998-07-24 13:56:11 +0000298\end{description}
299% XXXJH exceptions: overflow (when? what operations?) zerodivision
300
Fred Drake4e7c2051999-02-19 15:30:25 +0000301\subsubsection{Bit-string Operations on Integer Types \label{bitstring-ops}}
Fred Drake64e3b431998-07-24 13:56:11 +0000302\nodename{Bit-string Operations}
303
304Plain and long integer types support additional operations that make
305sense only for bit-strings. Negative numbers are treated as their 2's
306complement value (for long integers, this assumes a sufficiently large
307number of bits that no overflow occurs during the operation).
308
309The priorities of the binary bit-wise operations are all lower than
310the numeric operations and higher than the comparisons; the unary
311operation \samp{\~} has the same priority as the other unary numeric
312operations (\samp{+} and \samp{-}).
313
314This table lists the bit-string operations sorted in ascending
315priority (operations in the same box have the same priority):
316
317\begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
318 \lineiii{\var{x} | \var{y}}{bitwise \dfn{or} of \var{x} and \var{y}}{}
319 \lineiii{\var{x} \^{} \var{y}}{bitwise \dfn{exclusive or} of \var{x} and \var{y}}{}
320 \lineiii{\var{x} \&{} \var{y}}{bitwise \dfn{and} of \var{x} and \var{y}}{}
Fred Drake2269d862004-11-11 06:14:05 +0000321 % The empty groups below prevent conversion to guillemets.
322 \lineiii{\var{x} <{}< \var{n}}{\var{x} shifted left by \var{n} bits}{(1), (2)}
323 \lineiii{\var{x} >{}> \var{n}}{\var{x} shifted right by \var{n} bits}{(1), (3)}
Fred Drake64e3b431998-07-24 13:56:11 +0000324 \hline
325 \lineiii{\~\var{x}}{the bits of \var{x} inverted}{}
326\end{tableiii}
327\indexiii{operations on}{integer}{types}
328\indexii{bit-string}{operations}
329\indexii{shifting}{operations}
330\indexii{masking}{operations}
331
332\noindent
333Notes:
334\begin{description}
335\item[(1)] Negative shift counts are illegal and cause a
336\exception{ValueError} to be raised.
337\item[(2)] A left shift by \var{n} bits is equivalent to
338multiplication by \code{pow(2, \var{n})} without overflow check.
339\item[(3)] A right shift by \var{n} bits is equivalent to
340division by \code{pow(2, \var{n})} without overflow check.
341\end{description}
342
343
Fred Drake93656e72001-05-02 20:18:03 +0000344\subsection{Iterator Types \label{typeiter}}
345
Fred Drakef42cc452001-05-03 04:39:10 +0000346\versionadded{2.2}
Fred Drake93656e72001-05-02 20:18:03 +0000347\index{iterator protocol}
348\index{protocol!iterator}
349\index{sequence!iteration}
350\index{container!iteration over}
351
352Python supports a concept of iteration over containers. This is
353implemented using two distinct methods; these are used to allow
354user-defined classes to support iteration. Sequences, described below
355in more detail, always support the iteration methods.
356
357One method needs to be defined for container objects to provide
358iteration support:
359
360\begin{methoddesc}[container]{__iter__}{}
Greg Ward54f65092001-07-26 21:01:21 +0000361 Return an iterator object. The object is required to support the
Fred Drake93656e72001-05-02 20:18:03 +0000362 iterator protocol described below. If a container supports
363 different types of iteration, additional methods can be provided to
364 specifically request iterators for those iteration types. (An
365 example of an object supporting multiple forms of iteration would be
366 a tree structure which supports both breadth-first and depth-first
367 traversal.) This method corresponds to the \member{tp_iter} slot of
368 the type structure for Python objects in the Python/C API.
369\end{methoddesc}
370
371The iterator objects themselves are required to support the following
372two methods, which together form the \dfn{iterator protocol}:
373
374\begin{methoddesc}[iterator]{__iter__}{}
375 Return the iterator object itself. This is required to allow both
376 containers and iterators to be used with the \keyword{for} and
377 \keyword{in} statements. This method corresponds to the
378 \member{tp_iter} slot of the type structure for Python objects in
379 the Python/C API.
380\end{methoddesc}
381
Fred Drakef42cc452001-05-03 04:39:10 +0000382\begin{methoddesc}[iterator]{next}{}
Fred Drake93656e72001-05-02 20:18:03 +0000383 Return the next item from the container. If there are no further
384 items, raise the \exception{StopIteration} exception. This method
385 corresponds to the \member{tp_iternext} slot of the type structure
386 for Python objects in the Python/C API.
387\end{methoddesc}
388
389Python defines several iterator objects to support iteration over
390general and specific sequence types, dictionaries, and other more
391specialized forms. The specific types are not important beyond their
392implementation of the iterator protocol.
393
Guido van Rossum9534e142002-07-16 19:53:39 +0000394The intention of the protocol is that once an iterator's
395\method{next()} method raises \exception{StopIteration}, it will
396continue to do so on subsequent calls. Implementations that
397do not obey this property are deemed broken. (This constraint
398was added in Python 2.3; in Python 2.2, various iterators are
399broken according to this rule.)
400
Raymond Hettinger2dd8c422003-06-25 19:03:22 +0000401Python's generators provide a convenient way to implement the
402iterator protocol. If a container object's \method{__iter__()}
403method is implemented as a generator, it will automatically
404return an iterator object (technically, a generator object)
405supplying the \method{__iter__()} and \method{next()} methods.
406
Fred Drake93656e72001-05-02 20:18:03 +0000407
Fred Drake7a2f0661998-09-10 18:25:58 +0000408\subsection{Sequence Types \label{typesseq}}
Fred Drake64e3b431998-07-24 13:56:11 +0000409
Fred Drake107b9672000-08-14 15:37:59 +0000410There are six sequence types: strings, Unicode strings, lists,
Fred Drake512bb722000-08-18 03:12:38 +0000411tuples, buffers, and xrange objects.
Fred Drake64e3b431998-07-24 13:56:11 +0000412
Steve Holden1e4519f2002-06-14 09:16:40 +0000413String literals are written in single or double quotes:
Fred Drake38e5d272000-04-03 20:13:55 +0000414\code{'xyzzy'}, \code{"frobozz"}. See chapter 2 of the
Fred Drake4de96c22000-08-12 03:36:23 +0000415\citetitle[../ref/strings.html]{Python Reference Manual} for more about
416string literals. Unicode strings are much like strings, but are
417specified in the syntax using a preceeding \character{u} character:
418\code{u'abc'}, \code{u"def"}. Lists are constructed with square brackets,
Fred Drake37f15741999-11-10 16:21:37 +0000419separating items with commas: \code{[a, b, c]}. Tuples are
420constructed by the comma operator (not within square brackets), with
421or without enclosing parentheses, but an empty tuple must have the
Raymond Hettingerb67449d2003-09-08 18:52:18 +0000422enclosing parentheses, such as \code{a, b, c} or \code{()}. A single
423item tuple must have a trailing comma, such as \code{(d,)}.
Fred Drake0b4e25d2000-10-04 04:21:19 +0000424\obindex{sequence}
425\obindex{string}
426\obindex{Unicode}
Fred Drake0b4e25d2000-10-04 04:21:19 +0000427\obindex{tuple}
428\obindex{list}
Guido van Rossum5fe2c132001-07-05 15:27:19 +0000429
430Buffer objects are not directly supported by Python syntax, but can be
431created by calling the builtin function
Fred Drake36c2bd82002-09-24 15:32:04 +0000432\function{buffer()}.\bifuncindex{buffer} They don't support
Steve Holden1e4519f2002-06-14 09:16:40 +0000433concatenation or repetition.
Guido van Rossum5fe2c132001-07-05 15:27:19 +0000434\obindex{buffer}
435
436Xrange objects are similar to buffers in that there is no specific
Steve Holden1e4519f2002-06-14 09:16:40 +0000437syntax to create them, but they are created using the \function{xrange()}
438function.\bifuncindex{xrange} They don't support slicing,
439concatenation or repetition, and using \code{in}, \code{not in},
440\function{min()} or \function{max()} on them is inefficient.
Fred Drake0b4e25d2000-10-04 04:21:19 +0000441\obindex{xrange}
Fred Drake64e3b431998-07-24 13:56:11 +0000442
Guido van Rossum5fe2c132001-07-05 15:27:19 +0000443Most sequence types support the following operations. The \samp{in} and
Fred Drake64e3b431998-07-24 13:56:11 +0000444\samp{not in} operations have the same priorities as the comparison
445operations. The \samp{+} and \samp{*} operations have the same
446priority as the corresponding numeric operations.\footnote{They must
447have since the parser can't tell the type of the operands.}
448
449This table lists the sequence operations sorted in ascending priority
450(operations in the same box have the same priority). In the table,
451\var{s} and \var{t} are sequences of the same type; \var{n}, \var{i}
452and \var{j} are integers:
453
454\begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
Raymond Hettinger77d110d2004-10-08 01:52:15 +0000455 \lineiii{\var{x} in \var{s}}{\code{True} if an item of \var{s} is equal to \var{x}, else \code{False}}{(1)}
456 \lineiii{\var{x} not in \var{s}}{\code{False} if an item of \var{s} is
457equal to \var{x}, else \code{True}}{(1)}
Fred Drake64e3b431998-07-24 13:56:11 +0000458 \hline
Raymond Hettinger52a21b82004-08-06 18:43:09 +0000459 \lineiii{\var{s} + \var{t}}{the concatenation of \var{s} and \var{t}}{(6)}
Barry Warsaw817918c2002-08-06 16:58:21 +0000460 \lineiii{\var{s} * \var{n}\textrm{,} \var{n} * \var{s}}{\var{n} shallow copies of \var{s} concatenated}{(2)}
Fred Drake64e3b431998-07-24 13:56:11 +0000461 \hline
Barry Warsaw817918c2002-08-06 16:58:21 +0000462 \lineiii{\var{s}[\var{i}]}{\var{i}'th item of \var{s}, origin 0}{(3)}
463 \lineiii{\var{s}[\var{i}:\var{j}]}{slice of \var{s} from \var{i} to \var{j}}{(3), (4)}
Michael W. Hudson9c206152003-03-05 14:42:09 +0000464 \lineiii{\var{s}[\var{i}:\var{j}:\var{k}]}{slice of \var{s} from \var{i} to \var{j} with step \var{k}}{(3), (5)}
Fred Drake64e3b431998-07-24 13:56:11 +0000465 \hline
466 \lineiii{len(\var{s})}{length of \var{s}}{}
467 \lineiii{min(\var{s})}{smallest item of \var{s}}{}
468 \lineiii{max(\var{s})}{largest item of \var{s}}{}
469\end{tableiii}
470\indexiii{operations on}{sequence}{types}
471\bifuncindex{len}
472\bifuncindex{min}
473\bifuncindex{max}
474\indexii{concatenation}{operation}
475\indexii{repetition}{operation}
476\indexii{subscript}{operation}
477\indexii{slice}{operation}
Michael W. Hudson9c206152003-03-05 14:42:09 +0000478\indexii{extended slice}{operation}
Fred Drake64e3b431998-07-24 13:56:11 +0000479\opindex{in}
480\opindex{not in}
481
482\noindent
483Notes:
484
485\begin{description}
Barry Warsaw817918c2002-08-06 16:58:21 +0000486\item[(1)] When \var{s} is a string or Unicode string object the
487\code{in} and \code{not in} operations act like a substring test. In
488Python versions before 2.3, \var{x} had to be a string of length 1.
489In Python 2.3 and beyond, \var{x} may be a string of any length.
490
491\item[(2)] Values of \var{n} less than \code{0} are treated as
Fred Drake38e5d272000-04-03 20:13:55 +0000492 \code{0} (which yields an empty sequence of the same type as
Fred Draked800cff2001-08-28 14:56:05 +0000493 \var{s}). Note also that the copies are shallow; nested structures
494 are not copied. This often haunts new Python programmers; consider:
495
496\begin{verbatim}
497>>> lists = [[]] * 3
498>>> lists
499[[], [], []]
500>>> lists[0].append(3)
501>>> lists
502[[3], [3], [3]]
503\end{verbatim}
504
Armin Rigo80adba62004-11-04 11:29:09 +0000505 What has happened is that \code{[[]]} is a one-element list containing
506 an empty list, so all three elements of \code{[[]] * 3} are (pointers to)
507 this single empty list. Modifying any of the elements of \code{lists}
508 modifies this single list. You can create a list of different lists this
509 way:
Fred Draked800cff2001-08-28 14:56:05 +0000510
511\begin{verbatim}
512>>> lists = [[] for i in range(3)]
513>>> lists[0].append(3)
514>>> lists[1].append(5)
515>>> lists[2].append(7)
516>>> lists
517[[3], [5], [7]]
518\end{verbatim}
Fred Drake38e5d272000-04-03 20:13:55 +0000519
Barry Warsaw817918c2002-08-06 16:58:21 +0000520\item[(3)] If \var{i} or \var{j} is negative, the index is relative to
Fred Drake907e76b2001-07-06 20:30:11 +0000521 the end of the string: \code{len(\var{s}) + \var{i}} or
Fred Drake64e3b431998-07-24 13:56:11 +0000522 \code{len(\var{s}) + \var{j}} is substituted. But note that \code{-0} is
523 still \code{0}.
Tim Peters8f01b682002-03-12 03:04:44 +0000524
Barry Warsaw817918c2002-08-06 16:58:21 +0000525\item[(4)] The slice of \var{s} from \var{i} to \var{j} is defined as
Fred Drake64e3b431998-07-24 13:56:11 +0000526 the sequence of items with index \var{k} such that \code{\var{i} <=
527 \var{k} < \var{j}}. If \var{i} or \var{j} is greater than
528 \code{len(\var{s})}, use \code{len(\var{s})}. If \var{i} is omitted,
529 use \code{0}. If \var{j} is omitted, use \code{len(\var{s})}. If
530 \var{i} is greater than or equal to \var{j}, the slice is empty.
Michael W. Hudson9c206152003-03-05 14:42:09 +0000531
532\item[(5)] The slice of \var{s} from \var{i} to \var{j} with step
533 \var{k} is defined as the sequence of items with index
Armin Rigo80adba62004-11-04 11:29:09 +0000534 \code{\var{x} = \var{i} + \var{n}*\var{k}} such that
535 $0 \leq n < \frac{j-i}{k}$. In other words, the indices
536 are \code{i}, \code{i+k}, \code{i+2*k}, \code{i+3*k} and so on, stopping when
537 \var{j} is reached (but never including \var{j}). If \var{i} or \var{j}
Michael W. Hudson9c206152003-03-05 14:42:09 +0000538 is greater than \code{len(\var{s})}, use \code{len(\var{s})}. If
Raymond Hettinger81702002003-08-30 23:31:31 +0000539 \var{i} or \var{j} are omitted then they become ``end'' values
540 (which end depends on the sign of \var{k}). Note, \var{k} cannot
541 be zero.
Michael W. Hudson9c206152003-03-05 14:42:09 +0000542
Raymond Hettinger52a21b82004-08-06 18:43:09 +0000543\item[(6)] If \var{s} and \var{t} are both strings, some Python
Andrew M. Kuchling34ed2b02004-08-06 18:55:09 +0000544implementations such as CPython can usually perform an in-place optimization
Raymond Hettinger52a21b82004-08-06 18:43:09 +0000545for assignments of the form \code{\var{s}=\var{s}+\var{t}} or
546\code{\var{s}+=\var{t}}. When applicable, this optimization makes
547quadratic run-time much less likely. This optimization is both version
548and implementation dependent. For performance sensitive code, it is
549preferrable to use the \method{str.join()} method which assures consistent
550linear concatenation performance across versions and implementations.
Andrew M. Kuchling34ed2b02004-08-06 18:55:09 +0000551\versionchanged[Formerly, string concatenation never occurred in-place]{2.4}
Raymond Hettinger52a21b82004-08-06 18:43:09 +0000552
Fred Drake64e3b431998-07-24 13:56:11 +0000553\end{description}
554
Fred Drake9474d861999-02-12 22:05:33 +0000555
Fred Drake4de96c22000-08-12 03:36:23 +0000556\subsubsection{String Methods \label{string-methods}}
557
558These are the string methods which both 8-bit strings and Unicode
559objects support:
560
561\begin{methoddesc}[string]{capitalize}{}
562Return a copy of the string with only its first character capitalized.
Martin v. Löwis4a9b8062004-06-03 09:47:01 +0000563
564For 8-bit strings, this method is locale-dependent.
Fred Drake4de96c22000-08-12 03:36:23 +0000565\end{methoddesc}
566
Raymond Hettinger4f8f9762003-11-26 08:21:35 +0000567\begin{methoddesc}[string]{center}{width\optional{, fillchar}}
Fred Drake4de96c22000-08-12 03:36:23 +0000568Return centered in a string of length \var{width}. Padding is done
Raymond Hettinger4f8f9762003-11-26 08:21:35 +0000569using the specified \var{fillchar} (default is a space).
Neal Norwitz72452652003-11-26 14:54:56 +0000570\versionchanged[Support for the \var{fillchar} argument]{2.4}
Fred Drake4de96c22000-08-12 03:36:23 +0000571\end{methoddesc}
572
573\begin{methoddesc}[string]{count}{sub\optional{, start\optional{, end}}}
574Return the number of occurrences of substring \var{sub} in string
575S\code{[\var{start}:\var{end}]}. Optional arguments \var{start} and
576\var{end} are interpreted as in slice notation.
577\end{methoddesc}
578
Fred Drake6048ce92001-12-10 16:43:08 +0000579\begin{methoddesc}[string]{decode}{\optional{encoding\optional{, errors}}}
580Decodes the string using the codec registered for \var{encoding}.
581\var{encoding} defaults to the default string encoding. \var{errors}
582may be given to set a different error handling scheme. The default is
583\code{'strict'}, meaning that encoding errors raise
Walter Dörwaldac1075a2004-07-01 19:58:47 +0000584\exception{UnicodeError}. Other possible values are \code{'ignore'},
585\code{'replace'} and any other name registered via
586\function{codecs.register_error}.
Fred Drake6048ce92001-12-10 16:43:08 +0000587\versionadded{2.2}
Walter Dörwaldac1075a2004-07-01 19:58:47 +0000588\versionchanged[Support for other error handling schemes added]{2.3}
Fred Drake6048ce92001-12-10 16:43:08 +0000589\end{methoddesc}
590
Fred Drake4de96c22000-08-12 03:36:23 +0000591\begin{methoddesc}[string]{encode}{\optional{encoding\optional{,errors}}}
592Return an encoded version of the string. Default encoding is the current
593default string encoding. \var{errors} may be given to set a different
594error handling scheme. The default for \var{errors} is
595\code{'strict'}, meaning that encoding errors raise a
Walter Dörwaldac1075a2004-07-01 19:58:47 +0000596\exception{UnicodeError}. Other possible values are \code{'ignore'},
597\code{'replace'}, \code{'xmlcharrefreplace'}, \code{'backslashreplace'}
598and any other name registered via \function{codecs.register_error}.
599For a list of possible encodings, see section~\ref{standard-encodings}.
Fred Drake1dba66c2000-10-25 21:03:55 +0000600\versionadded{2.0}
Walter Dörwaldac1075a2004-07-01 19:58:47 +0000601\versionchanged[Support for \code{'xmlcharrefreplace'} and
602\code{'backslashreplace'} and other error handling schemes added]{2.3}
Fred Drake4de96c22000-08-12 03:36:23 +0000603\end{methoddesc}
604
605\begin{methoddesc}[string]{endswith}{suffix\optional{, start\optional{, end}}}
Michael W. Hudson9c206152003-03-05 14:42:09 +0000606Return \code{True} if the string ends with the specified \var{suffix},
607otherwise return \code{False}. With optional \var{start}, test beginning at
Fred Drake4de96c22000-08-12 03:36:23 +0000608that position. With optional \var{end}, stop comparing at that position.
609\end{methoddesc}
610
611\begin{methoddesc}[string]{expandtabs}{\optional{tabsize}}
612Return a copy of the string where all tab characters are expanded
613using spaces. If \var{tabsize} is not given, a tab size of \code{8}
614characters is assumed.
615\end{methoddesc}
616
617\begin{methoddesc}[string]{find}{sub\optional{, start\optional{, end}}}
618Return the lowest index in the string where substring \var{sub} is
619found, such that \var{sub} is contained in the range [\var{start},
620\var{end}). Optional arguments \var{start} and \var{end} are
621interpreted as in slice notation. Return \code{-1} if \var{sub} is
622not found.
623\end{methoddesc}
624
625\begin{methoddesc}[string]{index}{sub\optional{, start\optional{, end}}}
626Like \method{find()}, but raise \exception{ValueError} when the
627substring is not found.
628\end{methoddesc}
629
630\begin{methoddesc}[string]{isalnum}{}
631Return true if all characters in the string are alphanumeric and there
632is at least one character, false otherwise.
Martin v. Löwis4a9b8062004-06-03 09:47:01 +0000633
634For 8-bit strings, this method is locale-dependent.
Fred Drake4de96c22000-08-12 03:36:23 +0000635\end{methoddesc}
636
637\begin{methoddesc}[string]{isalpha}{}
638Return true if all characters in the string are alphabetic and there
639is at least one character, false otherwise.
Martin v. Löwis4a9b8062004-06-03 09:47:01 +0000640
641For 8-bit strings, this method is locale-dependent.
Fred Drake4de96c22000-08-12 03:36:23 +0000642\end{methoddesc}
643
644\begin{methoddesc}[string]{isdigit}{}
Martin v. Löwis6828e182003-10-18 09:55:08 +0000645Return true if all characters in the string are digits and there
646is at least one character, false otherwise.
Martin v. Löwis4a9b8062004-06-03 09:47:01 +0000647
648For 8-bit strings, this method is locale-dependent.
Fred Drake4de96c22000-08-12 03:36:23 +0000649\end{methoddesc}
650
651\begin{methoddesc}[string]{islower}{}
652Return true if all cased characters in the string are lowercase and
653there is at least one cased character, false otherwise.
Martin v. Löwis4a9b8062004-06-03 09:47:01 +0000654
655For 8-bit strings, this method is locale-dependent.
Fred Drake4de96c22000-08-12 03:36:23 +0000656\end{methoddesc}
657
658\begin{methoddesc}[string]{isspace}{}
659Return true if there are only whitespace characters in the string and
Martin v. Löwis6828e182003-10-18 09:55:08 +0000660there is at least one character, false otherwise.
Martin v. Löwis4a9b8062004-06-03 09:47:01 +0000661
662For 8-bit strings, this method is locale-dependent.
Fred Drake4de96c22000-08-12 03:36:23 +0000663\end{methoddesc}
664
665\begin{methoddesc}[string]{istitle}{}
Martin v. Löwis6828e182003-10-18 09:55:08 +0000666Return true if the string is a titlecased string and there is at least one
Raymond Hettinger0a9b9da2003-10-29 06:54:43 +0000667character, for example uppercase characters may only follow uncased
Martin v. Löwis6828e182003-10-18 09:55:08 +0000668characters and lowercase characters only cased ones. Return false
669otherwise.
Martin v. Löwis4a9b8062004-06-03 09:47:01 +0000670
671For 8-bit strings, this method is locale-dependent.
Fred Drake4de96c22000-08-12 03:36:23 +0000672\end{methoddesc}
673
674\begin{methoddesc}[string]{isupper}{}
675Return true if all cased characters in the string are uppercase and
676there is at least one cased character, false otherwise.
Martin v. Löwis4a9b8062004-06-03 09:47:01 +0000677
678For 8-bit strings, this method is locale-dependent.
Fred Drake4de96c22000-08-12 03:36:23 +0000679\end{methoddesc}
680
681\begin{methoddesc}[string]{join}{seq}
682Return a string which is the concatenation of the strings in the
683sequence \var{seq}. The separator between elements is the string
684providing this method.
685\end{methoddesc}
686
Raymond Hettinger4f8f9762003-11-26 08:21:35 +0000687\begin{methoddesc}[string]{ljust}{width\optional{, fillchar}}
Fred Drake4de96c22000-08-12 03:36:23 +0000688Return the string left justified in a string of length \var{width}.
Raymond Hettinger4f8f9762003-11-26 08:21:35 +0000689Padding is done using the specified \var{fillchar} (default is a
690space). The original string is returned if
Fred Drake4de96c22000-08-12 03:36:23 +0000691\var{width} is less than \code{len(\var{s})}.
Neal Norwitz72452652003-11-26 14:54:56 +0000692\versionchanged[Support for the \var{fillchar} argument]{2.4}
Fred Drake4de96c22000-08-12 03:36:23 +0000693\end{methoddesc}
694
695\begin{methoddesc}[string]{lower}{}
696Return a copy of the string converted to lowercase.
Martin v. Löwis4a9b8062004-06-03 09:47:01 +0000697
698For 8-bit strings, this method is locale-dependent.
Fred Drake4de96c22000-08-12 03:36:23 +0000699\end{methoddesc}
700
Fred Drake8b1c47b2002-04-13 02:43:39 +0000701\begin{methoddesc}[string]{lstrip}{\optional{chars}}
702Return a copy of the string with leading characters removed. If
703\var{chars} is omitted or \code{None}, whitespace characters are
704removed. If given and not \code{None}, \var{chars} must be a string;
705the characters in the string will be stripped from the beginning of
706the string this method is called on.
Fred Drake91718012002-11-16 00:41:55 +0000707\versionchanged[Support for the \var{chars} argument]{2.2.2}
Fred Drake4de96c22000-08-12 03:36:23 +0000708\end{methoddesc}
709
Fred Draked22bb652003-10-22 02:56:40 +0000710\begin{methoddesc}[string]{replace}{old, new\optional{, count}}
Fred Drake4de96c22000-08-12 03:36:23 +0000711Return a copy of the string with all occurrences of substring
712\var{old} replaced by \var{new}. If the optional argument
Fred Draked22bb652003-10-22 02:56:40 +0000713\var{count} is given, only the first \var{count} occurrences are
Fred Drake4de96c22000-08-12 03:36:23 +0000714replaced.
715\end{methoddesc}
716
717\begin{methoddesc}[string]{rfind}{sub \optional{,start \optional{,end}}}
718Return the highest index in the string where substring \var{sub} is
719found, such that \var{sub} is contained within s[start,end]. Optional
720arguments \var{start} and \var{end} are interpreted as in slice
721notation. Return \code{-1} on failure.
722\end{methoddesc}
723
724\begin{methoddesc}[string]{rindex}{sub\optional{, start\optional{, end}}}
725Like \method{rfind()} but raises \exception{ValueError} when the
726substring \var{sub} is not found.
727\end{methoddesc}
728
Raymond Hettinger4f8f9762003-11-26 08:21:35 +0000729\begin{methoddesc}[string]{rjust}{width\optional{, fillchar}}
Fred Drake4de96c22000-08-12 03:36:23 +0000730Return the string right justified in a string of length \var{width}.
Raymond Hettinger4f8f9762003-11-26 08:21:35 +0000731Padding is done using the specified \var{fillchar} (default is a space).
732The original string is returned if
Fred Drake4de96c22000-08-12 03:36:23 +0000733\var{width} is less than \code{len(\var{s})}.
Neal Norwitz72452652003-11-26 14:54:56 +0000734\versionchanged[Support for the \var{fillchar} argument]{2.4}
Fred Drake4de96c22000-08-12 03:36:23 +0000735\end{methoddesc}
736
Hye-Shik Changc6f066f2003-12-17 02:49:03 +0000737\begin{methoddesc}[string]{rsplit}{\optional{sep \optional{,maxsplit}}}
738Return a list of the words in the string, using \var{sep} as the
739delimiter string. If \var{maxsplit} is given, at most \var{maxsplit}
Fred Drake401d1e32003-12-30 22:21:18 +0000740splits are done, the \emph{rightmost} ones. If \var{sep} is not specified
Hye-Shik Changc6f066f2003-12-17 02:49:03 +0000741or \code{None}, any whitespace string is a separator.
Hye-Shik Chang3ae811b2003-12-15 18:49:53 +0000742\versionadded{2.4}
743\end{methoddesc}
744
Fred Drake8b1c47b2002-04-13 02:43:39 +0000745\begin{methoddesc}[string]{rstrip}{\optional{chars}}
746Return a copy of the string with trailing characters removed. If
747\var{chars} is omitted or \code{None}, whitespace characters are
748removed. If given and not \code{None}, \var{chars} must be a string;
749the characters in the string will be stripped from the end of the
750string this method is called on.
Fred Drake91718012002-11-16 00:41:55 +0000751\versionchanged[Support for the \var{chars} argument]{2.2.2}
Fred Drake4de96c22000-08-12 03:36:23 +0000752\end{methoddesc}
753
754\begin{methoddesc}[string]{split}{\optional{sep \optional{,maxsplit}}}
755Return a list of the words in the string, using \var{sep} as the
756delimiter string. If \var{maxsplit} is given, at most \var{maxsplit}
Raymond Hettinger18c69602004-09-06 00:12:04 +0000757splits are done. (thus, the list will have at most \code{\var{maxsplit}+1}
758elements). If \var{maxsplit} is not specified or is zero, then there
759is no limit on the number of splits (all possible splits are made).
760Consecutive delimiters are not grouped together and are
761deemed to delimit empty strings (for example, \samp{'1,,2'.split(',')}
Raymond Hettingerbb30af42004-09-06 00:42:14 +0000762returns \samp{['1', '', '2']}). The \var{sep} argument may consist of
Raymond Hettinger18c69602004-09-06 00:12:04 +0000763multiple characters (for example, \samp{'1, 2, 3'.split(', ')} returns
Raymond Hettingerbb30af42004-09-06 00:42:14 +0000764\samp{['1', '2', '3']}). Splitting an empty string with a specified
Raymond Hettinger18c69602004-09-06 00:12:04 +0000765separator returns an empty list.
766
767If \var{sep} is not specified or is \code{None}, a different splitting
768algorithm is applied. Words are separated by arbitrary length strings of
769whitespace characters (spaces, tabs, newlines, returns, and formfeeds).
770Consecutive whitespace delimiters are treated as a single delimiter
Raymond Hettingerbb30af42004-09-06 00:42:14 +0000771(\samp{'1 2 3'.split()} returns \samp{['1', '2', '3']}). Splitting an
Raymond Hettinger18c69602004-09-06 00:12:04 +0000772empty string returns \samp{['']}.
Fred Drake4de96c22000-08-12 03:36:23 +0000773\end{methoddesc}
774
775\begin{methoddesc}[string]{splitlines}{\optional{keepends}}
776Return a list of the lines in the string, breaking at line
777boundaries. Line breaks are not included in the resulting list unless
778\var{keepends} is given and true.
779\end{methoddesc}
780
Fred Drake8b1c47b2002-04-13 02:43:39 +0000781\begin{methoddesc}[string]{startswith}{prefix\optional{,
782 start\optional{, end}}}
Michael W. Hudson9c206152003-03-05 14:42:09 +0000783Return \code{True} if string starts with the \var{prefix}, otherwise
784return \code{False}. With optional \var{start}, test string beginning at
Fred Drake4de96c22000-08-12 03:36:23 +0000785that position. With optional \var{end}, stop comparing string at that
786position.
787\end{methoddesc}
788
Fred Drake8b1c47b2002-04-13 02:43:39 +0000789\begin{methoddesc}[string]{strip}{\optional{chars}}
790Return a copy of the string with leading and trailing characters
791removed. If \var{chars} is omitted or \code{None}, whitespace
792characters are removed. If given and not \code{None}, \var{chars}
793must be a string; the characters in the string will be stripped from
794the both ends of the string this method is called on.
Fred Drake91718012002-11-16 00:41:55 +0000795\versionchanged[Support for the \var{chars} argument]{2.2.2}
Fred Drake4de96c22000-08-12 03:36:23 +0000796\end{methoddesc}
797
798\begin{methoddesc}[string]{swapcase}{}
799Return a copy of the string with uppercase characters converted to
800lowercase and vice versa.
Martin v. Löwis4a9b8062004-06-03 09:47:01 +0000801
802For 8-bit strings, this method is locale-dependent.
Fred Drake4de96c22000-08-12 03:36:23 +0000803\end{methoddesc}
804
805\begin{methoddesc}[string]{title}{}
Fred Drake907e76b2001-07-06 20:30:11 +0000806Return a titlecased version of the string: words start with uppercase
Fred Drake4de96c22000-08-12 03:36:23 +0000807characters, all remaining cased characters are lowercase.
Martin v. Löwis4a9b8062004-06-03 09:47:01 +0000808
809For 8-bit strings, this method is locale-dependent.
Fred Drake4de96c22000-08-12 03:36:23 +0000810\end{methoddesc}
811
812\begin{methoddesc}[string]{translate}{table\optional{, deletechars}}
813Return a copy of the string where all characters occurring in the
814optional argument \var{deletechars} are removed, and the remaining
815characters have been mapped through the given translation table, which
816must be a string of length 256.
Raymond Hettinger46f681c2003-07-16 05:11:27 +0000817
818For Unicode objects, the \method{translate()} method does not
819accept the optional \var{deletechars} argument. Instead, it
820returns a copy of the \var{s} where all characters have been mapped
821through the given translation table which must be a mapping of
822Unicode ordinals to Unicode ordinals, Unicode strings or \code{None}.
823Unmapped characters are left untouched. Characters mapped to \code{None}
824are deleted. Note, a more flexible approach is to create a custom
825character mapping codec using the \refmodule{codecs} module (see
826\module{encodings.cp1251} for an example).
Fred Drake4de96c22000-08-12 03:36:23 +0000827\end{methoddesc}
828
829\begin{methoddesc}[string]{upper}{}
830Return a copy of the string converted to uppercase.
Martin v. Löwis4a9b8062004-06-03 09:47:01 +0000831
832For 8-bit strings, this method is locale-dependent.
Fred Drake4de96c22000-08-12 03:36:23 +0000833\end{methoddesc}
834
Walter Dörwald068325e2002-04-15 13:36:47 +0000835\begin{methoddesc}[string]{zfill}{width}
836Return the numeric string left filled with zeros in a string
837of length \var{width}. The original string is returned if
838\var{width} is less than \code{len(\var{s})}.
Fred Drakee55bec22002-11-16 00:44:00 +0000839\versionadded{2.2.2}
Walter Dörwald068325e2002-04-15 13:36:47 +0000840\end{methoddesc}
841
Fred Drake4de96c22000-08-12 03:36:23 +0000842
843\subsubsection{String Formatting Operations \label{typesseq-strings}}
Fred Drake64e3b431998-07-24 13:56:11 +0000844
Fred Drakeb38784e2001-12-03 22:15:56 +0000845\index{formatting, string (\%{})}
Fred Drakeab2dc1d2001-12-26 20:06:40 +0000846\index{interpolation, string (\%{})}
Fred Drake66d32b12000-09-14 17:57:42 +0000847\index{string!formatting}
Fred Drakeab2dc1d2001-12-26 20:06:40 +0000848\index{string!interpolation}
Fred Drake66d32b12000-09-14 17:57:42 +0000849\index{printf-style formatting}
850\index{sprintf-style formatting}
Fred Drakeb38784e2001-12-03 22:15:56 +0000851\index{\protect\%{} formatting}
Fred Drakeab2dc1d2001-12-26 20:06:40 +0000852\index{\protect\%{} interpolation}
Fred Drake66d32b12000-09-14 17:57:42 +0000853
Fred Drake8c071d42001-01-26 20:48:35 +0000854String and Unicode objects have one unique built-in operation: the
Fred Drakeab2dc1d2001-12-26 20:06:40 +0000855\code{\%} operator (modulo). This is also known as the string
856\emph{formatting} or \emph{interpolation} operator. Given
857\code{\var{format} \% \var{values}} (where \var{format} is a string or
858Unicode object), \code{\%} conversion specifications in \var{format}
859are replaced with zero or more elements of \var{values}. The effect
860is similar to the using \cfunction{sprintf()} in the C language. If
861\var{format} is a Unicode object, or if any of the objects being
862converted using the \code{\%s} conversion are Unicode objects, the
Steve Holden1e4519f2002-06-14 09:16:40 +0000863result will also be a Unicode object.
Fred Drake64e3b431998-07-24 13:56:11 +0000864
Fred Drake8c071d42001-01-26 20:48:35 +0000865If \var{format} requires a single argument, \var{values} may be a
Fred Drake401d1e32003-12-30 22:21:18 +0000866single non-tuple object.\footnote{To format only a tuple you
Steve Holden1e4519f2002-06-14 09:16:40 +0000867should therefore provide a singleton tuple whose only element
868is the tuple to be formatted.} Otherwise, \var{values} must be a tuple with
Fred Drake8c071d42001-01-26 20:48:35 +0000869exactly the number of items specified by the format string, or a
870single mapping object (for example, a dictionary).
Fred Drake64e3b431998-07-24 13:56:11 +0000871
Fred Drake8c071d42001-01-26 20:48:35 +0000872A conversion specifier contains two or more characters and has the
873following components, which must occur in this order:
874
875\begin{enumerate}
876 \item The \character{\%} character, which marks the start of the
877 specifier.
Steve Holden1e4519f2002-06-14 09:16:40 +0000878 \item Mapping key (optional), consisting of a parenthesised sequence
879 of characters (for example, \code{(somename)}).
Fred Drake8c071d42001-01-26 20:48:35 +0000880 \item Conversion flags (optional), which affect the result of some
881 conversion types.
882 \item Minimum field width (optional). If specified as an
883 \character{*} (asterisk), the actual width is read from the
884 next element of the tuple in \var{values}, and the object to
885 convert comes after the minimum field width and optional
886 precision.
887 \item Precision (optional), given as a \character{.} (dot) followed
888 by the precision. If specified as \character{*} (an
889 asterisk), the actual width is read from the next element of
890 the tuple in \var{values}, and the value to convert comes after
891 the precision.
892 \item Length modifier (optional).
893 \item Conversion type.
894\end{enumerate}
Fred Drake64e3b431998-07-24 13:56:11 +0000895
Steve Holden1e4519f2002-06-14 09:16:40 +0000896When the right argument is a dictionary (or other mapping type), then
897the formats in the string \emph{must} include a parenthesised mapping key into
Fred Drake8c071d42001-01-26 20:48:35 +0000898that dictionary inserted immediately after the \character{\%}
Steve Holden1e4519f2002-06-14 09:16:40 +0000899character. The mapping key selects the value to be formatted from the
Fred Drake8c071d42001-01-26 20:48:35 +0000900mapping. For example:
Fred Drake64e3b431998-07-24 13:56:11 +0000901
902\begin{verbatim}
Steve Holden1e4519f2002-06-14 09:16:40 +0000903>>> print '%(language)s has %(#)03d quote types.' % \
904 {'language': "Python", "#": 2}
Fred Drake64e3b431998-07-24 13:56:11 +0000905Python has 002 quote types.
906\end{verbatim}
907
908In this case no \code{*} specifiers may occur in a format (since they
909require a sequential parameter list).
910
Fred Drake8c071d42001-01-26 20:48:35 +0000911The conversion flag characters are:
912
913\begin{tableii}{c|l}{character}{Flag}{Meaning}
914 \lineii{\#}{The value conversion will use the ``alternate form''
915 (where defined below).}
Neal Norwitzf927f142003-02-17 18:57:06 +0000916 \lineii{0}{The conversion will be zero padded for numeric values.}
Fred Drake8c071d42001-01-26 20:48:35 +0000917 \lineii{-}{The converted value is left adjusted (overrides
Fred Drakef5968262002-10-25 16:55:51 +0000918 the \character{0} conversion if both are given).}
Fred Drake8c071d42001-01-26 20:48:35 +0000919 \lineii{{~}}{(a space) A blank should be left before a positive number
920 (or empty string) produced by a signed conversion.}
921 \lineii{+}{A sign character (\character{+} or \character{-}) will
922 precede the conversion (overrides a "space" flag).}
923\end{tableii}
924
925The length modifier may be \code{h}, \code{l}, and \code{L} may be
926present, but are ignored as they are not necessary for Python.
927
928The conversion types are:
929
Fred Drakef5968262002-10-25 16:55:51 +0000930\begin{tableiii}{c|l|c}{character}{Conversion}{Meaning}{Notes}
931 \lineiii{d}{Signed integer decimal.}{}
932 \lineiii{i}{Signed integer decimal.}{}
933 \lineiii{o}{Unsigned octal.}{(1)}
934 \lineiii{u}{Unsigned decimal.}{}
935 \lineiii{x}{Unsigned hexidecimal (lowercase).}{(2)}
936 \lineiii{X}{Unsigned hexidecimal (uppercase).}{(2)}
937 \lineiii{e}{Floating point exponential format (lowercase).}{}
938 \lineiii{E}{Floating point exponential format (uppercase).}{}
939 \lineiii{f}{Floating point decimal format.}{}
940 \lineiii{F}{Floating point decimal format.}{}
941 \lineiii{g}{Same as \character{e} if exponent is greater than -4 or
942 less than precision, \character{f} otherwise.}{}
943 \lineiii{G}{Same as \character{E} if exponent is greater than -4 or
944 less than precision, \character{F} otherwise.}{}
945 \lineiii{c}{Single character (accepts integer or single character
946 string).}{}
947 \lineiii{r}{String (converts any python object using
948 \function{repr()}).}{(3)}
949 \lineiii{s}{String (converts any python object using
Raymond Hettinger2bd15682003-01-13 04:29:19 +0000950 \function{str()}).}{(4)}
Fred Drakef5968262002-10-25 16:55:51 +0000951 \lineiii{\%}{No argument is converted, results in a \character{\%}
952 character in the result.}{}
953\end{tableiii}
954
955\noindent
956Notes:
957\begin{description}
958 \item[(1)]
959 The alternate form causes a leading zero (\character{0}) to be
960 inserted between left-hand padding and the formatting of the
961 number if the leading character of the result is not already a
962 zero.
963 \item[(2)]
964 The alternate form causes a leading \code{'0x'} or \code{'0X'}
965 (depending on whether the \character{x} or \character{X} format
966 was used) to be inserted between left-hand padding and the
967 formatting of the number if the leading character of the result is
968 not already a zero.
969 \item[(3)]
970 The \code{\%r} conversion was added in Python 2.0.
Raymond Hettinger2bd15682003-01-13 04:29:19 +0000971 \item[(4)]
972 If the object or format provided is a \class{unicode} string,
973 the resulting string will also be \class{unicode}.
Fred Drakef5968262002-10-25 16:55:51 +0000974\end{description}
Fred Drake8c071d42001-01-26 20:48:35 +0000975
976% XXX Examples?
977
Fred Drake8c071d42001-01-26 20:48:35 +0000978Since Python strings have an explicit length, \code{\%s} conversions
979do not assume that \code{'\e0'} is the end of the string.
980
981For safety reasons, floating point precisions are clipped to 50;
982\code{\%f} conversions for numbers whose absolute value is over 1e25
983are replaced by \code{\%g} conversions.\footnote{
984 These numbers are fairly arbitrary. They are intended to
985 avoid printing endless strings of meaningless digits without hampering
986 correct use and without having to know the exact precision of floating
987 point values on a particular machine.
988} All other errors raise exceptions.
989
Fred Drake14f5c5f2001-12-03 18:33:13 +0000990Additional string operations are defined in standard modules
Fred Drake401d1e32003-12-30 22:21:18 +0000991\refmodule{string}\refstmodindex{string}\ and
Tim Peters8f01b682002-03-12 03:04:44 +0000992\refmodule{re}.\refstmodindex{re}
Fred Drake64e3b431998-07-24 13:56:11 +0000993
Fred Drake107b9672000-08-14 15:37:59 +0000994
Fred Drake512bb722000-08-18 03:12:38 +0000995\subsubsection{XRange Type \label{typesseq-xrange}}
Fred Drake107b9672000-08-14 15:37:59 +0000996
Fred Drake401d1e32003-12-30 22:21:18 +0000997The \class{xrange}\obindex{xrange} type is an immutable sequence which
998is commonly used for looping. The advantage of the \class{xrange}
999type is that an \class{xrange} object will always take the same amount
1000of memory, no matter the size of the range it represents. There are
1001no consistent performance advantages.
Fred Drake107b9672000-08-14 15:37:59 +00001002
Raymond Hettingerd2bef822002-12-11 07:14:03 +00001003XRange objects have very little behavior: they only support indexing,
1004iteration, and the \function{len()} function.
Fred Drake107b9672000-08-14 15:37:59 +00001005
1006
Fred Drake9474d861999-02-12 22:05:33 +00001007\subsubsection{Mutable Sequence Types \label{typesseq-mutable}}
Fred Drake64e3b431998-07-24 13:56:11 +00001008
1009List objects support additional operations that allow in-place
1010modification of the object.
Steve Holden1e4519f2002-06-14 09:16:40 +00001011Other mutable sequence types (when added to the language) should
1012also support these operations.
1013Strings and tuples are immutable sequence types: such objects cannot
Fred Drake64e3b431998-07-24 13:56:11 +00001014be modified once created.
1015The following operations are defined on mutable sequence types (where
1016\var{x} is an arbitrary object):
1017\indexiii{mutable}{sequence}{types}
Fred Drake0b4e25d2000-10-04 04:21:19 +00001018\obindex{list}
Fred Drake64e3b431998-07-24 13:56:11 +00001019
1020\begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
1021 \lineiii{\var{s}[\var{i}] = \var{x}}
1022 {item \var{i} of \var{s} is replaced by \var{x}}{}
1023 \lineiii{\var{s}[\var{i}:\var{j}] = \var{t}}
1024 {slice of \var{s} from \var{i} to \var{j} is replaced by \var{t}}{}
1025 \lineiii{del \var{s}[\var{i}:\var{j}]}
1026 {same as \code{\var{s}[\var{i}:\var{j}] = []}}{}
Michael W. Hudson9c206152003-03-05 14:42:09 +00001027 \lineiii{\var{s}[\var{i}:\var{j}:\var{k}] = \var{t}}
1028 {the elements of \code{\var{s}[\var{i}:\var{j}:\var{k}]} are replaced by those of \var{t}}{(1)}
1029 \lineiii{del \var{s}[\var{i}:\var{j}:\var{k}]}
1030 {removes the elements of \code{\var{s}[\var{i}:\var{j}:\var{k}]} from the list}{}
Fred Drake64e3b431998-07-24 13:56:11 +00001031 \lineiii{\var{s}.append(\var{x})}
Michael W. Hudson9c206152003-03-05 14:42:09 +00001032 {same as \code{\var{s}[len(\var{s}):len(\var{s})] = [\var{x}]}}{(2)}
Barry Warsawafd974c1998-10-09 16:39:58 +00001033 \lineiii{\var{s}.extend(\var{x})}
Michael W. Hudson9c206152003-03-05 14:42:09 +00001034 {same as \code{\var{s}[len(\var{s}):len(\var{s})] = \var{x}}}{(3)}
Fred Drake64e3b431998-07-24 13:56:11 +00001035 \lineiii{\var{s}.count(\var{x})}
1036 {return number of \var{i}'s for which \code{\var{s}[\var{i}] == \var{x}}}{}
Walter Dörwald93719b52003-06-17 16:19:56 +00001037 \lineiii{\var{s}.index(\var{x}\optional{, \var{i}\optional{, \var{j}}})}
1038 {return smallest \var{k} such that \code{\var{s}[\var{k}] == \var{x}} and
1039 \code{\var{i} <= \var{k} < \var{j}}}{(4)}
Fred Drake64e3b431998-07-24 13:56:11 +00001040 \lineiii{\var{s}.insert(\var{i}, \var{x})}
Guido van Rossum3a3cca52003-04-14 20:58:14 +00001041 {same as \code{\var{s}[\var{i}:\var{i}] = [\var{x}]}}{(5)}
Fred Drake64e3b431998-07-24 13:56:11 +00001042 \lineiii{\var{s}.pop(\optional{\var{i}})}
Michael W. Hudson9c206152003-03-05 14:42:09 +00001043 {same as \code{\var{x} = \var{s}[\var{i}]; del \var{s}[\var{i}]; return \var{x}}}{(6)}
Fred Drake64e3b431998-07-24 13:56:11 +00001044 \lineiii{\var{s}.remove(\var{x})}
Michael W. Hudson9c206152003-03-05 14:42:09 +00001045 {same as \code{del \var{s}[\var{s}.index(\var{x})]}}{(4)}
Fred Drake64e3b431998-07-24 13:56:11 +00001046 \lineiii{\var{s}.reverse()}
Michael W. Hudson9c206152003-03-05 14:42:09 +00001047 {reverses the items of \var{s} in place}{(7)}
Fred Drake401d1e32003-12-30 22:21:18 +00001048 \lineiii{\var{s}.sort(\optional{\var{cmp}\optional{,
1049 \var{key}\optional{, \var{reverse}}}})}
Michael W. Hudson9c206152003-03-05 14:42:09 +00001050 {sort the items of \var{s} in place}{(7), (8), (9), (10)}
Fred Drake64e3b431998-07-24 13:56:11 +00001051\end{tableiii}
1052\indexiv{operations on}{mutable}{sequence}{types}
1053\indexiii{operations on}{sequence}{types}
1054\indexiii{operations on}{list}{type}
1055\indexii{subscript}{assignment}
1056\indexii{slice}{assignment}
Michael W. Hudson9c206152003-03-05 14:42:09 +00001057\indexii{extended slice}{assignment}
Fred Drake64e3b431998-07-24 13:56:11 +00001058\stindex{del}
Fred Drake9474d861999-02-12 22:05:33 +00001059\withsubitem{(list method)}{
Fred Drake68921df1999-08-09 17:05:12 +00001060 \ttindex{append()}\ttindex{extend()}\ttindex{count()}\ttindex{index()}
1061 \ttindex{insert()}\ttindex{pop()}\ttindex{remove()}\ttindex{reverse()}
Fred Drakee8391991998-11-25 17:09:19 +00001062 \ttindex{sort()}}
Fred Drake64e3b431998-07-24 13:56:11 +00001063\noindent
1064Notes:
1065\begin{description}
Michael W. Hudson9c206152003-03-05 14:42:09 +00001066\item[(1)] \var{t} must have the same length as the slice it is
1067 replacing.
Michael W. Hudson5efaf7e2002-06-11 10:55:12 +00001068
Michael W. Hudson9c206152003-03-05 14:42:09 +00001069\item[(2)] The C implementation of Python has historically accepted
1070 multiple parameters and implicitly joined them into a tuple; this
1071 no longer works in Python 2.0. Use of this misfeature has been
1072 deprecated since Python 1.4.
Fred Drake38e5d272000-04-03 20:13:55 +00001073
Raymond Hettinger91f5cbe2004-01-08 00:31:50 +00001074\item[(3)] Raises an exception when \var{x} is not a list object.
Michael W. Hudson9c206152003-03-05 14:42:09 +00001075
1076\item[(4)] Raises \exception{ValueError} when \var{x} is not found in
Walter Dörwald93719b52003-06-17 16:19:56 +00001077 \var{s}. When a negative index is passed as the second or third parameter
1078 to the \method{index()} method, the list length is added, as for slice
1079 indices. If it is still negative, it is truncated to zero, as for
1080 slice indices. \versionchanged[Previously, \method{index()} didn't
1081 have arguments for specifying start and stop positions]{2.3}
Fred Drake68921df1999-08-09 17:05:12 +00001082
Michael W. Hudson9c206152003-03-05 14:42:09 +00001083\item[(5)] When a negative index is passed as the first parameter to
Guido van Rossum3a3cca52003-04-14 20:58:14 +00001084 the \method{insert()} method, the list length is added, as for slice
1085 indices. If it is still negative, it is truncated to zero, as for
1086 slice indices. \versionchanged[Previously, all negative indices
1087 were truncated to zero]{2.3}
Fred Drakeef428a22001-10-26 18:57:14 +00001088
Michael W. Hudson9c206152003-03-05 14:42:09 +00001089\item[(6)] The \method{pop()} method is only supported by the list and
Fred Drakefbd3b452000-07-31 23:42:23 +00001090 array types. The optional argument \var{i} defaults to \code{-1},
1091 so that by default the last item is removed and returned.
Fred Drake38e5d272000-04-03 20:13:55 +00001092
Michael W. Hudson9c206152003-03-05 14:42:09 +00001093\item[(7)] The \method{sort()} and \method{reverse()} methods modify the
Fred Drake38e5d272000-04-03 20:13:55 +00001094 list in place for economy of space when sorting or reversing a large
Skip Montanaro41d7d582001-07-25 16:18:19 +00001095 list. To remind you that they operate by side effect, they don't return
1096 the sorted or reversed list.
Fred Drake38e5d272000-04-03 20:13:55 +00001097
Raymond Hettinger64958a12003-12-17 20:43:33 +00001098\item[(8)] The \method{sort()} method takes optional arguments for
Andrew M. Kuchling55be9ea2004-09-10 12:59:54 +00001099 controlling the comparisons.
Raymond Hettinger42b1ba32003-10-16 03:41:09 +00001100
1101 \var{cmp} specifies a custom comparison function of two arguments
1102 (list items) which should return a negative, zero or positive number
1103 depending on whether the first argument is considered smaller than,
1104 equal to, or larger than the second argument:
1105 \samp{\var{cmp}=\keyword{lambda} \var{x},\var{y}:
1106 \function{cmp}(x.lower(), y.lower())}
1107
1108 \var{key} specifies a function of one argument that is used to
1109 extract a comparison key from each list element:
1110 \samp{\var{cmp}=\function{str.lower}}
1111
1112 \var{reverse} is a boolean value. If set to \code{True}, then the
1113 list elements are sorted as if each comparison were reversed.
1114
1115 In general, the \var{key} and \var{reverse} conversion processes are
1116 much faster than specifying an equivalent \var{cmp} function. This is
1117 because \var{cmp} is called multiple times for each list element while
Fred Drake5b6150e2003-10-21 17:04:21 +00001118 \var{key} and \var{reverse} touch each element only once.
Raymond Hettinger42b1ba32003-10-16 03:41:09 +00001119
Fred Drake4cee2202003-03-20 22:17:59 +00001120 \versionchanged[Support for \code{None} as an equivalent to omitting
Fred Drake401d1e32003-12-30 22:21:18 +00001121 \var{cmp} was added]{2.3}
Fred Drake4cee2202003-03-20 22:17:59 +00001122
Fred Drake5b6150e2003-10-21 17:04:21 +00001123 \versionchanged[Support for \var{key} and \var{reverse} was added]{2.4}
Fred Drake4cee2202003-03-20 22:17:59 +00001124
Fred Drake401d1e32003-12-30 22:21:18 +00001125\item[(9)] Starting with Python 2.3, the \method{sort()} method is
Raymond Hettinger64958a12003-12-17 20:43:33 +00001126 guaranteed to be stable. A sort is stable if it guarantees not to
Raymond Hettinger42b1ba32003-10-16 03:41:09 +00001127 change the relative order of elements that compare equal --- this is
1128 helpful for sorting in multiple passes (for example, sort by
1129 department, then by salary grade).
Tim Petersb9099c32002-11-12 22:08:10 +00001130
Michael W. Hudson9c206152003-03-05 14:42:09 +00001131\item[(10)] While a list is being sorted, the effect of attempting to
Fred Drake401d1e32003-12-30 22:21:18 +00001132 mutate, or even inspect, the list is undefined. The C
1133 implementation of Python 2.3 and newer makes the list appear empty
1134 for the duration, and raises \exception{ValueError} if it can detect
1135 that the list has been mutated during a sort.
Fred Drake64e3b431998-07-24 13:56:11 +00001136\end{description}
1137
Raymond Hettingerf5f41bf2003-11-24 02:57:33 +00001138\subsection{Set Types \label{types-set}}
1139\obindex{set}
1140
1141A \dfn{set} object is an unordered collection of immutable values.
1142Common uses include membership testing, removing duplicates from a sequence,
1143and computing mathematical operations such as intersection, union, difference,
1144and symmetric difference.
1145\versionadded{2.4}
1146
1147Like other collections, sets support \code{\var{x} in \var{set}},
1148\code{len(\var{set})}, and \code{for \var{x} in \var{set}}. Being an
1149unordered collection, sets do not record element position or order of
1150insertion. Accordingly, sets do not support indexing, slicing, or
1151other sequence-like behavior.
1152
1153There are currently two builtin set types, \class{set} and \class{frozenset}.
1154The \class{set} type is mutable --- the contents can be changed using methods
1155like \method{add()} and \method{remove()}. Since it is mutable, it has no
1156hash value and cannot be used as either a dictionary key or as an element of
1157another set. The \class{frozenset} type is immutable and hashable --- its
1158contents cannot be altered after is created; however, it can be used as
1159a dictionary key or as an element of another set.
1160
1161Instances of \class{set} and \class{frozenset} provide the following operations:
1162
1163\begin{tableiii}{c|c|l}{code}{Operation}{Equivalent}{Result}
1164 \lineiii{len(\var{s})}{}{cardinality of set \var{s}}
1165
1166 \hline
1167 \lineiii{\var{x} in \var{s}}{}
1168 {test \var{x} for membership in \var{s}}
1169 \lineiii{\var{x} not in \var{s}}{}
1170 {test \var{x} for non-membership in \var{s}}
1171 \lineiii{\var{s}.issubset(\var{t})}{\code{\var{s} <= \var{t}}}
1172 {test whether every element in \var{s} is in \var{t}}
1173 \lineiii{\var{s}.issuperset(\var{t})}{\code{\var{s} >= \var{t}}}
1174 {test whether every element in \var{t} is in \var{s}}
1175
1176 \hline
1177 \lineiii{\var{s}.union(\var{t})}{\var{s} | \var{t}}
1178 {new set with elements from both \var{s} and \var{t}}
1179 \lineiii{\var{s}.intersection(\var{t})}{\var{s} \&\ \var{t}}
1180 {new set with elements common to \var{s} and \var{t}}
1181 \lineiii{\var{s}.difference(\var{t})}{\var{s} - \var{t}}
1182 {new set with elements in \var{s} but not in \var{t}}
1183 \lineiii{\var{s}.symmetric_difference(\var{t})}{\var{s} \^\ \var{t}}
1184 {new set with elements in either \var{s} or \var{t} but not both}
1185 \lineiii{\var{s}.copy()}{}
1186 {new set with a shallow copy of \var{s}}
1187\end{tableiii}
1188
1189Note, the non-operator versions of \method{union()}, \method{intersection()},
1190\method{difference()}, and \method{symmetric_difference()},
1191\method{issubset()}, and \method{issuperset()} methods will accept any
1192iterable as an argument. In contrast, their operator based counterparts
1193require their arguments to be sets. This precludes error-prone constructions
1194like \code{set('abc') \&\ 'cbs'} in favor of the more readable
1195\code{set('abc').intersection('cbs')}.
1196
1197Both \class{set} and \class{frozenset} support set to set comparisons.
1198Two sets are equal if and only if every element of each set is contained in
1199the other (each is a subset of the other).
1200A set is less than another set if and only if the first set is a proper
1201subset of the second set (is a subset, but is not equal).
1202A set is greater than another set if and only if the first set is a proper
1203superset of the second set (is a superset, but is not equal).
1204
Raymond Hettingercab5b942004-07-22 19:33:53 +00001205Instanceas of \class{set} are compared to instances of \class{frozenset} based
1206on their members. For example, \samp{set('abc') == frozenset('abc')} returns
1207\code{True}.
1208
Raymond Hettingerf5f41bf2003-11-24 02:57:33 +00001209The subset and equality comparisons do not generalize to a complete
1210ordering function. For example, any two disjoint sets are not equal and
1211are not subsets of each other, so \emph{all} of the following return
1212\code{False}: \code{\var{a}<\var{b}}, \code{\var{a}==\var{b}}, or
1213\code{\var{a}>\var{b}}.
1214Accordingly, sets do not implement the \method{__cmp__} method.
1215
1216Since sets only define partial ordering (subset relationships), the output
1217of the \method{list.sort()} method is undefined for lists of sets.
1218
Raymond Hettingercab5b942004-07-22 19:33:53 +00001219Binary operations that mix \class{set} instances with \class{frozenset}
1220return the type of the first operand. For example:
1221\samp{frozenset('ab') | set('bc')} returns an instance of \class{frozenset}.
Raymond Hettingerf5f41bf2003-11-24 02:57:33 +00001222
1223The following table lists operations available for \class{set}
1224that do not apply to immutable instances of \class{frozenset}:
1225
1226\begin{tableiii}{c|c|l}{code}{Operation}{Equivalent}{Result}
1227 \lineiii{\var{s}.update(\var{t})}
1228 {\var{s} |= \var{t}}
1229 {return set \var{s} with elements added from \var{t}}
1230 \lineiii{\var{s}.intersection_update(\var{t})}
1231 {\var{s} \&= \var{t}}
1232 {return set \var{s} keeping only elements also found in \var{t}}
1233 \lineiii{\var{s}.difference_update(\var{t})}
1234 {\var{s} -= \var{t}}
1235 {return set \var{s} after removing elements found in \var{t}}
1236 \lineiii{\var{s}.symmetric_difference_update(\var{t})}
1237 {\var{s} \textasciicircum= \var{t}}
1238 {return set \var{s} with elements from \var{s} or \var{t}
1239 but not both}
1240
1241 \hline
1242 \lineiii{\var{s}.add(\var{x})}{}
1243 {add element \var{x} to set \var{s}}
1244 \lineiii{\var{s}.remove(\var{x})}{}
1245 {remove \var{x} from set \var{s}; raises KeyError if not present}
1246 \lineiii{\var{s}.discard(\var{x})}{}
1247 {removes \var{x} from set \var{s} if present}
1248 \lineiii{\var{s}.pop()}{}
1249 {remove and return an arbitrary element from \var{s}; raises
1250 \exception{KeyError} if empty}
1251 \lineiii{\var{s}.clear()}{}
1252 {remove all elements from set \var{s}}
1253\end{tableiii}
1254
1255Note, the non-operator versions of the \method{update()},
1256\method{intersection_update()}, \method{difference_update()}, and
1257\method{symmetric_difference_update()} methods will accept any iterable
1258as an argument.
1259
Fred Drake64e3b431998-07-24 13:56:11 +00001260
Fred Drake7a2f0661998-09-10 18:25:58 +00001261\subsection{Mapping Types \label{typesmapping}}
Fred Drake0b4e25d2000-10-04 04:21:19 +00001262\obindex{mapping}
1263\obindex{dictionary}
Fred Drake64e3b431998-07-24 13:56:11 +00001264
Steve Holden1e4519f2002-06-14 09:16:40 +00001265A \dfn{mapping} object maps immutable values to
Fred Drake64e3b431998-07-24 13:56:11 +00001266arbitrary objects. Mappings are mutable objects. There is currently
1267only one standard mapping type, the \dfn{dictionary}. A dictionary's keys are
Steve Holden1e4519f2002-06-14 09:16:40 +00001268almost arbitrary values. Only values containing lists, dictionaries
1269or other mutable types (that are compared by value rather than by
1270object identity) may not be used as keys.
Fred Drake64e3b431998-07-24 13:56:11 +00001271Numeric types used for keys obey the normal rules for numeric
Raymond Hettinger74c8e552003-09-12 00:02:37 +00001272comparison: if two numbers compare equal (such as \code{1} and
Fred Drake64e3b431998-07-24 13:56:11 +00001273\code{1.0}) then they can be used interchangeably to index the same
1274dictionary entry.
1275
Fred Drake64e3b431998-07-24 13:56:11 +00001276Dictionaries are created by placing a comma-separated list of
1277\code{\var{key}: \var{value}} pairs within braces, for example:
1278\code{\{'jack': 4098, 'sjoerd': 4127\}} or
1279\code{\{4098: 'jack', 4127: 'sjoerd'\}}.
1280
Fred Drake9c5cc141999-06-10 22:37:34 +00001281The following operations are defined on mappings (where \var{a} and
1282\var{b} are mappings, \var{k} is a key, and \var{v} and \var{x} are
1283arbitrary objects):
Fred Drake64e3b431998-07-24 13:56:11 +00001284\indexiii{operations on}{mapping}{types}
1285\indexiii{operations on}{dictionary}{type}
1286\stindex{del}
1287\bifuncindex{len}
Fred Drake9474d861999-02-12 22:05:33 +00001288\withsubitem{(dictionary method)}{
1289 \ttindex{clear()}
1290 \ttindex{copy()}
1291 \ttindex{has_key()}
Raymond Hettinger74c8e552003-09-12 00:02:37 +00001292 \ttindex{fromkeys()}
Fred Drake9474d861999-02-12 22:05:33 +00001293 \ttindex{items()}
1294 \ttindex{keys()}
1295 \ttindex{update()}
1296 \ttindex{values()}
Michael W. Hudson9c206152003-03-05 14:42:09 +00001297 \ttindex{get()}
1298 \ttindex{setdefault()}
1299 \ttindex{pop()}
1300 \ttindex{popitem()}
1301 \ttindex{iteritems()}
Raymond Hettinger0dfd7a92003-05-10 07:40:56 +00001302 \ttindex{iterkeys()}
Michael W. Hudson9c206152003-03-05 14:42:09 +00001303 \ttindex{itervalues()}}
Fred Drake9c5cc141999-06-10 22:37:34 +00001304
1305\begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
1306 \lineiii{len(\var{a})}{the number of items in \var{a}}{}
1307 \lineiii{\var{a}[\var{k}]}{the item of \var{a} with key \var{k}}{(1)}
Fred Drake1e75e172000-07-31 16:34:46 +00001308 \lineiii{\var{a}[\var{k}] = \var{v}}
1309 {set \code{\var{a}[\var{k}]} to \var{v}}
Fred Drake9c5cc141999-06-10 22:37:34 +00001310 {}
1311 \lineiii{del \var{a}[\var{k}]}
1312 {remove \code{\var{a}[\var{k}]} from \var{a}}
1313 {(1)}
1314 \lineiii{\var{a}.clear()}{remove all items from \code{a}}{}
1315 \lineiii{\var{a}.copy()}{a (shallow) copy of \code{a}}{}
Guido van Rossum8b3d6ca2001-04-23 13:22:59 +00001316 \lineiii{\var{a}.has_key(\var{k})}
Raymond Hettinger6e13bcc2003-08-08 11:07:59 +00001317 {\code{True} if \var{a} has a key \var{k}, else \code{False}}
Fred Drake9c5cc141999-06-10 22:37:34 +00001318 {}
Guido van Rossum8b3d6ca2001-04-23 13:22:59 +00001319 \lineiii{\var{k} \code{in} \var{a}}
1320 {Equivalent to \var{a}.has_key(\var{k})}
Fred Drakec6d8f8d2001-05-25 04:24:37 +00001321 {(2)}
Guido van Rossum0dbb4fb2001-04-20 16:50:40 +00001322 \lineiii{\var{k} not in \var{a}}
Guido van Rossum8b3d6ca2001-04-23 13:22:59 +00001323 {Equivalent to \code{not} \var{a}.has_key(\var{k})}
Fred Drakec6d8f8d2001-05-25 04:24:37 +00001324 {(2)}
Fred Drake9c5cc141999-06-10 22:37:34 +00001325 \lineiii{\var{a}.items()}
1326 {a copy of \var{a}'s list of (\var{key}, \var{value}) pairs}
Fred Drakec6d8f8d2001-05-25 04:24:37 +00001327 {(3)}
Fred Drake4a6c5c52001-06-12 03:31:56 +00001328 \lineiii{\var{a}.keys()}{a copy of \var{a}'s list of keys}{(3)}
Raymond Hettinger31017ae2004-03-04 08:25:44 +00001329 \lineiii{\var{a}.update(\optional{\var{b}})}
1330 {updates (and overwrites) key/value pairs from \var{b}}
1331 {(9)}
Raymond Hettingere33d3df2002-11-27 07:29:33 +00001332 \lineiii{\var{a}.fromkeys(\var{seq}\optional{, \var{value}})}
1333 {Creates a new dictionary with keys from \var{seq} and values set to \var{value}}
1334 {(7)}
Fred Drake4a6c5c52001-06-12 03:31:56 +00001335 \lineiii{\var{a}.values()}{a copy of \var{a}'s list of values}{(3)}
Fred Drake9c5cc141999-06-10 22:37:34 +00001336 \lineiii{\var{a}.get(\var{k}\optional{, \var{x}})}
Fred Drake4cacec52001-04-21 05:56:06 +00001337 {\code{\var{a}[\var{k}]} if \code{\var{k} in \var{a}},
Fred Drake9c5cc141999-06-10 22:37:34 +00001338 else \var{x}}
Barry Warsawe9218a12001-06-26 20:32:59 +00001339 {(4)}
Guido van Rossum8141cf52000-08-08 16:15:49 +00001340 \lineiii{\var{a}.setdefault(\var{k}\optional{, \var{x}})}
Fred Drake4cacec52001-04-21 05:56:06 +00001341 {\code{\var{a}[\var{k}]} if \code{\var{k} in \var{a}},
Guido van Rossum8141cf52000-08-08 16:15:49 +00001342 else \var{x} (also setting it)}
Barry Warsawe9218a12001-06-26 20:32:59 +00001343 {(5)}
Raymond Hettingera3e1e4c2003-03-06 23:54:28 +00001344 \lineiii{\var{a}.pop(\var{k}\optional{, \var{x}})}
1345 {\code{\var{a}[\var{k}]} if \code{\var{k} in \var{a}},
1346 else \var{x} (and remove k)}
1347 {(8)}
Guido van Rossumff63f202000-12-12 22:03:47 +00001348 \lineiii{\var{a}.popitem()}
1349 {remove and return an arbitrary (\var{key}, \var{value}) pair}
Barry Warsawe9218a12001-06-26 20:32:59 +00001350 {(6)}
Fred Drakec6d8f8d2001-05-25 04:24:37 +00001351 \lineiii{\var{a}.iteritems()}
1352 {return an iterator over (\var{key}, \var{value}) pairs}
Fred Drake01777832002-08-19 21:58:58 +00001353 {(2), (3)}
Fred Drakec6d8f8d2001-05-25 04:24:37 +00001354 \lineiii{\var{a}.iterkeys()}
1355 {return an iterator over the mapping's keys}
Fred Drake01777832002-08-19 21:58:58 +00001356 {(2), (3)}
Fred Drakec6d8f8d2001-05-25 04:24:37 +00001357 \lineiii{\var{a}.itervalues()}
1358 {return an iterator over the mapping's values}
Fred Drake01777832002-08-19 21:58:58 +00001359 {(2), (3)}
Fred Drake9c5cc141999-06-10 22:37:34 +00001360\end{tableiii}
1361
Fred Drake64e3b431998-07-24 13:56:11 +00001362\noindent
1363Notes:
1364\begin{description}
Fred Drake9c5cc141999-06-10 22:37:34 +00001365\item[(1)] Raises a \exception{KeyError} exception if \var{k} is not
1366in the map.
Fred Drake64e3b431998-07-24 13:56:11 +00001367
Fred Drakec6d8f8d2001-05-25 04:24:37 +00001368\item[(2)] \versionadded{2.2}
1369
1370\item[(3)] Keys and values are listed in random order. If
Fred Drake01777832002-08-19 21:58:58 +00001371\method{items()}, \method{keys()}, \method{values()},
1372\method{iteritems()}, \method{iterkeys()}, and \method{itervalues()}
1373are called with no intervening modifications to the dictionary, the
1374lists will directly correspond. This allows the creation of
1375\code{(\var{value}, \var{key})} pairs using \function{zip()}:
1376\samp{pairs = zip(\var{a}.values(), \var{a}.keys())}. The same
1377relationship holds for the \method{iterkeys()} and
1378\method{itervalues()} methods: \samp{pairs = zip(\var{a}.itervalues(),
1379\var{a}.iterkeys())} provides the same value for \code{pairs}.
1380Another way to create the same list is \samp{pairs = [(v, k) for (k,
1381v) in \var{a}.iteritems()]}.
Fred Drake64e3b431998-07-24 13:56:11 +00001382
Barry Warsawe9218a12001-06-26 20:32:59 +00001383\item[(4)] Never raises an exception if \var{k} is not in the map,
Fred Drake38e5d272000-04-03 20:13:55 +00001384instead it returns \var{x}. \var{x} is optional; when \var{x} is not
Fred Drake9c5cc141999-06-10 22:37:34 +00001385provided and \var{k} is not in the map, \code{None} is returned.
Guido van Rossum8141cf52000-08-08 16:15:49 +00001386
Barry Warsawe9218a12001-06-26 20:32:59 +00001387\item[(5)] \function{setdefault()} is like \function{get()}, except
Guido van Rossum8141cf52000-08-08 16:15:49 +00001388that if \var{k} is missing, \var{x} is both returned and inserted into
Michael W. Hudson049e7aa2004-08-07 16:41:34 +00001389the dictionary as the value of \var{k}. \var{x} defaults to \var{None}.
Guido van Rossumff63f202000-12-12 22:03:47 +00001390
Barry Warsawe9218a12001-06-26 20:32:59 +00001391\item[(6)] \function{popitem()} is useful to destructively iterate
Guido van Rossumff63f202000-12-12 22:03:47 +00001392over a dictionary, as often used in set algorithms.
Fred Drake64e3b431998-07-24 13:56:11 +00001393
Raymond Hettingere33d3df2002-11-27 07:29:33 +00001394\item[(7)] \function{fromkeys()} is a class method that returns a
1395new dictionary. \var{value} defaults to \code{None}. \versionadded{2.3}
Raymond Hettingera3e1e4c2003-03-06 23:54:28 +00001396
1397\item[(8)] \function{pop()} raises a \exception{KeyError} when no default
1398value is given and the key is not found. \versionadded{2.3}
Raymond Hettingere33d3df2002-11-27 07:29:33 +00001399
Raymond Hettinger31017ae2004-03-04 08:25:44 +00001400\item[(9)] \function{update()} accepts either another mapping object
1401or an iterable of key/value pairs (as a tuple or other iterable of
1402length two). If keyword arguments are specified, the mapping is
1403then is updated with those key/value pairs:
1404\samp{d.update(red=1, blue=2)}.
1405\versionchanged[Allowed the argument to be an iterable of key/value
1406 pairs and allowed keyword arguments]{2.4}
Fred Drake64e3b431998-07-24 13:56:11 +00001407
Hye-Shik Chang9168c702004-03-09 05:53:15 +00001408\end{description}
1409
Fred Drake99de2182001-10-30 06:23:14 +00001410\subsection{File Objects
1411 \label{bltin-file-objects}}
Fred Drake64e3b431998-07-24 13:56:11 +00001412
Fred Drake99de2182001-10-30 06:23:14 +00001413File objects\obindex{file} are implemented using C's \code{stdio}
1414package and can be created with the built-in constructor
Tim Peters8f01b682002-03-12 03:04:44 +00001415\function{file()}\bifuncindex{file} described in section
Tim Peters003047a2001-10-30 05:54:04 +00001416\ref{built-in-funcs}, ``Built-in Functions.''\footnote{\function{file()}
1417is new in Python 2.2. The older built-in \function{open()} is an
Fred Drake401d1e32003-12-30 22:21:18 +00001418alias for \function{file()}.} File objects are also returned
Fred Drake907e76b2001-07-06 20:30:11 +00001419by some other built-in functions and methods, such as
Fred Drake4de96c22000-08-12 03:36:23 +00001420\function{os.popen()} and \function{os.fdopen()} and the
Fred Drake130072d1998-10-28 20:08:35 +00001421\method{makefile()} method of socket objects.
Fred Drake4de96c22000-08-12 03:36:23 +00001422\refstmodindex{os}
Fred Drake64e3b431998-07-24 13:56:11 +00001423\refbimodindex{socket}
1424
1425When a file operation fails for an I/O-related reason, the exception
Fred Drake84538cd1998-11-30 21:51:25 +00001426\exception{IOError} is raised. This includes situations where the
1427operation is not defined for some reason, like \method{seek()} on a tty
Fred Drake64e3b431998-07-24 13:56:11 +00001428device or writing a file opened for reading.
1429
1430Files have the following methods:
1431
1432
1433\begin{methoddesc}[file]{close}{}
Steve Holden1e4519f2002-06-14 09:16:40 +00001434 Close the file. A closed file cannot be read or written any more.
Fred Drakea776cea2000-11-06 20:17:37 +00001435 Any operation which requires that the file be open will raise a
1436 \exception{ValueError} after the file has been closed. Calling
Fred Drake752ba392000-09-19 15:18:51 +00001437 \method{close()} more than once is allowed.
Fred Drake64e3b431998-07-24 13:56:11 +00001438\end{methoddesc}
1439
1440\begin{methoddesc}[file]{flush}{}
Fred Drake752ba392000-09-19 15:18:51 +00001441 Flush the internal buffer, like \code{stdio}'s
1442 \cfunction{fflush()}. This may be a no-op on some file-like
1443 objects.
Fred Drake64e3b431998-07-24 13:56:11 +00001444\end{methoddesc}
1445
Fred Drake64e3b431998-07-24 13:56:11 +00001446\begin{methoddesc}[file]{fileno}{}
Fred Drake752ba392000-09-19 15:18:51 +00001447 \index{file descriptor}
1448 \index{descriptor, file}
1449 Return the integer ``file descriptor'' that is used by the
1450 underlying implementation to request I/O operations from the
1451 operating system. This can be useful for other, lower level
Fred Drake907e76b2001-07-06 20:30:11 +00001452 interfaces that use file descriptors, such as the
1453 \refmodule{fcntl}\refbimodindex{fcntl} module or
Fred Drake0aa811c2001-10-20 04:24:09 +00001454 \function{os.read()} and friends. \note{File-like objects
Fred Drake907e76b2001-07-06 20:30:11 +00001455 which do not have a real file descriptor should \emph{not} provide
Fred Drake0aa811c2001-10-20 04:24:09 +00001456 this method!}
Fred Drake64e3b431998-07-24 13:56:11 +00001457\end{methoddesc}
1458
Guido van Rossum0fc01862002-08-06 17:01:28 +00001459\begin{methoddesc}[file]{isatty}{}
1460 Return \code{True} if the file is connected to a tty(-like) device, else
1461 \code{False}. \note{If a file-like object is not associated
1462 with a real file, this method should \emph{not} be implemented.}
1463\end{methoddesc}
1464
1465\begin{methoddesc}[file]{next}{}
Raymond Hettinger74c8e552003-09-12 00:02:37 +00001466A file object is its own iterator, for example \code{iter(\var{f})} returns
Guido van Rossum0fc01862002-08-06 17:01:28 +00001467\var{f} (unless \var{f} is closed). When a file is used as an
1468iterator, typically in a \keyword{for} loop (for example,
1469\code{for line in f: print line}), the \method{next()} method is
1470called repeatedly. This method returns the next input line, or raises
1471\exception{StopIteration} when \EOF{} is hit. In order to make a
1472\keyword{for} loop the most efficient way of looping over the lines of
1473a file (a very common operation), the \method{next()} method uses a
1474hidden read-ahead buffer. As a consequence of using a read-ahead
1475buffer, combining \method{next()} with other file methods (like
1476\method{readline()}) does not work right. However, using
1477\method{seek()} to reposition the file to an absolute position will
1478flush the read-ahead buffer.
1479\versionadded{2.3}
1480\end{methoddesc}
1481
Fred Drake64e3b431998-07-24 13:56:11 +00001482\begin{methoddesc}[file]{read}{\optional{size}}
1483 Read at most \var{size} bytes from the file (less if the read hits
Fred Drakef4cbada1999-04-14 14:31:53 +00001484 \EOF{} before obtaining \var{size} bytes). If the \var{size}
1485 argument is negative or omitted, read all data until \EOF{} is
1486 reached. The bytes are returned as a string object. An empty
1487 string is returned when \EOF{} is encountered immediately. (For
1488 certain files, like ttys, it makes sense to continue reading after
1489 an \EOF{} is hit.) Note that this method may call the underlying
1490 C function \cfunction{fread()} more than once in an effort to
Gustavo Niemeyer786ddb22002-12-16 18:12:53 +00001491 acquire as close to \var{size} bytes as possible. Also note that
1492 when in non-blocking mode, less data than what was requested may
1493 be returned, even if no \var{size} parameter was given.
Fred Drake64e3b431998-07-24 13:56:11 +00001494\end{methoddesc}
1495
1496\begin{methoddesc}[file]{readline}{\optional{size}}
1497 Read one entire line from the file. A trailing newline character is
Fred Drake401d1e32003-12-30 22:21:18 +00001498 kept in the string (but may be absent when a file ends with an
1499 incomplete line).\footnote{
Steve Holden1e4519f2002-06-14 09:16:40 +00001500 The advantage of leaving the newline on is that
1501 returning an empty string is then an unambiguous \EOF{}
1502 indication. It is also possible (in cases where it might
1503 matter, for example, if you
Tim Peters8f01b682002-03-12 03:04:44 +00001504 want to make an exact copy of a file while scanning its lines)
Steve Holden1e4519f2002-06-14 09:16:40 +00001505 to tell whether the last line of a file ended in a newline
Fred Drake4de96c22000-08-12 03:36:23 +00001506 or not (yes this happens!).
Fred Drake401d1e32003-12-30 22:21:18 +00001507 } If the \var{size} argument is present and
Fred Drake64e3b431998-07-24 13:56:11 +00001508 non-negative, it is a maximum byte count (including the trailing
1509 newline) and an incomplete line may be returned.
Steve Holden1e4519f2002-06-14 09:16:40 +00001510 An empty string is returned \emph{only} when \EOF{} is encountered
Fred Drake0aa811c2001-10-20 04:24:09 +00001511 immediately. \note{Unlike \code{stdio}'s \cfunction{fgets()}, the
Fred Drake752ba392000-09-19 15:18:51 +00001512 returned string contains null characters (\code{'\e 0'}) if they
Fred Drake0aa811c2001-10-20 04:24:09 +00001513 occurred in the input.}
Fred Drake64e3b431998-07-24 13:56:11 +00001514\end{methoddesc}
1515
1516\begin{methoddesc}[file]{readlines}{\optional{sizehint}}
1517 Read until \EOF{} using \method{readline()} and return a list containing
1518 the lines thus read. If the optional \var{sizehint} argument is
Fred Drakec37b65e2001-11-28 07:26:15 +00001519 present, instead of reading up to \EOF, whole lines totalling
Fred Drake64e3b431998-07-24 13:56:11 +00001520 approximately \var{sizehint} bytes (possibly after rounding up to an
Fred Drake752ba392000-09-19 15:18:51 +00001521 internal buffer size) are read. Objects implementing a file-like
1522 interface may choose to ignore \var{sizehint} if it cannot be
1523 implemented, or cannot be implemented efficiently.
Fred Drake64e3b431998-07-24 13:56:11 +00001524\end{methoddesc}
1525
Guido van Rossum20ab9e92001-01-17 01:18:00 +00001526\begin{methoddesc}[file]{xreadlines}{}
Guido van Rossum0fc01862002-08-06 17:01:28 +00001527 This method returns the same thing as \code{iter(f)}.
Fred Drake82f93c62001-04-22 01:56:51 +00001528 \versionadded{2.1}
Fred Drake401d1e32003-12-30 22:21:18 +00001529 \deprecated{2.3}{Use \samp{for \var{line} in \var{file}} instead.}
Guido van Rossum20ab9e92001-01-17 01:18:00 +00001530\end{methoddesc}
1531
Fred Drake64e3b431998-07-24 13:56:11 +00001532\begin{methoddesc}[file]{seek}{offset\optional{, whence}}
1533 Set the file's current position, like \code{stdio}'s \cfunction{fseek()}.
1534 The \var{whence} argument is optional and defaults to \code{0}
1535 (absolute file positioning); other values are \code{1} (seek
1536 relative to the current position) and \code{2} (seek relative to the
Fred Drake19ae7832001-01-04 05:16:39 +00001537 file's end). There is no return value. Note that if the file is
1538 opened for appending (mode \code{'a'} or \code{'a+'}), any
1539 \method{seek()} operations will be undone at the next write. If the
1540 file is only opened for writing in append mode (mode \code{'a'}),
1541 this method is essentially a no-op, but it remains useful for files
Martin v. Löwis849a9722003-10-18 09:38:01 +00001542 opened in append mode with reading enabled (mode \code{'a+'}). If the
1543 file is opened in text mode (mode \code{'t'}), only offsets returned
1544 by \method{tell()} are legal. Use of other offsets causes undefined
1545 behavior.
1546
1547 Note that not all file objects are seekable.
Fred Drake64e3b431998-07-24 13:56:11 +00001548\end{methoddesc}
1549
1550\begin{methoddesc}[file]{tell}{}
1551 Return the file's current position, like \code{stdio}'s
1552 \cfunction{ftell()}.
1553\end{methoddesc}
1554
1555\begin{methoddesc}[file]{truncate}{\optional{size}}
Tim Peters8f01b682002-03-12 03:04:44 +00001556 Truncate the file's size. If the optional \var{size} argument is
Fred Drake752ba392000-09-19 15:18:51 +00001557 present, the file is truncated to (at most) that size. The size
Tim Peters8f01b682002-03-12 03:04:44 +00001558 defaults to the current position. The current file position is
1559 not changed. Note that if a specified size exceeds the file's
1560 current size, the result is platform-dependent: possibilities
1561 include that file may remain unchanged, increase to the specified
1562 size as if zero-filled, or increase to the specified size with
1563 undefined new content.
Raymond Hettingerb67449d2003-09-08 18:52:18 +00001564 Availability: Windows, many \UNIX{} variants.
Fred Drake64e3b431998-07-24 13:56:11 +00001565\end{methoddesc}
1566
1567\begin{methoddesc}[file]{write}{str}
Fred Drake0aa811c2001-10-20 04:24:09 +00001568 Write a string to the file. There is no return value. Due to
Fred Drake3c48ef72001-01-09 22:47:46 +00001569 buffering, the string may not actually show up in the file until
1570 the \method{flush()} or \method{close()} method is called.
Fred Drake64e3b431998-07-24 13:56:11 +00001571\end{methoddesc}
1572
Tim Peters2c9aa5e2001-09-23 04:06:05 +00001573\begin{methoddesc}[file]{writelines}{sequence}
1574 Write a sequence of strings to the file. The sequence can be any
1575 iterable object producing strings, typically a list of strings.
1576 There is no return value.
Fred Drake3c48ef72001-01-09 22:47:46 +00001577 (The name is intended to match \method{readlines()};
1578 \method{writelines()} does not add line separators.)
1579\end{methoddesc}
1580
Fred Drake64e3b431998-07-24 13:56:11 +00001581
Fred Drake038d2642001-09-22 04:34:48 +00001582Files support the iterator protocol. Each iteration returns the same
1583result as \code{\var{file}.readline()}, and iteration ends when the
1584\method{readline()} method returns an empty string.
1585
1586
Fred Drake752ba392000-09-19 15:18:51 +00001587File objects also offer a number of other interesting attributes.
1588These are not required for file-like objects, but should be
1589implemented if they make sense for the particular object.
Fred Drake64e3b431998-07-24 13:56:11 +00001590
1591\begin{memberdesc}[file]{closed}
Neal Norwitz6b353702002-04-09 18:15:00 +00001592bool indicating the current state of the file object. This is a
Fred Drake64e3b431998-07-24 13:56:11 +00001593read-only attribute; the \method{close()} method changes the value.
Fred Drake752ba392000-09-19 15:18:51 +00001594It may not be available on all file-like objects.
Fred Drake64e3b431998-07-24 13:56:11 +00001595\end{memberdesc}
1596
Martin v. Löwis5467d4c2003-05-10 07:10:12 +00001597\begin{memberdesc}[file]{encoding}
1598The encoding that this file uses. When Unicode strings are written
1599to a file, they will be converted to byte strings using this encoding.
1600In addition, when the file is connected to a terminal, the attribute
1601gives the encoding that the terminal is likely to use (that
1602information might be incorrect if the user has misconfigured the
1603terminal). The attribute is read-only and may not be present on
1604all file-like objects. It may also be \code{None}, in which case
1605the file uses the system default encoding for converting Unicode
1606strings.
1607
1608\versionadded{2.3}
1609\end{memberdesc}
1610
Fred Drake64e3b431998-07-24 13:56:11 +00001611\begin{memberdesc}[file]{mode}
1612The I/O mode for the file. If the file was created using the
1613\function{open()} built-in function, this will be the value of the
Fred Drake752ba392000-09-19 15:18:51 +00001614\var{mode} parameter. This is a read-only attribute and may not be
1615present on all file-like objects.
Fred Drake64e3b431998-07-24 13:56:11 +00001616\end{memberdesc}
1617
1618\begin{memberdesc}[file]{name}
1619If the file object was created using \function{open()}, the name of
1620the file. Otherwise, some string that indicates the source of the
1621file object, of the form \samp{<\mbox{\ldots}>}. This is a read-only
Fred Drake752ba392000-09-19 15:18:51 +00001622attribute and may not be present on all file-like objects.
Fred Drake64e3b431998-07-24 13:56:11 +00001623\end{memberdesc}
1624
Michael W. Hudson9c206152003-03-05 14:42:09 +00001625\begin{memberdesc}[file]{newlines}
Fred Drake7c67cb82003-12-30 17:17:17 +00001626If Python was built with the \longprogramopt{with-universal-newlines}
1627option to \program{configure} (the default) this read-only attribute
1628exists, and for files opened in
Michael W. Hudson9c206152003-03-05 14:42:09 +00001629universal newline read mode it keeps track of the types of newlines
1630encountered while reading the file. The values it can take are
1631\code{'\e r'}, \code{'\e n'}, \code{'\e r\e n'}, \code{None} (unknown,
1632no newlines read yet) or a tuple containing all the newline
1633types seen, to indicate that multiple
1634newline conventions were encountered. For files not opened in universal
1635newline read mode the value of this attribute will be \code{None}.
1636\end{memberdesc}
1637
Fred Drake64e3b431998-07-24 13:56:11 +00001638\begin{memberdesc}[file]{softspace}
1639Boolean that indicates whether a space character needs to be printed
1640before another value when using the \keyword{print} statement.
1641Classes that are trying to simulate a file object should also have a
1642writable \member{softspace} attribute, which should be initialized to
Fred Drake66571cc2000-09-09 03:30:34 +00001643zero. This will be automatic for most classes implemented in Python
1644(care may be needed for objects that override attribute access); types
1645implemented in C will have to provide a writable
1646\member{softspace} attribute.
Fred Drake0aa811c2001-10-20 04:24:09 +00001647\note{This attribute is not used to control the
Fred Drake51f53df2000-09-20 04:48:20 +00001648\keyword{print} statement, but to allow the implementation of
Fred Drake0aa811c2001-10-20 04:24:09 +00001649\keyword{print} to keep track of its internal state.}
Fred Drake64e3b431998-07-24 13:56:11 +00001650\end{memberdesc}
1651
Fred Drakea776cea2000-11-06 20:17:37 +00001652
Fred Drake99de2182001-10-30 06:23:14 +00001653\subsection{Other Built-in Types \label{typesother}}
1654
1655The interpreter supports several other kinds of objects.
1656Most of these support only one or two operations.
1657
1658
1659\subsubsection{Modules \label{typesmodules}}
1660
1661The only special operation on a module is attribute access:
1662\code{\var{m}.\var{name}}, where \var{m} is a module and \var{name}
1663accesses a name defined in \var{m}'s symbol table. Module attributes
1664can be assigned to. (Note that the \keyword{import} statement is not,
1665strictly speaking, an operation on a module object; \code{import
1666\var{foo}} does not require a module object named \var{foo} to exist,
1667rather it requires an (external) \emph{definition} for a module named
1668\var{foo} somewhere.)
1669
1670A special member of every module is \member{__dict__}.
1671This is the dictionary containing the module's symbol table.
1672Modifying this dictionary will actually change the module's symbol
1673table, but direct assignment to the \member{__dict__} attribute is not
1674possible (you can write \code{\var{m}.__dict__['a'] = 1}, which
1675defines \code{\var{m}.a} to be \code{1}, but you can't write
Fred Drake401d1e32003-12-30 22:21:18 +00001676\code{\var{m}.__dict__ = \{\}}). Modifying \member{__dict__} directly
1677is not recommended.
Fred Drake99de2182001-10-30 06:23:14 +00001678
1679Modules built into the interpreter are written like this:
1680\code{<module 'sys' (built-in)>}. If loaded from a file, they are
1681written as \code{<module 'os' from
1682'/usr/local/lib/python\shortversion/os.pyc'>}.
1683
1684
1685\subsubsection{Classes and Class Instances \label{typesobjects}}
1686\nodename{Classes and Instances}
1687
1688See chapters 3 and 7 of the \citetitle[../ref/ref.html]{Python
1689Reference Manual} for these.
1690
1691
1692\subsubsection{Functions \label{typesfunctions}}
1693
1694Function objects are created by function definitions. The only
1695operation on a function object is to call it:
1696\code{\var{func}(\var{argument-list})}.
1697
1698There are really two flavors of function objects: built-in functions
1699and user-defined functions. Both support the same operation (to call
1700the function), but the implementation is different, hence the
1701different object types.
1702
Michael W. Hudson5e897952004-08-12 18:12:44 +00001703See the \citetitle[../ref/ref.html]{Python Reference Manual} for more
1704information.
Fred Drake99de2182001-10-30 06:23:14 +00001705
1706\subsubsection{Methods \label{typesmethods}}
1707\obindex{method}
1708
1709Methods are functions that are called using the attribute notation.
1710There are two flavors: built-in methods (such as \method{append()} on
1711lists) and class instance methods. Built-in methods are described
1712with the types that support them.
1713
1714The implementation adds two special read-only attributes to class
1715instance methods: \code{\var{m}.im_self} is the object on which the
1716method operates, and \code{\var{m}.im_func} is the function
1717implementing the method. Calling \code{\var{m}(\var{arg-1},
1718\var{arg-2}, \textrm{\ldots}, \var{arg-n})} is completely equivalent to
1719calling \code{\var{m}.im_func(\var{m}.im_self, \var{arg-1},
1720\var{arg-2}, \textrm{\ldots}, \var{arg-n})}.
1721
1722Class instance methods are either \emph{bound} or \emph{unbound},
1723referring to whether the method was accessed through an instance or a
1724class, respectively. When a method is unbound, its \code{im_self}
1725attribute will be \code{None} and if called, an explicit \code{self}
1726object must be passed as the first argument. In this case,
1727\code{self} must be an instance of the unbound method's class (or a
1728subclass of that class), otherwise a \code{TypeError} is raised.
1729
1730Like function objects, methods objects support getting
1731arbitrary attributes. However, since method attributes are actually
1732stored on the underlying function object (\code{meth.im_func}),
1733setting method attributes on either bound or unbound methods is
1734disallowed. Attempting to set a method attribute results in a
1735\code{TypeError} being raised. In order to set a method attribute,
1736you need to explicitly set it on the underlying function object:
1737
1738\begin{verbatim}
1739class C:
1740 def method(self):
1741 pass
1742
1743c = C()
1744c.method.im_func.whoami = 'my name is c'
1745\end{verbatim}
1746
1747See the \citetitle[../ref/ref.html]{Python Reference Manual} for more
1748information.
1749
1750
1751\subsubsection{Code Objects \label{bltin-code-objects}}
1752\obindex{code}
1753
1754Code objects are used by the implementation to represent
1755``pseudo-compiled'' executable Python code such as a function body.
1756They differ from function objects because they don't contain a
1757reference to their global execution environment. Code objects are
1758returned by the built-in \function{compile()} function and can be
1759extracted from function objects through their \member{func_code}
1760attribute.
1761\bifuncindex{compile}
1762\withsubitem{(function object attribute)}{\ttindex{func_code}}
1763
1764A code object can be executed or evaluated by passing it (instead of a
1765source string) to the \keyword{exec} statement or the built-in
1766\function{eval()} function.
1767\stindex{exec}
1768\bifuncindex{eval}
1769
1770See the \citetitle[../ref/ref.html]{Python Reference Manual} for more
1771information.
1772
1773
1774\subsubsection{Type Objects \label{bltin-type-objects}}
1775
1776Type objects represent the various object types. An object's type is
1777accessed by the built-in function \function{type()}. There are no special
Fred Drake401d1e32003-12-30 22:21:18 +00001778operations on types. The standard module \refmodule{types} defines names
Fred Drake99de2182001-10-30 06:23:14 +00001779for all standard built-in types.
1780\bifuncindex{type}
1781\refstmodindex{types}
1782
1783Types are written like this: \code{<type 'int'>}.
1784
1785
1786\subsubsection{The Null Object \label{bltin-null-object}}
1787
1788This object is returned by functions that don't explicitly return a
1789value. It supports no special operations. There is exactly one null
1790object, named \code{None} (a built-in name).
1791
1792It is written as \code{None}.
1793
1794
1795\subsubsection{The Ellipsis Object \label{bltin-ellipsis-object}}
1796
1797This object is used by extended slice notation (see the
1798\citetitle[../ref/ref.html]{Python Reference Manual}). It supports no
1799special operations. There is exactly one ellipsis object, named
1800\constant{Ellipsis} (a built-in name).
1801
1802It is written as \code{Ellipsis}.
1803
Guido van Rossum77f6a652002-04-03 22:41:51 +00001804\subsubsection{Boolean Values}
1805
1806Boolean values are the two constant objects \code{False} and
1807\code{True}. They are used to represent truth values (although other
1808values can also be considered false or true). In numeric contexts
1809(for example when used as the argument to an arithmetic operator),
1810they behave like the integers 0 and 1, respectively. The built-in
1811function \function{bool()} can be used to cast any value to a Boolean,
1812if the value can be interpreted as a truth value (see section Truth
1813Value Testing above).
1814
1815They are written as \code{False} and \code{True}, respectively.
1816\index{False}
1817\index{True}
1818\indexii{Boolean}{values}
1819
Fred Drake99de2182001-10-30 06:23:14 +00001820
Fred Drake9474d861999-02-12 22:05:33 +00001821\subsubsection{Internal Objects \label{typesinternal}}
Fred Drake64e3b431998-07-24 13:56:11 +00001822
Fred Drake37f15741999-11-10 16:21:37 +00001823See the \citetitle[../ref/ref.html]{Python Reference Manual} for this
Fred Drake512bb722000-08-18 03:12:38 +00001824information. It describes stack frame objects, traceback objects, and
1825slice objects.
Fred Drake64e3b431998-07-24 13:56:11 +00001826
1827
Fred Drake7a2f0661998-09-10 18:25:58 +00001828\subsection{Special Attributes \label{specialattrs}}
Fred Drake64e3b431998-07-24 13:56:11 +00001829
1830The implementation adds a few special read-only attributes to several
Fred Drakef72de0f2004-05-12 02:48:29 +00001831object types, where they are relevant. Some of these are not reported
1832by the \function{dir()} built-in function.
Fred Drake64e3b431998-07-24 13:56:11 +00001833
Fred Drakea776cea2000-11-06 20:17:37 +00001834\begin{memberdesc}[object]{__dict__}
1835A dictionary or other mapping object used to store an
Fred Drake7a2f0661998-09-10 18:25:58 +00001836object's (writable) attributes.
Fred Drakea776cea2000-11-06 20:17:37 +00001837\end{memberdesc}
Fred Drake64e3b431998-07-24 13:56:11 +00001838
Fred Drakea776cea2000-11-06 20:17:37 +00001839\begin{memberdesc}[object]{__methods__}
Fred Drake35705512001-12-03 17:32:27 +00001840\deprecated{2.2}{Use the built-in function \function{dir()} to get a
1841list of an object's attributes. This attribute is no longer available.}
Fred Drakea776cea2000-11-06 20:17:37 +00001842\end{memberdesc}
Fred Drake64e3b431998-07-24 13:56:11 +00001843
Fred Drakea776cea2000-11-06 20:17:37 +00001844\begin{memberdesc}[object]{__members__}
Fred Drake35705512001-12-03 17:32:27 +00001845\deprecated{2.2}{Use the built-in function \function{dir()} to get a
1846list of an object's attributes. This attribute is no longer available.}
Fred Drakea776cea2000-11-06 20:17:37 +00001847\end{memberdesc}
Fred Drake64e3b431998-07-24 13:56:11 +00001848
Fred Drakea776cea2000-11-06 20:17:37 +00001849\begin{memberdesc}[instance]{__class__}
Fred Drake7a2f0661998-09-10 18:25:58 +00001850The class to which a class instance belongs.
Fred Drakea776cea2000-11-06 20:17:37 +00001851\end{memberdesc}
Fred Drake64e3b431998-07-24 13:56:11 +00001852
Fred Drakea776cea2000-11-06 20:17:37 +00001853\begin{memberdesc}[class]{__bases__}
Fred Drake907e76b2001-07-06 20:30:11 +00001854The tuple of base classes of a class object. If there are no base
1855classes, this will be an empty tuple.
Fred Drakea776cea2000-11-06 20:17:37 +00001856\end{memberdesc}
Fred Drakef72de0f2004-05-12 02:48:29 +00001857
1858\begin{memberdesc}[class]{__name__}
1859The name of the class or type.
1860\end{memberdesc}