Blame - Doc/ref/ref2.tex - platform/external/python/cpython2

blob: 5971dab4bcdac04efb61f9a020e293b5fafe549a [file] [log] [blame]

Fred Drake	a1cce71	1998-07-24 22:12:32 +0000	[diff] [blame]	1	\chapter{Lexical analysis\label{lexical}}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	2
Fred Drake	5c07d9b	1998-05-14 19:37:06 +0000	[diff] [blame]	3	A Python program is read by a \emph{parser}. Input to the parser is a
				4	stream of \emph{tokens}, generated by the \emph{lexical analyzer}. This
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	5	chapter describes how the lexical analyzer breaks a file into tokens.
				6	\index{lexical analysis}
				7	\index{parser}
				8	\index{token}
				9
Martin v. Löwis	00f1e3f	2002-08-04 17:29:52 +0000	[diff] [blame]	10	Python uses the 7-bit \ASCII{} character set for program text.
				11	\versionadded[An encoding declaration can be used to indicate that
Georg Brandl	a635fbb	2006-01-15 07:55:35 +0000	[diff] [blame]	12	string literals and comments use an encoding different from ASCII]{2.3}
Martin v. Löwis	00f1e3f	2002-08-04 17:29:52 +0000	[diff] [blame]	13	For compatibility with older versions, Python only warns if it finds
				14	8-bit characters; those warnings should be corrected by either declaring
				15	an explicit encoding, or using escape sequences if those bytes are binary
				16	data, instead of characters.
				17
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	18
				19	The run-time character set depends on the I/O devices connected to the
Fred Drake	c37b65e	2001-11-28 07:26:15 +0000	[diff] [blame]	20	program but is generally a superset of \ASCII.
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	21
				22	\strong{Future compatibility note:} It may be tempting to assume that the
				23	character set for 8-bit characters is ISO Latin-1 (an \ASCII{}
				24	superset that covers most western languages that use the Latin
				25	alphabet), but it is possible that in the future Unicode text editors
				26	will become common. These generally use the UTF-8 encoding, which is
				27	also an \ASCII{} superset, but with very different use for the
				28	characters with ordinals 128-255. While there is no consensus on this
				29	subject yet, it is unwise to assume either Latin-1 or UTF-8, even
				30	though the current implementation appears to favor Latin-1. This
				31	applies both to the source character set and the run-time character
				32	set.
				33
Fred Drake	f5eae66	2001-06-23 05:26:52 +0000	[diff] [blame]	34
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	35	\section{Line structure\label{line-structure}}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	36
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	37	A Python program is divided into a number of \emph{logical lines}.
				38	\index{line structure}
				39
Fred Drake	f5eae66	2001-06-23 05:26:52 +0000	[diff] [blame]	40
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	41	\subsection{Logical lines\label{logical}}
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	42
				43	The end of
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	44	a logical line is represented by the token NEWLINE. Statements cannot
				45	cross logical line boundaries except where NEWLINE is allowed by the
Guido van Rossum	7c0240f	1998-07-24 15:36:43 +0000	[diff] [blame]	46	syntax (e.g., between statements in compound statements).
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	47	A logical line is constructed from one or more \emph{physical lines}
				48	by following the explicit or implicit \emph{line joining} rules.
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	49	\index{logical line}
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	50	\index{physical line}
				51	\index{line joining}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	52	\index{NEWLINE token}
				53
Fred Drake	f5eae66	2001-06-23 05:26:52 +0000	[diff] [blame]	54
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	55	\subsection{Physical lines\label{physical}}
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	56
Fred Drake	db22958	2005-05-25 05:29:17 +0000	[diff] [blame]	57	A physical line is a sequence of characters terminated by an end-of-line
				58	sequence. In source files, any of the standard platform line
				59	termination sequences can be used - the \UNIX form using \ASCII{} LF
				60	(linefeed), the Windows form using the \ASCII{} sequence CR LF (return
				61	followed by linefeed), or the Macintosh form using the \ASCII{} CR
				62	(return) character. All of these forms can be used equally, regardless
				63	of platform.
				64
				65	When embedding Python, source code strings should be passed to Python
				66	APIs using the standard C conventions for newline characters (the
				67	\code{\e n} character, representing \ASCII{} LF, is the line
				68	terminator).
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	69
Fred Drake	f5eae66	2001-06-23 05:26:52 +0000	[diff] [blame]	70
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	71	\subsection{Comments\label{comments}}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	72
Fred Drake	5c07d9b	1998-05-14 19:37:06 +0000	[diff] [blame]	73	A comment starts with a hash character (\code{\#}) that is not part of
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	74	a string literal, and ends at the end of the physical line. A comment
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	75	signifies the end of the logical line unless the implicit line joining
				76	rules are invoked.
				77	Comments are ignored by the syntax; they are not tokens.
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	78	\index{comment}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	79	\index{hash character}
				80
Fred Drake	f5eae66	2001-06-23 05:26:52 +0000	[diff] [blame]	81
Martin v. Löwis	00f1e3f	2002-08-04 17:29:52 +0000	[diff] [blame]	82	\subsection{Encoding declarations\label{encodings}}
Martin v. Löwis	266a436	2004-09-14 07:52:22 +0000	[diff] [blame]	83	\index{source character set}
				84	\index{encodings}
Martin v. Löwis	00f1e3f	2002-08-04 17:29:52 +0000	[diff] [blame]	85
				86	If a comment in the first or second line of the Python script matches
Martin v. Löwis	ae075b6	2004-08-18 13:25:05 +0000	[diff] [blame]	87	the regular expression \regexp{coding[=:]\e s*([-\e w.]+)}, this comment is
Martin v. Löwis	00f1e3f	2002-08-04 17:29:52 +0000	[diff] [blame]	88	processed as an encoding declaration; the first group of this
				89	expression names the encoding of the source code file. The recommended
				90	forms of this expression are
				91
				92	\begin{verbatim}
				93	# -- coding: <encoding-name> --
				94	\end{verbatim}
				95
				96	which is recognized also by GNU Emacs, and
				97
				98	\begin{verbatim}
				99	# vim:fileencoding=<encoding-name>
				100	\end{verbatim}
				101
Raymond Hettinger	3fd9779	2004-02-08 20:18:26 +0000	[diff] [blame]	102	which is recognized by Bram Moolenaar's VIM. In addition, if the first
Fred Drake	31f3db3	2002-08-06 21:36:06 +0000	[diff] [blame]	103	bytes of the file are the UTF-8 byte-order mark
				104	(\code{'\e xef\e xbb\e xbf'}), the declared file encoding is UTF-8
				105	(this is supported, among others, by Microsoft's \program{notepad}).
Martin v. Löwis	00f1e3f	2002-08-04 17:29:52 +0000	[diff] [blame]	106
				107	If an encoding is declared, the encoding name must be recognized by
				108	Python. % XXX there should be a list of supported encodings.
				109	The encoding is used for all lexical analysis, in particular to find
				110	the end of a string, and to interpret the contents of Unicode literals.
				111	String literals are converted to Unicode for syntactical analysis,
				112	then converted back to their original encoding before interpretation
Martin v. Löwis	f62a89b	2002-09-03 11:52:44 +0000	[diff] [blame]	113	starts. The encoding declaration must appear on a line of its own.
Martin v. Löwis	00f1e3f	2002-08-04 17:29:52 +0000	[diff] [blame]	114
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	115	\subsection{Explicit line joining\label{explicit-joining}}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	116
				117	Two or more physical lines may be joined into logical lines using
Fred Drake	5c07d9b	1998-05-14 19:37:06 +0000	[diff] [blame]	118	backslash characters (\code{\e}), as follows: when a physical line ends
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	119	in a backslash that is not part of a string literal or comment, it is
				120	joined with the following forming a single logical line, deleting the
				121	backslash and the following end-of-line character. For example:
				122	\index{physical line}
				123	\index{line joining}
				124	\index{line continuation}
				125	\index{backslash character}
				126	%
				127	\begin{verbatim}
				128	if 1900 < year < 2100 and 1 <= month <= 12 \
				129	and 1 <= day <= 31 and 0 <= hour < 24 \
				130	and 0 <= minute < 60 and 0 <= second < 60: # Looks like a valid date
				131	return 1
				132	\end{verbatim}
				133
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	134	A line ending in a backslash cannot carry a comment. A backslash does
				135	not continue a comment. A backslash does not continue a token except
				136	for string literals (i.e., tokens other than string literals cannot be
				137	split across physical lines using a backslash). A backslash is
				138	illegal elsewhere on a line outside a string literal.
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	139
Fred Drake	c411fa6	1999-02-22 14:32:18 +0000	[diff] [blame]	140
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	141	\subsection{Implicit line joining\label{implicit-joining}}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	142
				143	Expressions in parentheses, square brackets or curly braces can be
				144	split over more than one physical line without using backslashes.
				145	For example:
				146
				147	\begin{verbatim}
				148	month_names = ['Januari', 'Februari', 'Maart', # These are the
				149	'April', 'Mei', 'Juni', # Dutch names
				150	'Juli', 'Augustus', 'September', # for the months
				151	'Oktober', 'November', 'December'] # of the year
				152	\end{verbatim}
				153
				154	Implicitly continued lines can carry comments. The indentation of the
				155	continuation lines is not important. Blank continuation lines are
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	156	allowed. There is no NEWLINE token between implicit continuation
				157	lines. Implicitly continued lines can also occur within triple-quoted
				158	strings (see below); in that case they cannot carry comments.
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	159
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	160
Fred Drake	79713fd	2002-10-24 19:57:37 +0000	[diff] [blame]	161	\subsection{Blank lines \label{blank-lines}}
Fred Drake	c411fa6	1999-02-22 14:32:18 +0000	[diff] [blame]	162
Fred Drake	79713fd	2002-10-24 19:57:37 +0000	[diff] [blame]	163	\index{blank line}
Fred Drake	c411fa6	1999-02-22 14:32:18 +0000	[diff] [blame]	164	A logical line that contains only spaces, tabs, formfeeds and possibly
				165	a comment, is ignored (i.e., no NEWLINE token is generated). During
				166	interactive input of statements, handling of a blank line may differ
				167	depending on the implementation of the read-eval-print loop. In the
				168	standard implementation, an entirely blank logical line (i.e.\ one
				169	containing not even whitespace or a comment) terminates a multi-line
				170	statement.
				171
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	172
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	173	\subsection{Indentation\label{indentation}}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	174
				175	Leading whitespace (spaces and tabs) at the beginning of a logical
				176	line is used to compute the indentation level of the line, which in
				177	turn is used to determine the grouping of statements.
				178	\index{indentation}
				179	\index{whitespace}
				180	\index{leading whitespace}
				181	\index{space}
				182	\index{tab}
				183	\index{grouping}
				184	\index{statement grouping}
				185
				186	First, tabs are replaced (from left to right) by one to eight spaces
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	187	such that the total number of characters up to and including the
				188	replacement is a multiple of
Fred Drake	c37b65e	2001-11-28 07:26:15 +0000	[diff] [blame]	189	eight (this is intended to be the same rule as used by \UNIX). The
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	190	total number of spaces preceding the first non-blank character then
				191	determines the line's indentation. Indentation cannot be split over
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	192	multiple physical lines using backslashes; the whitespace up to the
				193	first backslash determines the indentation.
				194
				195	\strong{Cross-platform compatibility note:} because of the nature of
				196	text editors on non-UNIX platforms, it is unwise to use a mixture of
Martin v. Löwis	171be76	2003-06-21 13:40:02 +0000	[diff] [blame]	197	spaces and tabs for the indentation in a single source file. It
				198	should also be noted that different platforms may explicitly limit the
				199	maximum indentation level.
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	200
				201	A formfeed character may be present at the start of the line; it will
Fred Drake	e15956b	2000-04-03 04:51:13 +0000	[diff] [blame]	202	be ignored for the indentation calculations above. Formfeed
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	203	characters occurring elsewhere in the leading whitespace have an
				204	undefined effect (for instance, they may reset the space count to
				205	zero).
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	206
				207	The indentation levels of consecutive lines are used to generate
				208	INDENT and DEDENT tokens, using a stack, as follows.
				209	\index{INDENT token}
				210	\index{DEDENT token}
				211
				212	Before the first line of the file is read, a single zero is pushed on
				213	the stack; this will never be popped off again. The numbers pushed on
				214	the stack will always be strictly increasing from bottom to top. At
				215	the beginning of each logical line, the line's indentation level is
				216	compared to the top of the stack. If it is equal, nothing happens.
				217	If it is larger, it is pushed on the stack, and one INDENT token is
Fred Drake	5c07d9b	1998-05-14 19:37:06 +0000	[diff] [blame]	218	generated. If it is smaller, it \emph{must} be one of the numbers
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	219	occurring on the stack; all numbers on the stack that are larger are
				220	popped off, and for each number popped off a DEDENT token is
				221	generated. At the end of the file, a DEDENT token is generated for
				222	each number remaining on the stack that is larger than zero.
				223
				224	Here is an example of a correctly (though confusingly) indented piece
				225	of Python code:
				226
				227	\begin{verbatim}
				228	def perm(l):
				229	# Compute the list of all permutations of l
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	230	if len(l) <= 1:
				231	return [l]
				232	r = []
				233	for i in range(len(l)):
				234	s = l[:i] + l[i+1:]
				235	p = perm(s)
				236	for x in p:
				237	r.append(l[i:i+1] + x)
				238	return r
				239	\end{verbatim}
				240
				241	The following example shows various indentation errors:
				242
				243	\begin{verbatim}
Fred Drake	1d3e6c1	2001-12-11 17:46:38 +0000	[diff] [blame]	244	def perm(l): # error: first line indented
				245	for i in range(len(l)): # error: not indented
				246	s = l[:i] + l[i+1:]
				247	p = perm(l[:i] + l[i+1:]) # error: unexpected indent
				248	for x in p:
				249	r.append(l[i:i+1] + x)
				250	return r # error: inconsistent dedent
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	251	\end{verbatim}
				252
				253	(Actually, the first three errors are detected by the parser; only the
				254	last error is found by the lexical analyzer --- the indentation of
Fred Drake	5c07d9b	1998-05-14 19:37:06 +0000	[diff] [blame]	255	\code{return r} does not match a level popped off the stack.)
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	256
Fred Drake	f5eae66	2001-06-23 05:26:52 +0000	[diff] [blame]	257
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	258	\subsection{Whitespace between tokens\label{whitespace}}
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	259
				260	Except at the beginning of a logical line or in string literals, the
				261	whitespace characters space, tab and formfeed can be used
				262	interchangeably to separate tokens. Whitespace is needed between two
				263	tokens only if their concatenation could otherwise be interpreted as a
				264	different token (e.g., ab is one token, but a b is two tokens).
				265
Fred Drake	f5eae66	2001-06-23 05:26:52 +0000	[diff] [blame]	266
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	267	\section{Other tokens\label{other-tokens}}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	268
				269	Besides NEWLINE, INDENT and DEDENT, the following categories of tokens
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	270	exist: \emph{identifiers}, \emph{keywords}, \emph{literals},
				271	\emph{operators}, and \emph{delimiters}.
				272	Whitespace characters (other than line terminators, discussed earlier)
				273	are not tokens, but serve to delimit tokens.
				274	Where
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	275	ambiguity exists, a token comprises the longest possible string that
				276	forms a legal token, when read from left to right.
				277
Fred Drake	f5eae66	2001-06-23 05:26:52 +0000	[diff] [blame]	278
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	279	\section{Identifiers and keywords\label{identifiers}}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	280
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	281	Identifiers (also referred to as \emph{names}) are described by the following
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	282	lexical definitions:
				283	\index{identifier}
				284	\index{name}
				285
Fred Drake	cb4638a	2001-07-06 22:49:53 +0000	[diff] [blame]	286	\begin{productionlist}
				287	\production{identifier}
				288	{(\token{letter}\|"_") (\token{letter} \| \token{digit} \| "_")*}
				289	\production{letter}
				290	{\token{lowercase} \| \token{uppercase}}
				291	\production{lowercase}
				292	{"a"..."z"}
				293	\production{uppercase}
				294	{"A"..."Z"}
				295	\production{digit}
				296	{"0"..."9"}
				297	\end{productionlist}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	298
				299	Identifiers are unlimited in length. Case is significant.
				300
Fred Drake	f5eae66	2001-06-23 05:26:52 +0000	[diff] [blame]	301
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	302	\subsection{Keywords\label{keywords}}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	303
Fred Drake	5c07d9b	1998-05-14 19:37:06 +0000	[diff] [blame]	304	The following identifiers are used as reserved words, or
				305	\emph{keywords} of the language, and cannot be used as ordinary
				306	identifiers. They must be spelled exactly as written here:%
				307	\index{keyword}%
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	308	\index{reserved word}
				309
				310	\begin{verbatim}
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	311	and del for is raise
				312	assert elif from lambda return
				313	break else global not try
Guido van Rossum	41c6719	2001-12-04 20:38:44 +0000	[diff] [blame]	314	class except if or while
				315	continue exec import pass yield
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	316	def finally in print
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	317	\end{verbatim}
				318
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	319	% When adding keywords, use reswords.py for reformatting
				320
Fred Drake	a23b573	2002-06-18 19:17:14 +0000	[diff] [blame]	321	Note that although the identifier \code{as} can be used as part of the
				322	syntax of \keyword{import} statements, it is not currently a reserved
				323	word.
				324
				325	In some future version of Python, the identifiers \code{as} and
				326	\code{None} will both become keywords.
				327
Fred Drake	f5eae66	2001-06-23 05:26:52 +0000	[diff] [blame]	328
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	329	\subsection{Reserved classes of identifiers\label{id-classes}}
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	330
				331	Certain classes of identifiers (besides keywords) have special
Fred Drake	38f6b88	2003-09-06 03:50:07 +0000	[diff] [blame]	332	meanings. These classes are identified by the patterns of leading and
				333	trailing underscore characters:
Fred Drake	39fc1bc	1999-03-05 18:30:21 +0000	[diff] [blame]	334
				335	\begin{description}
Fred Drake	38f6b88	2003-09-06 03:50:07 +0000	[diff] [blame]	336
				337	\item[\code{_*}]
				338	Not imported by \samp{from \var{module} import *}. The special
				339	identifier \samp{_} is used in the interactive interpreter to store
				340	the result of the last evaluation; it is stored in the
				341	\module{__builtin__} module. When not in interactive mode, \samp{_}
				342	has no special meaning and is not defined.
				343	See section~\ref{import}, ``The \keyword{import} statement.''
				344
				345	\note{The name \samp{_} is often used in conjunction with
				346	internationalization; refer to the documentation for the
				347	\ulink{\module{gettext} module}{../lib/module-gettext.html} for more
				348	information on this convention.}
				349
				350	\item[\code{__*__}]
				351	System-defined names. These names are defined by the interpreter
Neil Schemenauer	c493229	2005-06-18 17:54:13 +0000	[diff] [blame]	352	and its implementation (including the standard library);
Fred Drake	38f6b88	2003-09-06 03:50:07 +0000	[diff] [blame]	353	applications should not expect to define additional names using this
				354	convention. The set of names of this class defined by Python may be
				355	extended in future versions.
				356	See section~\ref{specialnames}, ``Special method names.''
				357
				358	\item[\code{__*}]
				359	Class-private names. Names in this category, when used within the
Martin v. Löwis	13ff116	2004-06-02 12:48:20 +0000	[diff] [blame]	360	context of a class definition, are re-written to use a mangled form
Fred Drake	38f6b88	2003-09-06 03:50:07 +0000	[diff] [blame]	361	to help avoid name clashes between ``private'' attributes of base
				362	and derived classes.
				363	See section~\ref{atom-identifiers}, ``Identifiers (Names).''
				364
Fred Drake	39fc1bc	1999-03-05 18:30:21 +0000	[diff] [blame]	365	\end{description}
				366
				367
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	368	\section{Literals\label{literals}}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	369
				370	Literals are notations for constant values of some built-in types.
				371	\index{literal}
				372	\index{constant}
				373
Fred Drake	f5eae66	2001-06-23 05:26:52 +0000	[diff] [blame]	374
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	375	\subsection{String literals\label{strings}}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	376
				377	String literals are described by the following lexical definitions:
				378	\index{string literal}
				379
Fred Drake	c37b65e	2001-11-28 07:26:15 +0000	[diff] [blame]	380	\index{ASCII@\ASCII}
Fred Drake	cb4638a	2001-07-06 22:49:53 +0000	[diff] [blame]	381	\begin{productionlist}
				382	\production{stringliteral}
Fred Drake	c0cf726	2001-08-14 21:43:31 +0000	[diff] [blame]	383	{[\token{stringprefix}](\token{shortstring} \| \token{longstring})}
				384	\production{stringprefix}
				385	{"r" \| "u" \| "ur" \| "R" \| "U" \| "UR" \| "Ur" \| "uR"}
Fred Drake	cb4638a	2001-07-06 22:49:53 +0000	[diff] [blame]	386	\production{shortstring}
				387	{"'" \token{shortstringitem}* "'"
				388	\| '"' \token{shortstringitem}* '"'}
				389	\production{longstring}
Fred Drake	5381588	2002-03-15 23:21:37 +0000	[diff] [blame]	390	{"'''" \token{longstringitem}* "'''"}
				391	\productioncont{\| '"""' \token{longstringitem}* '"""'}
Fred Drake	cb4638a	2001-07-06 22:49:53 +0000	[diff] [blame]	392	\production{shortstringitem}
				393	{\token{shortstringchar} \| \token{escapeseq}}
				394	\production{longstringitem}
				395	{\token{longstringchar} \| \token{escapeseq}}
				396	\production{shortstringchar}
Martin v. Löwis	266a436	2004-09-14 07:52:22 +0000	[diff] [blame]	397	{<any source character except "\e" or newline or the quote>}
Fred Drake	cb4638a	2001-07-06 22:49:53 +0000	[diff] [blame]	398	\production{longstringchar}
Martin v. Löwis	266a436	2004-09-14 07:52:22 +0000	[diff] [blame]	399	{<any source character except "\e">}
Fred Drake	cb4638a	2001-07-06 22:49:53 +0000	[diff] [blame]	400	\production{escapeseq}
				401	{"\e" <any ASCII character>}
				402	\end{productionlist}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	403
Fred Drake	c0cf726	2001-08-14 21:43:31 +0000	[diff] [blame]	404	One syntactic restriction not indicated by these productions is that
				405	whitespace is not allowed between the \grammartoken{stringprefix} and
Martin v. Löwis	266a436	2004-09-14 07:52:22 +0000	[diff] [blame]	406	the rest of the string literal. The source character set is defined
Fred Drake	5b00059	2004-11-10 16:51:17 +0000	[diff] [blame]	407	by the encoding declaration; it is \ASCII{} if no encoding declaration
				408	is given in the source file; see section~\ref{encodings}.
Fred Drake	c0cf726	2001-08-14 21:43:31 +0000	[diff] [blame]	409
Fred Drake	dea764d	2000-12-19 04:52:03 +0000	[diff] [blame]	410	\index{triple-quoted string}
				411	\index{Unicode Consortium}
				412	\index{string!Unicode}
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	413	In plain English: String literals can be enclosed in matching single
				414	quotes (\code{'}) or double quotes (\code{"}). They can also be
				415	enclosed in matching groups of three single or double quotes (these
				416	are generally referred to as \emph{triple-quoted strings}). The
				417	backslash (\code{\e}) character is used to escape characters that
				418	otherwise have a special meaning, such as newline, backslash itself,
				419	or the quote character. String literals may optionally be prefixed
Raymond Hettinger	83dcf5a	2002-08-07 16:53:17 +0000	[diff] [blame]	420	with a letter \character{r} or \character{R}; such strings are called
				421	\dfn{raw strings}\index{raw string} and use different rules for interpreting
				422	backslash escape sequences. A prefix of \character{u} or \character{U}
				423	makes the string a Unicode string. Unicode strings use the Unicode character
				424	set as defined by the Unicode Consortium and ISO~10646. Some additional
Fred Drake	dea764d	2000-12-19 04:52:03 +0000	[diff] [blame]	425	escape sequences, described below, are available in Unicode strings.
Raymond Hettinger	83dcf5a	2002-08-07 16:53:17 +0000	[diff] [blame]	426	The two prefix characters may be combined; in this case, \character{u} must
				427	appear before \character{r}.
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	428
				429	In triple-quoted strings,
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	430	unescaped newlines and quotes are allowed (and are retained), except
				431	that three unescaped quotes in a row terminate the string. (A
				432	``quote'' is the character used to open the string, i.e. either
Fred Drake	5c07d9b	1998-05-14 19:37:06 +0000	[diff] [blame]	433	\code{'} or \code{"}.)
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	434
Raymond Hettinger	83dcf5a	2002-08-07 16:53:17 +0000	[diff] [blame]	435	Unless an \character{r} or \character{R} prefix is present, escape
				436	sequences in strings are interpreted according to rules similar
Fred Drake	9079164	2001-07-20 15:33:23 +0000	[diff] [blame]	437	to those used by Standard C. The recognized escape sequences are:
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	438	\index{physical line}
				439	\index{escape sequence}
				440	\index{Standard C}
				441	\index{C}
				442
Fred Drake	3e930ba	2002-09-24 21:08:37 +0000	[diff] [blame]	443	\begin{tableiii}{l\|l\|c}{code}{Escape Sequence}{Meaning}{Notes}
				444	\lineiii{\e\var{newline}} {Ignored}{}
				445	\lineiii{\e\e} {Backslash (\code{\e})}{}
				446	\lineiii{\e'} {Single quote (\code{'})}{}
				447	\lineiii{\e"} {Double quote (\code{"})}{}
				448	\lineiii{\e a} {\ASCII{} Bell (BEL)}{}
				449	\lineiii{\e b} {\ASCII{} Backspace (BS)}{}
				450	\lineiii{\e f} {\ASCII{} Formfeed (FF)}{}
				451	\lineiii{\e n} {\ASCII{} Linefeed (LF)}{}
				452	\lineiii{\e N\{\var{name}\}}
				453	{Character named \var{name} in the Unicode database (Unicode only)}{}
				454	\lineiii{\e r} {\ASCII{} Carriage Return (CR)}{}
				455	\lineiii{\e t} {\ASCII{} Horizontal Tab (TAB)}{}
				456	\lineiii{\e u\var{xxxx}}
				457	{Character with 16-bit hex value \var{xxxx} (Unicode only)}{(1)}
				458	\lineiii{\e U\var{xxxxxxxx}}
				459	{Character with 32-bit hex value \var{xxxxxxxx} (Unicode only)}{(2)}
				460	\lineiii{\e v} {\ASCII{} Vertical Tab (VT)}{}
Martin v. Löwis	266a436	2004-09-14 07:52:22 +0000	[diff] [blame]	461	\lineiii{\e\var{ooo}} {Character with octal value \var{ooo}}{(3,5)}
				462	\lineiii{\e x\var{hh}} {Character with hex value \var{hh}}{(4,5)}
Fred Drake	3e930ba	2002-09-24 21:08:37 +0000	[diff] [blame]	463	\end{tableiii}
Fred Drake	c37b65e	2001-11-28 07:26:15 +0000	[diff] [blame]	464	\index{ASCII@\ASCII}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	465
Fred Drake	3e930ba	2002-09-24 21:08:37 +0000	[diff] [blame]	466	\noindent
				467	Notes:
				468
				469	\begin{itemize}
				470	\item[(1)]
				471	Individual code units which form parts of a surrogate pair can be
				472	encoded using this escape sequence.
				473	\item[(2)]
				474	Any Unicode character can be encoded this way, but characters
				475	outside the Basic Multilingual Plane (BMP) will be encoded using a
				476	surrogate pair if Python is compiled to use 16-bit code units (the
				477	default). Individual code units which form parts of a surrogate
				478	pair can be encoded using this escape sequence.
				479	\item[(3)]
				480	As in Standard C, up to three octal digits are accepted.
				481	\item[(4)]
				482	Unlike in Standard C, at most two hex digits are accepted.
Martin v. Löwis	266a436	2004-09-14 07:52:22 +0000	[diff] [blame]	483	\item[(5)]
				484	In a string literal, hexadecimal and octal escapes denote the
				485	byte with the given value; it is not necessary that the byte
				486	encodes a character in the source character set. In a Unicode
				487	literal, these escapes denote a Unicode character with the given
				488	value.
Fred Drake	3e930ba	2002-09-24 21:08:37 +0000	[diff] [blame]	489	\end{itemize}
				490
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	491
Fred Drake	dea764d	2000-12-19 04:52:03 +0000	[diff] [blame]	492	Unlike Standard \index{unrecognized escape sequence}C,
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	493	all unrecognized escape sequences are left in the string unchanged,
Fred Drake	dea764d	2000-12-19 04:52:03 +0000	[diff] [blame]	494	i.e., \emph{the backslash is left in the string}. (This behavior is
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	495	useful when debugging: if an escape sequence is mistyped, the
Fred Drake	dea764d	2000-12-19 04:52:03 +0000	[diff] [blame]	496	resulting output is more easily recognized as broken.) It is also
				497	important to note that the escape sequences marked as ``(Unicode
				498	only)'' in the table above fall into the category of unrecognized
				499	escapes for non-Unicode string literals.
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	500
Raymond Hettinger	83dcf5a	2002-08-07 16:53:17 +0000	[diff] [blame]	501	When an \character{r} or \character{R} prefix is present, a character
				502	following a backslash is included in the string without change, and \emph{all
Fred Drake	347a625	2001-01-09 21:38:16 +0000	[diff] [blame]	503	backslashes are left in the string}. For example, the string literal
				504	\code{r"\e n"} consists of two characters: a backslash and a lowercase
Raymond Hettinger	83dcf5a	2002-08-07 16:53:17 +0000	[diff] [blame]	505	\character{n}. String quotes can be escaped with a backslash, but the
				506	backslash remains in the string; for example, \code{r"\e""} is a valid string
Fred Drake	347a625	2001-01-09 21:38:16 +0000	[diff] [blame]	507	literal consisting of two characters: a backslash and a double quote;
Fred Drake	0825dc2	2001-07-20 14:32:28 +0000	[diff] [blame]	508	\code{r"\e"} is not a valid string literal (even a raw string cannot
Fred Drake	347a625	2001-01-09 21:38:16 +0000	[diff] [blame]	509	end in an odd number of backslashes). Specifically, \emph{a raw
				510	string cannot end in a single backslash} (since the backslash would
				511	escape the following quote character). Note also that a single
				512	backslash followed by a newline is interpreted as those two characters
				513	as part of the string, \emph{not} as a line continuation.
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	514
Fred Drake	f7aa164	2002-08-07 13:24:09 +0000	[diff] [blame]	515	When an \character{r} or \character{R} prefix is used in conjunction
				516	with a \character{u} or \character{U} prefix, then the \code{\e uXXXX}
Georg Brandl	f96f5f5	2005-11-22 19:23:58 +0000	[diff] [blame]	517	and \code{\e UXXXXXXXX} escape sequences are processed while
				518	\emph{all other backslashes are left in the string}.
				519	For example, the string literal
Fred Drake	3e930ba	2002-09-24 21:08:37 +0000	[diff] [blame]	520	\code{ur"\e{}u0062\e n"} consists of three Unicode characters: `LATIN
				521	SMALL LETTER B', `REVERSE SOLIDUS', and `LATIN SMALL LETTER N'.
				522	Backslashes can be escaped with a preceding backslash; however, both
				523	remain in the string. As a result, \code{\e uXXXX} escape sequences
				524	are only recognized when there are an odd number of backslashes.
Fred Drake	f5eae66	2001-06-23 05:26:52 +0000	[diff] [blame]	525
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	526	\subsection{String literal concatenation\label{string-catenation}}
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	527
				528	Multiple adjacent string literals (delimited by whitespace), possibly
				529	using different quoting conventions, are allowed, and their meaning is
				530	the same as their concatenation. Thus, \code{"hello" 'world'} is
				531	equivalent to \code{"helloworld"}. This feature can be used to reduce
				532	the number of backslashes needed, to split long strings conveniently
				533	across long lines, or even to add comments to parts of strings, for
				534	example:
				535
				536	\begin{verbatim}
				537	re.compile("[A-Za-z_]" # letter or underscore
				538	"[A-Za-z0-9_]*" # letter, digit or underscore
				539	)
				540	\end{verbatim}
				541
				542	Note that this feature is defined at the syntactical level, but
				543	implemented at compile time. The `+' operator must be used to
				544	concatenate string expressions at run time. Also note that literal
				545	concatenation can use different quoting styles for each component
				546	(even mixing raw strings and triple quoted strings).
				547
Fred Drake	2ed27d3	2000-11-17 19:05:12 +0000	[diff] [blame]	548
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	549	\subsection{Numeric literals\label{numbers}}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	550
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	551	There are four types of numeric literals: plain integers, long
				552	integers, floating point numbers, and imaginary numbers. There are no
				553	complex literals (complex numbers can be formed by adding a real
				554	number and an imaginary number).
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	555	\index{number}
				556	\index{numeric literal}
				557	\index{integer literal}
				558	\index{plain integer literal}
				559	\index{long integer literal}
				560	\index{floating point literal}
				561	\index{hexadecimal literal}
				562	\index{octal literal}
				563	\index{decimal literal}
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	564	\index{imaginary literal}
Fred Drake	ed9e453	2002-04-23 20:04:46 +0000	[diff] [blame]	565	\index{complex!literal}
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	566
				567	Note that numeric literals do not include a sign; a phrase like
				568	\code{-1} is actually an expression composed of the unary operator
				569	`\code{-}' and the literal \code{1}.
				570
Fred Drake	f5eae66	2001-06-23 05:26:52 +0000	[diff] [blame]	571
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	572	\subsection{Integer and long integer literals\label{integers}}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	573
				574	Integer and long integer literals are described by the following
				575	lexical definitions:
				576
Fred Drake	cb4638a	2001-07-06 22:49:53 +0000	[diff] [blame]	577	\begin{productionlist}
				578	\production{longinteger}
				579	{\token{integer} ("l" \| "L")}
				580	\production{integer}
				581	{\token{decimalinteger} \| \token{octinteger} \| \token{hexinteger}}
				582	\production{decimalinteger}
				583	{\token{nonzerodigit} \token{digit}* \| "0"}
				584	\production{octinteger}
				585	{"0" \token{octdigit}+}
				586	\production{hexinteger}
				587	{"0" ("x" \| "X") \token{hexdigit}+}
				588	\production{nonzerodigit}
				589	{"1"..."9"}
				590	\production{octdigit}
				591	{"0"..."7"}
				592	\production{hexdigit}
				593	{\token{digit} \| "a"..."f" \| "A"..."F"}
				594	\end{productionlist}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	595
Raymond Hettinger	83dcf5a	2002-08-07 16:53:17 +0000	[diff] [blame]	596	Although both lower case \character{l} and upper case \character{L} are
				597	allowed as suffix for long integers, it is strongly recommended to always
				598	use \character{L}, since the letter \character{l} looks too much like the
				599	digit \character{1}.
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	600
Guido van Rossum	6c9e130	2003-11-29 23:52:13 +0000	[diff] [blame]	601	Plain integer literals that are above the largest representable plain
				602	integer (e.g., 2147483647 when using 32-bit arithmetic) are accepted
				603	as if they were long integers instead.\footnote{In versions of Python
				604	prior to 2.4, octal and hexadecimal literals in the range just above
				605	the largest representable plain integer but below the largest unsigned
				606	32-bit number (on a machine using 32-bit arithmetic), 4294967296, were
				607	taken as the negative plain integer obtained by subtracting 4294967296
				608	from their unsigned value.} There is no limit for long integer
				609	literals apart from what can be stored in available memory.
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	610
Raymond Hettinger	e701dcb	2003-01-19 13:08:18 +0000	[diff] [blame]	611	Some examples of plain integer literals (first row) and long integer
				612	literals (second and third rows):
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	613
				614	\begin{verbatim}
Guido van Rossum	6c9e130	2003-11-29 23:52:13 +0000	[diff] [blame]	615	7 2147483647 0177
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	616	3L 79228162514264337593543950336L 0377L 0x100000000L
Guido van Rossum	6c9e130	2003-11-29 23:52:13 +0000	[diff] [blame]	617	79228162514264337593543950336 0xdeadbeef
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	618	\end{verbatim}
				619
Fred Drake	f5eae66	2001-06-23 05:26:52 +0000	[diff] [blame]	620
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	621	\subsection{Floating point literals\label{floating}}
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	622
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	623	Floating point literals are described by the following lexical
				624	definitions:
				625
Fred Drake	cb4638a	2001-07-06 22:49:53 +0000	[diff] [blame]	626	\begin{productionlist}
				627	\production{floatnumber}
				628	{\token{pointfloat} \| \token{exponentfloat}}
				629	\production{pointfloat}
				630	{[\token{intpart}] \token{fraction} \| \token{intpart} "."}
				631	\production{exponentfloat}
Tim Peters	d507dab	2001-08-30 20:51:59 +0000	[diff] [blame]	632	{(\token{intpart} \| \token{pointfloat})
Fred Drake	cb4638a	2001-07-06 22:49:53 +0000	[diff] [blame]	633	\token{exponent}}
				634	\production{intpart}
Tim Peters	d507dab	2001-08-30 20:51:59 +0000	[diff] [blame]	635	{\token{digit}+}
Fred Drake	cb4638a	2001-07-06 22:49:53 +0000	[diff] [blame]	636	\production{fraction}
				637	{"." \token{digit}+}
				638	\production{exponent}
				639	{("e" \| "E") ["+" \| "-"] \token{digit}+}
				640	\end{productionlist}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	641
Tim Peters	d507dab	2001-08-30 20:51:59 +0000	[diff] [blame]	642	Note that the integer and exponent parts of floating point numbers
				643	can look like octal integers, but are interpreted using radix 10. For
				644	example, \samp{077e010} is legal, and denotes the same number
				645	as \samp{77e10}.
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	646	The allowed range of floating point literals is
				647	implementation-dependent.
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	648	Some examples of floating point literals:
				649
				650	\begin{verbatim}
Tim Peters	d507dab	2001-08-30 20:51:59 +0000	[diff] [blame]	651	3.14 10. .001 1e100 3.14e-10 0e0
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	652	\end{verbatim}
				653
				654	Note that numeric literals do not include a sign; a phrase like
George Yoshida	6fffa5e	2006-05-20 15:36:19 +0000	[diff] [blame^]	655	\code{-1} is actually an expression composed of the unary operator
Fred Drake	5c07d9b	1998-05-14 19:37:06 +0000	[diff] [blame]	656	\code{-} and the literal \code{1}.
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	657
Fred Drake	f5eae66	2001-06-23 05:26:52 +0000	[diff] [blame]	658
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	659	\subsection{Imaginary literals\label{imaginary}}
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	660
				661	Imaginary literals are described by the following lexical definitions:
				662
Fred Drake	cb4638a	2001-07-06 22:49:53 +0000	[diff] [blame]	663	\begin{productionlist}
				664	\production{imagnumber}{(\token{floatnumber} \| \token{intpart}) ("j" \| "J")}
				665	\end{productionlist}
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	666
Fred Drake	e15956b	2000-04-03 04:51:13 +0000	[diff] [blame]	667	An imaginary literal yields a complex number with a real part of
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	668	0.0. Complex numbers are represented as a pair of floating point
				669	numbers and have the same restrictions on their range. To create a
				670	complex number with a nonzero real part, add a floating point number
Guido van Rossum	7c0240f	1998-07-24 15:36:43 +0000	[diff] [blame]	671	to it, e.g., \code{(3+4j)}. Some examples of imaginary literals:
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	672
				673	\begin{verbatim}
Guido van Rossum	7c0240f	1998-07-24 15:36:43 +0000	[diff] [blame]	674	3.14j 10.j 10j .001j 1e100j 3.14e-10j
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	675	\end{verbatim}
				676
				677
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	678	\section{Operators\label{operators}}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	679
				680	The following tokens are operators:
				681	\index{operators}
				682
				683	\begin{verbatim}
Fred Drake	a7d608d	2001-08-08 05:37:21 +0000	[diff] [blame]	684	+ - * ** / // %
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	685	<< >> & \| ^ ~
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	686	< > <= >= == != <>
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	687	\end{verbatim}
				688
Fred Drake	5c07d9b	1998-05-14 19:37:06 +0000	[diff] [blame]	689	The comparison operators \code{<>} and \code{!=} are alternate
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	690	spellings of the same operator. \code{!=} is the preferred spelling;
				691	\code{<>} is obsolescent.
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	692
Fred Drake	f5eae66	2001-06-23 05:26:52 +0000	[diff] [blame]	693
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	694	\section{Delimiters\label{delimiters}}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	695
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	696	The following tokens serve as delimiters in the grammar:
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	697	\index{delimiters}
				698
				699	\begin{verbatim}
Fred Drake	6bd8e84	2004-08-05 21:11:27 +0000	[diff] [blame]	700	( ) [ ] { } @
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	701	, : . ` = ;
Fred Drake	a7d608d	2001-08-08 05:37:21 +0000	[diff] [blame]	702	+= -= *= /= //= %=
				703	&= \|= ^= >>= <<= **=
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	704	\end{verbatim}
				705
				706	The period can also occur in floating-point and imaginary literals. A
Fred Drake	e15956b	2000-04-03 04:51:13 +0000	[diff] [blame]	707	sequence of three periods has a special meaning as an ellipsis in slices.
Thomas Wouters	12bba85	2000-08-24 20:06:04 +0000	[diff] [blame]	708	The second half of the list, the augmented assignment operators, serve
				709	lexically as delimiters, but also perform an operation.
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	710
Fred Drake	c37b65e	2001-11-28 07:26:15 +0000	[diff] [blame]	711	The following printing \ASCII{} characters have special meaning as part
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	712	of other tokens or are otherwise significant to the lexical analyzer:
				713
				714	\begin{verbatim}
				715	' " # \
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	716	\end{verbatim}
				717
				718	The following printing \ASCII{} characters are not used in Python. Their
				719	occurrence outside string literals and comments is an unconditional
				720	error:
Fred Drake	c37b65e	2001-11-28 07:26:15 +0000	[diff] [blame]	721	\index{ASCII@\ASCII}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	722
				723	\begin{verbatim}
Fred Drake	6bd8e84	2004-08-05 21:11:27 +0000	[diff] [blame]	724	$ ?
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	725	\end{verbatim}