Blame - Doc/ref/ref2.tex - platform/external/python/cpython3

blob: b8ddacbbf1d75521509dce6e46278f7f042be71b [file] [log] [blame]

Fred Drake	a1cce71	1998-07-24 22:12:32 +0000	[diff] [blame]	1	\chapter{Lexical analysis\label{lexical}}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	2
Fred Drake	5c07d9b	1998-05-14 19:37:06 +0000	[diff] [blame]	3	A Python program is read by a \emph{parser}. Input to the parser is a
				4	stream of \emph{tokens}, generated by the \emph{lexical analyzer}. This
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	5	chapter describes how the lexical analyzer breaks a file into tokens.
				6	\index{lexical analysis}
				7	\index{parser}
				8	\index{token}
				9
Martin v. Löwis	00f1e3f	2002-08-04 17:29:52 +0000	[diff] [blame]	10	Python uses the 7-bit \ASCII{} character set for program text.
				11	\versionadded[An encoding declaration can be used to indicate that
				12	string literals and comments use an encoding different from ASCII.]{2.3}
				13	For compatibility with older versions, Python only warns if it finds
				14	8-bit characters; those warnings should be corrected by either declaring
				15	an explicit encoding, or using escape sequences if those bytes are binary
				16	data, instead of characters.
				17
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	18
				19	The run-time character set depends on the I/O devices connected to the
Fred Drake	c37b65e	2001-11-28 07:26:15 +0000	[diff] [blame]	20	program but is generally a superset of \ASCII.
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	21
				22	\strong{Future compatibility note:} It may be tempting to assume that the
				23	character set for 8-bit characters is ISO Latin-1 (an \ASCII{}
				24	superset that covers most western languages that use the Latin
				25	alphabet), but it is possible that in the future Unicode text editors
				26	will become common. These generally use the UTF-8 encoding, which is
				27	also an \ASCII{} superset, but with very different use for the
				28	characters with ordinals 128-255. While there is no consensus on this
				29	subject yet, it is unwise to assume either Latin-1 or UTF-8, even
				30	though the current implementation appears to favor Latin-1. This
				31	applies both to the source character set and the run-time character
				32	set.
				33
Fred Drake	f5eae66	2001-06-23 05:26:52 +0000	[diff] [blame]	34
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	35	\section{Line structure\label{line-structure}}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	36
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	37	A Python program is divided into a number of \emph{logical lines}.
				38	\index{line structure}
				39
Fred Drake	f5eae66	2001-06-23 05:26:52 +0000	[diff] [blame]	40
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	41	\subsection{Logical lines\label{logical}}
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	42
				43	The end of
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	44	a logical line is represented by the token NEWLINE. Statements cannot
				45	cross logical line boundaries except where NEWLINE is allowed by the
Guido van Rossum	7c0240f	1998-07-24 15:36:43 +0000	[diff] [blame]	46	syntax (e.g., between statements in compound statements).
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	47	A logical line is constructed from one or more \emph{physical lines}
				48	by following the explicit or implicit \emph{line joining} rules.
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	49	\index{logical line}
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	50	\index{physical line}
				51	\index{line joining}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	52	\index{NEWLINE token}
				53
Fred Drake	f5eae66	2001-06-23 05:26:52 +0000	[diff] [blame]	54
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	55	\subsection{Physical lines\label{physical}}
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	56
				57	A physical line ends in whatever the current platform's convention is
Fred Drake	c37b65e	2001-11-28 07:26:15 +0000	[diff] [blame]	58	for terminating lines. On \UNIX, this is the \ASCII{} LF (linefeed)
Martin v. Löwis	36a4d8c	2002-10-10 18:24:54 +0000	[diff] [blame]	59	character. On Windows, it is the \ASCII{} sequence CR LF (return
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	60	followed by linefeed). On Macintosh, it is the \ASCII{} CR (return)
				61	character.
				62
Fred Drake	f5eae66	2001-06-23 05:26:52 +0000	[diff] [blame]	63
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	64	\subsection{Comments\label{comments}}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	65
Fred Drake	5c07d9b	1998-05-14 19:37:06 +0000	[diff] [blame]	66	A comment starts with a hash character (\code{\#}) that is not part of
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	67	a string literal, and ends at the end of the physical line. A comment
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	68	signifies the end of the logical line unless the implicit line joining
				69	rules are invoked.
				70	Comments are ignored by the syntax; they are not tokens.
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	71	\index{comment}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	72	\index{hash character}
				73
Fred Drake	f5eae66	2001-06-23 05:26:52 +0000	[diff] [blame]	74
Martin v. Löwis	00f1e3f	2002-08-04 17:29:52 +0000	[diff] [blame]	75	\subsection{Encoding declarations\label{encodings}}
Martin v. Löwis	266a436	2004-09-14 07:52:22 +0000	[diff] [blame]	76	\index{source character set}
				77	\index{encodings}
Martin v. Löwis	00f1e3f	2002-08-04 17:29:52 +0000	[diff] [blame]	78
				79	If a comment in the first or second line of the Python script matches
Martin v. Löwis	ae075b6	2004-08-18 13:25:05 +0000	[diff] [blame]	80	the regular expression \regexp{coding[=:]\e s*([-\e w.]+)}, this comment is
Martin v. Löwis	00f1e3f	2002-08-04 17:29:52 +0000	[diff] [blame]	81	processed as an encoding declaration; the first group of this
				82	expression names the encoding of the source code file. The recommended
				83	forms of this expression are
				84
				85	\begin{verbatim}
				86	# -- coding: <encoding-name> --
				87	\end{verbatim}
				88
				89	which is recognized also by GNU Emacs, and
				90
				91	\begin{verbatim}
				92	# vim:fileencoding=<encoding-name>
				93	\end{verbatim}
				94
Raymond Hettinger	3fd9779	2004-02-08 20:18:26 +0000	[diff] [blame]	95	which is recognized by Bram Moolenaar's VIM. In addition, if the first
Fred Drake	31f3db3	2002-08-06 21:36:06 +0000	[diff] [blame]	96	bytes of the file are the UTF-8 byte-order mark
				97	(\code{'\e xef\e xbb\e xbf'}), the declared file encoding is UTF-8
				98	(this is supported, among others, by Microsoft's \program{notepad}).
Martin v. Löwis	00f1e3f	2002-08-04 17:29:52 +0000	[diff] [blame]	99
				100	If an encoding is declared, the encoding name must be recognized by
				101	Python. % XXX there should be a list of supported encodings.
				102	The encoding is used for all lexical analysis, in particular to find
				103	the end of a string, and to interpret the contents of Unicode literals.
				104	String literals are converted to Unicode for syntactical analysis,
				105	then converted back to their original encoding before interpretation
Martin v. Löwis	f62a89b	2002-09-03 11:52:44 +0000	[diff] [blame]	106	starts. The encoding declaration must appear on a line of its own.
Martin v. Löwis	00f1e3f	2002-08-04 17:29:52 +0000	[diff] [blame]	107
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	108	\subsection{Explicit line joining\label{explicit-joining}}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	109
				110	Two or more physical lines may be joined into logical lines using
Fred Drake	5c07d9b	1998-05-14 19:37:06 +0000	[diff] [blame]	111	backslash characters (\code{\e}), as follows: when a physical line ends
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	112	in a backslash that is not part of a string literal or comment, it is
				113	joined with the following forming a single logical line, deleting the
				114	backslash and the following end-of-line character. For example:
				115	\index{physical line}
				116	\index{line joining}
				117	\index{line continuation}
				118	\index{backslash character}
				119	%
				120	\begin{verbatim}
				121	if 1900 < year < 2100 and 1 <= month <= 12 \
				122	and 1 <= day <= 31 and 0 <= hour < 24 \
				123	and 0 <= minute < 60 and 0 <= second < 60: # Looks like a valid date
				124	return 1
				125	\end{verbatim}
				126
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	127	A line ending in a backslash cannot carry a comment. A backslash does
				128	not continue a comment. A backslash does not continue a token except
				129	for string literals (i.e., tokens other than string literals cannot be
				130	split across physical lines using a backslash). A backslash is
				131	illegal elsewhere on a line outside a string literal.
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	132
Fred Drake	c411fa6	1999-02-22 14:32:18 +0000	[diff] [blame]	133
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	134	\subsection{Implicit line joining\label{implicit-joining}}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	135
				136	Expressions in parentheses, square brackets or curly braces can be
				137	split over more than one physical line without using backslashes.
				138	For example:
				139
				140	\begin{verbatim}
				141	month_names = ['Januari', 'Februari', 'Maart', # These are the
				142	'April', 'Mei', 'Juni', # Dutch names
				143	'Juli', 'Augustus', 'September', # for the months
				144	'Oktober', 'November', 'December'] # of the year
				145	\end{verbatim}
				146
				147	Implicitly continued lines can carry comments. The indentation of the
				148	continuation lines is not important. Blank continuation lines are
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	149	allowed. There is no NEWLINE token between implicit continuation
				150	lines. Implicitly continued lines can also occur within triple-quoted
				151	strings (see below); in that case they cannot carry comments.
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	152
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	153
Fred Drake	79713fd	2002-10-24 19:57:37 +0000	[diff] [blame]	154	\subsection{Blank lines \label{blank-lines}}
Fred Drake	c411fa6	1999-02-22 14:32:18 +0000	[diff] [blame]	155
Fred Drake	79713fd	2002-10-24 19:57:37 +0000	[diff] [blame]	156	\index{blank line}
Fred Drake	c411fa6	1999-02-22 14:32:18 +0000	[diff] [blame]	157	A logical line that contains only spaces, tabs, formfeeds and possibly
				158	a comment, is ignored (i.e., no NEWLINE token is generated). During
				159	interactive input of statements, handling of a blank line may differ
				160	depending on the implementation of the read-eval-print loop. In the
				161	standard implementation, an entirely blank logical line (i.e.\ one
				162	containing not even whitespace or a comment) terminates a multi-line
				163	statement.
				164
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	165
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	166	\subsection{Indentation\label{indentation}}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	167
				168	Leading whitespace (spaces and tabs) at the beginning of a logical
				169	line is used to compute the indentation level of the line, which in
				170	turn is used to determine the grouping of statements.
				171	\index{indentation}
				172	\index{whitespace}
				173	\index{leading whitespace}
				174	\index{space}
				175	\index{tab}
				176	\index{grouping}
				177	\index{statement grouping}
				178
				179	First, tabs are replaced (from left to right) by one to eight spaces
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	180	such that the total number of characters up to and including the
				181	replacement is a multiple of
Fred Drake	c37b65e	2001-11-28 07:26:15 +0000	[diff] [blame]	182	eight (this is intended to be the same rule as used by \UNIX). The
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	183	total number of spaces preceding the first non-blank character then
				184	determines the line's indentation. Indentation cannot be split over
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	185	multiple physical lines using backslashes; the whitespace up to the
				186	first backslash determines the indentation.
				187
				188	\strong{Cross-platform compatibility note:} because of the nature of
				189	text editors on non-UNIX platforms, it is unwise to use a mixture of
Martin v. Löwis	171be76	2003-06-21 13:40:02 +0000	[diff] [blame]	190	spaces and tabs for the indentation in a single source file. It
				191	should also be noted that different platforms may explicitly limit the
				192	maximum indentation level.
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	193
				194	A formfeed character may be present at the start of the line; it will
Fred Drake	e15956b	2000-04-03 04:51:13 +0000	[diff] [blame]	195	be ignored for the indentation calculations above. Formfeed
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	196	characters occurring elsewhere in the leading whitespace have an
				197	undefined effect (for instance, they may reset the space count to
				198	zero).
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	199
				200	The indentation levels of consecutive lines are used to generate
				201	INDENT and DEDENT tokens, using a stack, as follows.
				202	\index{INDENT token}
				203	\index{DEDENT token}
				204
				205	Before the first line of the file is read, a single zero is pushed on
				206	the stack; this will never be popped off again. The numbers pushed on
				207	the stack will always be strictly increasing from bottom to top. At
				208	the beginning of each logical line, the line's indentation level is
				209	compared to the top of the stack. If it is equal, nothing happens.
				210	If it is larger, it is pushed on the stack, and one INDENT token is
Fred Drake	5c07d9b	1998-05-14 19:37:06 +0000	[diff] [blame]	211	generated. If it is smaller, it \emph{must} be one of the numbers
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	212	occurring on the stack; all numbers on the stack that are larger are
				213	popped off, and for each number popped off a DEDENT token is
				214	generated. At the end of the file, a DEDENT token is generated for
				215	each number remaining on the stack that is larger than zero.
				216
				217	Here is an example of a correctly (though confusingly) indented piece
				218	of Python code:
				219
				220	\begin{verbatim}
				221	def perm(l):
				222	# Compute the list of all permutations of l
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	223	if len(l) <= 1:
				224	return [l]
				225	r = []
				226	for i in range(len(l)):
				227	s = l[:i] + l[i+1:]
				228	p = perm(s)
				229	for x in p:
				230	r.append(l[i:i+1] + x)
				231	return r
				232	\end{verbatim}
				233
				234	The following example shows various indentation errors:
				235
				236	\begin{verbatim}
Fred Drake	1d3e6c1	2001-12-11 17:46:38 +0000	[diff] [blame]	237	def perm(l): # error: first line indented
				238	for i in range(len(l)): # error: not indented
				239	s = l[:i] + l[i+1:]
				240	p = perm(l[:i] + l[i+1:]) # error: unexpected indent
				241	for x in p:
				242	r.append(l[i:i+1] + x)
				243	return r # error: inconsistent dedent
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	244	\end{verbatim}
				245
				246	(Actually, the first three errors are detected by the parser; only the
				247	last error is found by the lexical analyzer --- the indentation of
Fred Drake	5c07d9b	1998-05-14 19:37:06 +0000	[diff] [blame]	248	\code{return r} does not match a level popped off the stack.)
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	249
Fred Drake	f5eae66	2001-06-23 05:26:52 +0000	[diff] [blame]	250
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	251	\subsection{Whitespace between tokens\label{whitespace}}
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	252
				253	Except at the beginning of a logical line or in string literals, the
				254	whitespace characters space, tab and formfeed can be used
				255	interchangeably to separate tokens. Whitespace is needed between two
				256	tokens only if their concatenation could otherwise be interpreted as a
				257	different token (e.g., ab is one token, but a b is two tokens).
				258
Fred Drake	f5eae66	2001-06-23 05:26:52 +0000	[diff] [blame]	259
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	260	\section{Other tokens\label{other-tokens}}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	261
				262	Besides NEWLINE, INDENT and DEDENT, the following categories of tokens
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	263	exist: \emph{identifiers}, \emph{keywords}, \emph{literals},
				264	\emph{operators}, and \emph{delimiters}.
				265	Whitespace characters (other than line terminators, discussed earlier)
				266	are not tokens, but serve to delimit tokens.
				267	Where
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	268	ambiguity exists, a token comprises the longest possible string that
				269	forms a legal token, when read from left to right.
				270
Fred Drake	f5eae66	2001-06-23 05:26:52 +0000	[diff] [blame]	271
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	272	\section{Identifiers and keywords\label{identifiers}}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	273
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	274	Identifiers (also referred to as \emph{names}) are described by the following
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	275	lexical definitions:
				276	\index{identifier}
				277	\index{name}
				278
Fred Drake	cb4638a	2001-07-06 22:49:53 +0000	[diff] [blame]	279	\begin{productionlist}
				280	\production{identifier}
				281	{(\token{letter}\|"_") (\token{letter} \| \token{digit} \| "_")*}
				282	\production{letter}
				283	{\token{lowercase} \| \token{uppercase}}
				284	\production{lowercase}
				285	{"a"..."z"}
				286	\production{uppercase}
				287	{"A"..."Z"}
				288	\production{digit}
				289	{"0"..."9"}
				290	\end{productionlist}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	291
				292	Identifiers are unlimited in length. Case is significant.
				293
Fred Drake	f5eae66	2001-06-23 05:26:52 +0000	[diff] [blame]	294
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	295	\subsection{Keywords\label{keywords}}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	296
Fred Drake	5c07d9b	1998-05-14 19:37:06 +0000	[diff] [blame]	297	The following identifiers are used as reserved words, or
				298	\emph{keywords} of the language, and cannot be used as ordinary
				299	identifiers. They must be spelled exactly as written here:%
				300	\index{keyword}%
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	301	\index{reserved word}
				302
				303	\begin{verbatim}
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	304	and del for is raise
				305	assert elif from lambda return
				306	break else global not try
Guido van Rossum	41c6719	2001-12-04 20:38:44 +0000	[diff] [blame]	307	class except if or while
				308	continue exec import pass yield
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	309	def finally in print
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	310	\end{verbatim}
				311
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	312	% When adding keywords, use reswords.py for reformatting
				313
Fred Drake	a23b573	2002-06-18 19:17:14 +0000	[diff] [blame]	314	Note that although the identifier \code{as} can be used as part of the
				315	syntax of \keyword{import} statements, it is not currently a reserved
				316	word.
				317
				318	In some future version of Python, the identifiers \code{as} and
				319	\code{None} will both become keywords.
				320
Fred Drake	f5eae66	2001-06-23 05:26:52 +0000	[diff] [blame]	321
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	322	\subsection{Reserved classes of identifiers\label{id-classes}}
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	323
				324	Certain classes of identifiers (besides keywords) have special
Fred Drake	38f6b88	2003-09-06 03:50:07 +0000	[diff] [blame]	325	meanings. These classes are identified by the patterns of leading and
				326	trailing underscore characters:
Fred Drake	39fc1bc	1999-03-05 18:30:21 +0000	[diff] [blame]	327
				328	\begin{description}
Fred Drake	38f6b88	2003-09-06 03:50:07 +0000	[diff] [blame]	329
				330	\item[\code{_*}]
				331	Not imported by \samp{from \var{module} import *}. The special
				332	identifier \samp{_} is used in the interactive interpreter to store
				333	the result of the last evaluation; it is stored in the
				334	\module{__builtin__} module. When not in interactive mode, \samp{_}
				335	has no special meaning and is not defined.
				336	See section~\ref{import}, ``The \keyword{import} statement.''
				337
				338	\note{The name \samp{_} is often used in conjunction with
				339	internationalization; refer to the documentation for the
				340	\ulink{\module{gettext} module}{../lib/module-gettext.html} for more
				341	information on this convention.}
				342
				343	\item[\code{__*__}]
				344	System-defined names. These names are defined by the interpreter
				345	and it's implementation (including the standard library);
				346	applications should not expect to define additional names using this
				347	convention. The set of names of this class defined by Python may be
				348	extended in future versions.
				349	See section~\ref{specialnames}, ``Special method names.''
				350
				351	\item[\code{__*}]
				352	Class-private names. Names in this category, when used within the
Martin v. Löwis	13ff116	2004-06-02 12:48:20 +0000	[diff] [blame]	353	context of a class definition, are re-written to use a mangled form
Fred Drake	38f6b88	2003-09-06 03:50:07 +0000	[diff] [blame]	354	to help avoid name clashes between ``private'' attributes of base
				355	and derived classes.
				356	See section~\ref{atom-identifiers}, ``Identifiers (Names).''
				357
Fred Drake	39fc1bc	1999-03-05 18:30:21 +0000	[diff] [blame]	358	\end{description}
				359
				360
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	361	\section{Literals\label{literals}}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	362
				363	Literals are notations for constant values of some built-in types.
				364	\index{literal}
				365	\index{constant}
				366
Fred Drake	f5eae66	2001-06-23 05:26:52 +0000	[diff] [blame]	367
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	368	\subsection{String literals\label{strings}}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	369
				370	String literals are described by the following lexical definitions:
				371	\index{string literal}
				372
Fred Drake	c37b65e	2001-11-28 07:26:15 +0000	[diff] [blame]	373	\index{ASCII@\ASCII}
Fred Drake	cb4638a	2001-07-06 22:49:53 +0000	[diff] [blame]	374	\begin{productionlist}
				375	\production{stringliteral}
Fred Drake	c0cf726	2001-08-14 21:43:31 +0000	[diff] [blame]	376	{[\token{stringprefix}](\token{shortstring} \| \token{longstring})}
				377	\production{stringprefix}
				378	{"r" \| "u" \| "ur" \| "R" \| "U" \| "UR" \| "Ur" \| "uR"}
Fred Drake	cb4638a	2001-07-06 22:49:53 +0000	[diff] [blame]	379	\production{shortstring}
				380	{"'" \token{shortstringitem}* "'"
				381	\| '"' \token{shortstringitem}* '"'}
				382	\production{longstring}
Fred Drake	5381588	2002-03-15 23:21:37 +0000	[diff] [blame]	383	{"'''" \token{longstringitem}* "'''"}
				384	\productioncont{\| '"""' \token{longstringitem}* '"""'}
Fred Drake	cb4638a	2001-07-06 22:49:53 +0000	[diff] [blame]	385	\production{shortstringitem}
				386	{\token{shortstringchar} \| \token{escapeseq}}
				387	\production{longstringitem}
				388	{\token{longstringchar} \| \token{escapeseq}}
				389	\production{shortstringchar}
Martin v. Löwis	266a436	2004-09-14 07:52:22 +0000	[diff] [blame]	390	{<any source character except "\e" or newline or the quote>}
Fred Drake	cb4638a	2001-07-06 22:49:53 +0000	[diff] [blame]	391	\production{longstringchar}
Martin v. Löwis	266a436	2004-09-14 07:52:22 +0000	[diff] [blame]	392	{<any source character except "\e">}
Fred Drake	cb4638a	2001-07-06 22:49:53 +0000	[diff] [blame]	393	\production{escapeseq}
				394	{"\e" <any ASCII character>}
				395	\end{productionlist}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	396
Fred Drake	c0cf726	2001-08-14 21:43:31 +0000	[diff] [blame]	397	One syntactic restriction not indicated by these productions is that
				398	whitespace is not allowed between the \grammartoken{stringprefix} and
Martin v. Löwis	266a436	2004-09-14 07:52:22 +0000	[diff] [blame]	399	the rest of the string literal. The source character set is defined
Fred Drake	5b00059	2004-11-10 16:51:17 +0000	[diff] [blame]	400	by the encoding declaration; it is \ASCII{} if no encoding declaration
				401	is given in the source file; see section~\ref{encodings}.
Fred Drake	c0cf726	2001-08-14 21:43:31 +0000	[diff] [blame]	402
Fred Drake	dea764d	2000-12-19 04:52:03 +0000	[diff] [blame]	403	\index{triple-quoted string}
				404	\index{Unicode Consortium}
				405	\index{string!Unicode}
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	406	In plain English: String literals can be enclosed in matching single
				407	quotes (\code{'}) or double quotes (\code{"}). They can also be
				408	enclosed in matching groups of three single or double quotes (these
				409	are generally referred to as \emph{triple-quoted strings}). The
				410	backslash (\code{\e}) character is used to escape characters that
				411	otherwise have a special meaning, such as newline, backslash itself,
				412	or the quote character. String literals may optionally be prefixed
Raymond Hettinger	83dcf5a	2002-08-07 16:53:17 +0000	[diff] [blame]	413	with a letter \character{r} or \character{R}; such strings are called
				414	\dfn{raw strings}\index{raw string} and use different rules for interpreting
				415	backslash escape sequences. A prefix of \character{u} or \character{U}
				416	makes the string a Unicode string. Unicode strings use the Unicode character
				417	set as defined by the Unicode Consortium and ISO~10646. Some additional
Fred Drake	dea764d	2000-12-19 04:52:03 +0000	[diff] [blame]	418	escape sequences, described below, are available in Unicode strings.
Raymond Hettinger	83dcf5a	2002-08-07 16:53:17 +0000	[diff] [blame]	419	The two prefix characters may be combined; in this case, \character{u} must
				420	appear before \character{r}.
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	421
				422	In triple-quoted strings,
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	423	unescaped newlines and quotes are allowed (and are retained), except
				424	that three unescaped quotes in a row terminate the string. (A
				425	``quote'' is the character used to open the string, i.e. either
Fred Drake	5c07d9b	1998-05-14 19:37:06 +0000	[diff] [blame]	426	\code{'} or \code{"}.)
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	427
Raymond Hettinger	83dcf5a	2002-08-07 16:53:17 +0000	[diff] [blame]	428	Unless an \character{r} or \character{R} prefix is present, escape
				429	sequences in strings are interpreted according to rules similar
Fred Drake	9079164	2001-07-20 15:33:23 +0000	[diff] [blame]	430	to those used by Standard C. The recognized escape sequences are:
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	431	\index{physical line}
				432	\index{escape sequence}
				433	\index{Standard C}
				434	\index{C}
				435
Fred Drake	3e930ba	2002-09-24 21:08:37 +0000	[diff] [blame]	436	\begin{tableiii}{l\|l\|c}{code}{Escape Sequence}{Meaning}{Notes}
				437	\lineiii{\e\var{newline}} {Ignored}{}
				438	\lineiii{\e\e} {Backslash (\code{\e})}{}
				439	\lineiii{\e'} {Single quote (\code{'})}{}
				440	\lineiii{\e"} {Double quote (\code{"})}{}
				441	\lineiii{\e a} {\ASCII{} Bell (BEL)}{}
				442	\lineiii{\e b} {\ASCII{} Backspace (BS)}{}
				443	\lineiii{\e f} {\ASCII{} Formfeed (FF)}{}
				444	\lineiii{\e n} {\ASCII{} Linefeed (LF)}{}
				445	\lineiii{\e N\{\var{name}\}}
				446	{Character named \var{name} in the Unicode database (Unicode only)}{}
				447	\lineiii{\e r} {\ASCII{} Carriage Return (CR)}{}
				448	\lineiii{\e t} {\ASCII{} Horizontal Tab (TAB)}{}
				449	\lineiii{\e u\var{xxxx}}
				450	{Character with 16-bit hex value \var{xxxx} (Unicode only)}{(1)}
				451	\lineiii{\e U\var{xxxxxxxx}}
				452	{Character with 32-bit hex value \var{xxxxxxxx} (Unicode only)}{(2)}
				453	\lineiii{\e v} {\ASCII{} Vertical Tab (VT)}{}
Martin v. Löwis	266a436	2004-09-14 07:52:22 +0000	[diff] [blame]	454	\lineiii{\e\var{ooo}} {Character with octal value \var{ooo}}{(3,5)}
				455	\lineiii{\e x\var{hh}} {Character with hex value \var{hh}}{(4,5)}
Fred Drake	3e930ba	2002-09-24 21:08:37 +0000	[diff] [blame]	456	\end{tableiii}
Fred Drake	c37b65e	2001-11-28 07:26:15 +0000	[diff] [blame]	457	\index{ASCII@\ASCII}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	458
Fred Drake	3e930ba	2002-09-24 21:08:37 +0000	[diff] [blame]	459	\noindent
				460	Notes:
				461
				462	\begin{itemize}
				463	\item[(1)]
				464	Individual code units which form parts of a surrogate pair can be
				465	encoded using this escape sequence.
				466	\item[(2)]
				467	Any Unicode character can be encoded this way, but characters
				468	outside the Basic Multilingual Plane (BMP) will be encoded using a
				469	surrogate pair if Python is compiled to use 16-bit code units (the
				470	default). Individual code units which form parts of a surrogate
				471	pair can be encoded using this escape sequence.
				472	\item[(3)]
				473	As in Standard C, up to three octal digits are accepted.
				474	\item[(4)]
				475	Unlike in Standard C, at most two hex digits are accepted.
Martin v. Löwis	266a436	2004-09-14 07:52:22 +0000	[diff] [blame]	476	\item[(5)]
				477	In a string literal, hexadecimal and octal escapes denote the
				478	byte with the given value; it is not necessary that the byte
				479	encodes a character in the source character set. In a Unicode
				480	literal, these escapes denote a Unicode character with the given
				481	value.
Fred Drake	3e930ba	2002-09-24 21:08:37 +0000	[diff] [blame]	482	\end{itemize}
				483
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	484
Fred Drake	dea764d	2000-12-19 04:52:03 +0000	[diff] [blame]	485	Unlike Standard \index{unrecognized escape sequence}C,
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	486	all unrecognized escape sequences are left in the string unchanged,
Fred Drake	dea764d	2000-12-19 04:52:03 +0000	[diff] [blame]	487	i.e., \emph{the backslash is left in the string}. (This behavior is
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	488	useful when debugging: if an escape sequence is mistyped, the
Fred Drake	dea764d	2000-12-19 04:52:03 +0000	[diff] [blame]	489	resulting output is more easily recognized as broken.) It is also
				490	important to note that the escape sequences marked as ``(Unicode
				491	only)'' in the table above fall into the category of unrecognized
				492	escapes for non-Unicode string literals.
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	493
Raymond Hettinger	83dcf5a	2002-08-07 16:53:17 +0000	[diff] [blame]	494	When an \character{r} or \character{R} prefix is present, a character
				495	following a backslash is included in the string without change, and \emph{all
Fred Drake	347a625	2001-01-09 21:38:16 +0000	[diff] [blame]	496	backslashes are left in the string}. For example, the string literal
				497	\code{r"\e n"} consists of two characters: a backslash and a lowercase
Raymond Hettinger	83dcf5a	2002-08-07 16:53:17 +0000	[diff] [blame]	498	\character{n}. String quotes can be escaped with a backslash, but the
				499	backslash remains in the string; for example, \code{r"\e""} is a valid string
Fred Drake	347a625	2001-01-09 21:38:16 +0000	[diff] [blame]	500	literal consisting of two characters: a backslash and a double quote;
Fred Drake	0825dc2	2001-07-20 14:32:28 +0000	[diff] [blame]	501	\code{r"\e"} is not a valid string literal (even a raw string cannot
Fred Drake	347a625	2001-01-09 21:38:16 +0000	[diff] [blame]	502	end in an odd number of backslashes). Specifically, \emph{a raw
				503	string cannot end in a single backslash} (since the backslash would
				504	escape the following quote character). Note also that a single
				505	backslash followed by a newline is interpreted as those two characters
				506	as part of the string, \emph{not} as a line continuation.
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	507
Fred Drake	f7aa164	2002-08-07 13:24:09 +0000	[diff] [blame]	508	When an \character{r} or \character{R} prefix is used in conjunction
				509	with a \character{u} or \character{U} prefix, then the \code{\e uXXXX}
				510	escape sequence is processed while \emph{all other backslashes are
Fred Drake	3e930ba	2002-09-24 21:08:37 +0000	[diff] [blame]	511	left in the string}. For example, the string literal
				512	\code{ur"\e{}u0062\e n"} consists of three Unicode characters: `LATIN
				513	SMALL LETTER B', `REVERSE SOLIDUS', and `LATIN SMALL LETTER N'.
				514	Backslashes can be escaped with a preceding backslash; however, both
				515	remain in the string. As a result, \code{\e uXXXX} escape sequences
				516	are only recognized when there are an odd number of backslashes.
Fred Drake	f5eae66	2001-06-23 05:26:52 +0000	[diff] [blame]	517
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	518	\subsection{String literal concatenation\label{string-catenation}}
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	519
				520	Multiple adjacent string literals (delimited by whitespace), possibly
				521	using different quoting conventions, are allowed, and their meaning is
				522	the same as their concatenation. Thus, \code{"hello" 'world'} is
				523	equivalent to \code{"helloworld"}. This feature can be used to reduce
				524	the number of backslashes needed, to split long strings conveniently
				525	across long lines, or even to add comments to parts of strings, for
				526	example:
				527
				528	\begin{verbatim}
				529	re.compile("[A-Za-z_]" # letter or underscore
				530	"[A-Za-z0-9_]*" # letter, digit or underscore
				531	)
				532	\end{verbatim}
				533
				534	Note that this feature is defined at the syntactical level, but
				535	implemented at compile time. The `+' operator must be used to
				536	concatenate string expressions at run time. Also note that literal
				537	concatenation can use different quoting styles for each component
				538	(even mixing raw strings and triple quoted strings).
				539
Fred Drake	2ed27d3	2000-11-17 19:05:12 +0000	[diff] [blame]	540
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	541	\subsection{Numeric literals\label{numbers}}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	542
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	543	There are four types of numeric literals: plain integers, long
				544	integers, floating point numbers, and imaginary numbers. There are no
				545	complex literals (complex numbers can be formed by adding a real
				546	number and an imaginary number).
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	547	\index{number}
				548	\index{numeric literal}
				549	\index{integer literal}
				550	\index{plain integer literal}
				551	\index{long integer literal}
				552	\index{floating point literal}
				553	\index{hexadecimal literal}
				554	\index{octal literal}
				555	\index{decimal literal}
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	556	\index{imaginary literal}
Fred Drake	ed9e453	2002-04-23 20:04:46 +0000	[diff] [blame]	557	\index{complex!literal}
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	558
				559	Note that numeric literals do not include a sign; a phrase like
				560	\code{-1} is actually an expression composed of the unary operator
				561	`\code{-}' and the literal \code{1}.
				562
Fred Drake	f5eae66	2001-06-23 05:26:52 +0000	[diff] [blame]	563
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	564	\subsection{Integer and long integer literals\label{integers}}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	565
				566	Integer and long integer literals are described by the following
				567	lexical definitions:
				568
Fred Drake	cb4638a	2001-07-06 22:49:53 +0000	[diff] [blame]	569	\begin{productionlist}
				570	\production{longinteger}
				571	{\token{integer} ("l" \| "L")}
				572	\production{integer}
				573	{\token{decimalinteger} \| \token{octinteger} \| \token{hexinteger}}
				574	\production{decimalinteger}
				575	{\token{nonzerodigit} \token{digit}* \| "0"}
				576	\production{octinteger}
				577	{"0" \token{octdigit}+}
				578	\production{hexinteger}
				579	{"0" ("x" \| "X") \token{hexdigit}+}
				580	\production{nonzerodigit}
				581	{"1"..."9"}
				582	\production{octdigit}
				583	{"0"..."7"}
				584	\production{hexdigit}
				585	{\token{digit} \| "a"..."f" \| "A"..."F"}
				586	\end{productionlist}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	587
Raymond Hettinger	83dcf5a	2002-08-07 16:53:17 +0000	[diff] [blame]	588	Although both lower case \character{l} and upper case \character{L} are
				589	allowed as suffix for long integers, it is strongly recommended to always
				590	use \character{L}, since the letter \character{l} looks too much like the
				591	digit \character{1}.
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	592
Guido van Rossum	6c9e130	2003-11-29 23:52:13 +0000	[diff] [blame]	593	Plain integer literals that are above the largest representable plain
				594	integer (e.g., 2147483647 when using 32-bit arithmetic) are accepted
				595	as if they were long integers instead.\footnote{In versions of Python
				596	prior to 2.4, octal and hexadecimal literals in the range just above
				597	the largest representable plain integer but below the largest unsigned
				598	32-bit number (on a machine using 32-bit arithmetic), 4294967296, were
				599	taken as the negative plain integer obtained by subtracting 4294967296
				600	from their unsigned value.} There is no limit for long integer
				601	literals apart from what can be stored in available memory.
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	602
Raymond Hettinger	e701dcb	2003-01-19 13:08:18 +0000	[diff] [blame]	603	Some examples of plain integer literals (first row) and long integer
				604	literals (second and third rows):
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	605
				606	\begin{verbatim}
Guido van Rossum	6c9e130	2003-11-29 23:52:13 +0000	[diff] [blame]	607	7 2147483647 0177
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	608	3L 79228162514264337593543950336L 0377L 0x100000000L
Guido van Rossum	6c9e130	2003-11-29 23:52:13 +0000	[diff] [blame]	609	79228162514264337593543950336 0xdeadbeef
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	610	\end{verbatim}
				611
Fred Drake	f5eae66	2001-06-23 05:26:52 +0000	[diff] [blame]	612
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	613	\subsection{Floating point literals\label{floating}}
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	614
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	615	Floating point literals are described by the following lexical
				616	definitions:
				617
Fred Drake	cb4638a	2001-07-06 22:49:53 +0000	[diff] [blame]	618	\begin{productionlist}
				619	\production{floatnumber}
				620	{\token{pointfloat} \| \token{exponentfloat}}
				621	\production{pointfloat}
				622	{[\token{intpart}] \token{fraction} \| \token{intpart} "."}
				623	\production{exponentfloat}
Tim Peters	d507dab	2001-08-30 20:51:59 +0000	[diff] [blame]	624	{(\token{intpart} \| \token{pointfloat})
Fred Drake	cb4638a	2001-07-06 22:49:53 +0000	[diff] [blame]	625	\token{exponent}}
				626	\production{intpart}
Tim Peters	d507dab	2001-08-30 20:51:59 +0000	[diff] [blame]	627	{\token{digit}+}
Fred Drake	cb4638a	2001-07-06 22:49:53 +0000	[diff] [blame]	628	\production{fraction}
				629	{"." \token{digit}+}
				630	\production{exponent}
				631	{("e" \| "E") ["+" \| "-"] \token{digit}+}
				632	\end{productionlist}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	633
Tim Peters	d507dab	2001-08-30 20:51:59 +0000	[diff] [blame]	634	Note that the integer and exponent parts of floating point numbers
				635	can look like octal integers, but are interpreted using radix 10. For
				636	example, \samp{077e010} is legal, and denotes the same number
				637	as \samp{77e10}.
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	638	The allowed range of floating point literals is
				639	implementation-dependent.
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	640	Some examples of floating point literals:
				641
				642	\begin{verbatim}
Tim Peters	d507dab	2001-08-30 20:51:59 +0000	[diff] [blame]	643	3.14 10. .001 1e100 3.14e-10 0e0
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	644	\end{verbatim}
				645
				646	Note that numeric literals do not include a sign; a phrase like
Fred Drake	5c07d9b	1998-05-14 19:37:06 +0000	[diff] [blame]	647	\code{-1} is actually an expression composed of the operator
				648	\code{-} and the literal \code{1}.
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	649
Fred Drake	f5eae66	2001-06-23 05:26:52 +0000	[diff] [blame]	650
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	651	\subsection{Imaginary literals\label{imaginary}}
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	652
				653	Imaginary literals are described by the following lexical definitions:
				654
Fred Drake	cb4638a	2001-07-06 22:49:53 +0000	[diff] [blame]	655	\begin{productionlist}
				656	\production{imagnumber}{(\token{floatnumber} \| \token{intpart}) ("j" \| "J")}
				657	\end{productionlist}
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	658
Fred Drake	e15956b	2000-04-03 04:51:13 +0000	[diff] [blame]	659	An imaginary literal yields a complex number with a real part of
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	660	0.0. Complex numbers are represented as a pair of floating point
				661	numbers and have the same restrictions on their range. To create a
				662	complex number with a nonzero real part, add a floating point number
Guido van Rossum	7c0240f	1998-07-24 15:36:43 +0000	[diff] [blame]	663	to it, e.g., \code{(3+4j)}. Some examples of imaginary literals:
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	664
				665	\begin{verbatim}
Guido van Rossum	7c0240f	1998-07-24 15:36:43 +0000	[diff] [blame]	666	3.14j 10.j 10j .001j 1e100j 3.14e-10j
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	667	\end{verbatim}
				668
				669
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	670	\section{Operators\label{operators}}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	671
				672	The following tokens are operators:
				673	\index{operators}
				674
				675	\begin{verbatim}
Fred Drake	a7d608d	2001-08-08 05:37:21 +0000	[diff] [blame]	676	+ - * ** / // %
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	677	<< >> & \| ^ ~
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	678	< > <= >= == != <>
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	679	\end{verbatim}
				680
Fred Drake	5c07d9b	1998-05-14 19:37:06 +0000	[diff] [blame]	681	The comparison operators \code{<>} and \code{!=} are alternate
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	682	spellings of the same operator. \code{!=} is the preferred spelling;
				683	\code{<>} is obsolescent.
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	684
Fred Drake	f5eae66	2001-06-23 05:26:52 +0000	[diff] [blame]	685
Fred Drake	61c7728	1998-07-28 19:34:22 +0000	[diff] [blame]	686	\section{Delimiters\label{delimiters}}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	687
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	688	The following tokens serve as delimiters in the grammar:
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	689	\index{delimiters}
				690
				691	\begin{verbatim}
Fred Drake	6bd8e84	2004-08-05 21:11:27 +0000	[diff] [blame]	692	( ) [ ] { } @
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	693	, : . ` = ;
Fred Drake	a7d608d	2001-08-08 05:37:21 +0000	[diff] [blame]	694	+= -= *= /= //= %=
				695	&= \|= ^= >>= <<= **=
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	696	\end{verbatim}
				697
				698	The period can also occur in floating-point and imaginary literals. A
Fred Drake	e15956b	2000-04-03 04:51:13 +0000	[diff] [blame]	699	sequence of three periods has a special meaning as an ellipsis in slices.
Thomas Wouters	12bba85	2000-08-24 20:06:04 +0000	[diff] [blame]	700	The second half of the list, the augmented assignment operators, serve
				701	lexically as delimiters, but also perform an operation.
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	702
Fred Drake	c37b65e	2001-11-28 07:26:15 +0000	[diff] [blame]	703	The following printing \ASCII{} characters have special meaning as part
Guido van Rossum	60f2f0c	1998-06-15 18:00:50 +0000	[diff] [blame]	704	of other tokens or are otherwise significant to the lexical analyzer:
				705
				706	\begin{verbatim}
				707	' " # \
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	708	\end{verbatim}
				709
				710	The following printing \ASCII{} characters are not used in Python. Their
				711	occurrence outside string literals and comments is an unconditional
				712	error:
Fred Drake	c37b65e	2001-11-28 07:26:15 +0000	[diff] [blame]	713	\index{ASCII@\ASCII}
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	714
				715	\begin{verbatim}
Fred Drake	6bd8e84	2004-08-05 21:11:27 +0000	[diff] [blame]	716	$ ?
Fred Drake	f666917	1998-05-06 19:52:49 +0000	[diff] [blame]	717	\end{verbatim}