Guido van Rossum | 46f3e00 | 1992-08-14 09:11:01 +0000 | [diff] [blame] | 1 | \chapter{Introduction} |
| 2 | |
| 3 | This reference manual describes the Python programming language. |
| 4 | It is not intended as a tutorial. |
| 5 | |
| 6 | While I am trying to be as precise as possible, I chose to use English |
| 7 | rather than formal specifications for everything except syntax and |
Guido van Rossum | 16d6e71 | 1994-08-08 12:30:22 +0000 | [diff] [blame] | 8 | lexical analysis. This should make the document more understandable |
Guido van Rossum | 46f3e00 | 1992-08-14 09:11:01 +0000 | [diff] [blame] | 9 | to the average reader, but will leave room for ambiguities. |
| 10 | Consequently, if you were coming from Mars and tried to re-implement |
| 11 | Python from this document alone, you might have to guess things and in |
| 12 | fact you would probably end up implementing quite a different language. |
| 13 | On the other hand, if you are using |
| 14 | Python and wonder what the precise rules about a particular area of |
| 15 | the language are, you should definitely be able to find them here. |
| 16 | |
| 17 | It is dangerous to add too many implementation details to a language |
| 18 | reference document --- the implementation may change, and other |
| 19 | implementations of the same language may work differently. On the |
| 20 | other hand, there is currently only one Python implementation, and |
| 21 | its particular quirks are sometimes worth being mentioned, especially |
| 22 | where the implementation imposes additional limitations. Therefore, |
| 23 | you'll find short ``implementation notes'' sprinkled throughout the |
| 24 | text. |
| 25 | |
| 26 | Every Python implementation comes with a number of built-in and |
| 27 | standard modules. These are not documented here, but in the separate |
| 28 | {\em Python Library Reference} document. A few built-in modules are |
| 29 | mentioned when they interact in a significant way with the language |
| 30 | definition. |
| 31 | |
| 32 | \section{Notation} |
| 33 | |
| 34 | The descriptions of lexical analysis and syntax use a modified BNF |
| 35 | grammar notation. This uses the following style of definition: |
| 36 | \index{BNF} |
| 37 | \index{grammar} |
| 38 | \index{syntax} |
| 39 | \index{notation} |
| 40 | |
| 41 | \begin{verbatim} |
| 42 | name: lc_letter (lc_letter | "_")* |
| 43 | lc_letter: "a"..."z" |
| 44 | \end{verbatim} |
| 45 | |
Guido van Rossum | 6938f06 | 1994-08-01 12:22:53 +0000 | [diff] [blame] | 46 | The first line says that a \verb@name@ is an \verb@lc_letter@ followed by |
| 47 | a sequence of zero or more \verb@lc_letter@s and underscores. An |
| 48 | \verb@lc_letter@ in turn is any of the single characters `a' through `z'. |
Guido van Rossum | 46f3e00 | 1992-08-14 09:11:01 +0000 | [diff] [blame] | 49 | (This rule is actually adhered to for the names defined in lexical and |
| 50 | grammar rules in this document.) |
| 51 | |
| 52 | Each rule begins with a name (which is the name defined by the rule) |
Guido van Rossum | 6938f06 | 1994-08-01 12:22:53 +0000 | [diff] [blame] | 53 | and a colon. A vertical bar (\verb@|@) is used to separate |
Guido van Rossum | 46f3e00 | 1992-08-14 09:11:01 +0000 | [diff] [blame] | 54 | alternatives; it is the least binding operator in this notation. A |
Guido van Rossum | 6938f06 | 1994-08-01 12:22:53 +0000 | [diff] [blame] | 55 | star (\verb@*@) means zero or more repetitions of the preceding item; |
| 56 | likewise, a plus (\verb@+@) means one or more repetitions, and a |
| 57 | phrase enclosed in square brackets (\verb@[ ]@) means zero or one |
Guido van Rossum | 46f3e00 | 1992-08-14 09:11:01 +0000 | [diff] [blame] | 58 | occurrences (in other words, the enclosed phrase is optional). The |
Guido van Rossum | 6938f06 | 1994-08-01 12:22:53 +0000 | [diff] [blame] | 59 | \verb@*@ and \verb@+@ operators bind as tightly as possible; |
Guido van Rossum | 46f3e00 | 1992-08-14 09:11:01 +0000 | [diff] [blame] | 60 | parentheses are used for grouping. Literal strings are enclosed in |
Guido van Rossum | 6938f06 | 1994-08-01 12:22:53 +0000 | [diff] [blame] | 61 | quotes. White space is only meaningful to separate tokens. |
Guido van Rossum | 46f3e00 | 1992-08-14 09:11:01 +0000 | [diff] [blame] | 62 | Rules are normally contained on a single line; rules with many |
| 63 | alternatives may be formatted alternatively with each line after the |
| 64 | first beginning with a vertical bar. |
| 65 | |
| 66 | In lexical definitions (as the example above), two more conventions |
| 67 | are used: Two literal characters separated by three dots mean a choice |
Guido van Rossum | 47b4c0f | 1995-03-15 11:25:32 +0000 | [diff] [blame] | 68 | of any single character in the given (inclusive) range of \ASCII{} |
Guido van Rossum | 6938f06 | 1994-08-01 12:22:53 +0000 | [diff] [blame] | 69 | characters. A phrase between angular brackets (\verb@<...>@) gives an |
Guido van Rossum | 46f3e00 | 1992-08-14 09:11:01 +0000 | [diff] [blame] | 70 | informal description of the symbol defined; e.g. this could be used |
| 71 | to describe the notion of `control character' if needed. |
| 72 | \index{lexical definitions} |
| 73 | \index{ASCII} |
| 74 | |
| 75 | Even though the notation used is almost the same, there is a big |
| 76 | difference between the meaning of lexical and syntactic definitions: |
| 77 | a lexical definition operates on the individual characters of the |
| 78 | input source, while a syntax definition operates on the stream of |
| 79 | tokens generated by the lexical analysis. All uses of BNF in the next |
| 80 | chapter (``Lexical Analysis'') are lexical definitions; uses in |
| 81 | subsequent chapters are syntactic definitions. |