| \chapter{Introduction\label{introduction}} |
| |
| This reference manual describes the Python programming language. |
| It is not intended as a tutorial. |
| |
| While I am trying to be as precise as possible, I chose to use English |
| rather than formal specifications for everything except syntax and |
| lexical analysis. This should make the document more understandable |
| to the average reader, but will leave room for ambiguities. |
| Consequently, if you were coming from Mars and tried to re-implement |
| Python from this document alone, you might have to guess things and in |
| fact you would probably end up implementing quite a different language. |
| On the other hand, if you are using |
| Python and wonder what the precise rules about a particular area of |
| the language are, you should definitely be able to find them here. |
| If you would like to see a more formal definition of the language, |
| maybe you could volunteer your time --- or invent a cloning machine |
| :-). |
| |
| It is dangerous to add too many implementation details to a language |
| reference document --- the implementation may change, and other |
| implementations of the same language may work differently. On the |
| other hand, there is currently only one Python implementation in |
| widespread use (although alternate implementations exist), and |
| its particular quirks are sometimes worth being mentioned, especially |
| where the implementation imposes additional limitations. Therefore, |
| you'll find short ``implementation notes'' sprinkled throughout the |
| text. |
| |
| Every Python implementation comes with a number of built-in and |
| standard modules. These are not documented here, but in the separate |
| \citetitle[../lib/lib.html]{Python Library Reference} document. A few |
| built-in modules are mentioned when they interact in a significant way |
| with the language definition. |
| |
| |
| \section{Alternate Implementations\label{implementations}} |
| |
| Though there is one Python implementation which is by far the most |
| popular, there are some alternate implementations which are of |
| particular interest to different audiences. |
| |
| Known implementations include: |
| |
| \begin{itemize} |
| \item[CPython] |
| This is the original and most-maintained implementation of Python, |
| written in C. New language features generally appear here first. |
| |
| \item[Jython] |
| Python implemented in Java. This implementation can be used as a |
| scripting language for Java applications, or can be used to create |
| applications using the Java class libraries. It is also often used to |
| create tests for Java libraries. More information can be found at |
| \ulink{the Jython website}{http://www.jython.org/}. |
| |
| \item[Python for .NET] |
| This implementation actually uses the CPython implementation, but is a |
| managed .NET application and makes .NET libraries available. This was |
| created by Brian Lloyd. For more information, see the \ulink{Python |
| for .NET home page}{http://www.zope.org/Members/Brian/PythonNet}. |
| |
| \item[IronPython] |
| An alternate Python for\ .NET. Unlike Python.NET, this is a complete |
| Python implementation that generates IL, and compiles Python code |
| directly to\ .NET assemblies. It was created by Jim Hugunin, the |
| original creator of Jython. For more information, see \ulink{the |
| IronPython website}{http://workspaces.gotdotnet.com/ironpython}. |
| |
| \item[PyPy] |
| An implementation of Python written in Python; even the bytecode |
| interpreter is written in Python. This is executed using CPython as |
| the underlying interpreter. One of the goals of the project is to |
| encourage experimentation with the language itself by making it easier |
| to modify the interpreter (since it is written in Python). Additional |
| information is available on \ulink{the PyPy project's home |
| page}{http://codespeak.net/pypy/}. |
| \end{itemize} |
| |
| Each of these implementations varies in some way from the language as |
| documented in this manual, or introduces specific information beyond |
| what's covered in the standard Python documentation. Please refer to |
| the implementation-specific documentation to determine what else you |
| need to know about the specific implementation you're using. |
| |
| |
| \section{Notation\label{notation}} |
| |
| The descriptions of lexical analysis and syntax use a modified BNF |
| grammar notation. This uses the following style of definition: |
| \index{BNF} |
| \index{grammar} |
| \index{syntax} |
| \index{notation} |
| |
| \begin{productionlist}[*] |
| \production{name}{\token{lc_letter} (\token{lc_letter} | "_")*} |
| \production{lc_letter}{"a"..."z"} |
| \end{productionlist} |
| |
| The first line says that a \code{name} is an \code{lc_letter} followed by |
| a sequence of zero or more \code{lc_letter}s and underscores. An |
| \code{lc_letter} in turn is any of the single characters \character{a} |
| through \character{z}. (This rule is actually adhered to for the |
| names defined in lexical and grammar rules in this document.) |
| |
| Each rule begins with a name (which is the name defined by the rule) |
| and \code{::=}. A vertical bar (\code{|}) is used to separate |
| alternatives; it is the least binding operator in this notation. A |
| star (\code{*}) means zero or more repetitions of the preceding item; |
| likewise, a plus (\code{+}) means one or more repetitions, and a |
| phrase enclosed in square brackets (\code{[ ]}) means zero or one |
| occurrences (in other words, the enclosed phrase is optional). The |
| \code{*} and \code{+} operators bind as tightly as possible; |
| parentheses are used for grouping. Literal strings are enclosed in |
| quotes. White space is only meaningful to separate tokens. |
| Rules are normally contained on a single line; rules with many |
| alternatives may be formatted alternatively with each line after the |
| first beginning with a vertical bar. |
| |
| In lexical definitions (as the example above), two more conventions |
| are used: Two literal characters separated by three dots mean a choice |
| of any single character in the given (inclusive) range of \ASCII{} |
| characters. A phrase between angular brackets (\code{<...>}) gives an |
| informal description of the symbol defined; e.g., this could be used |
| to describe the notion of `control character' if needed. |
| \index{lexical definitions} |
| \index{ASCII@\ASCII} |
| |
| Even though the notation used is almost the same, there is a big |
| difference between the meaning of lexical and syntactic definitions: |
| a lexical definition operates on the individual characters of the |
| input source, while a syntax definition operates on the stream of |
| tokens generated by the lexical analysis. All uses of BNF in the next |
| chapter (``Lexical Analysis'') are lexical definitions; uses in |
| subsequent chapters are syntactic definitions. |