| % XXX Label can't be _ast? |
| % XXX Where should this section/chapter go? |
| \chapter{Abstract Syntax Trees\label{ast}} |
| |
| \sectionauthor{Martin v. L\"owis}{martin@v.loewis.de} |
| |
| \versionadded{2.5} |
| |
| The \code{_ast} module helps Python applications to process |
| trees of the Python abstract syntax grammar. The Python compiler |
| currently provides read-only access to such trees, meaning that |
| applications can only create a tree for a given piece of Python |
| source code; generating byte code from a (potentially modified) |
| tree is not supported. The abstract syntax itself might change with |
| each Python release; this module helps to find out programmatically |
| what the current grammar looks like. |
| |
| An abstract syntax tree can be generated by passing \code{_ast.PyCF_ONLY_AST} |
| as a flag to the \function{compile} builtin function. The result will be a tree |
| of objects whose classes all inherit from \code{_ast.AST}. |
| |
| The actual classes are derived from the \code{Parser/Python.asdl} file, |
| which is reproduced below. There is one class defined for each left-hand |
| side symbol in the abstract grammar (for example, \code{_ast.stmt} or \code{_ast.expr}). |
| In addition, there is one class defined for each constructor on the |
| right-hand side; these classes inherit from the classes for the left-hand |
| side trees. For example, \code{_ast.BinOp} inherits from \code{_ast.expr}. |
| For production rules with alternatives (aka "sums"), the left-hand side |
| class is abstract: only instances of specific constructor nodes are ever |
| created. |
| |
| Each concrete class has an attribute \code{_fields} which gives the |
| names of all child nodes. |
| |
| Each instance of a concrete class has one attribute for each child node, |
| of the type as defined in the grammar. For example, \code{_ast.BinOp} |
| instances have an attribute \code{left} of type \code{_ast.expr}. |
| Instances of \code{_ast.expr} and \code{_ast.stmt} subclasses also |
| have lineno and col_offset attributes. The lineno is the line number |
| of source text (1 indexed so the first line is line 1) and the |
| col_offset is the utf8 byte offset of the first token that generated |
| the node. The utf8 offset is recorded because the parser uses utf8 |
| internally. |
| |
| If these attributes are marked as optional in the grammar (using a |
| question mark), the value might be \code{None}. If the attributes |
| can have zero-or-more values (marked with an asterisk), the |
| values are represented as Python lists. |
| |
| \subsection{Abstract Grammar} |
| |
| The module defines a string constant \code{__version__} which |
| is the decimal subversion revision number of the file shown below. |
| |
| The abstract grammar is currently defined as follows: |
| |
| \verbatiminput{../../Parser/Python.asdl} |