Martin v. Löwis | 577b5b9 | 2006-02-27 15:23:19 +0000 | [diff] [blame] | 1 | % XXX Label can't be _ast? |
| 2 | % XXX Where should this section/chapter go? |
| 3 | \chapter{Abstract Syntax Trees\label{ast}} |
| 4 | |
| 5 | \sectionauthor{Martin v. L\"owis}{martin@v.loewis.de} |
| 6 | |
Martin v. Löwis | 40d8459 | 2006-02-28 00:30:54 +0000 | [diff] [blame] | 7 | \versionadded{2.5} |
| 8 | |
Martin v. Löwis | 577b5b9 | 2006-02-27 15:23:19 +0000 | [diff] [blame] | 9 | The \code{_ast} module helps Python applications to process |
| 10 | trees of the Python abstract syntax grammar. The Python compiler |
| 11 | currently provides read-only access to such trees, meaning that |
| 12 | applications can only create a tree for a given piece of Python |
| 13 | source code; generating byte code from a (potentially modified) |
| 14 | tree is not supported. The abstract syntax itself might change with |
| 15 | each Python release; this module helps to find out programmatically |
| 16 | what the current grammar looks like. |
| 17 | |
| 18 | An abstract syntax tree can be generated by passing \code{_ast.PyCF_ONLY_AST} |
| 19 | as a flag to the \function{compile} builtin function. The result will be a tree |
| 20 | of objects whose classes all inherit from \code{_ast.AST}. |
| 21 | |
| 22 | The actual classes are derived from the \code{Parser/Python.asdl} file, |
| 23 | which is reproduced below. There is one class defined for each left-hand |
| 24 | side symbol in the abstract grammar (for example, \code{_ast.stmt} or \code{_ast.expr}). |
| 25 | In addition, there is one class defined for each constructor on the |
| 26 | right-hand side; these classes inherit from the classes for the left-hand |
| 27 | side trees. For example, \code{_ast.BinOp} inherits from \code{_ast.expr}. |
| 28 | For production rules with alternatives (aka "sums"), the left-hand side |
| 29 | class is abstract: only instances of specific constructor nodes are ever |
| 30 | created. |
| 31 | |
| 32 | Each concrete class has an attribute \code{_fields} which gives the |
| 33 | names of all child nodes. |
| 34 | |
| 35 | Each instance of a concrete class has one attribute for each child node, |
| 36 | of the type as defined in the grammar. For example, \code{_ast.BinOp} |
Martin v. Löwis | 49c5da1 | 2006-03-01 22:49:05 +0000 | [diff] [blame] | 37 | instances have an attribute \code{left} of type \code{_ast.expr}. |
| 38 | Instances of \code{_ast.expr} and \code{_ast.stmt} subclasses also |
| 39 | have lineno and col_offset attributes. The lineno is the line number |
| 40 | of source text (1 indexed so the first line is line 1) and the |
| 41 | col_offset is the utf8 byte offset of the first token that generated |
| 42 | the node. The utf8 offset is recorded because the parser uses utf8 |
| 43 | internally. |
Martin v. Löwis | 577b5b9 | 2006-02-27 15:23:19 +0000 | [diff] [blame] | 44 | |
| 45 | If these attributes are marked as optional in the grammar (using a |
| 46 | question mark), the value might be \code{None}. If the attributes |
| 47 | can have zero-or-more values (marked with an asterisk), the |
| 48 | values are represented as Python lists. |
| 49 | |
Fred Drake | 842ab70 | 2006-03-31 05:28:38 +0000 | [diff] [blame] | 50 | \section{Abstract Grammar} |
Martin v. Löwis | 577b5b9 | 2006-02-27 15:23:19 +0000 | [diff] [blame] | 51 | |
Martin v. Löwis | 40d8459 | 2006-02-28 00:30:54 +0000 | [diff] [blame] | 52 | The module defines a string constant \code{__version__} which |
| 53 | is the decimal subversion revision number of the file shown below. |
| 54 | |
Martin v. Löwis | 577b5b9 | 2006-02-27 15:23:19 +0000 | [diff] [blame] | 55 | The abstract grammar is currently defined as follows: |
| 56 | |
Fred Drake | 842ab70 | 2006-03-31 05:28:38 +0000 | [diff] [blame] | 57 | \verbatiminput{../../Parser/Python.asdl} |