blob: b2956aeb5ca069130a11994c5204e725f4f1f450 [file] [log] [blame]
Martin v. Löwis577b5b92006-02-27 15:23:19 +00001% XXX Label can't be _ast?
2% XXX Where should this section/chapter go?
3\chapter{Abstract Syntax Trees\label{ast}}
4
5\sectionauthor{Martin v. L\"owis}{martin@v.loewis.de}
6
Martin v. Löwis40d84592006-02-28 00:30:54 +00007\versionadded{2.5}
8
Martin v. Löwis577b5b92006-02-27 15:23:19 +00009The \code{_ast} module helps Python applications to process
10trees of the Python abstract syntax grammar. The Python compiler
11currently provides read-only access to such trees, meaning that
12applications can only create a tree for a given piece of Python
13source code; generating byte code from a (potentially modified)
14tree is not supported. The abstract syntax itself might change with
15each Python release; this module helps to find out programmatically
16what the current grammar looks like.
17
18An abstract syntax tree can be generated by passing \code{_ast.PyCF_ONLY_AST}
19as a flag to the \function{compile} builtin function. The result will be a tree
20of objects whose classes all inherit from \code{_ast.AST}.
21
22The actual classes are derived from the \code{Parser/Python.asdl} file,
23which is reproduced below. There is one class defined for each left-hand
24side symbol in the abstract grammar (for example, \code{_ast.stmt} or \code{_ast.expr}).
25In addition, there is one class defined for each constructor on the
26right-hand side; these classes inherit from the classes for the left-hand
27side trees. For example, \code{_ast.BinOp} inherits from \code{_ast.expr}.
28For production rules with alternatives (aka "sums"), the left-hand side
29class is abstract: only instances of specific constructor nodes are ever
30created.
31
32Each concrete class has an attribute \code{_fields} which gives the
33names of all child nodes.
34
35Each instance of a concrete class has one attribute for each child node,
36of the type as defined in the grammar. For example, \code{_ast.BinOp}
Martin v. Löwis49c5da12006-03-01 22:49:05 +000037instances have an attribute \code{left} of type \code{_ast.expr}.
38Instances of \code{_ast.expr} and \code{_ast.stmt} subclasses also
39have lineno and col_offset attributes. The lineno is the line number
40of source text (1 indexed so the first line is line 1) and the
41col_offset is the utf8 byte offset of the first token that generated
42the node. The utf8 offset is recorded because the parser uses utf8
43internally.
Martin v. Löwis577b5b92006-02-27 15:23:19 +000044
45If these attributes are marked as optional in the grammar (using a
46question mark), the value might be \code{None}. If the attributes
47can have zero-or-more values (marked with an asterisk), the
48values are represented as Python lists.
49
Fred Drake842ab702006-03-31 05:28:38 +000050\section{Abstract Grammar}
Martin v. Löwis577b5b92006-02-27 15:23:19 +000051
Martin v. Löwis40d84592006-02-28 00:30:54 +000052The module defines a string constant \code{__version__} which
53is the decimal subversion revision number of the file shown below.
54
Martin v. Löwis577b5b92006-02-27 15:23:19 +000055The abstract grammar is currently defined as follows:
56
Fred Drake842ab702006-03-31 05:28:38 +000057\verbatiminput{../../Parser/Python.asdl}