| :mod:`parser` --- Access Python parse trees | 
 | =========================================== | 
 |  | 
 | .. module:: parser | 
 |    :synopsis: Access parse trees for Python source code. | 
 | .. moduleauthor:: Fred L. Drake, Jr. <fdrake@acm.org> | 
 | .. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> | 
 |  | 
 |  | 
 | .. Copyright 1995 Virginia Polytechnic Institute and State University and Fred | 
 |    L. Drake, Jr.  This copyright notice must be distributed on all copies, but | 
 |    this document otherwise may be distributed as part of the Python | 
 |    distribution.  No fee may be charged for this document in any representation, | 
 |    either on paper or electronically.  This restriction does not affect other | 
 |    elements in a distributed package in any way. | 
 |  | 
 | .. index:: single: parsing; Python source code | 
 |  | 
 | The :mod:`parser` module provides an interface to Python's internal parser and | 
 | byte-code compiler.  The primary purpose for this interface is to allow Python | 
 | code to edit the parse tree of a Python expression and create executable code | 
 | from this.  This is better than trying to parse and modify an arbitrary Python | 
 | code fragment as a string because parsing is performed in a manner identical to | 
 | the code forming the application.  It is also faster. | 
 |  | 
 | .. note:: | 
 |  | 
 |    From Python 2.5 onward, it's much more convenient to cut in at the Abstract | 
 |    Syntax Tree (AST) generation and compilation stage, using the :mod:`ast` | 
 |    module. | 
 |  | 
 | There are a few things to note about this module which are important to making | 
 | use of the data structures created.  This is not a tutorial on editing the parse | 
 | trees for Python code, but some examples of using the :mod:`parser` module are | 
 | presented. | 
 |  | 
 | Most importantly, a good understanding of the Python grammar processed by the | 
 | internal parser is required.  For full information on the language syntax, refer | 
 | to :ref:`reference-index`.  The parser | 
 | itself is created from a grammar specification defined in the file | 
 | :file:`Grammar/Grammar` in the standard Python distribution.  The parse trees | 
 | stored in the ST objects created by this module are the actual output from the | 
 | internal parser when created by the :func:`expr` or :func:`suite` functions, | 
 | described below.  The ST objects created by :func:`sequence2st` faithfully | 
 | simulate those structures.  Be aware that the values of the sequences which are | 
 | considered "correct" will vary from one version of Python to another as the | 
 | formal grammar for the language is revised.  However, transporting code from one | 
 | Python version to another as source text will always allow correct parse trees | 
 | to be created in the target version, with the only restriction being that | 
 | migrating to an older version of the interpreter will not support more recent | 
 | language constructs.  The parse trees are not typically compatible from one | 
 | version to another, whereas source code has always been forward-compatible. | 
 |  | 
 | Each element of the sequences returned by :func:`st2list` or :func:`st2tuple` | 
 | has a simple form.  Sequences representing non-terminal elements in the grammar | 
 | always have a length greater than one.  The first element is an integer which | 
 | identifies a production in the grammar.  These integers are given symbolic names | 
 | in the C header file :file:`Include/graminit.h` and the Python module | 
 | :mod:`symbol`.  Each additional element of the sequence represents a component | 
 | of the production as recognized in the input string: these are always sequences | 
 | which have the same form as the parent.  An important aspect of this structure | 
 | which should be noted is that keywords used to identify the parent node type, | 
 | such as the keyword :keyword:`if` in an :const:`if_stmt`, are included in the | 
 | node tree without any special treatment.  For example, the :keyword:`if` keyword | 
 | is represented by the tuple ``(1, 'if')``, where ``1`` is the numeric value | 
 | associated with all :const:`NAME` tokens, including variable and function names | 
 | defined by the user.  In an alternate form returned when line number information | 
 | is requested, the same token might be represented as ``(1, 'if', 12)``, where | 
 | the ``12`` represents the line number at which the terminal symbol was found. | 
 |  | 
 | Terminal elements are represented in much the same way, but without any child | 
 | elements and the addition of the source text which was identified.  The example | 
 | of the :keyword:`if` keyword above is representative.  The various types of | 
 | terminal symbols are defined in the C header file :file:`Include/token.h` and | 
 | the Python module :mod:`token`. | 
 |  | 
 | The ST objects are not required to support the functionality of this module, | 
 | but are provided for three purposes: to allow an application to amortize the | 
 | cost of processing complex parse trees, to provide a parse tree representation | 
 | which conserves memory space when compared to the Python list or tuple | 
 | representation, and to ease the creation of additional modules in C which | 
 | manipulate parse trees.  A simple "wrapper" class may be created in Python to | 
 | hide the use of ST objects. | 
 |  | 
 | The :mod:`parser` module defines functions for a few distinct purposes.  The | 
 | most important purposes are to create ST objects and to convert ST objects to | 
 | other representations such as parse trees and compiled code objects, but there | 
 | are also functions which serve to query the type of parse tree represented by an | 
 | ST object. | 
 |  | 
 |  | 
 | .. seealso:: | 
 |  | 
 |    Module :mod:`symbol` | 
 |       Useful constants representing internal nodes of the parse tree. | 
 |  | 
 |    Module :mod:`token` | 
 |       Useful constants representing leaf nodes of the parse tree and functions for | 
 |       testing node values. | 
 |  | 
 |  | 
 | .. _creating-sts: | 
 |  | 
 | Creating ST Objects | 
 | ------------------- | 
 |  | 
 | ST objects may be created from source code or from a parse tree. When creating | 
 | an ST object from source, different functions are used to create the ``'eval'`` | 
 | and ``'exec'`` forms. | 
 |  | 
 |  | 
 | .. function:: expr(source) | 
 |  | 
 |    The :func:`expr` function parses the parameter *source* as if it were an input | 
 |    to ``compile(source, 'file.py', 'eval')``.  If the parse succeeds, an ST object | 
 |    is created to hold the internal parse tree representation, otherwise an | 
 |    appropriate exception is raised. | 
 |  | 
 |  | 
 | .. function:: suite(source) | 
 |  | 
 |    The :func:`suite` function parses the parameter *source* as if it were an input | 
 |    to ``compile(source, 'file.py', 'exec')``.  If the parse succeeds, an ST object | 
 |    is created to hold the internal parse tree representation, otherwise an | 
 |    appropriate exception is raised. | 
 |  | 
 |  | 
 | .. function:: sequence2st(sequence) | 
 |  | 
 |    This function accepts a parse tree represented as a sequence and builds an | 
 |    internal representation if possible.  If it can validate that the tree conforms | 
 |    to the Python grammar and all nodes are valid node types in the host version of | 
 |    Python, an ST object is created from the internal representation and returned | 
 |    to the called.  If there is a problem creating the internal representation, or | 
 |    if the tree cannot be validated, a :exc:`ParserError` exception is raised.  An | 
 |    ST object created this way should not be assumed to compile correctly; normal | 
 |    exceptions raised by compilation may still be initiated when the ST object is | 
 |    passed to :func:`compilest`.  This may indicate problems not related to syntax | 
 |    (such as a :exc:`MemoryError` exception), but may also be due to constructs such | 
 |    as the result of parsing ``del f(0)``, which escapes the Python parser but is | 
 |    checked by the bytecode compiler. | 
 |  | 
 |    Sequences representing terminal tokens may be represented as either two-element | 
 |    lists of the form ``(1, 'name')`` or as three-element lists of the form ``(1, | 
 |    'name', 56)``.  If the third element is present, it is assumed to be a valid | 
 |    line number.  The line number may be specified for any subset of the terminal | 
 |    symbols in the input tree. | 
 |  | 
 |  | 
 | .. function:: tuple2st(sequence) | 
 |  | 
 |    This is the same function as :func:`sequence2st`.  This entry point is | 
 |    maintained for backward compatibility. | 
 |  | 
 |  | 
 | .. _converting-sts: | 
 |  | 
 | Converting ST Objects | 
 | --------------------- | 
 |  | 
 | ST objects, regardless of the input used to create them, may be converted to | 
 | parse trees represented as list- or tuple- trees, or may be compiled into | 
 | executable code objects.  Parse trees may be extracted with or without line | 
 | numbering information. | 
 |  | 
 |  | 
 | .. function:: st2list(st, line_info=False, col_info=False) | 
 |  | 
 |    This function accepts an ST object from the caller in *st* and returns a | 
 |    Python list representing the equivalent parse tree.  The resulting list | 
 |    representation can be used for inspection or the creation of a new parse tree in | 
 |    list form.  This function does not fail so long as memory is available to build | 
 |    the list representation.  If the parse tree will only be used for inspection, | 
 |    :func:`st2tuple` should be used instead to reduce memory consumption and | 
 |    fragmentation.  When the list representation is required, this function is | 
 |    significantly faster than retrieving a tuple representation and converting that | 
 |    to nested lists. | 
 |  | 
 |    If *line_info* is true, line number information will be included for all | 
 |    terminal tokens as a third element of the list representing the token.  Note | 
 |    that the line number provided specifies the line on which the token *ends*. | 
 |    This information is omitted if the flag is false or omitted. | 
 |  | 
 |  | 
 | .. function:: st2tuple(st, line_info=False, col_info=False) | 
 |  | 
 |    This function accepts an ST object from the caller in *st* and returns a | 
 |    Python tuple representing the equivalent parse tree.  Other than returning a | 
 |    tuple instead of a list, this function is identical to :func:`st2list`. | 
 |  | 
 |    If *line_info* is true, line number information will be included for all | 
 |    terminal tokens as a third element of the list representing the token.  This | 
 |    information is omitted if the flag is false or omitted. | 
 |  | 
 |  | 
 | .. function:: compilest(st, filename='<syntax-tree>') | 
 |  | 
 |    .. index:: | 
 |       builtin: exec | 
 |       builtin: eval | 
 |  | 
 |    The Python byte compiler can be invoked on an ST object to produce code objects | 
 |    which can be used as part of a call to the built-in :func:`exec` or :func:`eval` | 
 |    functions. This function provides the interface to the compiler, passing the | 
 |    internal parse tree from *st* to the parser, using the source file name | 
 |    specified by the *filename* parameter. The default value supplied for *filename* | 
 |    indicates that the source was an ST object. | 
 |  | 
 |    Compiling an ST object may result in exceptions related to compilation; an | 
 |    example would be a :exc:`SyntaxError` caused by the parse tree for ``del f(0)``: | 
 |    this statement is considered legal within the formal grammar for Python but is | 
 |    not a legal language construct.  The :exc:`SyntaxError` raised for this | 
 |    condition is actually generated by the Python byte-compiler normally, which is | 
 |    why it can be raised at this point by the :mod:`parser` module.  Most causes of | 
 |    compilation failure can be diagnosed programmatically by inspection of the parse | 
 |    tree. | 
 |  | 
 |  | 
 | .. _querying-sts: | 
 |  | 
 | Queries on ST Objects | 
 | --------------------- | 
 |  | 
 | Two functions are provided which allow an application to determine if an ST was | 
 | created as an expression or a suite.  Neither of these functions can be used to | 
 | determine if an ST was created from source code via :func:`expr` or | 
 | :func:`suite` or from a parse tree via :func:`sequence2st`. | 
 |  | 
 |  | 
 | .. function:: isexpr(st) | 
 |  | 
 |    .. index:: builtin: compile | 
 |  | 
 |    When *st* represents an ``'eval'`` form, this function returns true, otherwise | 
 |    it returns false.  This is useful, since code objects normally cannot be queried | 
 |    for this information using existing built-in functions.  Note that the code | 
 |    objects created by :func:`compilest` cannot be queried like this either, and | 
 |    are identical to those created by the built-in :func:`compile` function. | 
 |  | 
 |  | 
 | .. function:: issuite(st) | 
 |  | 
 |    This function mirrors :func:`isexpr` in that it reports whether an ST object | 
 |    represents an ``'exec'`` form, commonly known as a "suite."  It is not safe to | 
 |    assume that this function is equivalent to ``not isexpr(st)``, as additional | 
 |    syntactic fragments may be supported in the future. | 
 |  | 
 |  | 
 | .. _st-errors: | 
 |  | 
 | Exceptions and Error Handling | 
 | ----------------------------- | 
 |  | 
 | The parser module defines a single exception, but may also pass other built-in | 
 | exceptions from other portions of the Python runtime environment.  See each | 
 | function for information about the exceptions it can raise. | 
 |  | 
 |  | 
 | .. exception:: ParserError | 
 |  | 
 |    Exception raised when a failure occurs within the parser module.  This is | 
 |    generally produced for validation failures rather than the built-in | 
 |    :exc:`SyntaxError` raised during normal parsing. The exception argument is | 
 |    either a string describing the reason of the failure or a tuple containing a | 
 |    sequence causing the failure from a parse tree passed to :func:`sequence2st` | 
 |    and an explanatory string.  Calls to :func:`sequence2st` need to be able to | 
 |    handle either type of exception, while calls to other functions in the module | 
 |    will only need to be aware of the simple string values. | 
 |  | 
 | Note that the functions :func:`compilest`, :func:`expr`, and :func:`suite` may | 
 | raise exceptions which are normally raised by the parsing and compilation | 
 | process.  These include the built in exceptions :exc:`MemoryError`, | 
 | :exc:`OverflowError`, :exc:`SyntaxError`, and :exc:`SystemError`.  In these | 
 | cases, these exceptions carry all the meaning normally associated with them. | 
 | Refer to the descriptions of each function for detailed information. | 
 |  | 
 |  | 
 | .. _st-objects: | 
 |  | 
 | ST Objects | 
 | ---------- | 
 |  | 
 | Ordered and equality comparisons are supported between ST objects. Pickling of | 
 | ST objects (using the :mod:`pickle` module) is also supported. | 
 |  | 
 |  | 
 | .. data:: STType | 
 |  | 
 |    The type of the objects returned by :func:`expr`, :func:`suite` and | 
 |    :func:`sequence2st`. | 
 |  | 
 | ST objects have the following methods: | 
 |  | 
 |  | 
 | .. method:: ST.compile(filename='<syntax-tree>') | 
 |  | 
 |    Same as ``compilest(st, filename)``. | 
 |  | 
 |  | 
 | .. method:: ST.isexpr() | 
 |  | 
 |    Same as ``isexpr(st)``. | 
 |  | 
 |  | 
 | .. method:: ST.issuite() | 
 |  | 
 |    Same as ``issuite(st)``. | 
 |  | 
 |  | 
 | .. method:: ST.tolist(line_info=False, col_info=False) | 
 |  | 
 |    Same as ``st2list(st, line_info, col_info)``. | 
 |  | 
 |  | 
 | .. method:: ST.totuple(line_info=False, col_info=False) | 
 |  | 
 |    Same as ``st2tuple(st, line_info, col_info)``. | 
 |  | 
 |  | 
 | Example: Emulation of :func:`compile` | 
 | ------------------------------------- | 
 |  | 
 | While many useful operations may take place between parsing and bytecode | 
 | generation, the simplest operation is to do nothing.  For this purpose, using | 
 | the :mod:`parser` module to produce an intermediate data structure is equivalent | 
 | to the code :: | 
 |  | 
 |    >>> code = compile('a + 5', 'file.py', 'eval') | 
 |    >>> a = 5 | 
 |    >>> eval(code) | 
 |    10 | 
 |  | 
 | The equivalent operation using the :mod:`parser` module is somewhat longer, and | 
 | allows the intermediate internal parse tree to be retained as an ST object:: | 
 |  | 
 |    >>> import parser | 
 |    >>> st = parser.expr('a + 5') | 
 |    >>> code = st.compile('file.py') | 
 |    >>> a = 5 | 
 |    >>> eval(code) | 
 |    10 | 
 |  | 
 | An application which needs both ST and code objects can package this code into | 
 | readily available functions:: | 
 |  | 
 |    import parser | 
 |  | 
 |    def load_suite(source_string): | 
 |        st = parser.suite(source_string) | 
 |        return st, st.compile() | 
 |  | 
 |    def load_expression(source_string): | 
 |        st = parser.expr(source_string) | 
 |        return st, st.compile() |