Fred Drake | e10ef74 | 2000-09-20 02:52:20 +0000 | [diff] [blame] | 1 | \section{\module{xml.sax} --- |
| 2 | Support for SAX2 parsers} |
| 3 | |
| 4 | \declaremodule{standard}{xml.sax} |
| 5 | \modulesynopsis{Package containing SAX2 base classes and convenience |
| 6 | functions.} |
| 7 | \moduleauthor{Lars Marius Garshol}{larsga@garshol.priv.no} |
| 8 | \sectionauthor{Fred L. Drake, Jr.}{fdrake@acm.org} |
Fred Drake | 014f0e3 | 2000-10-12 20:05:09 +0000 | [diff] [blame] | 9 | \sectionauthor{Martin v. L\"owis}{loewis@informatik.hu-berlin.de} |
Fred Drake | e10ef74 | 2000-09-20 02:52:20 +0000 | [diff] [blame] | 10 | |
| 11 | \versionadded{2.0} |
| 12 | |
| 13 | |
| 14 | The \module{xml.sax} package provides a number of modules which |
| 15 | implement the Simple API for XML (SAX) interface for Python. The |
| 16 | package itself provides the SAX exceptions and the convenience |
| 17 | functions which will be most used by users of the SAX API. |
| 18 | |
| 19 | The convenience functions are: |
| 20 | |
| 21 | \begin{funcdesc}{make_parser}{\optional{parser_list}} |
| 22 | Create and return a SAX \class{XMLReader} object. The first parser |
| 23 | found will be used. If \var{parser_list} is provided, it must be a |
| 24 | sequence of strings which name modules that have a function named |
| 25 | \function{create_parser()}. Modules listed in \var{parser_list} |
| 26 | will be used before modules in the default list of parsers. |
| 27 | \end{funcdesc} |
| 28 | |
| 29 | \begin{funcdesc}{parse}{filename_or_stream, handler\optional{, error_handler}} |
| 30 | Create a SAX parser and use it to parse a document. The document, |
| 31 | passed in as \var{filename_or_stream}, can be a filename or a file |
| 32 | object. The \var{handler} parameter needs to be a SAX |
| 33 | \class{ContentHandler} instance. If \var{error_handler} is given, |
| 34 | it must be a SAX \class{ErrorHandler} instance; if omitted, |
| 35 | \exception{SAXParseException} will be raised on all errors. There |
| 36 | is no return value; all work must be done by the \var{handler} |
| 37 | passed in. |
| 38 | \end{funcdesc} |
| 39 | |
| 40 | \begin{funcdesc}{parseString}{string, handler\optional{, error_handler}} |
| 41 | Similar to \function{parse()}, but parses from a buffer \var{string} |
| 42 | received as a parameter. |
| 43 | \end{funcdesc} |
| 44 | |
Fred Drake | 014f0e3 | 2000-10-12 20:05:09 +0000 | [diff] [blame] | 45 | A typical SAX application uses three kinds of objects: readers, |
| 46 | handlers and input sources. ``Reader'' in this context is another term |
| 47 | for parser, ie. some piece of code that reads the bytes or characters |
| 48 | from the input source, and produces a sequence of events. The events |
| 49 | then get distributed to the handler objects, ie. the reader invokes a |
| 50 | method on the handler. A SAX application must therefore obtain a |
| 51 | handler object, create or open the input sources, create the handlers, |
| 52 | and connect these objects all together. As the final step, parsing is |
| 53 | invoked. During parsing |
Fred Drake | e10ef74 | 2000-09-20 02:52:20 +0000 | [diff] [blame] | 54 | |
Fred Drake | 014f0e3 | 2000-10-12 20:05:09 +0000 | [diff] [blame] | 55 | For these objects, only the interfaces are relevant; they are normally |
| 56 | not instantiated by the application itself. Since Python does not have |
| 57 | an explicit notion of interface, they are formally introduced as |
| 58 | classes. The \class{InputSource}, \class{Locator}, |
| 59 | \class{AttributesImpl}, and \class{XMLReader} interfaces are defined |
| 60 | in the module \refmodule{xml.sax.xmlreader}. The handler interfaces |
| 61 | are defined in \refmodule{xml.sax.handler}. For convenience, |
| 62 | \class{InputSource} (which is often instantiated directly) and the |
| 63 | handler classes are also available from \module{xml.sax}. These |
| 64 | classes are described below. |
| 65 | |
| 66 | In addition to these classes, \module{xml.sax} provides the following |
| 67 | exception classes. |
Fred Drake | e10ef74 | 2000-09-20 02:52:20 +0000 | [diff] [blame] | 68 | |
| 69 | \begin{excclassdesc}{SAXException}{msg\optional{, exception}} |
| 70 | Encapsulate an XML error or warning. This class can contain basic |
| 71 | error or warning information from either the XML parser or the |
| 72 | application: it can be subclassed to provide additional |
| 73 | functionality or to add localization. Note that although the |
| 74 | handlers defined in the \class{ErrorHandler} interface receive |
| 75 | instances of this exception, it is not required to actually raise |
| 76 | the exception --- it is also useful as a container for information. |
| 77 | |
| 78 | When instantiated, \var{msg} should be a human-readable description |
| 79 | of the error. The optional \var{exception} parameter, if given, |
| 80 | should be \code{None} or an exception that was caught by the parsing |
| 81 | code and is being passed along as information. |
| 82 | |
| 83 | This is the base class for the other SAX exception classes. |
| 84 | \end{excclassdesc} |
| 85 | |
| 86 | \begin{excclassdesc}{SAXParseException}{msg, exception, locator} |
| 87 | Subclass of \exception{SAXException} raised on parse errors. |
| 88 | Instances of this class are passed to the methods of the SAX |
| 89 | \class{ErrorHandler} interface to provide information about the |
| 90 | parse error. This class supports the SAX \class{Locator} interface |
| 91 | as well as the \class{SAXException} interface. |
| 92 | \end{excclassdesc} |
| 93 | |
| 94 | \begin{excclassdesc}{SAXNotRecognizedException}{msg\optional{, exception}} |
| 95 | Subclass of \exception{SAXException} raised when a SAX |
| 96 | \class{XMLReader} is confronted with an unrecognized feature or |
| 97 | property. SAX applications and extensions may use this class for |
| 98 | similar purposes. |
| 99 | \end{excclassdesc} |
| 100 | |
| 101 | \begin{excclassdesc}{SAXNotSupportedException}{msg\optional{, exception}} |
| 102 | Subclass of \exception{SAXException} raised when a SAX |
| 103 | \class{XMLReader} is asked to enable a feature that is not |
| 104 | supported, or to set a property to a value that the implementation |
| 105 | does not support. SAX applications and extensions may use this |
| 106 | class for similar purposes. |
| 107 | \end{excclassdesc} |
| 108 | |
| 109 | |
| 110 | \begin{seealso} |
| 111 | \seetitle[http://www.megginson.com/SAX/]{SAX: The Simple API for |
| 112 | XML}{This site is the focal point for the definition of |
| 113 | the SAX API. It provides a Java implementation and online |
| 114 | documentation. Links to implementations and historical |
| 115 | information are also available.} |
| 116 | \end{seealso} |
| 117 | |
| 118 | |
Fred Drake | 014f0e3 | 2000-10-12 20:05:09 +0000 | [diff] [blame] | 119 | \subsection{SAXException Objects \label{sax-exception-objects}} |
Fred Drake | e10ef74 | 2000-09-20 02:52:20 +0000 | [diff] [blame] | 120 | |
| 121 | The \class{SAXException} exception class supports the following |
| 122 | methods: |
| 123 | |
| 124 | \begin{methoddesc}[SAXException]{getMessage}{} |
| 125 | Return a human-readable message describing the error condition. |
| 126 | \end{methoddesc} |
| 127 | |
| 128 | \begin{methoddesc}[SAXException]{getException}{} |
| 129 | Return an encapsulated exception object, or \code{None}. |
| 130 | \end{methoddesc} |