Fred Drake | e10ef74 | 2000-09-20 02:52:20 +0000 | [diff] [blame] | 1 | \section{\module{xml.sax} --- |
| 2 | Support for SAX2 parsers} |
| 3 | |
| 4 | \declaremodule{standard}{xml.sax} |
| 5 | \modulesynopsis{Package containing SAX2 base classes and convenience |
| 6 | functions.} |
| 7 | \moduleauthor{Lars Marius Garshol}{larsga@garshol.priv.no} |
| 8 | \sectionauthor{Fred L. Drake, Jr.}{fdrake@acm.org} |
Fred Drake | 014f0e3 | 2000-10-12 20:05:09 +0000 | [diff] [blame] | 9 | \sectionauthor{Martin v. L\"owis}{loewis@informatik.hu-berlin.de} |
Fred Drake | e10ef74 | 2000-09-20 02:52:20 +0000 | [diff] [blame] | 10 | |
| 11 | \versionadded{2.0} |
| 12 | |
| 13 | |
| 14 | The \module{xml.sax} package provides a number of modules which |
| 15 | implement the Simple API for XML (SAX) interface for Python. The |
| 16 | package itself provides the SAX exceptions and the convenience |
| 17 | functions which will be most used by users of the SAX API. |
| 18 | |
| 19 | The convenience functions are: |
| 20 | |
| 21 | \begin{funcdesc}{make_parser}{\optional{parser_list}} |
| 22 | Create and return a SAX \class{XMLReader} object. The first parser |
| 23 | found will be used. If \var{parser_list} is provided, it must be a |
| 24 | sequence of strings which name modules that have a function named |
| 25 | \function{create_parser()}. Modules listed in \var{parser_list} |
| 26 | will be used before modules in the default list of parsers. |
| 27 | \end{funcdesc} |
| 28 | |
| 29 | \begin{funcdesc}{parse}{filename_or_stream, handler\optional{, error_handler}} |
| 30 | Create a SAX parser and use it to parse a document. The document, |
| 31 | passed in as \var{filename_or_stream}, can be a filename or a file |
| 32 | object. The \var{handler} parameter needs to be a SAX |
| 33 | \class{ContentHandler} instance. If \var{error_handler} is given, |
| 34 | it must be a SAX \class{ErrorHandler} instance; if omitted, |
| 35 | \exception{SAXParseException} will be raised on all errors. There |
| 36 | is no return value; all work must be done by the \var{handler} |
| 37 | passed in. |
| 38 | \end{funcdesc} |
| 39 | |
| 40 | \begin{funcdesc}{parseString}{string, handler\optional{, error_handler}} |
| 41 | Similar to \function{parse()}, but parses from a buffer \var{string} |
| 42 | received as a parameter. |
| 43 | \end{funcdesc} |
| 44 | |
Fred Drake | 014f0e3 | 2000-10-12 20:05:09 +0000 | [diff] [blame] | 45 | A typical SAX application uses three kinds of objects: readers, |
Fred Drake | 4897119 | 2000-12-13 22:36:02 +0000 | [diff] [blame] | 46 | handlers and input sources. ``Reader'' in this context is another |
| 47 | term for parser, i.e.\ some piece of code that reads the bytes or |
| 48 | characters from the input source, and produces a sequence of events. |
| 49 | The events then get distributed to the handler objects, i.e.\ the |
| 50 | reader invokes a method on the handler. A SAX application must |
| 51 | therefore obtain a reader object, create or open the input sources, |
| 52 | create the handlers, and connect these objects all together. As the |
| 53 | final step of preparation, the reader is called to parse the input. |
| 54 | During parsing, methods on the handler objects are called based on |
| 55 | structural and syntactic events from the input data. |
Fred Drake | e10ef74 | 2000-09-20 02:52:20 +0000 | [diff] [blame] | 56 | |
Fred Drake | 014f0e3 | 2000-10-12 20:05:09 +0000 | [diff] [blame] | 57 | For these objects, only the interfaces are relevant; they are normally |
Fred Drake | 4897119 | 2000-12-13 22:36:02 +0000 | [diff] [blame] | 58 | not instantiated by the application itself. Since Python does not have |
Fred Drake | 014f0e3 | 2000-10-12 20:05:09 +0000 | [diff] [blame] | 59 | an explicit notion of interface, they are formally introduced as |
Fred Drake | 4897119 | 2000-12-13 22:36:02 +0000 | [diff] [blame] | 60 | classes, but applications may use implementations which do not inherit |
| 61 | from the provided classes. The \class{InputSource}, \class{Locator}, |
| 62 | \class{AttributesImpl}, \class{AttributesNSImpl}, and |
| 63 | \class{XMLReader} interfaces are defined in the module |
| 64 | \refmodule{xml.sax.xmlreader}. The handler interfaces are defined in |
| 65 | \refmodule{xml.sax.handler}. For convenience, \class{InputSource} |
| 66 | (which is often instantiated directly) and the handler classes are |
| 67 | also available from \module{xml.sax}. These interfaces are described |
| 68 | below. |
Fred Drake | 014f0e3 | 2000-10-12 20:05:09 +0000 | [diff] [blame] | 69 | |
| 70 | In addition to these classes, \module{xml.sax} provides the following |
| 71 | exception classes. |
Fred Drake | e10ef74 | 2000-09-20 02:52:20 +0000 | [diff] [blame] | 72 | |
| 73 | \begin{excclassdesc}{SAXException}{msg\optional{, exception}} |
| 74 | Encapsulate an XML error or warning. This class can contain basic |
| 75 | error or warning information from either the XML parser or the |
| 76 | application: it can be subclassed to provide additional |
| 77 | functionality or to add localization. Note that although the |
| 78 | handlers defined in the \class{ErrorHandler} interface receive |
| 79 | instances of this exception, it is not required to actually raise |
| 80 | the exception --- it is also useful as a container for information. |
| 81 | |
| 82 | When instantiated, \var{msg} should be a human-readable description |
| 83 | of the error. The optional \var{exception} parameter, if given, |
| 84 | should be \code{None} or an exception that was caught by the parsing |
| 85 | code and is being passed along as information. |
| 86 | |
| 87 | This is the base class for the other SAX exception classes. |
| 88 | \end{excclassdesc} |
| 89 | |
| 90 | \begin{excclassdesc}{SAXParseException}{msg, exception, locator} |
| 91 | Subclass of \exception{SAXException} raised on parse errors. |
| 92 | Instances of this class are passed to the methods of the SAX |
| 93 | \class{ErrorHandler} interface to provide information about the |
| 94 | parse error. This class supports the SAX \class{Locator} interface |
| 95 | as well as the \class{SAXException} interface. |
| 96 | \end{excclassdesc} |
| 97 | |
| 98 | \begin{excclassdesc}{SAXNotRecognizedException}{msg\optional{, exception}} |
| 99 | Subclass of \exception{SAXException} raised when a SAX |
| 100 | \class{XMLReader} is confronted with an unrecognized feature or |
| 101 | property. SAX applications and extensions may use this class for |
| 102 | similar purposes. |
| 103 | \end{excclassdesc} |
| 104 | |
| 105 | \begin{excclassdesc}{SAXNotSupportedException}{msg\optional{, exception}} |
| 106 | Subclass of \exception{SAXException} raised when a SAX |
| 107 | \class{XMLReader} is asked to enable a feature that is not |
| 108 | supported, or to set a property to a value that the implementation |
| 109 | does not support. SAX applications and extensions may use this |
| 110 | class for similar purposes. |
| 111 | \end{excclassdesc} |
| 112 | |
| 113 | |
| 114 | \begin{seealso} |
Fred Drake | cf72aba | 2001-12-10 18:10:37 +0000 | [diff] [blame] | 115 | \seetitle[http://www.saxproject.org/]{SAX: The Simple API for |
Fred Drake | e10ef74 | 2000-09-20 02:52:20 +0000 | [diff] [blame] | 116 | XML}{This site is the focal point for the definition of |
| 117 | the SAX API. It provides a Java implementation and online |
| 118 | documentation. Links to implementations and historical |
| 119 | information are also available.} |
| 120 | \end{seealso} |
| 121 | |
| 122 | |
Fred Drake | 014f0e3 | 2000-10-12 20:05:09 +0000 | [diff] [blame] | 123 | \subsection{SAXException Objects \label{sax-exception-objects}} |
Fred Drake | e10ef74 | 2000-09-20 02:52:20 +0000 | [diff] [blame] | 124 | |
| 125 | The \class{SAXException} exception class supports the following |
| 126 | methods: |
| 127 | |
| 128 | \begin{methoddesc}[SAXException]{getMessage}{} |
| 129 | Return a human-readable message describing the error condition. |
| 130 | \end{methoddesc} |
| 131 | |
| 132 | \begin{methoddesc}[SAXException]{getException}{} |
| 133 | Return an encapsulated exception object, or \code{None}. |
| 134 | \end{methoddesc} |