| \section{\module{xml.sax.handler} --- | 
 |          Base classes for SAX handlers} | 
 |  | 
 | \declaremodule{standard}{xml.sax.handler} | 
 | \modulesynopsis{Base classes for SAX event handlers.} | 
 | \sectionauthor{Martin v. L\"owis}{martin@v.loewis.de} | 
 | \moduleauthor{Lars Marius Garshol}{larsga@garshol.priv.no} | 
 |  | 
 | \versionadded{2.0} | 
 |  | 
 |  | 
 | The SAX API defines four kinds of handlers: content handlers, DTD | 
 | handlers, error handlers, and entity resolvers. Applications normally | 
 | only need to implement those interfaces whose events they are | 
 | interested in; they can implement the interfaces in a single object or | 
 | in multiple objects. Handler implementations should inherit from the | 
 | base classes provided in the module \module{xml.sax}, so that all | 
 | methods get default implementations. | 
 |  | 
 | \begin{classdesc*}{ContentHandler} | 
 |   This is the main callback interface in SAX, and the one most | 
 |   important to applications. The order of events in this interface | 
 |   mirrors the order of the information in the document. | 
 | \end{classdesc*} | 
 |  | 
 | \begin{classdesc*}{DTDHandler} | 
 |   Handle DTD events. | 
 |  | 
 |   This interface specifies only those DTD events required for basic | 
 |   parsing (unparsed entities and attributes). | 
 | \end{classdesc*} | 
 |  | 
 | \begin{classdesc*}{EntityResolver} | 
 |  Basic interface for resolving entities. If you create an object | 
 |  implementing this interface, then register the object with your | 
 |  Parser, the parser will call the method in your object to resolve all | 
 |  external entities. | 
 | \end{classdesc*} | 
 |  | 
 | \begin{classdesc*}{ErrorHandler} | 
 |   Interface used by the parser to present error and warning messages | 
 |   to the application.  The methods of this object control whether errors | 
 |   are immediately converted to exceptions or are handled in some other | 
 |   way. | 
 | \end{classdesc*} | 
 |  | 
 | In addition to these classes, \module{xml.sax.handler} provides | 
 | symbolic constants for the feature and property names. | 
 |  | 
 | \begin{datadesc}{feature_namespaces} | 
 |   Value: \code{"http://xml.org/sax/features/namespaces"}\\ | 
 |   true: Perform Namespace processing.\\ | 
 |   false: Optionally do not perform Namespace processing | 
 |          (implies namespace-prefixes; default).\\ | 
 |   access: (parsing) read-only; (not parsing) read/write | 
 | \end{datadesc} | 
 |  | 
 | \begin{datadesc}{feature_namespace_prefixes} | 
 |   Value: \code{"http://xml.org/sax/features/namespace-prefixes"}\\ | 
 |   true: Report the original prefixed names and attributes used for Namespace | 
 |         declarations.\\ | 
 |   false: Do not report attributes used for Namespace declarations, and | 
 |          optionally do not report original prefixed names (default).\\ | 
 |   access: (parsing) read-only; (not parsing) read/write   | 
 | \end{datadesc} | 
 |  | 
 | \begin{datadesc}{feature_string_interning} | 
 |   Value: \code{"http://xml.org/sax/features/string-interning"} | 
 |   true: All element names, prefixes, attribute names, Namespace URIs, and | 
 |         local names are interned using the built-in intern function.\\ | 
 |   false: Names are not necessarily interned, although they may be (default).\\ | 
 |   access: (parsing) read-only; (not parsing) read/write | 
 | \end{datadesc} | 
 |  | 
 | \begin{datadesc}{feature_validation} | 
 |   Value: \code{"http://xml.org/sax/features/validation"}\\ | 
 |   true: Report all validation errors (implies external-general-entities and | 
 |         external-parameter-entities).\\ | 
 |   false: Do not report validation errors.\\ | 
 |   access: (parsing) read-only; (not parsing) read/write | 
 | \end{datadesc} | 
 |  | 
 | \begin{datadesc}{feature_external_ges} | 
 |   Value: \code{"http://xml.org/sax/features/external-general-entities"}\\ | 
 |   true: Include all external general (text) entities.\\ | 
 |   false: Do not include external general entities.\\ | 
 |   access: (parsing) read-only; (not parsing) read/write | 
 | \end{datadesc} | 
 |  | 
 | \begin{datadesc}{feature_external_pes} | 
 |   Value: \code{"http://xml.org/sax/features/external-parameter-entities"}\\ | 
 |   true: Include all external parameter entities, including the external | 
 |         DTD subset.\\ | 
 |   false: Do not include any external parameter entities, even the external | 
 |          DTD subset.\\ | 
 |   access: (parsing) read-only; (not parsing) read/write | 
 | \end{datadesc} | 
 |  | 
 | \begin{datadesc}{all_features} | 
 |   List of all features. | 
 | \end{datadesc} | 
 |  | 
 | \begin{datadesc}{property_lexical_handler} | 
 |   Value: \code{"http://xml.org/sax/properties/lexical-handler"}\\ | 
 |   data type: xml.sax.sax2lib.LexicalHandler (not supported in Python 2)\\ | 
 |   description: An optional extension handler for lexical events like comments.\\ | 
 |   access: read/write | 
 | \end{datadesc} | 
 |  | 
 | \begin{datadesc}{property_declaration_handler} | 
 |   Value: \code{"http://xml.org/sax/properties/declaration-handler"}\\ | 
 |   data type: xml.sax.sax2lib.DeclHandler (not supported in Python 2)\\ | 
 |   description: An optional extension handler for DTD-related events other | 
 |                than notations and unparsed entities.\\ | 
 |   access: read/write | 
 | \end{datadesc} | 
 |  | 
 | \begin{datadesc}{property_dom_node} | 
 |   Value: \code{"http://xml.org/sax/properties/dom-node"}\\ | 
 |   data type: org.w3c.dom.Node (not supported in Python 2) \\ | 
 |   description: When parsing, the current DOM node being visited if this is | 
 |                a DOM iterator; when not parsing, the root DOM node for | 
 |                iteration.\\ | 
 |   access: (parsing) read-only; (not parsing) read/write   | 
 | \end{datadesc} | 
 |  | 
 | \begin{datadesc}{property_xml_string} | 
 |   Value: \code{"http://xml.org/sax/properties/xml-string"}\\ | 
 |   data type: String\\ | 
 |   description: The literal string of characters that was the source for | 
 |                the current event.\\ | 
 |   access: read-only | 
 | \end{datadesc} | 
 |  | 
 | \begin{datadesc}{all_properties} | 
 |   List of all known property names. | 
 | \end{datadesc} | 
 |  | 
 |  | 
 | \subsection{ContentHandler Objects \label{content-handler-objects}} | 
 |  | 
 | Users are expected to subclass \class{ContentHandler} to support their | 
 | application.  The following methods are called by the parser on the | 
 | appropriate events in the input document: | 
 |  | 
 | \begin{methoddesc}[ContentHandler]{setDocumentLocator}{locator} | 
 |   Called by the parser to give the application a locator for locating | 
 |   the origin of document events. | 
 |    | 
 |   SAX parsers are strongly encouraged (though not absolutely required) | 
 |   to supply a locator: if it does so, it must supply the locator to | 
 |   the application by invoking this method before invoking any of the | 
 |   other methods in the DocumentHandler interface. | 
 |    | 
 |   The locator allows the application to determine the end position of | 
 |   any document-related event, even if the parser is not reporting an | 
 |   error. Typically, the application will use this information for | 
 |   reporting its own errors (such as character content that does not | 
 |   match an application's business rules). The information returned by | 
 |   the locator is probably not sufficient for use with a search engine. | 
 |    | 
 |   Note that the locator will return correct information only during | 
 |   the invocation of the events in this interface. The application | 
 |   should not attempt to use it at any other time. | 
 | \end{methoddesc} | 
 |  | 
 | \begin{methoddesc}[ContentHandler]{startDocument}{} | 
 |   Receive notification of the beginning of a document. | 
 |          | 
 |   The SAX parser will invoke this method only once, before any other | 
 |   methods in this interface or in DTDHandler (except for | 
 |   \method{setDocumentLocator()}). | 
 | \end{methoddesc} | 
 |  | 
 | \begin{methoddesc}[ContentHandler]{endDocument}{} | 
 |   Receive notification of the end of a document. | 
 |          | 
 |   The SAX parser will invoke this method only once, and it will be the | 
 |   last method invoked during the parse. The parser shall not invoke | 
 |   this method until it has either abandoned parsing (because of an | 
 |   unrecoverable error) or reached the end of input. | 
 | \end{methoddesc} | 
 |  | 
 | \begin{methoddesc}[ContentHandler]{startPrefixMapping}{prefix, uri} | 
 |   Begin the scope of a prefix-URI Namespace mapping. | 
 |          | 
 |   The information from this event is not necessary for normal | 
 |   Namespace processing: the SAX XML reader will automatically replace | 
 |   prefixes for element and attribute names when the | 
 |   \code{feature_namespaces} feature is enabled (the default). | 
 |  | 
 | %% XXX This is not really the default, is it? MvL | 
 |    | 
 |   There are cases, however, when applications need to use prefixes in | 
 |   character data or in attribute values, where they cannot safely be | 
 |   expanded automatically; the \method{startPrefixMapping()} and | 
 |   \method{endPrefixMapping()} events supply the information to the | 
 |   application to expand prefixes in those contexts itself, if | 
 |   necessary. | 
 |    | 
 |   Note that \method{startPrefixMapping()} and | 
 |   \method{endPrefixMapping()} events are not guaranteed to be properly | 
 |   nested relative to each-other: all \method{startPrefixMapping()} | 
 |   events will occur before the corresponding \method{startElement()} | 
 |   event, and all \method{endPrefixMapping()} events will occur after | 
 |   the corresponding \method{endElement()} event, but their order is | 
 |   not guaranteed. | 
 | \end{methoddesc} | 
 |  | 
 | \begin{methoddesc}[ContentHandler]{endPrefixMapping}{prefix} | 
 |   End the scope of a prefix-URI mapping. | 
 |  | 
 |   See \method{startPrefixMapping()} for details. This event will | 
 |   always occur after the corresponding \method{endElement()} event, | 
 |   but the order of \method{endPrefixMapping()} events is not otherwise | 
 |   guaranteed. | 
 | \end{methoddesc} | 
 |  | 
 | \begin{methoddesc}[ContentHandler]{startElement}{name, attrs} | 
 |   Signals the start of an element in non-namespace mode. | 
 |  | 
 |   The \var{name} parameter contains the raw XML 1.0 name of the | 
 |   element type as a string and the \var{attrs} parameter holds an | 
 |   object of the \ulink{\class{Attributes} | 
 |   interface}{attributes-objects.html} containing the attributes of the | 
 |   element.  The object passed as \var{attrs} may be re-used by the | 
 |   parser; holding on to a reference to it is not a reliable way to | 
 |   keep a copy of the attributes.  To keep a copy of the attributes, | 
 |   use the \method{copy()} method of the \var{attrs} object. | 
 | \end{methoddesc} | 
 |  | 
 | \begin{methoddesc}[ContentHandler]{endElement}{name} | 
 |   Signals the end of an element in non-namespace mode. | 
 |  | 
 |   The \var{name} parameter contains the name of the element type, just | 
 |   as with the \method{startElement()} event. | 
 | \end{methoddesc} | 
 |  | 
 | \begin{methoddesc}[ContentHandler]{startElementNS}{name, qname, attrs} | 
 |   Signals the start of an element in namespace mode. | 
 |  | 
 |   The \var{name} parameter contains the name of the element type as a | 
 |   \code{(\var{uri}, \var{localname})} tuple, the \var{qname} parameter | 
 |   contains the raw XML 1.0 name used in the source document, and the | 
 |   \var{attrs} parameter holds an instance of the | 
 |   \ulink{\class{AttributesNS} interface}{attributes-ns-objects.html} | 
 |   containing the attributes of the element.  If no namespace is | 
 |   associated with the element, the \var{uri} component of \var{name} | 
 |   will be \code{None}.  The object passed as \var{attrs} may be | 
 |   re-used by the parser; holding on to a reference to it is not a | 
 |   reliable way to keep a copy of the attributes.  To keep a copy of | 
 |   the attributes, use the \method{copy()} method of the \var{attrs} | 
 |   object. | 
 |  | 
 |   Parsers may set the \var{qname} parameter to \code{None}, unless the | 
 |   \code{feature_namespace_prefixes} feature is activated. | 
 | \end{methoddesc} | 
 |  | 
 | \begin{methoddesc}[ContentHandler]{endElementNS}{name, qname} | 
 |   Signals the end of an element in namespace mode. | 
 |  | 
 |   The \var{name} parameter contains the name of the element type, just | 
 |   as with the \method{startElementNS()} method, likewise the | 
 |   \var{qname} parameter. | 
 | \end{methoddesc} | 
 |  | 
 | \begin{methoddesc}[ContentHandler]{characters}{content} | 
 |   Receive notification of character data. | 
 |          | 
 |   The Parser will call this method to report each chunk of character | 
 |   data. SAX parsers may return all contiguous character data in a | 
 |   single chunk, or they may split it into several chunks; however, all | 
 |   of the characters in any single event must come from the same | 
 |   external entity so that the Locator provides useful information. | 
 |  | 
 |   \var{content} may be a Unicode string or a byte string; the | 
 |   \code{expat} reader module produces always Unicode strings. | 
 |  | 
 |   \note{The earlier SAX 1 interface provided by the Python | 
 |   XML Special Interest Group used a more Java-like interface for this | 
 |   method.  Since most parsers used from Python did not take advantage | 
 |   of the older interface, the simpler signature was chosen to replace | 
 |   it.  To convert old code to the new interface, use \var{content} | 
 |   instead of slicing content with the old \var{offset} and | 
 |   \var{length} parameters.} | 
 | \end{methoddesc} | 
 |  | 
 | \begin{methoddesc}[ContentHandler]{ignorableWhitespace}{whitespace} | 
 |   Receive notification of ignorable whitespace in element content. | 
 |          | 
 |   Validating Parsers must use this method to report each chunk | 
 |   of ignorable whitespace (see the W3C XML 1.0 recommendation, | 
 |   section 2.10): non-validating parsers may also use this method | 
 |   if they are capable of parsing and using content models. | 
 |    | 
 |   SAX parsers may return all contiguous whitespace in a single | 
 |   chunk, or they may split it into several chunks; however, all | 
 |   of the characters in any single event must come from the same | 
 |   external entity, so that the Locator provides useful | 
 |   information. | 
 | \end{methoddesc} | 
 |  | 
 | \begin{methoddesc}[ContentHandler]{processingInstruction}{target, data} | 
 |   Receive notification of a processing instruction. | 
 |          | 
 |   The Parser will invoke this method once for each processing | 
 |   instruction found: note that processing instructions may occur | 
 |   before or after the main document element. | 
 |  | 
 |   A SAX parser should never report an XML declaration (XML 1.0, | 
 |   section 2.8) or a text declaration (XML 1.0, section 4.3.1) using | 
 |   this method. | 
 | \end{methoddesc} | 
 |  | 
 | \begin{methoddesc}[ContentHandler]{skippedEntity}{name} | 
 |   Receive notification of a skipped entity. | 
 |          | 
 |   The Parser will invoke this method once for each entity | 
 |   skipped. Non-validating processors may skip entities if they have | 
 |   not seen the declarations (because, for example, the entity was | 
 |   declared in an external DTD subset). All processors may skip | 
 |   external entities, depending on the values of the | 
 |   \code{feature_external_ges} and the | 
 |   \code{feature_external_pes} properties. | 
 | \end{methoddesc} | 
 |  | 
 |  | 
 | \subsection{DTDHandler Objects \label{dtd-handler-objects}} | 
 |  | 
 | \class{DTDHandler} instances provide the following methods: | 
 |  | 
 | \begin{methoddesc}[DTDHandler]{notationDecl}{name, publicId, systemId} | 
 |   Handle a notation declaration event. | 
 | \end{methoddesc} | 
 |  | 
 | \begin{methoddesc}[DTDHandler]{unparsedEntityDecl}{name, publicId, | 
 |                                                    systemId, ndata} | 
 |   Handle an unparsed entity declaration event. | 
 | \end{methoddesc} | 
 |  | 
 |  | 
 | \subsection{EntityResolver Objects \label{entity-resolver-objects}} | 
 |  | 
 | \begin{methoddesc}[EntityResolver]{resolveEntity}{publicId, systemId} | 
 |   Resolve the system identifier of an entity and return either the | 
 |   system identifier to read from as a string, or an InputSource to | 
 |   read from. The default implementation returns \var{systemId}. | 
 | \end{methoddesc} | 
 |  | 
 |  | 
 | \subsection{ErrorHandler Objects \label{sax-error-handler}} | 
 |  | 
 | Objects with this interface are used to receive error and warning | 
 | information from the \class{XMLReader}.  If you create an object that | 
 | implements this interface, then register the object with your | 
 | \class{XMLReader}, the parser will call the methods in your object to | 
 | report all warnings and errors. There are three levels of errors | 
 | available: warnings, (possibly) recoverable errors, and unrecoverable | 
 | errors.  All methods take a \exception{SAXParseException} as the only | 
 | parameter.  Errors and warnings may be converted to an exception by | 
 | raising the passed-in exception object. | 
 |  | 
 | \begin{methoddesc}[ErrorHandler]{error}{exception} | 
 |   Called when the parser encounters a recoverable error.  If this method | 
 |   does not raise an exception, parsing may continue, but further document | 
 |   information should not be expected by the application.  Allowing the | 
 |   parser to continue may allow additional errors to be discovered in the | 
 |   input document. | 
 | \end{methoddesc} | 
 |  | 
 | \begin{methoddesc}[ErrorHandler]{fatalError}{exception} | 
 |   Called when the parser encounters an error it cannot recover from; | 
 |   parsing is expected to terminate when this method returns. | 
 | \end{methoddesc} | 
 |  | 
 | \begin{methoddesc}[ErrorHandler]{warning}{exception} | 
 |   Called when the parser presents minor warning information to the | 
 |   application.  Parsing is expected to continue when this method returns, | 
 |   and document information will continue to be passed to the application. | 
 |   Raising an exception in this method will cause parsing to end. | 
 | \end{methoddesc} |