blob: 0b6973b8c8a8b3234adec44034a3fe9b5c143221 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`xml.sax` --- Support for SAX2 parsers
2===========================================
3
4.. module:: xml.sax
5 :synopsis: Package containing SAX2 base classes and convenience functions.
Terry Jan Reedyfa089b92016-06-11 15:02:54 -04006
Georg Brandl116aa622007-08-15 14:28:22 +00007.. moduleauthor:: Lars Marius Garshol <larsga@garshol.priv.no>
8.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org>
9.. sectionauthor:: Martin v. Löwis <martin@v.loewis.de>
10
Terry Jan Reedyfa089b92016-06-11 15:02:54 -040011**Source code:** :source:`Lib/xml/sax/__init__.py`
12
13--------------
Georg Brandl116aa622007-08-15 14:28:22 +000014
Georg Brandl116aa622007-08-15 14:28:22 +000015The :mod:`xml.sax` package provides a number of modules which implement the
16Simple API for XML (SAX) interface for Python. The package itself provides the
17SAX exceptions and the convenience functions which will be most used by users of
18the SAX API.
19
Christian Heimes7380a672013-03-26 17:35:55 +010020
21.. warning::
22
23 The :mod:`xml.sax` module is not secure against maliciously
24 constructed data. If you need to parse untrusted or unauthenticated data see
25 :ref:`xml-vulnerabilities`.
26
Christian Heimes17b1d5d2018-09-23 09:50:25 +020027.. versionchanged:: 3.8
28
29 The SAX parser no longer processes general external entities by default
30 to increase security. Before, the parser created network connections
31 to fetch remote files or loaded local files from the file
32 system for DTD and entities. The feature can be enabled again with method
33 :meth:`~xml.sax.xmlreader.XMLReader.setFeature` on the parser object
34 and argument :data:`~xml.sax.handler.feature_external_ges`.
Christian Heimes7380a672013-03-26 17:35:55 +010035
Georg Brandl116aa622007-08-15 14:28:22 +000036The convenience functions are:
37
38
Georg Brandl7f01a132009-09-16 15:58:14 +000039.. function:: make_parser(parser_list=[])
Georg Brandl116aa622007-08-15 14:28:22 +000040
Serhiy Storchaka15e65902013-08-29 10:28:44 +030041 Create and return a SAX :class:`~xml.sax.xmlreader.XMLReader` object. The
42 first parser found will
Andrés Delfinoa6dc5312018-10-26 11:56:57 -030043 be used. If *parser_list* is provided, it must be an iterable of strings which
Georg Brandl116aa622007-08-15 14:28:22 +000044 name modules that have a function named :func:`create_parser`. Modules listed
45 in *parser_list* will be used before modules in the default list of parsers.
46
Andrés Delfinoa6dc5312018-10-26 11:56:57 -030047 .. versionchanged:: 3.8
48 The *parser_list* argument can be any iterable, not just a list.
49
Georg Brandl116aa622007-08-15 14:28:22 +000050
Georg Brandl7f01a132009-09-16 15:58:14 +000051.. function:: parse(filename_or_stream, handler, error_handler=handler.ErrorHandler())
Georg Brandl116aa622007-08-15 14:28:22 +000052
53 Create a SAX parser and use it to parse a document. The document, passed in as
54 *filename_or_stream*, can be a filename or a file object. The *handler*
Serhiy Storchaka15e65902013-08-29 10:28:44 +030055 parameter needs to be a SAX :class:`~handler.ContentHandler` instance. If
56 *error_handler* is given, it must be a SAX :class:`~handler.ErrorHandler`
57 instance; if
Georg Brandl116aa622007-08-15 14:28:22 +000058 omitted, :exc:`SAXParseException` will be raised on all errors. There is no
59 return value; all work must be done by the *handler* passed in.
60
61
Georg Brandl7f01a132009-09-16 15:58:14 +000062.. function:: parseString(string, handler, error_handler=handler.ErrorHandler())
Georg Brandl116aa622007-08-15 14:28:22 +000063
64 Similar to :func:`parse`, but parses from a buffer *string* received as a
Serhiy Storchaka778db282015-04-04 10:12:26 +030065 parameter. *string* must be a :class:`str` instance or a
66 :term:`bytes-like object`.
67
68 .. versionchanged:: 3.5
69 Added support of :class:`str` instances.
Georg Brandl116aa622007-08-15 14:28:22 +000070
71A typical SAX application uses three kinds of objects: readers, handlers and
72input sources. "Reader" in this context is another term for parser, i.e. some
73piece of code that reads the bytes or characters from the input source, and
74produces a sequence of events. The events then get distributed to the handler
75objects, i.e. the reader invokes a method on the handler. A SAX application
76must therefore obtain a reader object, create or open the input sources, create
77the handlers, and connect these objects all together. As the final step of
78preparation, the reader is called to parse the input. During parsing, methods on
79the handler objects are called based on structural and syntactic events from the
80input data.
81
82For these objects, only the interfaces are relevant; they are normally not
83instantiated by the application itself. Since Python does not have an explicit
84notion of interface, they are formally introduced as classes, but applications
85may use implementations which do not inherit from the provided classes. The
Serhiy Storchaka15e65902013-08-29 10:28:44 +030086:class:`~xml.sax.xmlreader.InputSource`, :class:`~xml.sax.xmlreader.Locator`,
87:class:`~xml.sax.xmlreader.Attributes`, :class:`~xml.sax.xmlreader.AttributesNS`,
88and :class:`~xml.sax.xmlreader.XMLReader` interfaces are defined in the
Georg Brandl116aa622007-08-15 14:28:22 +000089module :mod:`xml.sax.xmlreader`. The handler interfaces are defined in
Serhiy Storchaka15e65902013-08-29 10:28:44 +030090:mod:`xml.sax.handler`. For convenience,
91:class:`~xml.sax.xmlreader.InputSource` (which is often
Georg Brandl116aa622007-08-15 14:28:22 +000092instantiated directly) and the handler classes are also available from
93:mod:`xml.sax`. These interfaces are described below.
94
95In addition to these classes, :mod:`xml.sax` provides the following exception
96classes.
97
98
Georg Brandl7f01a132009-09-16 15:58:14 +000099.. exception:: SAXException(msg, exception=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000100
101 Encapsulate an XML error or warning. This class can contain basic error or
102 warning information from either the XML parser or the application: it can be
103 subclassed to provide additional functionality or to add localization. Note
Serhiy Storchaka15e65902013-08-29 10:28:44 +0300104 that although the handlers defined in the
105 :class:`~xml.sax.handler.ErrorHandler` interface
Georg Brandl116aa622007-08-15 14:28:22 +0000106 receive instances of this exception, it is not required to actually raise the
107 exception --- it is also useful as a container for information.
108
109 When instantiated, *msg* should be a human-readable description of the error.
110 The optional *exception* parameter, if given, should be ``None`` or an exception
111 that was caught by the parsing code and is being passed along as information.
112
113 This is the base class for the other SAX exception classes.
114
115
116.. exception:: SAXParseException(msg, exception, locator)
117
Serhiy Storchaka15e65902013-08-29 10:28:44 +0300118 Subclass of :exc:`SAXException` raised on parse errors. Instances of this
119 class are passed to the methods of the SAX
120 :class:`~xml.sax.handler.ErrorHandler` interface to provide information
121 about the parse error. This class supports the SAX
122 :class:`~xml.sax.xmlreader.Locator` interface as well as the
123 :class:`SAXException` interface.
Georg Brandl116aa622007-08-15 14:28:22 +0000124
125
Georg Brandl7f01a132009-09-16 15:58:14 +0000126.. exception:: SAXNotRecognizedException(msg, exception=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000127
Serhiy Storchaka15e65902013-08-29 10:28:44 +0300128 Subclass of :exc:`SAXException` raised when a SAX
129 :class:`~xml.sax.xmlreader.XMLReader` is
Georg Brandl116aa622007-08-15 14:28:22 +0000130 confronted with an unrecognized feature or property. SAX applications and
131 extensions may use this class for similar purposes.
132
133
Georg Brandl7f01a132009-09-16 15:58:14 +0000134.. exception:: SAXNotSupportedException(msg, exception=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000135
Serhiy Storchaka15e65902013-08-29 10:28:44 +0300136 Subclass of :exc:`SAXException` raised when a SAX
137 :class:`~xml.sax.xmlreader.XMLReader` is asked to
Georg Brandl116aa622007-08-15 14:28:22 +0000138 enable a feature that is not supported, or to set a property to a value that the
139 implementation does not support. SAX applications and extensions may use this
140 class for similar purposes.
141
142
143.. seealso::
144
145 `SAX: The Simple API for XML <http://www.saxproject.org/>`_
146 This site is the focal point for the definition of the SAX API. It provides a
147 Java implementation and online documentation. Links to implementations and
148 historical information are also available.
149
150 Module :mod:`xml.sax.handler`
151 Definitions of the interfaces for application-provided objects.
152
153 Module :mod:`xml.sax.saxutils`
154 Convenience functions for use in SAX applications.
155
156 Module :mod:`xml.sax.xmlreader`
157 Definitions of the interfaces for parser-provided objects.
158
159
160.. _sax-exception-objects:
161
162SAXException Objects
163--------------------
164
165The :class:`SAXException` exception class supports the following methods:
166
167
168.. method:: SAXException.getMessage()
169
170 Return a human-readable message describing the error condition.
171
172
173.. method:: SAXException.getException()
174
175 Return an encapsulated exception object, or ``None``.
176