blob: 07ec22de89bff3c8218eaac74bdaa56025cd42ed [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`xml.etree.ElementTree` --- The ElementTree XML API
2========================================================
3
4.. module:: xml.etree.ElementTree
5 :synopsis: Implementation of the ElementTree API.
6.. moduleauthor:: Fredrik Lundh <fredrik@pythonware.com>
7
Raymond Hettinger3029aff2011-02-10 08:09:36 +00008**Source code:** :source:`Lib/xml/etree/ElementTree.py`
9
10--------------
Georg Brandl116aa622007-08-15 14:28:22 +000011
Florent Xiclunaf15351d2010-03-13 23:24:31 +000012The :class:`Element` type is a flexible container object, designed to store
13hierarchical data structures in memory. The type can be described as a cross
14between a list and a dictionary.
Georg Brandl116aa622007-08-15 14:28:22 +000015
16Each element has a number of properties associated with it:
17
18* a tag which is a string identifying what kind of data this element represents
19 (the element type, in other words).
20
21* a number of attributes, stored in a Python dictionary.
22
23* a text string.
24
25* an optional tail string.
26
27* a number of child elements, stored in a Python sequence
28
Florent Xiclunaf15351d2010-03-13 23:24:31 +000029To create an element instance, use the :class:`Element` constructor or the
30:func:`SubElement` factory function.
Georg Brandl116aa622007-08-15 14:28:22 +000031
32The :class:`ElementTree` class can be used to wrap an element structure, and
33convert it from and to XML.
34
Christian Heimesd8654cf2007-12-02 15:22:16 +000035See http://effbot.org/zone/element-index.htm for tutorials and links to other
Florent Xiclunaa72a98f2012-02-13 11:03:30 +010036docs.
Florent Xiclunaf15351d2010-03-13 23:24:31 +000037
Ezio Melottif8754a62010-03-21 07:16:43 +000038.. versionchanged:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +000039 The ElementTree API is updated to 1.3. For more information, see
40 `Introducing ElementTree 1.3
41 <http://effbot.org/zone/elementtree-13-intro.htm>`_.
42
Florent Xiclunaa72a98f2012-02-13 11:03:30 +010043.. versionchanged:: 3.3
44 This module will use a fast implementation whenever available.
45 The :mod:`xml.etree.cElementTree` module is deprecated.
46
Georg Brandl116aa622007-08-15 14:28:22 +000047
48.. _elementtree-functions:
49
50Functions
51---------
52
53
Georg Brandl7f01a132009-09-16 15:58:14 +000054.. function:: Comment(text=None)
Georg Brandl116aa622007-08-15 14:28:22 +000055
Georg Brandlf6945182008-02-01 11:56:49 +000056 Comment element factory. This factory function creates a special element
Florent Xiclunaf15351d2010-03-13 23:24:31 +000057 that will be serialized as an XML comment by the standard serializer. The
58 comment string can be either a bytestring or a Unicode string. *text* is a
59 string containing the comment string. Returns an element instance
Georg Brandlf6945182008-02-01 11:56:49 +000060 representing a comment.
Georg Brandl116aa622007-08-15 14:28:22 +000061
62
63.. function:: dump(elem)
64
Florent Xiclunaf15351d2010-03-13 23:24:31 +000065 Writes an element tree or element structure to sys.stdout. This function
66 should be used for debugging only.
Georg Brandl116aa622007-08-15 14:28:22 +000067
68 The exact output format is implementation dependent. In this version, it's
69 written as an ordinary XML file.
70
71 *elem* is an element tree or an individual element.
72
73
Georg Brandl116aa622007-08-15 14:28:22 +000074.. function:: fromstring(text)
75
Florent Xiclunadddd5e92010-03-14 01:28:07 +000076 Parses an XML section from a string constant. Same as :func:`XML`. *text*
77 is a string containing XML data. Returns an :class:`Element` instance.
Florent Xiclunaf15351d2010-03-13 23:24:31 +000078
79
80.. function:: fromstringlist(sequence, parser=None)
81
82 Parses an XML document from a sequence of string fragments. *sequence* is a
83 list or other sequence containing XML data fragments. *parser* is an
84 optional parser instance. If not given, the standard :class:`XMLParser`
85 parser is used. Returns an :class:`Element` instance.
86
Ezio Melottif8754a62010-03-21 07:16:43 +000087 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +000088
89
90.. function:: iselement(element)
91
Florent Xiclunaf15351d2010-03-13 23:24:31 +000092 Checks if an object appears to be a valid element object. *element* is an
93 element instance. Returns a true value if this is an element object.
Georg Brandl116aa622007-08-15 14:28:22 +000094
95
Florent Xiclunaf15351d2010-03-13 23:24:31 +000096.. function:: iterparse(source, events=None, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +000097
98 Parses an XML section into an element tree incrementally, and reports what's
Antoine Pitrou11cb9612010-09-15 11:11:28 +000099 going on to the user. *source* is a filename or :term:`file object` containing
100 XML data. *events* is a list of events to report back. If omitted, only "end"
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000101 events are reported. *parser* is an optional parser instance. If not
102 given, the standard :class:`XMLParser` parser is used. Returns an
103 :term:`iterator` providing ``(event, elem)`` pairs.
Georg Brandl116aa622007-08-15 14:28:22 +0000104
Benjamin Peterson75edad02009-01-01 15:05:06 +0000105 .. note::
106
107 :func:`iterparse` only guarantees that it has seen the ">"
108 character of a starting tag when it emits a "start" event, so the
109 attributes are defined, but the contents of the text and tail attributes
110 are undefined at that point. The same applies to the element children;
111 they may or may not be present.
112
113 If you need a fully populated element, look for "end" events instead.
114
Georg Brandl116aa622007-08-15 14:28:22 +0000115
Georg Brandl7f01a132009-09-16 15:58:14 +0000116.. function:: parse(source, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000117
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000118 Parses an XML section into an element tree. *source* is a filename or file
119 object containing XML data. *parser* is an optional parser instance. If
120 not given, the standard :class:`XMLParser` parser is used. Returns an
121 :class:`ElementTree` instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000122
123
Georg Brandl7f01a132009-09-16 15:58:14 +0000124.. function:: ProcessingInstruction(target, text=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000125
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000126 PI element factory. This factory function creates a special element that
127 will be serialized as an XML processing instruction. *target* is a string
128 containing the PI target. *text* is a string containing the PI contents, if
129 given. Returns an element instance, representing a processing instruction.
130
131
132.. function:: register_namespace(prefix, uri)
133
134 Registers a namespace prefix. The registry is global, and any existing
135 mapping for either the given prefix or the namespace URI will be removed.
136 *prefix* is a namespace prefix. *uri* is a namespace uri. Tags and
137 attributes in this namespace will be serialized with the given prefix, if at
138 all possible.
139
Ezio Melottif8754a62010-03-21 07:16:43 +0000140 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000141
142
Georg Brandl7f01a132009-09-16 15:58:14 +0000143.. function:: SubElement(parent, tag, attrib={}, **extra)
Georg Brandl116aa622007-08-15 14:28:22 +0000144
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000145 Subelement factory. This function creates an element instance, and appends
146 it to an existing element.
Georg Brandl116aa622007-08-15 14:28:22 +0000147
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000148 The element name, attribute names, and attribute values can be either
149 bytestrings or Unicode strings. *parent* is the parent element. *tag* is
150 the subelement name. *attrib* is an optional dictionary, containing element
151 attributes. *extra* contains additional attributes, given as keyword
152 arguments. Returns an element instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000153
154
Florent Xiclunac17f1722010-08-08 19:48:29 +0000155.. function:: tostring(element, encoding="us-ascii", method="xml")
Georg Brandl116aa622007-08-15 14:28:22 +0000156
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000157 Generates a string representation of an XML element, including all
Florent Xiclunadddd5e92010-03-14 01:28:07 +0000158 subelements. *element* is an :class:`Element` instance. *encoding* [1]_ is
Florent Xiclunac17f1722010-08-08 19:48:29 +0000159 the output encoding (default is US-ASCII). Use ``encoding="unicode"`` to
160 generate a Unicode string. *method* is either ``"xml"``,
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000161 ``"html"`` or ``"text"`` (default is ``"xml"``). Returns an (optionally)
162 encoded string containing the XML data.
Georg Brandl116aa622007-08-15 14:28:22 +0000163
164
Florent Xiclunac17f1722010-08-08 19:48:29 +0000165.. function:: tostringlist(element, encoding="us-ascii", method="xml")
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000166
167 Generates a string representation of an XML element, including all
Florent Xiclunadddd5e92010-03-14 01:28:07 +0000168 subelements. *element* is an :class:`Element` instance. *encoding* [1]_ is
Florent Xiclunac17f1722010-08-08 19:48:29 +0000169 the output encoding (default is US-ASCII). Use ``encoding="unicode"`` to
170 generate a Unicode string. *method* is either ``"xml"``,
Florent Xiclunadddd5e92010-03-14 01:28:07 +0000171 ``"html"`` or ``"text"`` (default is ``"xml"``). Returns a list of
172 (optionally) encoded strings containing the XML data. It does not guarantee
173 any specific sequence, except that ``"".join(tostringlist(element)) ==
174 tostring(element)``.
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000175
Ezio Melottif8754a62010-03-21 07:16:43 +0000176 .. versionadded:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000177
178
179.. function:: XML(text, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000180
181 Parses an XML section from a string constant. This function can be used to
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000182 embed "XML literals" in Python code. *text* is a string containing XML
183 data. *parser* is an optional parser instance. If not given, the standard
184 :class:`XMLParser` parser is used. Returns an :class:`Element` instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000185
186
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000187.. function:: XMLID(text, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000188
189 Parses an XML section from a string constant, and also returns a dictionary
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000190 which maps from element id:s to elements. *text* is a string containing XML
191 data. *parser* is an optional parser instance. If not given, the standard
192 :class:`XMLParser` parser is used. Returns a tuple containing an
193 :class:`Element` instance and a dictionary.
Georg Brandl116aa622007-08-15 14:28:22 +0000194
195
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000196.. _elementtree-element-objects:
Georg Brandl116aa622007-08-15 14:28:22 +0000197
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000198Element Objects
199---------------
Georg Brandl116aa622007-08-15 14:28:22 +0000200
Georg Brandl116aa622007-08-15 14:28:22 +0000201
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000202.. class:: Element(tag, attrib={}, **extra)
Georg Brandl116aa622007-08-15 14:28:22 +0000203
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000204 Element class. This class defines the Element interface, and provides a
205 reference implementation of this interface.
Georg Brandl116aa622007-08-15 14:28:22 +0000206
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000207 The element name, attribute names, and attribute values can be either
208 bytestrings or Unicode strings. *tag* is the element name. *attrib* is
209 an optional dictionary, containing element attributes. *extra* contains
210 additional attributes, given as keyword arguments.
Georg Brandl116aa622007-08-15 14:28:22 +0000211
212
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000213 .. attribute:: tag
Georg Brandl116aa622007-08-15 14:28:22 +0000214
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000215 A string identifying what kind of data this element represents (the
216 element type, in other words).
Georg Brandl116aa622007-08-15 14:28:22 +0000217
218
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000219 .. attribute:: text
Georg Brandl116aa622007-08-15 14:28:22 +0000220
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000221 The *text* attribute can be used to hold additional data associated with
222 the element. As the name implies this attribute is usually a string but
223 may be any application-specific object. If the element is created from
224 an XML file the attribute will contain any text found between the element
225 tags.
Georg Brandl116aa622007-08-15 14:28:22 +0000226
227
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000228 .. attribute:: tail
Georg Brandl116aa622007-08-15 14:28:22 +0000229
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000230 The *tail* attribute can be used to hold additional data associated with
231 the element. This attribute is usually a string but may be any
232 application-specific object. If the element is created from an XML file
233 the attribute will contain any text found after the element's end tag and
234 before the next tag.
Georg Brandl116aa622007-08-15 14:28:22 +0000235
Georg Brandl116aa622007-08-15 14:28:22 +0000236
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000237 .. attribute:: attrib
Georg Brandl116aa622007-08-15 14:28:22 +0000238
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000239 A dictionary containing the element's attributes. Note that while the
240 *attrib* value is always a real mutable Python dictionary, an ElementTree
241 implementation may choose to use another internal representation, and
242 create the dictionary only if someone asks for it. To take advantage of
243 such implementations, use the dictionary methods below whenever possible.
Georg Brandl116aa622007-08-15 14:28:22 +0000244
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000245 The following dictionary-like methods work on the element attributes.
Georg Brandl116aa622007-08-15 14:28:22 +0000246
247
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000248 .. method:: clear()
Georg Brandl116aa622007-08-15 14:28:22 +0000249
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000250 Resets an element. This function removes all subelements, clears all
251 attributes, and sets the text and tail attributes to None.
Georg Brandl116aa622007-08-15 14:28:22 +0000252
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000253
254 .. method:: get(key, default=None)
255
256 Gets the element attribute named *key*.
257
258 Returns the attribute value, or *default* if the attribute was not found.
259
260
261 .. method:: items()
262
263 Returns the element attributes as a sequence of (name, value) pairs. The
264 attributes are returned in an arbitrary order.
265
266
267 .. method:: keys()
268
269 Returns the elements attribute names as a list. The names are returned
270 in an arbitrary order.
271
272
273 .. method:: set(key, value)
274
275 Set the attribute *key* on the element to *value*.
276
277 The following methods work on the element's children (subelements).
278
279
280 .. method:: append(subelement)
281
282 Adds the element *subelement* to the end of this elements internal list
283 of subelements.
284
285
286 .. method:: extend(subelements)
Georg Brandl116aa622007-08-15 14:28:22 +0000287
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000288 Appends *subelements* from a sequence object with zero or more elements.
289 Raises :exc:`AssertionError` if a subelement is not a valid object.
Georg Brandl116aa622007-08-15 14:28:22 +0000290
Ezio Melottif8754a62010-03-21 07:16:43 +0000291 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000292
Georg Brandl116aa622007-08-15 14:28:22 +0000293
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000294 .. method:: find(match)
Georg Brandl116aa622007-08-15 14:28:22 +0000295
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000296 Finds the first subelement matching *match*. *match* may be a tag name
297 or path. Returns an element instance or ``None``.
Georg Brandl116aa622007-08-15 14:28:22 +0000298
Georg Brandl116aa622007-08-15 14:28:22 +0000299
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000300 .. method:: findall(match)
Georg Brandl116aa622007-08-15 14:28:22 +0000301
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000302 Finds all matching subelements, by tag name or path. Returns a list
303 containing all matching elements in document order.
Georg Brandl116aa622007-08-15 14:28:22 +0000304
Georg Brandl116aa622007-08-15 14:28:22 +0000305
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000306 .. method:: findtext(match, default=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000307
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000308 Finds text for the first subelement matching *match*. *match* may be
309 a tag name or path. Returns the text content of the first matching
310 element, or *default* if no element was found. Note that if the matching
311 element has no text content an empty string is returned.
Georg Brandl116aa622007-08-15 14:28:22 +0000312
Georg Brandl116aa622007-08-15 14:28:22 +0000313
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000314 .. method:: getchildren()
Georg Brandl116aa622007-08-15 14:28:22 +0000315
Georg Brandl67b21b72010-08-17 15:07:14 +0000316 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000317 Use ``list(elem)`` or iteration.
Georg Brandl116aa622007-08-15 14:28:22 +0000318
Georg Brandl116aa622007-08-15 14:28:22 +0000319
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000320 .. method:: getiterator(tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000321
Georg Brandl67b21b72010-08-17 15:07:14 +0000322 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000323 Use method :meth:`Element.iter` instead.
Georg Brandl116aa622007-08-15 14:28:22 +0000324
Georg Brandl116aa622007-08-15 14:28:22 +0000325
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000326 .. method:: insert(index, element)
Georg Brandl116aa622007-08-15 14:28:22 +0000327
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000328 Inserts a subelement at the given position in this element.
Georg Brandl116aa622007-08-15 14:28:22 +0000329
Georg Brandl116aa622007-08-15 14:28:22 +0000330
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000331 .. method:: iter(tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000332
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000333 Creates a tree :term:`iterator` with the current element as the root.
334 The iterator iterates over this element and all elements below it, in
335 document (depth first) order. If *tag* is not ``None`` or ``'*'``, only
336 elements whose tag equals *tag* are returned from the iterator. If the
337 tree structure is modified during iteration, the result is undefined.
Georg Brandl116aa622007-08-15 14:28:22 +0000338
Ezio Melotti138fc892011-10-10 00:02:03 +0300339 .. versionadded:: 3.2
340
Georg Brandl116aa622007-08-15 14:28:22 +0000341
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000342 .. method:: iterfind(match)
Georg Brandl116aa622007-08-15 14:28:22 +0000343
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000344 Finds all matching subelements, by tag name or path. Returns an iterable
345 yielding all matching elements in document order.
Georg Brandl116aa622007-08-15 14:28:22 +0000346
Ezio Melottif8754a62010-03-21 07:16:43 +0000347 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000348
Georg Brandl116aa622007-08-15 14:28:22 +0000349
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000350 .. method:: itertext()
Georg Brandl116aa622007-08-15 14:28:22 +0000351
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000352 Creates a text iterator. The iterator loops over this element and all
353 subelements, in document order, and returns all inner text.
Georg Brandl116aa622007-08-15 14:28:22 +0000354
Ezio Melottif8754a62010-03-21 07:16:43 +0000355 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000356
357
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000358 .. method:: makeelement(tag, attrib)
Georg Brandl116aa622007-08-15 14:28:22 +0000359
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000360 Creates a new element object of the same type as this element. Do not
361 call this method, use the :func:`SubElement` factory function instead.
Georg Brandl116aa622007-08-15 14:28:22 +0000362
363
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000364 .. method:: remove(subelement)
Georg Brandl116aa622007-08-15 14:28:22 +0000365
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000366 Removes *subelement* from the element. Unlike the find\* methods this
367 method compares elements based on the instance identity, not on tag value
368 or contents.
Georg Brandl116aa622007-08-15 14:28:22 +0000369
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000370 :class:`Element` objects also support the following sequence type methods
371 for working with subelements: :meth:`__delitem__`, :meth:`__getitem__`,
372 :meth:`__setitem__`, :meth:`__len__`.
Georg Brandl116aa622007-08-15 14:28:22 +0000373
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000374 Caution: Elements with no subelements will test as ``False``. This behavior
375 will change in future versions. Use specific ``len(elem)`` or ``elem is
376 None`` test instead. ::
Georg Brandl116aa622007-08-15 14:28:22 +0000377
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000378 element = root.find('foo')
Georg Brandl116aa622007-08-15 14:28:22 +0000379
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000380 if not element: # careful!
381 print("element not found, or element has no subelements")
Georg Brandl116aa622007-08-15 14:28:22 +0000382
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000383 if element is None:
384 print("element not found")
Georg Brandl116aa622007-08-15 14:28:22 +0000385
386
387.. _elementtree-elementtree-objects:
388
389ElementTree Objects
390-------------------
391
392
Georg Brandl7f01a132009-09-16 15:58:14 +0000393.. class:: ElementTree(element=None, file=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000394
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000395 ElementTree wrapper class. This class represents an entire element
396 hierarchy, and adds some extra support for serialization to and from
397 standard XML.
Georg Brandl116aa622007-08-15 14:28:22 +0000398
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000399 *element* is the root element. The tree is initialized with the contents
400 of the XML *file* if given.
Georg Brandl116aa622007-08-15 14:28:22 +0000401
402
Benjamin Petersone41251e2008-04-25 01:59:09 +0000403 .. method:: _setroot(element)
Georg Brandl116aa622007-08-15 14:28:22 +0000404
Benjamin Petersone41251e2008-04-25 01:59:09 +0000405 Replaces the root element for this tree. This discards the current
406 contents of the tree, and replaces it with the given element. Use with
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000407 care. *element* is an element instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000408
409
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000410 .. method:: find(match)
Georg Brandl116aa622007-08-15 14:28:22 +0000411
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000412 Finds the first toplevel element matching *match*. *match* may be a tag
413 name or path. Same as getroot().find(match). Returns the first matching
414 element, or ``None`` if no element was found.
Georg Brandl116aa622007-08-15 14:28:22 +0000415
416
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000417 .. method:: findall(match)
Georg Brandl116aa622007-08-15 14:28:22 +0000418
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000419 Finds all matching subelements, by tag name or path. Same as
420 getroot().findall(match). *match* may be a tag name or path. Returns a
421 list containing all matching elements, in document order.
Georg Brandl116aa622007-08-15 14:28:22 +0000422
423
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000424 .. method:: findtext(match, default=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000425
Benjamin Petersone41251e2008-04-25 01:59:09 +0000426 Finds the element text for the first toplevel element with given tag.
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000427 Same as getroot().findtext(match). *match* may be a tag name or path.
428 *default* is the value to return if the element was not found. Returns
429 the text content of the first matching element, or the default value no
430 element was found. Note that if the element is found, but has no text
431 content, this method returns an empty string.
Georg Brandl116aa622007-08-15 14:28:22 +0000432
433
Georg Brandl7f01a132009-09-16 15:58:14 +0000434 .. method:: getiterator(tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000435
Georg Brandl67b21b72010-08-17 15:07:14 +0000436 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000437 Use method :meth:`ElementTree.iter` instead.
Georg Brandl116aa622007-08-15 14:28:22 +0000438
439
Benjamin Petersone41251e2008-04-25 01:59:09 +0000440 .. method:: getroot()
Florent Xiclunac17f1722010-08-08 19:48:29 +0000441
Benjamin Petersone41251e2008-04-25 01:59:09 +0000442 Returns the root element for this tree.
Georg Brandl116aa622007-08-15 14:28:22 +0000443
444
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000445 .. method:: iter(tag=None)
446
447 Creates and returns a tree iterator for the root element. The iterator
448 loops over all elements in this tree, in section order. *tag* is the tag
449 to look for (default is to return all elements)
450
451
452 .. method:: iterfind(match)
453
454 Finds all matching subelements, by tag name or path. Same as
455 getroot().iterfind(match). Returns an iterable yielding all matching
456 elements in document order.
457
Ezio Melottif8754a62010-03-21 07:16:43 +0000458 .. versionadded:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000459
460
Georg Brandl7f01a132009-09-16 15:58:14 +0000461 .. method:: parse(source, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000462
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000463 Loads an external XML section into this element tree. *source* is a file
Antoine Pitrou11cb9612010-09-15 11:11:28 +0000464 name or :term:`file object`. *parser* is an optional parser instance.
465 If not given, the standard XMLParser parser is used. Returns the section
Benjamin Petersone41251e2008-04-25 01:59:09 +0000466 root element.
Georg Brandl116aa622007-08-15 14:28:22 +0000467
468
Florent Xiclunac17f1722010-08-08 19:48:29 +0000469 .. method:: write(file, encoding="us-ascii", xml_declaration=None, method="xml")
Georg Brandl116aa622007-08-15 14:28:22 +0000470
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000471 Writes the element tree to a file, as XML. *file* is a file name, or a
Antoine Pitrou11cb9612010-09-15 11:11:28 +0000472 :term:`file object` opened for writing. *encoding* [1]_ is the output encoding
Florent Xiclunac17f1722010-08-08 19:48:29 +0000473 (default is US-ASCII). Use ``encoding="unicode"`` to write a Unicode string.
474 *xml_declaration* controls if an XML declaration
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000475 should be added to the file. Use False for never, True for always, None
Florent Xiclunac17f1722010-08-08 19:48:29 +0000476 for only if not US-ASCII or UTF-8 or Unicode (default is None). *method* is
477 either ``"xml"``, ``"html"`` or ``"text"`` (default is ``"xml"``).
478 Returns an (optionally) encoded string.
Georg Brandl116aa622007-08-15 14:28:22 +0000479
Christian Heimesd8654cf2007-12-02 15:22:16 +0000480This is the XML file that is going to be manipulated::
481
482 <html>
483 <head>
484 <title>Example page</title>
485 </head>
486 <body>
Georg Brandl48310cd2009-01-03 21:18:54 +0000487 <p>Moved to <a href="http://example.org/">example.org</a>
Christian Heimesd8654cf2007-12-02 15:22:16 +0000488 or <a href="http://example.com/">example.com</a>.</p>
489 </body>
490 </html>
491
492Example of changing the attribute "target" of every link in first paragraph::
493
494 >>> from xml.etree.ElementTree import ElementTree
495 >>> tree = ElementTree()
496 >>> tree.parse("index.xhtml")
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000497 <Element 'html' at 0xb77e6fac>
Christian Heimesd8654cf2007-12-02 15:22:16 +0000498 >>> p = tree.find("body/p") # Finds first occurrence of tag p in body
499 >>> p
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000500 <Element 'p' at 0xb77ec26c>
501 >>> links = list(p.iter("a")) # Returns list of all links
Christian Heimesd8654cf2007-12-02 15:22:16 +0000502 >>> links
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000503 [<Element 'a' at 0xb77ec2ac>, <Element 'a' at 0xb77ec1cc>]
Christian Heimesd8654cf2007-12-02 15:22:16 +0000504 >>> for i in links: # Iterates through all found links
505 ... i.attrib["target"] = "blank"
506 >>> tree.write("output.xhtml")
Georg Brandl116aa622007-08-15 14:28:22 +0000507
508.. _elementtree-qname-objects:
509
510QName Objects
511-------------
512
513
Georg Brandl7f01a132009-09-16 15:58:14 +0000514.. class:: QName(text_or_uri, tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000515
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000516 QName wrapper. This can be used to wrap a QName attribute value, in order
517 to get proper namespace handling on output. *text_or_uri* is a string
518 containing the QName value, in the form {uri}local, or, if the tag argument
519 is given, the URI part of a QName. If *tag* is given, the first argument is
520 interpreted as an URI, and this argument is interpreted as a local name.
521 :class:`QName` instances are opaque.
Georg Brandl116aa622007-08-15 14:28:22 +0000522
523
524.. _elementtree-treebuilder-objects:
525
526TreeBuilder Objects
527-------------------
528
529
Georg Brandl7f01a132009-09-16 15:58:14 +0000530.. class:: TreeBuilder(element_factory=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000531
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000532 Generic element structure builder. This builder converts a sequence of
533 start, data, and end method calls to a well-formed element structure. You
534 can use this class to build an element structure using a custom XML parser,
535 or a parser for some other XML-like format. The *element_factory* is called
536 to create new :class:`Element` instances when given.
Georg Brandl116aa622007-08-15 14:28:22 +0000537
538
Benjamin Petersone41251e2008-04-25 01:59:09 +0000539 .. method:: close()
Georg Brandl116aa622007-08-15 14:28:22 +0000540
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000541 Flushes the builder buffers, and returns the toplevel document
542 element. Returns an :class:`Element` instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000543
544
Benjamin Petersone41251e2008-04-25 01:59:09 +0000545 .. method:: data(data)
Georg Brandl116aa622007-08-15 14:28:22 +0000546
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000547 Adds text to the current element. *data* is a string. This should be
548 either a bytestring, or a Unicode string.
Georg Brandl116aa622007-08-15 14:28:22 +0000549
550
Benjamin Petersone41251e2008-04-25 01:59:09 +0000551 .. method:: end(tag)
Georg Brandl116aa622007-08-15 14:28:22 +0000552
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000553 Closes the current element. *tag* is the element name. Returns the
554 closed element.
Georg Brandl116aa622007-08-15 14:28:22 +0000555
556
Benjamin Petersone41251e2008-04-25 01:59:09 +0000557 .. method:: start(tag, attrs)
Georg Brandl116aa622007-08-15 14:28:22 +0000558
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000559 Opens a new element. *tag* is the element name. *attrs* is a dictionary
560 containing element attributes. Returns the opened element.
Georg Brandl116aa622007-08-15 14:28:22 +0000561
562
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000563 In addition, a custom :class:`TreeBuilder` object can provide the
564 following method:
Georg Brandl116aa622007-08-15 14:28:22 +0000565
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000566 .. method:: doctype(name, pubid, system)
567
568 Handles a doctype declaration. *name* is the doctype name. *pubid* is
569 the public identifier. *system* is the system identifier. This method
570 does not exist on the default :class:`TreeBuilder` class.
571
Ezio Melottif8754a62010-03-21 07:16:43 +0000572 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000573
574
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000575.. _elementtree-xmlparser-objects:
Georg Brandl116aa622007-08-15 14:28:22 +0000576
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000577XMLParser Objects
578-----------------
579
580
581.. class:: XMLParser(html=0, target=None, encoding=None)
582
583 :class:`Element` structure builder for XML source data, based on the expat
584 parser. *html* are predefined HTML entities. This flag is not supported by
585 the current implementation. *target* is the target object. If omitted, the
586 builder uses an instance of the standard TreeBuilder class. *encoding* [1]_
587 is optional. If given, the value overrides the encoding specified in the
588 XML file.
Georg Brandl116aa622007-08-15 14:28:22 +0000589
590
Benjamin Petersone41251e2008-04-25 01:59:09 +0000591 .. method:: close()
Georg Brandl116aa622007-08-15 14:28:22 +0000592
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000593 Finishes feeding data to the parser. Returns an element structure.
Georg Brandl116aa622007-08-15 14:28:22 +0000594
595
Benjamin Petersone41251e2008-04-25 01:59:09 +0000596 .. method:: doctype(name, pubid, system)
Georg Brandl116aa622007-08-15 14:28:22 +0000597
Georg Brandl67b21b72010-08-17 15:07:14 +0000598 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000599 Define the :meth:`TreeBuilder.doctype` method on a custom TreeBuilder
600 target.
Georg Brandl116aa622007-08-15 14:28:22 +0000601
602
Benjamin Petersone41251e2008-04-25 01:59:09 +0000603 .. method:: feed(data)
Georg Brandl116aa622007-08-15 14:28:22 +0000604
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000605 Feeds data to the parser. *data* is encoded data.
Georg Brandl116aa622007-08-15 14:28:22 +0000606
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000607:meth:`XMLParser.feed` calls *target*\'s :meth:`start` method
Christian Heimesd8654cf2007-12-02 15:22:16 +0000608for each opening tag, its :meth:`end` method for each closing tag,
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000609and data is processed by method :meth:`data`. :meth:`XMLParser.close`
Georg Brandl48310cd2009-01-03 21:18:54 +0000610calls *target*\'s method :meth:`close`.
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000611:class:`XMLParser` can be used not only for building a tree structure.
Christian Heimesd8654cf2007-12-02 15:22:16 +0000612This is an example of counting the maximum depth of an XML file::
613
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000614 >>> from xml.etree.ElementTree import XMLParser
Christian Heimesd8654cf2007-12-02 15:22:16 +0000615 >>> class MaxDepth: # The target object of the parser
616 ... maxDepth = 0
617 ... depth = 0
618 ... def start(self, tag, attrib): # Called for each opening tag.
Georg Brandl48310cd2009-01-03 21:18:54 +0000619 ... self.depth += 1
Christian Heimesd8654cf2007-12-02 15:22:16 +0000620 ... if self.depth > self.maxDepth:
621 ... self.maxDepth = self.depth
622 ... def end(self, tag): # Called for each closing tag.
623 ... self.depth -= 1
Georg Brandl48310cd2009-01-03 21:18:54 +0000624 ... def data(self, data):
Christian Heimesd8654cf2007-12-02 15:22:16 +0000625 ... pass # We do not need to do anything with data.
626 ... def close(self): # Called when all data has been parsed.
627 ... return self.maxDepth
Georg Brandl48310cd2009-01-03 21:18:54 +0000628 ...
Christian Heimesd8654cf2007-12-02 15:22:16 +0000629 >>> target = MaxDepth()
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000630 >>> parser = XMLParser(target=target)
Christian Heimesd8654cf2007-12-02 15:22:16 +0000631 >>> exampleXml = """
632 ... <a>
633 ... <b>
634 ... </b>
635 ... <b>
636 ... <c>
637 ... <d>
638 ... </d>
639 ... </c>
640 ... </b>
641 ... </a>"""
642 >>> parser.feed(exampleXml)
643 >>> parser.close()
644 4
Christian Heimesb186d002008-03-18 15:15:01 +0000645
646
647.. rubric:: Footnotes
648
649.. [#] The encoding string included in XML output should conform to the
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000650 appropriate standards. For example, "UTF-8" is valid, but "UTF8" is
651 not. See http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl
Benjamin Petersonad3d5c22009-02-26 03:38:59 +0000652 and http://www.iana.org/assignments/character-sets.