blob: 18c35aa290c39eb56c47319a0070077193f6194e [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`xml.etree.ElementTree` --- The ElementTree XML API
2========================================================
3
4.. module:: xml.etree.ElementTree
5 :synopsis: Implementation of the ElementTree API.
6.. moduleauthor:: Fredrik Lundh <fredrik@pythonware.com>
7
Raymond Hettinger3029aff2011-02-10 08:09:36 +00008**Source code:** :source:`Lib/xml/etree/ElementTree.py`
9
10--------------
Georg Brandl116aa622007-08-15 14:28:22 +000011
Florent Xiclunaf15351d2010-03-13 23:24:31 +000012The :class:`Element` type is a flexible container object, designed to store
13hierarchical data structures in memory. The type can be described as a cross
14between a list and a dictionary.
Georg Brandl116aa622007-08-15 14:28:22 +000015
16Each element has a number of properties associated with it:
17
18* a tag which is a string identifying what kind of data this element represents
19 (the element type, in other words).
20
21* a number of attributes, stored in a Python dictionary.
22
23* a text string.
24
25* an optional tail string.
26
27* a number of child elements, stored in a Python sequence
28
Florent Xiclunaf15351d2010-03-13 23:24:31 +000029To create an element instance, use the :class:`Element` constructor or the
30:func:`SubElement` factory function.
Georg Brandl116aa622007-08-15 14:28:22 +000031
32The :class:`ElementTree` class can be used to wrap an element structure, and
33convert it from and to XML.
34
35A C implementation of this API is available as :mod:`xml.etree.cElementTree`.
36
Christian Heimesd8654cf2007-12-02 15:22:16 +000037See http://effbot.org/zone/element-index.htm for tutorials and links to other
Florent Xiclunaf15351d2010-03-13 23:24:31 +000038docs. Fredrik Lundh's page is also the location of the development version of
39the xml.etree.ElementTree.
40
Ezio Melottif8754a62010-03-21 07:16:43 +000041.. versionchanged:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +000042 The ElementTree API is updated to 1.3. For more information, see
43 `Introducing ElementTree 1.3
44 <http://effbot.org/zone/elementtree-13-intro.htm>`_.
45
Georg Brandl116aa622007-08-15 14:28:22 +000046
47.. _elementtree-functions:
48
49Functions
50---------
51
52
Georg Brandl7f01a132009-09-16 15:58:14 +000053.. function:: Comment(text=None)
Georg Brandl116aa622007-08-15 14:28:22 +000054
Georg Brandlf6945182008-02-01 11:56:49 +000055 Comment element factory. This factory function creates a special element
Florent Xiclunaf15351d2010-03-13 23:24:31 +000056 that will be serialized as an XML comment by the standard serializer. The
57 comment string can be either a bytestring or a Unicode string. *text* is a
58 string containing the comment string. Returns an element instance
Georg Brandlf6945182008-02-01 11:56:49 +000059 representing a comment.
Georg Brandl116aa622007-08-15 14:28:22 +000060
61
62.. function:: dump(elem)
63
Florent Xiclunaf15351d2010-03-13 23:24:31 +000064 Writes an element tree or element structure to sys.stdout. This function
65 should be used for debugging only.
Georg Brandl116aa622007-08-15 14:28:22 +000066
67 The exact output format is implementation dependent. In this version, it's
68 written as an ordinary XML file.
69
70 *elem* is an element tree or an individual element.
71
72
Georg Brandl116aa622007-08-15 14:28:22 +000073.. function:: fromstring(text)
74
Florent Xiclunadddd5e92010-03-14 01:28:07 +000075 Parses an XML section from a string constant. Same as :func:`XML`. *text*
76 is a string containing XML data. Returns an :class:`Element` instance.
Florent Xiclunaf15351d2010-03-13 23:24:31 +000077
78
79.. function:: fromstringlist(sequence, parser=None)
80
81 Parses an XML document from a sequence of string fragments. *sequence* is a
82 list or other sequence containing XML data fragments. *parser* is an
83 optional parser instance. If not given, the standard :class:`XMLParser`
84 parser is used. Returns an :class:`Element` instance.
85
Ezio Melottif8754a62010-03-21 07:16:43 +000086 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +000087
88
89.. function:: iselement(element)
90
Florent Xiclunaf15351d2010-03-13 23:24:31 +000091 Checks if an object appears to be a valid element object. *element* is an
92 element instance. Returns a true value if this is an element object.
Georg Brandl116aa622007-08-15 14:28:22 +000093
94
Florent Xiclunaf15351d2010-03-13 23:24:31 +000095.. function:: iterparse(source, events=None, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +000096
97 Parses an XML section into an element tree incrementally, and reports what's
Antoine Pitrou11cb9612010-09-15 11:11:28 +000098 going on to the user. *source* is a filename or :term:`file object` containing
99 XML data. *events* is a list of events to report back. If omitted, only "end"
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000100 events are reported. *parser* is an optional parser instance. If not
101 given, the standard :class:`XMLParser` parser is used. Returns an
102 :term:`iterator` providing ``(event, elem)`` pairs.
Georg Brandl116aa622007-08-15 14:28:22 +0000103
Benjamin Peterson75edad02009-01-01 15:05:06 +0000104 .. note::
105
106 :func:`iterparse` only guarantees that it has seen the ">"
107 character of a starting tag when it emits a "start" event, so the
108 attributes are defined, but the contents of the text and tail attributes
109 are undefined at that point. The same applies to the element children;
110 they may or may not be present.
111
112 If you need a fully populated element, look for "end" events instead.
113
Georg Brandl116aa622007-08-15 14:28:22 +0000114
Georg Brandl7f01a132009-09-16 15:58:14 +0000115.. function:: parse(source, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000116
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000117 Parses an XML section into an element tree. *source* is a filename or file
118 object containing XML data. *parser* is an optional parser instance. If
119 not given, the standard :class:`XMLParser` parser is used. Returns an
120 :class:`ElementTree` instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000121
122
Georg Brandl7f01a132009-09-16 15:58:14 +0000123.. function:: ProcessingInstruction(target, text=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000124
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000125 PI element factory. This factory function creates a special element that
126 will be serialized as an XML processing instruction. *target* is a string
127 containing the PI target. *text* is a string containing the PI contents, if
128 given. Returns an element instance, representing a processing instruction.
129
130
131.. function:: register_namespace(prefix, uri)
132
133 Registers a namespace prefix. The registry is global, and any existing
134 mapping for either the given prefix or the namespace URI will be removed.
135 *prefix* is a namespace prefix. *uri* is a namespace uri. Tags and
136 attributes in this namespace will be serialized with the given prefix, if at
137 all possible.
138
Ezio Melottif8754a62010-03-21 07:16:43 +0000139 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000140
141
Georg Brandl7f01a132009-09-16 15:58:14 +0000142.. function:: SubElement(parent, tag, attrib={}, **extra)
Georg Brandl116aa622007-08-15 14:28:22 +0000143
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000144 Subelement factory. This function creates an element instance, and appends
145 it to an existing element.
Georg Brandl116aa622007-08-15 14:28:22 +0000146
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000147 The element name, attribute names, and attribute values can be either
148 bytestrings or Unicode strings. *parent* is the parent element. *tag* is
149 the subelement name. *attrib* is an optional dictionary, containing element
150 attributes. *extra* contains additional attributes, given as keyword
151 arguments. Returns an element instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000152
153
Florent Xiclunac17f1722010-08-08 19:48:29 +0000154.. function:: tostring(element, encoding="us-ascii", method="xml")
Georg Brandl116aa622007-08-15 14:28:22 +0000155
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000156 Generates a string representation of an XML element, including all
Florent Xiclunadddd5e92010-03-14 01:28:07 +0000157 subelements. *element* is an :class:`Element` instance. *encoding* [1]_ is
Florent Xiclunac17f1722010-08-08 19:48:29 +0000158 the output encoding (default is US-ASCII). Use ``encoding="unicode"`` to
159 generate a Unicode string. *method* is either ``"xml"``,
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000160 ``"html"`` or ``"text"`` (default is ``"xml"``). Returns an (optionally)
161 encoded string containing the XML data.
Georg Brandl116aa622007-08-15 14:28:22 +0000162
163
Florent Xiclunac17f1722010-08-08 19:48:29 +0000164.. function:: tostringlist(element, encoding="us-ascii", method="xml")
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000165
166 Generates a string representation of an XML element, including all
Florent Xiclunadddd5e92010-03-14 01:28:07 +0000167 subelements. *element* is an :class:`Element` instance. *encoding* [1]_ is
Florent Xiclunac17f1722010-08-08 19:48:29 +0000168 the output encoding (default is US-ASCII). Use ``encoding="unicode"`` to
169 generate a Unicode string. *method* is either ``"xml"``,
Florent Xiclunadddd5e92010-03-14 01:28:07 +0000170 ``"html"`` or ``"text"`` (default is ``"xml"``). Returns a list of
171 (optionally) encoded strings containing the XML data. It does not guarantee
172 any specific sequence, except that ``"".join(tostringlist(element)) ==
173 tostring(element)``.
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000174
Ezio Melottif8754a62010-03-21 07:16:43 +0000175 .. versionadded:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000176
177
178.. function:: XML(text, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000179
180 Parses an XML section from a string constant. This function can be used to
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000181 embed "XML literals" in Python code. *text* is a string containing XML
182 data. *parser* is an optional parser instance. If not given, the standard
183 :class:`XMLParser` parser is used. Returns an :class:`Element` instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000184
185
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000186.. function:: XMLID(text, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000187
188 Parses an XML section from a string constant, and also returns a dictionary
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000189 which maps from element id:s to elements. *text* is a string containing XML
190 data. *parser* is an optional parser instance. If not given, the standard
191 :class:`XMLParser` parser is used. Returns a tuple containing an
192 :class:`Element` instance and a dictionary.
Georg Brandl116aa622007-08-15 14:28:22 +0000193
194
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000195.. _elementtree-element-objects:
Georg Brandl116aa622007-08-15 14:28:22 +0000196
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000197Element Objects
198---------------
Georg Brandl116aa622007-08-15 14:28:22 +0000199
Georg Brandl116aa622007-08-15 14:28:22 +0000200
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000201.. class:: Element(tag, attrib={}, **extra)
Georg Brandl116aa622007-08-15 14:28:22 +0000202
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000203 Element class. This class defines the Element interface, and provides a
204 reference implementation of this interface.
Georg Brandl116aa622007-08-15 14:28:22 +0000205
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000206 The element name, attribute names, and attribute values can be either
207 bytestrings or Unicode strings. *tag* is the element name. *attrib* is
208 an optional dictionary, containing element attributes. *extra* contains
209 additional attributes, given as keyword arguments.
Georg Brandl116aa622007-08-15 14:28:22 +0000210
211
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000212 .. attribute:: tag
Georg Brandl116aa622007-08-15 14:28:22 +0000213
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000214 A string identifying what kind of data this element represents (the
215 element type, in other words).
Georg Brandl116aa622007-08-15 14:28:22 +0000216
217
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000218 .. attribute:: text
Georg Brandl116aa622007-08-15 14:28:22 +0000219
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000220 The *text* attribute can be used to hold additional data associated with
221 the element. As the name implies this attribute is usually a string but
222 may be any application-specific object. If the element is created from
223 an XML file the attribute will contain any text found between the element
224 tags.
Georg Brandl116aa622007-08-15 14:28:22 +0000225
226
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000227 .. attribute:: tail
Georg Brandl116aa622007-08-15 14:28:22 +0000228
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000229 The *tail* attribute can be used to hold additional data associated with
230 the element. This attribute is usually a string but may be any
231 application-specific object. If the element is created from an XML file
232 the attribute will contain any text found after the element's end tag and
233 before the next tag.
Georg Brandl116aa622007-08-15 14:28:22 +0000234
Georg Brandl116aa622007-08-15 14:28:22 +0000235
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000236 .. attribute:: attrib
Georg Brandl116aa622007-08-15 14:28:22 +0000237
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000238 A dictionary containing the element's attributes. Note that while the
239 *attrib* value is always a real mutable Python dictionary, an ElementTree
240 implementation may choose to use another internal representation, and
241 create the dictionary only if someone asks for it. To take advantage of
242 such implementations, use the dictionary methods below whenever possible.
Georg Brandl116aa622007-08-15 14:28:22 +0000243
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000244 The following dictionary-like methods work on the element attributes.
Georg Brandl116aa622007-08-15 14:28:22 +0000245
246
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000247 .. method:: clear()
Georg Brandl116aa622007-08-15 14:28:22 +0000248
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000249 Resets an element. This function removes all subelements, clears all
250 attributes, and sets the text and tail attributes to None.
Georg Brandl116aa622007-08-15 14:28:22 +0000251
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000252
253 .. method:: get(key, default=None)
254
255 Gets the element attribute named *key*.
256
257 Returns the attribute value, or *default* if the attribute was not found.
258
259
260 .. method:: items()
261
262 Returns the element attributes as a sequence of (name, value) pairs. The
263 attributes are returned in an arbitrary order.
264
265
266 .. method:: keys()
267
268 Returns the elements attribute names as a list. The names are returned
269 in an arbitrary order.
270
271
272 .. method:: set(key, value)
273
274 Set the attribute *key* on the element to *value*.
275
276 The following methods work on the element's children (subelements).
277
278
279 .. method:: append(subelement)
280
281 Adds the element *subelement* to the end of this elements internal list
282 of subelements.
283
284
285 .. method:: extend(subelements)
Georg Brandl116aa622007-08-15 14:28:22 +0000286
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000287 Appends *subelements* from a sequence object with zero or more elements.
288 Raises :exc:`AssertionError` if a subelement is not a valid object.
Georg Brandl116aa622007-08-15 14:28:22 +0000289
Ezio Melottif8754a62010-03-21 07:16:43 +0000290 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000291
Georg Brandl116aa622007-08-15 14:28:22 +0000292
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000293 .. method:: find(match)
Georg Brandl116aa622007-08-15 14:28:22 +0000294
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000295 Finds the first subelement matching *match*. *match* may be a tag name
296 or path. Returns an element instance or ``None``.
Georg Brandl116aa622007-08-15 14:28:22 +0000297
Georg Brandl116aa622007-08-15 14:28:22 +0000298
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000299 .. method:: findall(match)
Georg Brandl116aa622007-08-15 14:28:22 +0000300
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000301 Finds all matching subelements, by tag name or path. Returns a list
302 containing all matching elements in document order.
Georg Brandl116aa622007-08-15 14:28:22 +0000303
Georg Brandl116aa622007-08-15 14:28:22 +0000304
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000305 .. method:: findtext(match, default=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000306
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000307 Finds text for the first subelement matching *match*. *match* may be
308 a tag name or path. Returns the text content of the first matching
309 element, or *default* if no element was found. Note that if the matching
310 element has no text content an empty string is returned.
Georg Brandl116aa622007-08-15 14:28:22 +0000311
Georg Brandl116aa622007-08-15 14:28:22 +0000312
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000313 .. method:: getchildren()
Georg Brandl116aa622007-08-15 14:28:22 +0000314
Georg Brandl67b21b72010-08-17 15:07:14 +0000315 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000316 Use ``list(elem)`` or iteration.
Georg Brandl116aa622007-08-15 14:28:22 +0000317
Georg Brandl116aa622007-08-15 14:28:22 +0000318
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000319 .. method:: getiterator(tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000320
Georg Brandl67b21b72010-08-17 15:07:14 +0000321 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000322 Use method :meth:`Element.iter` instead.
Georg Brandl116aa622007-08-15 14:28:22 +0000323
Georg Brandl116aa622007-08-15 14:28:22 +0000324
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000325 .. method:: insert(index, element)
Georg Brandl116aa622007-08-15 14:28:22 +0000326
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000327 Inserts a subelement at the given position in this element.
Georg Brandl116aa622007-08-15 14:28:22 +0000328
Georg Brandl116aa622007-08-15 14:28:22 +0000329
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000330 .. method:: iter(tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000331
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000332 Creates a tree :term:`iterator` with the current element as the root.
333 The iterator iterates over this element and all elements below it, in
334 document (depth first) order. If *tag* is not ``None`` or ``'*'``, only
335 elements whose tag equals *tag* are returned from the iterator. If the
336 tree structure is modified during iteration, the result is undefined.
Georg Brandl116aa622007-08-15 14:28:22 +0000337
Georg Brandl116aa622007-08-15 14:28:22 +0000338
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000339 .. method:: iterfind(match)
Georg Brandl116aa622007-08-15 14:28:22 +0000340
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000341 Finds all matching subelements, by tag name or path. Returns an iterable
342 yielding all matching elements in document order.
Georg Brandl116aa622007-08-15 14:28:22 +0000343
Ezio Melottif8754a62010-03-21 07:16:43 +0000344 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000345
Georg Brandl116aa622007-08-15 14:28:22 +0000346
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000347 .. method:: itertext()
Georg Brandl116aa622007-08-15 14:28:22 +0000348
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000349 Creates a text iterator. The iterator loops over this element and all
350 subelements, in document order, and returns all inner text.
Georg Brandl116aa622007-08-15 14:28:22 +0000351
Ezio Melottif8754a62010-03-21 07:16:43 +0000352 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000353
354
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000355 .. method:: makeelement(tag, attrib)
Georg Brandl116aa622007-08-15 14:28:22 +0000356
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000357 Creates a new element object of the same type as this element. Do not
358 call this method, use the :func:`SubElement` factory function instead.
Georg Brandl116aa622007-08-15 14:28:22 +0000359
360
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000361 .. method:: remove(subelement)
Georg Brandl116aa622007-08-15 14:28:22 +0000362
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000363 Removes *subelement* from the element. Unlike the find\* methods this
364 method compares elements based on the instance identity, not on tag value
365 or contents.
Georg Brandl116aa622007-08-15 14:28:22 +0000366
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000367 :class:`Element` objects also support the following sequence type methods
368 for working with subelements: :meth:`__delitem__`, :meth:`__getitem__`,
369 :meth:`__setitem__`, :meth:`__len__`.
Georg Brandl116aa622007-08-15 14:28:22 +0000370
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000371 Caution: Elements with no subelements will test as ``False``. This behavior
372 will change in future versions. Use specific ``len(elem)`` or ``elem is
373 None`` test instead. ::
Georg Brandl116aa622007-08-15 14:28:22 +0000374
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000375 element = root.find('foo')
Georg Brandl116aa622007-08-15 14:28:22 +0000376
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000377 if not element: # careful!
378 print("element not found, or element has no subelements")
Georg Brandl116aa622007-08-15 14:28:22 +0000379
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000380 if element is None:
381 print("element not found")
Georg Brandl116aa622007-08-15 14:28:22 +0000382
383
384.. _elementtree-elementtree-objects:
385
386ElementTree Objects
387-------------------
388
389
Georg Brandl7f01a132009-09-16 15:58:14 +0000390.. class:: ElementTree(element=None, file=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000391
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000392 ElementTree wrapper class. This class represents an entire element
393 hierarchy, and adds some extra support for serialization to and from
394 standard XML.
Georg Brandl116aa622007-08-15 14:28:22 +0000395
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000396 *element* is the root element. The tree is initialized with the contents
397 of the XML *file* if given.
Georg Brandl116aa622007-08-15 14:28:22 +0000398
399
Benjamin Petersone41251e2008-04-25 01:59:09 +0000400 .. method:: _setroot(element)
Georg Brandl116aa622007-08-15 14:28:22 +0000401
Benjamin Petersone41251e2008-04-25 01:59:09 +0000402 Replaces the root element for this tree. This discards the current
403 contents of the tree, and replaces it with the given element. Use with
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000404 care. *element* is an element instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000405
406
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000407 .. method:: find(match)
Georg Brandl116aa622007-08-15 14:28:22 +0000408
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000409 Finds the first toplevel element matching *match*. *match* may be a tag
410 name or path. Same as getroot().find(match). Returns the first matching
411 element, or ``None`` if no element was found.
Georg Brandl116aa622007-08-15 14:28:22 +0000412
413
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000414 .. method:: findall(match)
Georg Brandl116aa622007-08-15 14:28:22 +0000415
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000416 Finds all matching subelements, by tag name or path. Same as
417 getroot().findall(match). *match* may be a tag name or path. Returns a
418 list containing all matching elements, in document order.
Georg Brandl116aa622007-08-15 14:28:22 +0000419
420
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000421 .. method:: findtext(match, default=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000422
Benjamin Petersone41251e2008-04-25 01:59:09 +0000423 Finds the element text for the first toplevel element with given tag.
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000424 Same as getroot().findtext(match). *match* may be a tag name or path.
425 *default* is the value to return if the element was not found. Returns
426 the text content of the first matching element, or the default value no
427 element was found. Note that if the element is found, but has no text
428 content, this method returns an empty string.
Georg Brandl116aa622007-08-15 14:28:22 +0000429
430
Georg Brandl7f01a132009-09-16 15:58:14 +0000431 .. method:: getiterator(tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000432
Georg Brandl67b21b72010-08-17 15:07:14 +0000433 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000434 Use method :meth:`ElementTree.iter` instead.
Georg Brandl116aa622007-08-15 14:28:22 +0000435
436
Benjamin Petersone41251e2008-04-25 01:59:09 +0000437 .. method:: getroot()
Florent Xiclunac17f1722010-08-08 19:48:29 +0000438
Benjamin Petersone41251e2008-04-25 01:59:09 +0000439 Returns the root element for this tree.
Georg Brandl116aa622007-08-15 14:28:22 +0000440
441
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000442 .. method:: iter(tag=None)
443
444 Creates and returns a tree iterator for the root element. The iterator
445 loops over all elements in this tree, in section order. *tag* is the tag
446 to look for (default is to return all elements)
447
448
449 .. method:: iterfind(match)
450
451 Finds all matching subelements, by tag name or path. Same as
452 getroot().iterfind(match). Returns an iterable yielding all matching
453 elements in document order.
454
Ezio Melottif8754a62010-03-21 07:16:43 +0000455 .. versionadded:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000456
457
Georg Brandl7f01a132009-09-16 15:58:14 +0000458 .. method:: parse(source, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000459
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000460 Loads an external XML section into this element tree. *source* is a file
Antoine Pitrou11cb9612010-09-15 11:11:28 +0000461 name or :term:`file object`. *parser* is an optional parser instance.
462 If not given, the standard XMLParser parser is used. Returns the section
Benjamin Petersone41251e2008-04-25 01:59:09 +0000463 root element.
Georg Brandl116aa622007-08-15 14:28:22 +0000464
465
Florent Xiclunac17f1722010-08-08 19:48:29 +0000466 .. method:: write(file, encoding="us-ascii", xml_declaration=None, method="xml")
Georg Brandl116aa622007-08-15 14:28:22 +0000467
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000468 Writes the element tree to a file, as XML. *file* is a file name, or a
Antoine Pitrou11cb9612010-09-15 11:11:28 +0000469 :term:`file object` opened for writing. *encoding* [1]_ is the output encoding
Florent Xiclunac17f1722010-08-08 19:48:29 +0000470 (default is US-ASCII). Use ``encoding="unicode"`` to write a Unicode string.
471 *xml_declaration* controls if an XML declaration
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000472 should be added to the file. Use False for never, True for always, None
Florent Xiclunac17f1722010-08-08 19:48:29 +0000473 for only if not US-ASCII or UTF-8 or Unicode (default is None). *method* is
474 either ``"xml"``, ``"html"`` or ``"text"`` (default is ``"xml"``).
475 Returns an (optionally) encoded string.
Georg Brandl116aa622007-08-15 14:28:22 +0000476
Christian Heimesd8654cf2007-12-02 15:22:16 +0000477This is the XML file that is going to be manipulated::
478
479 <html>
480 <head>
481 <title>Example page</title>
482 </head>
483 <body>
Georg Brandl48310cd2009-01-03 21:18:54 +0000484 <p>Moved to <a href="http://example.org/">example.org</a>
Christian Heimesd8654cf2007-12-02 15:22:16 +0000485 or <a href="http://example.com/">example.com</a>.</p>
486 </body>
487 </html>
488
489Example of changing the attribute "target" of every link in first paragraph::
490
491 >>> from xml.etree.ElementTree import ElementTree
492 >>> tree = ElementTree()
493 >>> tree.parse("index.xhtml")
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000494 <Element 'html' at 0xb77e6fac>
Christian Heimesd8654cf2007-12-02 15:22:16 +0000495 >>> p = tree.find("body/p") # Finds first occurrence of tag p in body
496 >>> p
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000497 <Element 'p' at 0xb77ec26c>
498 >>> links = list(p.iter("a")) # Returns list of all links
Christian Heimesd8654cf2007-12-02 15:22:16 +0000499 >>> links
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000500 [<Element 'a' at 0xb77ec2ac>, <Element 'a' at 0xb77ec1cc>]
Christian Heimesd8654cf2007-12-02 15:22:16 +0000501 >>> for i in links: # Iterates through all found links
502 ... i.attrib["target"] = "blank"
503 >>> tree.write("output.xhtml")
Georg Brandl116aa622007-08-15 14:28:22 +0000504
505.. _elementtree-qname-objects:
506
507QName Objects
508-------------
509
510
Georg Brandl7f01a132009-09-16 15:58:14 +0000511.. class:: QName(text_or_uri, tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000512
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000513 QName wrapper. This can be used to wrap a QName attribute value, in order
514 to get proper namespace handling on output. *text_or_uri* is a string
515 containing the QName value, in the form {uri}local, or, if the tag argument
516 is given, the URI part of a QName. If *tag* is given, the first argument is
517 interpreted as an URI, and this argument is interpreted as a local name.
518 :class:`QName` instances are opaque.
Georg Brandl116aa622007-08-15 14:28:22 +0000519
520
521.. _elementtree-treebuilder-objects:
522
523TreeBuilder Objects
524-------------------
525
526
Georg Brandl7f01a132009-09-16 15:58:14 +0000527.. class:: TreeBuilder(element_factory=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000528
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000529 Generic element structure builder. This builder converts a sequence of
530 start, data, and end method calls to a well-formed element structure. You
531 can use this class to build an element structure using a custom XML parser,
532 or a parser for some other XML-like format. The *element_factory* is called
533 to create new :class:`Element` instances when given.
Georg Brandl116aa622007-08-15 14:28:22 +0000534
535
Benjamin Petersone41251e2008-04-25 01:59:09 +0000536 .. method:: close()
Georg Brandl116aa622007-08-15 14:28:22 +0000537
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000538 Flushes the builder buffers, and returns the toplevel document
539 element. Returns an :class:`Element` instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000540
541
Benjamin Petersone41251e2008-04-25 01:59:09 +0000542 .. method:: data(data)
Georg Brandl116aa622007-08-15 14:28:22 +0000543
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000544 Adds text to the current element. *data* is a string. This should be
545 either a bytestring, or a Unicode string.
Georg Brandl116aa622007-08-15 14:28:22 +0000546
547
Benjamin Petersone41251e2008-04-25 01:59:09 +0000548 .. method:: end(tag)
Georg Brandl116aa622007-08-15 14:28:22 +0000549
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000550 Closes the current element. *tag* is the element name. Returns the
551 closed element.
Georg Brandl116aa622007-08-15 14:28:22 +0000552
553
Benjamin Petersone41251e2008-04-25 01:59:09 +0000554 .. method:: start(tag, attrs)
Georg Brandl116aa622007-08-15 14:28:22 +0000555
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000556 Opens a new element. *tag* is the element name. *attrs* is a dictionary
557 containing element attributes. Returns the opened element.
Georg Brandl116aa622007-08-15 14:28:22 +0000558
559
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000560 In addition, a custom :class:`TreeBuilder` object can provide the
561 following method:
Georg Brandl116aa622007-08-15 14:28:22 +0000562
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000563 .. method:: doctype(name, pubid, system)
564
565 Handles a doctype declaration. *name* is the doctype name. *pubid* is
566 the public identifier. *system* is the system identifier. This method
567 does not exist on the default :class:`TreeBuilder` class.
568
Ezio Melottif8754a62010-03-21 07:16:43 +0000569 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000570
571
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000572.. _elementtree-xmlparser-objects:
Georg Brandl116aa622007-08-15 14:28:22 +0000573
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000574XMLParser Objects
575-----------------
576
577
578.. class:: XMLParser(html=0, target=None, encoding=None)
579
580 :class:`Element` structure builder for XML source data, based on the expat
581 parser. *html* are predefined HTML entities. This flag is not supported by
582 the current implementation. *target* is the target object. If omitted, the
583 builder uses an instance of the standard TreeBuilder class. *encoding* [1]_
584 is optional. If given, the value overrides the encoding specified in the
585 XML file.
Georg Brandl116aa622007-08-15 14:28:22 +0000586
587
Benjamin Petersone41251e2008-04-25 01:59:09 +0000588 .. method:: close()
Georg Brandl116aa622007-08-15 14:28:22 +0000589
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000590 Finishes feeding data to the parser. Returns an element structure.
Georg Brandl116aa622007-08-15 14:28:22 +0000591
592
Benjamin Petersone41251e2008-04-25 01:59:09 +0000593 .. method:: doctype(name, pubid, system)
Georg Brandl116aa622007-08-15 14:28:22 +0000594
Georg Brandl67b21b72010-08-17 15:07:14 +0000595 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000596 Define the :meth:`TreeBuilder.doctype` method on a custom TreeBuilder
597 target.
Georg Brandl116aa622007-08-15 14:28:22 +0000598
599
Benjamin Petersone41251e2008-04-25 01:59:09 +0000600 .. method:: feed(data)
Georg Brandl116aa622007-08-15 14:28:22 +0000601
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000602 Feeds data to the parser. *data* is encoded data.
Georg Brandl116aa622007-08-15 14:28:22 +0000603
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000604:meth:`XMLParser.feed` calls *target*\'s :meth:`start` method
Christian Heimesd8654cf2007-12-02 15:22:16 +0000605for each opening tag, its :meth:`end` method for each closing tag,
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000606and data is processed by method :meth:`data`. :meth:`XMLParser.close`
Georg Brandl48310cd2009-01-03 21:18:54 +0000607calls *target*\'s method :meth:`close`.
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000608:class:`XMLParser` can be used not only for building a tree structure.
Christian Heimesd8654cf2007-12-02 15:22:16 +0000609This is an example of counting the maximum depth of an XML file::
610
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000611 >>> from xml.etree.ElementTree import XMLParser
Christian Heimesd8654cf2007-12-02 15:22:16 +0000612 >>> class MaxDepth: # The target object of the parser
613 ... maxDepth = 0
614 ... depth = 0
615 ... def start(self, tag, attrib): # Called for each opening tag.
Georg Brandl48310cd2009-01-03 21:18:54 +0000616 ... self.depth += 1
Christian Heimesd8654cf2007-12-02 15:22:16 +0000617 ... if self.depth > self.maxDepth:
618 ... self.maxDepth = self.depth
619 ... def end(self, tag): # Called for each closing tag.
620 ... self.depth -= 1
Georg Brandl48310cd2009-01-03 21:18:54 +0000621 ... def data(self, data):
Christian Heimesd8654cf2007-12-02 15:22:16 +0000622 ... pass # We do not need to do anything with data.
623 ... def close(self): # Called when all data has been parsed.
624 ... return self.maxDepth
Georg Brandl48310cd2009-01-03 21:18:54 +0000625 ...
Christian Heimesd8654cf2007-12-02 15:22:16 +0000626 >>> target = MaxDepth()
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000627 >>> parser = XMLParser(target=target)
Christian Heimesd8654cf2007-12-02 15:22:16 +0000628 >>> exampleXml = """
629 ... <a>
630 ... <b>
631 ... </b>
632 ... <b>
633 ... <c>
634 ... <d>
635 ... </d>
636 ... </c>
637 ... </b>
638 ... </a>"""
639 >>> parser.feed(exampleXml)
640 >>> parser.close()
641 4
Christian Heimesb186d002008-03-18 15:15:01 +0000642
643
644.. rubric:: Footnotes
645
646.. [#] The encoding string included in XML output should conform to the
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000647 appropriate standards. For example, "UTF-8" is valid, but "UTF8" is
648 not. See http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl
Benjamin Petersonad3d5c22009-02-26 03:38:59 +0000649 and http://www.iana.org/assignments/character-sets.