blob: 231992789c28350db34703839095d919c7b58875 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`xml.etree.ElementTree` --- The ElementTree XML API
2========================================================
3
4.. module:: xml.etree.ElementTree
5 :synopsis: Implementation of the ElementTree API.
6.. moduleauthor:: Fredrik Lundh <fredrik@pythonware.com>
7
8
Florent Xiclunaf15351d2010-03-13 23:24:31 +00009The :class:`Element` type is a flexible container object, designed to store
10hierarchical data structures in memory. The type can be described as a cross
11between a list and a dictionary.
Georg Brandl116aa622007-08-15 14:28:22 +000012
13Each element has a number of properties associated with it:
14
15* a tag which is a string identifying what kind of data this element represents
16 (the element type, in other words).
17
18* a number of attributes, stored in a Python dictionary.
19
20* a text string.
21
22* an optional tail string.
23
24* a number of child elements, stored in a Python sequence
25
Florent Xiclunaf15351d2010-03-13 23:24:31 +000026To create an element instance, use the :class:`Element` constructor or the
27:func:`SubElement` factory function.
Georg Brandl116aa622007-08-15 14:28:22 +000028
29The :class:`ElementTree` class can be used to wrap an element structure, and
30convert it from and to XML.
31
32A C implementation of this API is available as :mod:`xml.etree.cElementTree`.
33
Christian Heimesd8654cf2007-12-02 15:22:16 +000034See http://effbot.org/zone/element-index.htm for tutorials and links to other
Florent Xiclunaf15351d2010-03-13 23:24:31 +000035docs. Fredrik Lundh's page is also the location of the development version of
36the xml.etree.ElementTree.
37
Ezio Melottif8754a62010-03-21 07:16:43 +000038.. versionchanged:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +000039 The ElementTree API is updated to 1.3. For more information, see
40 `Introducing ElementTree 1.3
41 <http://effbot.org/zone/elementtree-13-intro.htm>`_.
42
Georg Brandl116aa622007-08-15 14:28:22 +000043
44.. _elementtree-functions:
45
46Functions
47---------
48
49
Georg Brandl7f01a132009-09-16 15:58:14 +000050.. function:: Comment(text=None)
Georg Brandl116aa622007-08-15 14:28:22 +000051
Georg Brandlf6945182008-02-01 11:56:49 +000052 Comment element factory. This factory function creates a special element
Florent Xiclunaf15351d2010-03-13 23:24:31 +000053 that will be serialized as an XML comment by the standard serializer. The
54 comment string can be either a bytestring or a Unicode string. *text* is a
55 string containing the comment string. Returns an element instance
Georg Brandlf6945182008-02-01 11:56:49 +000056 representing a comment.
Georg Brandl116aa622007-08-15 14:28:22 +000057
58
59.. function:: dump(elem)
60
Florent Xiclunaf15351d2010-03-13 23:24:31 +000061 Writes an element tree or element structure to sys.stdout. This function
62 should be used for debugging only.
Georg Brandl116aa622007-08-15 14:28:22 +000063
64 The exact output format is implementation dependent. In this version, it's
65 written as an ordinary XML file.
66
67 *elem* is an element tree or an individual element.
68
69
Georg Brandl116aa622007-08-15 14:28:22 +000070.. function:: fromstring(text)
71
Florent Xiclunadddd5e92010-03-14 01:28:07 +000072 Parses an XML section from a string constant. Same as :func:`XML`. *text*
73 is a string containing XML data. Returns an :class:`Element` instance.
Florent Xiclunaf15351d2010-03-13 23:24:31 +000074
75
76.. function:: fromstringlist(sequence, parser=None)
77
78 Parses an XML document from a sequence of string fragments. *sequence* is a
79 list or other sequence containing XML data fragments. *parser* is an
80 optional parser instance. If not given, the standard :class:`XMLParser`
81 parser is used. Returns an :class:`Element` instance.
82
Ezio Melottif8754a62010-03-21 07:16:43 +000083 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +000084
85
86.. function:: iselement(element)
87
Florent Xiclunaf15351d2010-03-13 23:24:31 +000088 Checks if an object appears to be a valid element object. *element* is an
89 element instance. Returns a true value if this is an element object.
Georg Brandl116aa622007-08-15 14:28:22 +000090
91
Florent Xiclunaf15351d2010-03-13 23:24:31 +000092.. function:: iterparse(source, events=None, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +000093
94 Parses an XML section into an element tree incrementally, and reports what's
Florent Xiclunaf15351d2010-03-13 23:24:31 +000095 going on to the user. *source* is a filename or file object containing XML
96 data. *events* is a list of events to report back. If omitted, only "end"
97 events are reported. *parser* is an optional parser instance. If not
98 given, the standard :class:`XMLParser` parser is used. Returns an
99 :term:`iterator` providing ``(event, elem)`` pairs.
Georg Brandl116aa622007-08-15 14:28:22 +0000100
Benjamin Peterson75edad02009-01-01 15:05:06 +0000101 .. note::
102
103 :func:`iterparse` only guarantees that it has seen the ">"
104 character of a starting tag when it emits a "start" event, so the
105 attributes are defined, but the contents of the text and tail attributes
106 are undefined at that point. The same applies to the element children;
107 they may or may not be present.
108
109 If you need a fully populated element, look for "end" events instead.
110
Georg Brandl116aa622007-08-15 14:28:22 +0000111
Georg Brandl7f01a132009-09-16 15:58:14 +0000112.. function:: parse(source, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000113
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000114 Parses an XML section into an element tree. *source* is a filename or file
115 object containing XML data. *parser* is an optional parser instance. If
116 not given, the standard :class:`XMLParser` parser is used. Returns an
117 :class:`ElementTree` instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000118
119
Georg Brandl7f01a132009-09-16 15:58:14 +0000120.. function:: ProcessingInstruction(target, text=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000121
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000122 PI element factory. This factory function creates a special element that
123 will be serialized as an XML processing instruction. *target* is a string
124 containing the PI target. *text* is a string containing the PI contents, if
125 given. Returns an element instance, representing a processing instruction.
126
127
128.. function:: register_namespace(prefix, uri)
129
130 Registers a namespace prefix. The registry is global, and any existing
131 mapping for either the given prefix or the namespace URI will be removed.
132 *prefix* is a namespace prefix. *uri* is a namespace uri. Tags and
133 attributes in this namespace will be serialized with the given prefix, if at
134 all possible.
135
Ezio Melottif8754a62010-03-21 07:16:43 +0000136 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000137
138
Georg Brandl7f01a132009-09-16 15:58:14 +0000139.. function:: SubElement(parent, tag, attrib={}, **extra)
Georg Brandl116aa622007-08-15 14:28:22 +0000140
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000141 Subelement factory. This function creates an element instance, and appends
142 it to an existing element.
Georg Brandl116aa622007-08-15 14:28:22 +0000143
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000144 The element name, attribute names, and attribute values can be either
145 bytestrings or Unicode strings. *parent* is the parent element. *tag* is
146 the subelement name. *attrib* is an optional dictionary, containing element
147 attributes. *extra* contains additional attributes, given as keyword
148 arguments. Returns an element instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000149
150
Florent Xiclunac17f1722010-08-08 19:48:29 +0000151.. function:: tostring(element, encoding="us-ascii", method="xml")
Georg Brandl116aa622007-08-15 14:28:22 +0000152
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000153 Generates a string representation of an XML element, including all
Florent Xiclunadddd5e92010-03-14 01:28:07 +0000154 subelements. *element* is an :class:`Element` instance. *encoding* [1]_ is
Florent Xiclunac17f1722010-08-08 19:48:29 +0000155 the output encoding (default is US-ASCII). Use ``encoding="unicode"`` to
156 generate a Unicode string. *method* is either ``"xml"``,
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000157 ``"html"`` or ``"text"`` (default is ``"xml"``). Returns an (optionally)
158 encoded string containing the XML data.
Georg Brandl116aa622007-08-15 14:28:22 +0000159
160
Florent Xiclunac17f1722010-08-08 19:48:29 +0000161.. function:: tostringlist(element, encoding="us-ascii", method="xml")
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000162
163 Generates a string representation of an XML element, including all
Florent Xiclunadddd5e92010-03-14 01:28:07 +0000164 subelements. *element* is an :class:`Element` instance. *encoding* [1]_ is
Florent Xiclunac17f1722010-08-08 19:48:29 +0000165 the output encoding (default is US-ASCII). Use ``encoding="unicode"`` to
166 generate a Unicode string. *method* is either ``"xml"``,
Florent Xiclunadddd5e92010-03-14 01:28:07 +0000167 ``"html"`` or ``"text"`` (default is ``"xml"``). Returns a list of
168 (optionally) encoded strings containing the XML data. It does not guarantee
169 any specific sequence, except that ``"".join(tostringlist(element)) ==
170 tostring(element)``.
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000171
Ezio Melottif8754a62010-03-21 07:16:43 +0000172 .. versionadded:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000173
174
175.. function:: XML(text, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000176
177 Parses an XML section from a string constant. This function can be used to
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000178 embed "XML literals" in Python code. *text* is a string containing XML
179 data. *parser* is an optional parser instance. If not given, the standard
180 :class:`XMLParser` parser is used. Returns an :class:`Element` instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000181
182
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000183.. function:: XMLID(text, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000184
185 Parses an XML section from a string constant, and also returns a dictionary
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000186 which maps from element id:s to elements. *text* is a string containing XML
187 data. *parser* is an optional parser instance. If not given, the standard
188 :class:`XMLParser` parser is used. Returns a tuple containing an
189 :class:`Element` instance and a dictionary.
Georg Brandl116aa622007-08-15 14:28:22 +0000190
191
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000192.. _elementtree-element-objects:
Georg Brandl116aa622007-08-15 14:28:22 +0000193
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000194Element Objects
195---------------
Georg Brandl116aa622007-08-15 14:28:22 +0000196
Georg Brandl116aa622007-08-15 14:28:22 +0000197
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000198.. class:: Element(tag, attrib={}, **extra)
Georg Brandl116aa622007-08-15 14:28:22 +0000199
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000200 Element class. This class defines the Element interface, and provides a
201 reference implementation of this interface.
Georg Brandl116aa622007-08-15 14:28:22 +0000202
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000203 The element name, attribute names, and attribute values can be either
204 bytestrings or Unicode strings. *tag* is the element name. *attrib* is
205 an optional dictionary, containing element attributes. *extra* contains
206 additional attributes, given as keyword arguments.
Georg Brandl116aa622007-08-15 14:28:22 +0000207
208
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000209 .. attribute:: tag
Georg Brandl116aa622007-08-15 14:28:22 +0000210
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000211 A string identifying what kind of data this element represents (the
212 element type, in other words).
Georg Brandl116aa622007-08-15 14:28:22 +0000213
214
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000215 .. attribute:: text
Georg Brandl116aa622007-08-15 14:28:22 +0000216
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000217 The *text* attribute can be used to hold additional data associated with
218 the element. As the name implies this attribute is usually a string but
219 may be any application-specific object. If the element is created from
220 an XML file the attribute will contain any text found between the element
221 tags.
Georg Brandl116aa622007-08-15 14:28:22 +0000222
223
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000224 .. attribute:: tail
Georg Brandl116aa622007-08-15 14:28:22 +0000225
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000226 The *tail* attribute can be used to hold additional data associated with
227 the element. This attribute is usually a string but may be any
228 application-specific object. If the element is created from an XML file
229 the attribute will contain any text found after the element's end tag and
230 before the next tag.
Georg Brandl116aa622007-08-15 14:28:22 +0000231
Georg Brandl116aa622007-08-15 14:28:22 +0000232
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000233 .. attribute:: attrib
Georg Brandl116aa622007-08-15 14:28:22 +0000234
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000235 A dictionary containing the element's attributes. Note that while the
236 *attrib* value is always a real mutable Python dictionary, an ElementTree
237 implementation may choose to use another internal representation, and
238 create the dictionary only if someone asks for it. To take advantage of
239 such implementations, use the dictionary methods below whenever possible.
Georg Brandl116aa622007-08-15 14:28:22 +0000240
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000241 The following dictionary-like methods work on the element attributes.
Georg Brandl116aa622007-08-15 14:28:22 +0000242
243
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000244 .. method:: clear()
Georg Brandl116aa622007-08-15 14:28:22 +0000245
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000246 Resets an element. This function removes all subelements, clears all
247 attributes, and sets the text and tail attributes to None.
Georg Brandl116aa622007-08-15 14:28:22 +0000248
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000249
250 .. method:: get(key, default=None)
251
252 Gets the element attribute named *key*.
253
254 Returns the attribute value, or *default* if the attribute was not found.
255
256
257 .. method:: items()
258
259 Returns the element attributes as a sequence of (name, value) pairs. The
260 attributes are returned in an arbitrary order.
261
262
263 .. method:: keys()
264
265 Returns the elements attribute names as a list. The names are returned
266 in an arbitrary order.
267
268
269 .. method:: set(key, value)
270
271 Set the attribute *key* on the element to *value*.
272
273 The following methods work on the element's children (subelements).
274
275
276 .. method:: append(subelement)
277
278 Adds the element *subelement* to the end of this elements internal list
279 of subelements.
280
281
282 .. method:: extend(subelements)
Georg Brandl116aa622007-08-15 14:28:22 +0000283
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000284 Appends *subelements* from a sequence object with zero or more elements.
285 Raises :exc:`AssertionError` if a subelement is not a valid object.
Georg Brandl116aa622007-08-15 14:28:22 +0000286
Ezio Melottif8754a62010-03-21 07:16:43 +0000287 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000288
Georg Brandl116aa622007-08-15 14:28:22 +0000289
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000290 .. method:: find(match)
Georg Brandl116aa622007-08-15 14:28:22 +0000291
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000292 Finds the first subelement matching *match*. *match* may be a tag name
293 or path. Returns an element instance or ``None``.
Georg Brandl116aa622007-08-15 14:28:22 +0000294
Georg Brandl116aa622007-08-15 14:28:22 +0000295
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000296 .. method:: findall(match)
Georg Brandl116aa622007-08-15 14:28:22 +0000297
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000298 Finds all matching subelements, by tag name or path. Returns a list
299 containing all matching elements in document order.
Georg Brandl116aa622007-08-15 14:28:22 +0000300
Georg Brandl116aa622007-08-15 14:28:22 +0000301
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000302 .. method:: findtext(match, default=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000303
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000304 Finds text for the first subelement matching *match*. *match* may be
305 a tag name or path. Returns the text content of the first matching
306 element, or *default* if no element was found. Note that if the matching
307 element has no text content an empty string is returned.
Georg Brandl116aa622007-08-15 14:28:22 +0000308
Georg Brandl116aa622007-08-15 14:28:22 +0000309
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000310 .. method:: getchildren()
Georg Brandl116aa622007-08-15 14:28:22 +0000311
Georg Brandl67b21b72010-08-17 15:07:14 +0000312 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000313 Use ``list(elem)`` or iteration.
Georg Brandl116aa622007-08-15 14:28:22 +0000314
Georg Brandl116aa622007-08-15 14:28:22 +0000315
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000316 .. method:: getiterator(tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000317
Georg Brandl67b21b72010-08-17 15:07:14 +0000318 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000319 Use method :meth:`Element.iter` instead.
Georg Brandl116aa622007-08-15 14:28:22 +0000320
Georg Brandl116aa622007-08-15 14:28:22 +0000321
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000322 .. method:: insert(index, element)
Georg Brandl116aa622007-08-15 14:28:22 +0000323
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000324 Inserts a subelement at the given position in this element.
Georg Brandl116aa622007-08-15 14:28:22 +0000325
Georg Brandl116aa622007-08-15 14:28:22 +0000326
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000327 .. method:: iter(tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000328
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000329 Creates a tree :term:`iterator` with the current element as the root.
330 The iterator iterates over this element and all elements below it, in
331 document (depth first) order. If *tag* is not ``None`` or ``'*'``, only
332 elements whose tag equals *tag* are returned from the iterator. If the
333 tree structure is modified during iteration, the result is undefined.
Georg Brandl116aa622007-08-15 14:28:22 +0000334
Georg Brandl116aa622007-08-15 14:28:22 +0000335
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000336 .. method:: iterfind(match)
Georg Brandl116aa622007-08-15 14:28:22 +0000337
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000338 Finds all matching subelements, by tag name or path. Returns an iterable
339 yielding all matching elements in document order.
Georg Brandl116aa622007-08-15 14:28:22 +0000340
Ezio Melottif8754a62010-03-21 07:16:43 +0000341 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000342
Georg Brandl116aa622007-08-15 14:28:22 +0000343
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000344 .. method:: itertext()
Georg Brandl116aa622007-08-15 14:28:22 +0000345
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000346 Creates a text iterator. The iterator loops over this element and all
347 subelements, in document order, and returns all inner text.
Georg Brandl116aa622007-08-15 14:28:22 +0000348
Ezio Melottif8754a62010-03-21 07:16:43 +0000349 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000350
351
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000352 .. method:: makeelement(tag, attrib)
Georg Brandl116aa622007-08-15 14:28:22 +0000353
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000354 Creates a new element object of the same type as this element. Do not
355 call this method, use the :func:`SubElement` factory function instead.
Georg Brandl116aa622007-08-15 14:28:22 +0000356
357
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000358 .. method:: remove(subelement)
Georg Brandl116aa622007-08-15 14:28:22 +0000359
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000360 Removes *subelement* from the element. Unlike the find\* methods this
361 method compares elements based on the instance identity, not on tag value
362 or contents.
Georg Brandl116aa622007-08-15 14:28:22 +0000363
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000364 :class:`Element` objects also support the following sequence type methods
365 for working with subelements: :meth:`__delitem__`, :meth:`__getitem__`,
366 :meth:`__setitem__`, :meth:`__len__`.
Georg Brandl116aa622007-08-15 14:28:22 +0000367
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000368 Caution: Elements with no subelements will test as ``False``. This behavior
369 will change in future versions. Use specific ``len(elem)`` or ``elem is
370 None`` test instead. ::
Georg Brandl116aa622007-08-15 14:28:22 +0000371
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000372 element = root.find('foo')
Georg Brandl116aa622007-08-15 14:28:22 +0000373
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000374 if not element: # careful!
375 print("element not found, or element has no subelements")
Georg Brandl116aa622007-08-15 14:28:22 +0000376
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000377 if element is None:
378 print("element not found")
Georg Brandl116aa622007-08-15 14:28:22 +0000379
380
381.. _elementtree-elementtree-objects:
382
383ElementTree Objects
384-------------------
385
386
Georg Brandl7f01a132009-09-16 15:58:14 +0000387.. class:: ElementTree(element=None, file=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000388
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000389 ElementTree wrapper class. This class represents an entire element
390 hierarchy, and adds some extra support for serialization to and from
391 standard XML.
Georg Brandl116aa622007-08-15 14:28:22 +0000392
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000393 *element* is the root element. The tree is initialized with the contents
394 of the XML *file* if given.
Georg Brandl116aa622007-08-15 14:28:22 +0000395
396
Benjamin Petersone41251e2008-04-25 01:59:09 +0000397 .. method:: _setroot(element)
Georg Brandl116aa622007-08-15 14:28:22 +0000398
Benjamin Petersone41251e2008-04-25 01:59:09 +0000399 Replaces the root element for this tree. This discards the current
400 contents of the tree, and replaces it with the given element. Use with
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000401 care. *element* is an element instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000402
403
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000404 .. method:: find(match)
Georg Brandl116aa622007-08-15 14:28:22 +0000405
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000406 Finds the first toplevel element matching *match*. *match* may be a tag
407 name or path. Same as getroot().find(match). Returns the first matching
408 element, or ``None`` if no element was found.
Georg Brandl116aa622007-08-15 14:28:22 +0000409
410
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000411 .. method:: findall(match)
Georg Brandl116aa622007-08-15 14:28:22 +0000412
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000413 Finds all matching subelements, by tag name or path. Same as
414 getroot().findall(match). *match* may be a tag name or path. Returns a
415 list containing all matching elements, in document order.
Georg Brandl116aa622007-08-15 14:28:22 +0000416
417
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000418 .. method:: findtext(match, default=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000419
Benjamin Petersone41251e2008-04-25 01:59:09 +0000420 Finds the element text for the first toplevel element with given tag.
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000421 Same as getroot().findtext(match). *match* may be a tag name or path.
422 *default* is the value to return if the element was not found. Returns
423 the text content of the first matching element, or the default value no
424 element was found. Note that if the element is found, but has no text
425 content, this method returns an empty string.
Georg Brandl116aa622007-08-15 14:28:22 +0000426
427
Georg Brandl7f01a132009-09-16 15:58:14 +0000428 .. method:: getiterator(tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000429
Georg Brandl67b21b72010-08-17 15:07:14 +0000430 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000431 Use method :meth:`ElementTree.iter` instead.
Georg Brandl116aa622007-08-15 14:28:22 +0000432
433
Benjamin Petersone41251e2008-04-25 01:59:09 +0000434 .. method:: getroot()
Florent Xiclunac17f1722010-08-08 19:48:29 +0000435
Benjamin Petersone41251e2008-04-25 01:59:09 +0000436 Returns the root element for this tree.
Georg Brandl116aa622007-08-15 14:28:22 +0000437
438
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000439 .. method:: iter(tag=None)
440
441 Creates and returns a tree iterator for the root element. The iterator
442 loops over all elements in this tree, in section order. *tag* is the tag
443 to look for (default is to return all elements)
444
445
446 .. method:: iterfind(match)
447
448 Finds all matching subelements, by tag name or path. Same as
449 getroot().iterfind(match). Returns an iterable yielding all matching
450 elements in document order.
451
Ezio Melottif8754a62010-03-21 07:16:43 +0000452 .. versionadded:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000453
454
Georg Brandl7f01a132009-09-16 15:58:14 +0000455 .. method:: parse(source, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000456
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000457 Loads an external XML section into this element tree. *source* is a file
458 name or file object. *parser* is an optional parser instance. If not
459 given, the standard XMLParser parser is used. Returns the section
Benjamin Petersone41251e2008-04-25 01:59:09 +0000460 root element.
Georg Brandl116aa622007-08-15 14:28:22 +0000461
462
Florent Xiclunac17f1722010-08-08 19:48:29 +0000463 .. method:: write(file, encoding="us-ascii", xml_declaration=None, method="xml")
Georg Brandl116aa622007-08-15 14:28:22 +0000464
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000465 Writes the element tree to a file, as XML. *file* is a file name, or a
466 file object opened for writing. *encoding* [1]_ is the output encoding
Florent Xiclunac17f1722010-08-08 19:48:29 +0000467 (default is US-ASCII). Use ``encoding="unicode"`` to write a Unicode string.
468 *xml_declaration* controls if an XML declaration
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000469 should be added to the file. Use False for never, True for always, None
Florent Xiclunac17f1722010-08-08 19:48:29 +0000470 for only if not US-ASCII or UTF-8 or Unicode (default is None). *method* is
471 either ``"xml"``, ``"html"`` or ``"text"`` (default is ``"xml"``).
472 Returns an (optionally) encoded string.
Georg Brandl116aa622007-08-15 14:28:22 +0000473
Christian Heimesd8654cf2007-12-02 15:22:16 +0000474This is the XML file that is going to be manipulated::
475
476 <html>
477 <head>
478 <title>Example page</title>
479 </head>
480 <body>
Georg Brandl48310cd2009-01-03 21:18:54 +0000481 <p>Moved to <a href="http://example.org/">example.org</a>
Christian Heimesd8654cf2007-12-02 15:22:16 +0000482 or <a href="http://example.com/">example.com</a>.</p>
483 </body>
484 </html>
485
486Example of changing the attribute "target" of every link in first paragraph::
487
488 >>> from xml.etree.ElementTree import ElementTree
489 >>> tree = ElementTree()
490 >>> tree.parse("index.xhtml")
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000491 <Element 'html' at 0xb77e6fac>
Christian Heimesd8654cf2007-12-02 15:22:16 +0000492 >>> p = tree.find("body/p") # Finds first occurrence of tag p in body
493 >>> p
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000494 <Element 'p' at 0xb77ec26c>
495 >>> links = list(p.iter("a")) # Returns list of all links
Christian Heimesd8654cf2007-12-02 15:22:16 +0000496 >>> links
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000497 [<Element 'a' at 0xb77ec2ac>, <Element 'a' at 0xb77ec1cc>]
Christian Heimesd8654cf2007-12-02 15:22:16 +0000498 >>> for i in links: # Iterates through all found links
499 ... i.attrib["target"] = "blank"
500 >>> tree.write("output.xhtml")
Georg Brandl116aa622007-08-15 14:28:22 +0000501
502.. _elementtree-qname-objects:
503
504QName Objects
505-------------
506
507
Georg Brandl7f01a132009-09-16 15:58:14 +0000508.. class:: QName(text_or_uri, tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000509
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000510 QName wrapper. This can be used to wrap a QName attribute value, in order
511 to get proper namespace handling on output. *text_or_uri* is a string
512 containing the QName value, in the form {uri}local, or, if the tag argument
513 is given, the URI part of a QName. If *tag* is given, the first argument is
514 interpreted as an URI, and this argument is interpreted as a local name.
515 :class:`QName` instances are opaque.
Georg Brandl116aa622007-08-15 14:28:22 +0000516
517
518.. _elementtree-treebuilder-objects:
519
520TreeBuilder Objects
521-------------------
522
523
Georg Brandl7f01a132009-09-16 15:58:14 +0000524.. class:: TreeBuilder(element_factory=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000525
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000526 Generic element structure builder. This builder converts a sequence of
527 start, data, and end method calls to a well-formed element structure. You
528 can use this class to build an element structure using a custom XML parser,
529 or a parser for some other XML-like format. The *element_factory* is called
530 to create new :class:`Element` instances when given.
Georg Brandl116aa622007-08-15 14:28:22 +0000531
532
Benjamin Petersone41251e2008-04-25 01:59:09 +0000533 .. method:: close()
Georg Brandl116aa622007-08-15 14:28:22 +0000534
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000535 Flushes the builder buffers, and returns the toplevel document
536 element. Returns an :class:`Element` instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000537
538
Benjamin Petersone41251e2008-04-25 01:59:09 +0000539 .. method:: data(data)
Georg Brandl116aa622007-08-15 14:28:22 +0000540
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000541 Adds text to the current element. *data* is a string. This should be
542 either a bytestring, or a Unicode string.
Georg Brandl116aa622007-08-15 14:28:22 +0000543
544
Benjamin Petersone41251e2008-04-25 01:59:09 +0000545 .. method:: end(tag)
Georg Brandl116aa622007-08-15 14:28:22 +0000546
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000547 Closes the current element. *tag* is the element name. Returns the
548 closed element.
Georg Brandl116aa622007-08-15 14:28:22 +0000549
550
Benjamin Petersone41251e2008-04-25 01:59:09 +0000551 .. method:: start(tag, attrs)
Georg Brandl116aa622007-08-15 14:28:22 +0000552
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000553 Opens a new element. *tag* is the element name. *attrs* is a dictionary
554 containing element attributes. Returns the opened element.
Georg Brandl116aa622007-08-15 14:28:22 +0000555
556
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000557 In addition, a custom :class:`TreeBuilder` object can provide the
558 following method:
Georg Brandl116aa622007-08-15 14:28:22 +0000559
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000560 .. method:: doctype(name, pubid, system)
561
562 Handles a doctype declaration. *name* is the doctype name. *pubid* is
563 the public identifier. *system* is the system identifier. This method
564 does not exist on the default :class:`TreeBuilder` class.
565
Ezio Melottif8754a62010-03-21 07:16:43 +0000566 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000567
568
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000569.. _elementtree-xmlparser-objects:
Georg Brandl116aa622007-08-15 14:28:22 +0000570
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000571XMLParser Objects
572-----------------
573
574
575.. class:: XMLParser(html=0, target=None, encoding=None)
576
577 :class:`Element` structure builder for XML source data, based on the expat
578 parser. *html* are predefined HTML entities. This flag is not supported by
579 the current implementation. *target* is the target object. If omitted, the
580 builder uses an instance of the standard TreeBuilder class. *encoding* [1]_
581 is optional. If given, the value overrides the encoding specified in the
582 XML file.
Georg Brandl116aa622007-08-15 14:28:22 +0000583
584
Benjamin Petersone41251e2008-04-25 01:59:09 +0000585 .. method:: close()
Georg Brandl116aa622007-08-15 14:28:22 +0000586
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000587 Finishes feeding data to the parser. Returns an element structure.
Georg Brandl116aa622007-08-15 14:28:22 +0000588
589
Benjamin Petersone41251e2008-04-25 01:59:09 +0000590 .. method:: doctype(name, pubid, system)
Georg Brandl116aa622007-08-15 14:28:22 +0000591
Georg Brandl67b21b72010-08-17 15:07:14 +0000592 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000593 Define the :meth:`TreeBuilder.doctype` method on a custom TreeBuilder
594 target.
Georg Brandl116aa622007-08-15 14:28:22 +0000595
596
Benjamin Petersone41251e2008-04-25 01:59:09 +0000597 .. method:: feed(data)
Georg Brandl116aa622007-08-15 14:28:22 +0000598
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000599 Feeds data to the parser. *data* is encoded data.
Georg Brandl116aa622007-08-15 14:28:22 +0000600
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000601:meth:`XMLParser.feed` calls *target*\'s :meth:`start` method
Christian Heimesd8654cf2007-12-02 15:22:16 +0000602for each opening tag, its :meth:`end` method for each closing tag,
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000603and data is processed by method :meth:`data`. :meth:`XMLParser.close`
Georg Brandl48310cd2009-01-03 21:18:54 +0000604calls *target*\'s method :meth:`close`.
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000605:class:`XMLParser` can be used not only for building a tree structure.
Christian Heimesd8654cf2007-12-02 15:22:16 +0000606This is an example of counting the maximum depth of an XML file::
607
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000608 >>> from xml.etree.ElementTree import XMLParser
Christian Heimesd8654cf2007-12-02 15:22:16 +0000609 >>> class MaxDepth: # The target object of the parser
610 ... maxDepth = 0
611 ... depth = 0
612 ... def start(self, tag, attrib): # Called for each opening tag.
Georg Brandl48310cd2009-01-03 21:18:54 +0000613 ... self.depth += 1
Christian Heimesd8654cf2007-12-02 15:22:16 +0000614 ... if self.depth > self.maxDepth:
615 ... self.maxDepth = self.depth
616 ... def end(self, tag): # Called for each closing tag.
617 ... self.depth -= 1
Georg Brandl48310cd2009-01-03 21:18:54 +0000618 ... def data(self, data):
Christian Heimesd8654cf2007-12-02 15:22:16 +0000619 ... pass # We do not need to do anything with data.
620 ... def close(self): # Called when all data has been parsed.
621 ... return self.maxDepth
Georg Brandl48310cd2009-01-03 21:18:54 +0000622 ...
Christian Heimesd8654cf2007-12-02 15:22:16 +0000623 >>> target = MaxDepth()
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000624 >>> parser = XMLParser(target=target)
Christian Heimesd8654cf2007-12-02 15:22:16 +0000625 >>> exampleXml = """
626 ... <a>
627 ... <b>
628 ... </b>
629 ... <b>
630 ... <c>
631 ... <d>
632 ... </d>
633 ... </c>
634 ... </b>
635 ... </a>"""
636 >>> parser.feed(exampleXml)
637 >>> parser.close()
638 4
Christian Heimesb186d002008-03-18 15:15:01 +0000639
640
641.. rubric:: Footnotes
642
643.. [#] The encoding string included in XML output should conform to the
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000644 appropriate standards. For example, "UTF-8" is valid, but "UTF8" is
645 not. See http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl
Benjamin Petersonad3d5c22009-02-26 03:38:59 +0000646 and http://www.iana.org/assignments/character-sets.