blob: d9b0a520ef3a6ce5a8c1e82a05c0e1d364fc027e [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`xml.etree.ElementTree` --- The ElementTree XML API
2========================================================
3
4.. module:: xml.etree.ElementTree
5 :synopsis: Implementation of the ElementTree API.
6.. moduleauthor:: Fredrik Lundh <fredrik@pythonware.com>
7
Raymond Hettinger3029aff2011-02-10 08:09:36 +00008**Source code:** :source:`Lib/xml/etree/ElementTree.py`
9
10--------------
Georg Brandl116aa622007-08-15 14:28:22 +000011
Florent Xiclunaf15351d2010-03-13 23:24:31 +000012The :class:`Element` type is a flexible container object, designed to store
13hierarchical data structures in memory. The type can be described as a cross
14between a list and a dictionary.
Georg Brandl116aa622007-08-15 14:28:22 +000015
16Each element has a number of properties associated with it:
17
18* a tag which is a string identifying what kind of data this element represents
19 (the element type, in other words).
20
21* a number of attributes, stored in a Python dictionary.
22
23* a text string.
24
25* an optional tail string.
26
27* a number of child elements, stored in a Python sequence
28
Florent Xiclunaf15351d2010-03-13 23:24:31 +000029To create an element instance, use the :class:`Element` constructor or the
30:func:`SubElement` factory function.
Georg Brandl116aa622007-08-15 14:28:22 +000031
32The :class:`ElementTree` class can be used to wrap an element structure, and
33convert it from and to XML.
34
35A C implementation of this API is available as :mod:`xml.etree.cElementTree`.
36
Christian Heimesd8654cf2007-12-02 15:22:16 +000037See http://effbot.org/zone/element-index.htm for tutorials and links to other
Florent Xiclunaf15351d2010-03-13 23:24:31 +000038docs. Fredrik Lundh's page is also the location of the development version of
39the xml.etree.ElementTree.
40
Ezio Melottif8754a62010-03-21 07:16:43 +000041.. versionchanged:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +000042 The ElementTree API is updated to 1.3. For more information, see
43 `Introducing ElementTree 1.3
44 <http://effbot.org/zone/elementtree-13-intro.htm>`_.
45
Georg Brandl116aa622007-08-15 14:28:22 +000046
47.. _elementtree-functions:
48
49Functions
50---------
51
52
Georg Brandl7f01a132009-09-16 15:58:14 +000053.. function:: Comment(text=None)
Georg Brandl116aa622007-08-15 14:28:22 +000054
Georg Brandlf6945182008-02-01 11:56:49 +000055 Comment element factory. This factory function creates a special element
Florent Xiclunaf15351d2010-03-13 23:24:31 +000056 that will be serialized as an XML comment by the standard serializer. The
57 comment string can be either a bytestring or a Unicode string. *text* is a
58 string containing the comment string. Returns an element instance
Georg Brandlf6945182008-02-01 11:56:49 +000059 representing a comment.
Georg Brandl116aa622007-08-15 14:28:22 +000060
61
62.. function:: dump(elem)
63
Florent Xiclunaf15351d2010-03-13 23:24:31 +000064 Writes an element tree or element structure to sys.stdout. This function
65 should be used for debugging only.
Georg Brandl116aa622007-08-15 14:28:22 +000066
67 The exact output format is implementation dependent. In this version, it's
68 written as an ordinary XML file.
69
70 *elem* is an element tree or an individual element.
71
72
Georg Brandl116aa622007-08-15 14:28:22 +000073.. function:: fromstring(text)
74
Florent Xiclunadddd5e92010-03-14 01:28:07 +000075 Parses an XML section from a string constant. Same as :func:`XML`. *text*
76 is a string containing XML data. Returns an :class:`Element` instance.
Florent Xiclunaf15351d2010-03-13 23:24:31 +000077
78
79.. function:: fromstringlist(sequence, parser=None)
80
81 Parses an XML document from a sequence of string fragments. *sequence* is a
82 list or other sequence containing XML data fragments. *parser* is an
83 optional parser instance. If not given, the standard :class:`XMLParser`
84 parser is used. Returns an :class:`Element` instance.
85
Ezio Melottif8754a62010-03-21 07:16:43 +000086 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +000087
88
89.. function:: iselement(element)
90
Florent Xiclunaf15351d2010-03-13 23:24:31 +000091 Checks if an object appears to be a valid element object. *element* is an
92 element instance. Returns a true value if this is an element object.
Georg Brandl116aa622007-08-15 14:28:22 +000093
94
Florent Xiclunaf15351d2010-03-13 23:24:31 +000095.. function:: iterparse(source, events=None, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +000096
97 Parses an XML section into an element tree incrementally, and reports what's
Eli Bendersky604c4ff2012-03-16 08:41:30 +020098 going on to the user. *source* is a filename or :term:`file object`
99 containing XML data. *events* is a list of events to report back. The
100 supported events are the strings ``"start"``, ``"end"``, ``"start-ns"``
101 and ``"end-ns"`` (the "ns" events are used to get detailed namespace
102 information). If *events* is omitted, only ``"end"`` events are reported.
103 *parser* is an optional parser instance. If not given, the standard
104 :class:`XMLParser` parser is used. Returns an :term:`iterator` providing
105 ``(event, elem)`` pairs.
Georg Brandl116aa622007-08-15 14:28:22 +0000106
Benjamin Peterson75edad02009-01-01 15:05:06 +0000107 .. note::
108
109 :func:`iterparse` only guarantees that it has seen the ">"
110 character of a starting tag when it emits a "start" event, so the
111 attributes are defined, but the contents of the text and tail attributes
112 are undefined at that point. The same applies to the element children;
113 they may or may not be present.
114
115 If you need a fully populated element, look for "end" events instead.
116
Georg Brandl116aa622007-08-15 14:28:22 +0000117
Georg Brandl7f01a132009-09-16 15:58:14 +0000118.. function:: parse(source, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000119
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000120 Parses an XML section into an element tree. *source* is a filename or file
121 object containing XML data. *parser* is an optional parser instance. If
122 not given, the standard :class:`XMLParser` parser is used. Returns an
123 :class:`ElementTree` instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000124
125
Georg Brandl7f01a132009-09-16 15:58:14 +0000126.. function:: ProcessingInstruction(target, text=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000127
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000128 PI element factory. This factory function creates a special element that
129 will be serialized as an XML processing instruction. *target* is a string
130 containing the PI target. *text* is a string containing the PI contents, if
131 given. Returns an element instance, representing a processing instruction.
132
133
134.. function:: register_namespace(prefix, uri)
135
136 Registers a namespace prefix. The registry is global, and any existing
137 mapping for either the given prefix or the namespace URI will be removed.
138 *prefix* is a namespace prefix. *uri* is a namespace uri. Tags and
139 attributes in this namespace will be serialized with the given prefix, if at
140 all possible.
141
Ezio Melottif8754a62010-03-21 07:16:43 +0000142 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000143
144
Georg Brandl7f01a132009-09-16 15:58:14 +0000145.. function:: SubElement(parent, tag, attrib={}, **extra)
Georg Brandl116aa622007-08-15 14:28:22 +0000146
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000147 Subelement factory. This function creates an element instance, and appends
148 it to an existing element.
Georg Brandl116aa622007-08-15 14:28:22 +0000149
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000150 The element name, attribute names, and attribute values can be either
151 bytestrings or Unicode strings. *parent* is the parent element. *tag* is
152 the subelement name. *attrib* is an optional dictionary, containing element
153 attributes. *extra* contains additional attributes, given as keyword
154 arguments. Returns an element instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000155
156
Florent Xiclunac17f1722010-08-08 19:48:29 +0000157.. function:: tostring(element, encoding="us-ascii", method="xml")
Georg Brandl116aa622007-08-15 14:28:22 +0000158
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000159 Generates a string representation of an XML element, including all
Florent Xiclunadddd5e92010-03-14 01:28:07 +0000160 subelements. *element* is an :class:`Element` instance. *encoding* [1]_ is
Florent Xiclunac17f1722010-08-08 19:48:29 +0000161 the output encoding (default is US-ASCII). Use ``encoding="unicode"`` to
162 generate a Unicode string. *method* is either ``"xml"``,
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000163 ``"html"`` or ``"text"`` (default is ``"xml"``). Returns an (optionally)
164 encoded string containing the XML data.
Georg Brandl116aa622007-08-15 14:28:22 +0000165
166
Florent Xiclunac17f1722010-08-08 19:48:29 +0000167.. function:: tostringlist(element, encoding="us-ascii", method="xml")
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000168
169 Generates a string representation of an XML element, including all
Florent Xiclunadddd5e92010-03-14 01:28:07 +0000170 subelements. *element* is an :class:`Element` instance. *encoding* [1]_ is
Florent Xiclunac17f1722010-08-08 19:48:29 +0000171 the output encoding (default is US-ASCII). Use ``encoding="unicode"`` to
172 generate a Unicode string. *method* is either ``"xml"``,
Florent Xiclunadddd5e92010-03-14 01:28:07 +0000173 ``"html"`` or ``"text"`` (default is ``"xml"``). Returns a list of
174 (optionally) encoded strings containing the XML data. It does not guarantee
175 any specific sequence, except that ``"".join(tostringlist(element)) ==
176 tostring(element)``.
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000177
Ezio Melottif8754a62010-03-21 07:16:43 +0000178 .. versionadded:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000179
180
181.. function:: XML(text, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000182
183 Parses an XML section from a string constant. This function can be used to
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000184 embed "XML literals" in Python code. *text* is a string containing XML
185 data. *parser* is an optional parser instance. If not given, the standard
186 :class:`XMLParser` parser is used. Returns an :class:`Element` instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000187
188
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000189.. function:: XMLID(text, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000190
191 Parses an XML section from a string constant, and also returns a dictionary
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000192 which maps from element id:s to elements. *text* is a string containing XML
193 data. *parser* is an optional parser instance. If not given, the standard
194 :class:`XMLParser` parser is used. Returns a tuple containing an
195 :class:`Element` instance and a dictionary.
Georg Brandl116aa622007-08-15 14:28:22 +0000196
197
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000198.. _elementtree-element-objects:
Georg Brandl116aa622007-08-15 14:28:22 +0000199
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000200Element Objects
201---------------
Georg Brandl116aa622007-08-15 14:28:22 +0000202
Georg Brandl116aa622007-08-15 14:28:22 +0000203
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000204.. class:: Element(tag, attrib={}, **extra)
Georg Brandl116aa622007-08-15 14:28:22 +0000205
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000206 Element class. This class defines the Element interface, and provides a
207 reference implementation of this interface.
Georg Brandl116aa622007-08-15 14:28:22 +0000208
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000209 The element name, attribute names, and attribute values can be either
210 bytestrings or Unicode strings. *tag* is the element name. *attrib* is
211 an optional dictionary, containing element attributes. *extra* contains
212 additional attributes, given as keyword arguments.
Georg Brandl116aa622007-08-15 14:28:22 +0000213
214
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000215 .. attribute:: tag
Georg Brandl116aa622007-08-15 14:28:22 +0000216
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000217 A string identifying what kind of data this element represents (the
218 element type, in other words).
Georg Brandl116aa622007-08-15 14:28:22 +0000219
220
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000221 .. attribute:: text
Georg Brandl116aa622007-08-15 14:28:22 +0000222
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000223 The *text* attribute can be used to hold additional data associated with
224 the element. As the name implies this attribute is usually a string but
225 may be any application-specific object. If the element is created from
226 an XML file the attribute will contain any text found between the element
227 tags.
Georg Brandl116aa622007-08-15 14:28:22 +0000228
229
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000230 .. attribute:: tail
Georg Brandl116aa622007-08-15 14:28:22 +0000231
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000232 The *tail* attribute can be used to hold additional data associated with
233 the element. This attribute is usually a string but may be any
234 application-specific object. If the element is created from an XML file
235 the attribute will contain any text found after the element's end tag and
236 before the next tag.
Georg Brandl116aa622007-08-15 14:28:22 +0000237
Georg Brandl116aa622007-08-15 14:28:22 +0000238
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000239 .. attribute:: attrib
Georg Brandl116aa622007-08-15 14:28:22 +0000240
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000241 A dictionary containing the element's attributes. Note that while the
242 *attrib* value is always a real mutable Python dictionary, an ElementTree
243 implementation may choose to use another internal representation, and
244 create the dictionary only if someone asks for it. To take advantage of
245 such implementations, use the dictionary methods below whenever possible.
Georg Brandl116aa622007-08-15 14:28:22 +0000246
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000247 The following dictionary-like methods work on the element attributes.
Georg Brandl116aa622007-08-15 14:28:22 +0000248
249
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000250 .. method:: clear()
Georg Brandl116aa622007-08-15 14:28:22 +0000251
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000252 Resets an element. This function removes all subelements, clears all
253 attributes, and sets the text and tail attributes to None.
Georg Brandl116aa622007-08-15 14:28:22 +0000254
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000255
256 .. method:: get(key, default=None)
257
258 Gets the element attribute named *key*.
259
260 Returns the attribute value, or *default* if the attribute was not found.
261
262
263 .. method:: items()
264
265 Returns the element attributes as a sequence of (name, value) pairs. The
266 attributes are returned in an arbitrary order.
267
268
269 .. method:: keys()
270
271 Returns the elements attribute names as a list. The names are returned
272 in an arbitrary order.
273
274
275 .. method:: set(key, value)
276
277 Set the attribute *key* on the element to *value*.
278
279 The following methods work on the element's children (subelements).
280
281
282 .. method:: append(subelement)
283
284 Adds the element *subelement* to the end of this elements internal list
285 of subelements.
286
287
288 .. method:: extend(subelements)
Georg Brandl116aa622007-08-15 14:28:22 +0000289
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000290 Appends *subelements* from a sequence object with zero or more elements.
291 Raises :exc:`AssertionError` if a subelement is not a valid object.
Georg Brandl116aa622007-08-15 14:28:22 +0000292
Ezio Melottif8754a62010-03-21 07:16:43 +0000293 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000294
Georg Brandl116aa622007-08-15 14:28:22 +0000295
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000296 .. method:: find(match)
Georg Brandl116aa622007-08-15 14:28:22 +0000297
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000298 Finds the first subelement matching *match*. *match* may be a tag name
299 or path. Returns an element instance or ``None``.
Georg Brandl116aa622007-08-15 14:28:22 +0000300
Georg Brandl116aa622007-08-15 14:28:22 +0000301
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000302 .. method:: findall(match)
Georg Brandl116aa622007-08-15 14:28:22 +0000303
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000304 Finds all matching subelements, by tag name or path. Returns a list
305 containing all matching elements in document order.
Georg Brandl116aa622007-08-15 14:28:22 +0000306
Georg Brandl116aa622007-08-15 14:28:22 +0000307
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000308 .. method:: findtext(match, default=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000309
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000310 Finds text for the first subelement matching *match*. *match* may be
311 a tag name or path. Returns the text content of the first matching
312 element, or *default* if no element was found. Note that if the matching
313 element has no text content an empty string is returned.
Georg Brandl116aa622007-08-15 14:28:22 +0000314
Georg Brandl116aa622007-08-15 14:28:22 +0000315
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000316 .. method:: getchildren()
Georg Brandl116aa622007-08-15 14:28:22 +0000317
Georg Brandl67b21b72010-08-17 15:07:14 +0000318 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000319 Use ``list(elem)`` or iteration.
Georg Brandl116aa622007-08-15 14:28:22 +0000320
Georg Brandl116aa622007-08-15 14:28:22 +0000321
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000322 .. method:: getiterator(tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000323
Georg Brandl67b21b72010-08-17 15:07:14 +0000324 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000325 Use method :meth:`Element.iter` instead.
Georg Brandl116aa622007-08-15 14:28:22 +0000326
Georg Brandl116aa622007-08-15 14:28:22 +0000327
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000328 .. method:: insert(index, element)
Georg Brandl116aa622007-08-15 14:28:22 +0000329
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000330 Inserts a subelement at the given position in this element.
Georg Brandl116aa622007-08-15 14:28:22 +0000331
Georg Brandl116aa622007-08-15 14:28:22 +0000332
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000333 .. method:: iter(tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000334
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000335 Creates a tree :term:`iterator` with the current element as the root.
336 The iterator iterates over this element and all elements below it, in
337 document (depth first) order. If *tag* is not ``None`` or ``'*'``, only
338 elements whose tag equals *tag* are returned from the iterator. If the
339 tree structure is modified during iteration, the result is undefined.
Georg Brandl116aa622007-08-15 14:28:22 +0000340
Ezio Melotti138fc892011-10-10 00:02:03 +0300341 .. versionadded:: 3.2
342
Georg Brandl116aa622007-08-15 14:28:22 +0000343
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000344 .. method:: iterfind(match)
Georg Brandl116aa622007-08-15 14:28:22 +0000345
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000346 Finds all matching subelements, by tag name or path. Returns an iterable
347 yielding all matching elements in document order.
Georg Brandl116aa622007-08-15 14:28:22 +0000348
Ezio Melottif8754a62010-03-21 07:16:43 +0000349 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000350
Georg Brandl116aa622007-08-15 14:28:22 +0000351
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000352 .. method:: itertext()
Georg Brandl116aa622007-08-15 14:28:22 +0000353
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000354 Creates a text iterator. The iterator loops over this element and all
355 subelements, in document order, and returns all inner text.
Georg Brandl116aa622007-08-15 14:28:22 +0000356
Ezio Melottif8754a62010-03-21 07:16:43 +0000357 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000358
359
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000360 .. method:: makeelement(tag, attrib)
Georg Brandl116aa622007-08-15 14:28:22 +0000361
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000362 Creates a new element object of the same type as this element. Do not
363 call this method, use the :func:`SubElement` factory function instead.
Georg Brandl116aa622007-08-15 14:28:22 +0000364
365
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000366 .. method:: remove(subelement)
Georg Brandl116aa622007-08-15 14:28:22 +0000367
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000368 Removes *subelement* from the element. Unlike the find\* methods this
369 method compares elements based on the instance identity, not on tag value
370 or contents.
Georg Brandl116aa622007-08-15 14:28:22 +0000371
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000372 :class:`Element` objects also support the following sequence type methods
373 for working with subelements: :meth:`__delitem__`, :meth:`__getitem__`,
374 :meth:`__setitem__`, :meth:`__len__`.
Georg Brandl116aa622007-08-15 14:28:22 +0000375
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000376 Caution: Elements with no subelements will test as ``False``. This behavior
377 will change in future versions. Use specific ``len(elem)`` or ``elem is
378 None`` test instead. ::
Georg Brandl116aa622007-08-15 14:28:22 +0000379
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000380 element = root.find('foo')
Georg Brandl116aa622007-08-15 14:28:22 +0000381
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000382 if not element: # careful!
383 print("element not found, or element has no subelements")
Georg Brandl116aa622007-08-15 14:28:22 +0000384
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000385 if element is None:
386 print("element not found")
Georg Brandl116aa622007-08-15 14:28:22 +0000387
388
389.. _elementtree-elementtree-objects:
390
391ElementTree Objects
392-------------------
393
394
Georg Brandl7f01a132009-09-16 15:58:14 +0000395.. class:: ElementTree(element=None, file=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000396
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000397 ElementTree wrapper class. This class represents an entire element
398 hierarchy, and adds some extra support for serialization to and from
399 standard XML.
Georg Brandl116aa622007-08-15 14:28:22 +0000400
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000401 *element* is the root element. The tree is initialized with the contents
402 of the XML *file* if given.
Georg Brandl116aa622007-08-15 14:28:22 +0000403
404
Benjamin Petersone41251e2008-04-25 01:59:09 +0000405 .. method:: _setroot(element)
Georg Brandl116aa622007-08-15 14:28:22 +0000406
Benjamin Petersone41251e2008-04-25 01:59:09 +0000407 Replaces the root element for this tree. This discards the current
408 contents of the tree, and replaces it with the given element. Use with
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000409 care. *element* is an element instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000410
411
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000412 .. method:: find(match)
Georg Brandl116aa622007-08-15 14:28:22 +0000413
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000414 Finds the first toplevel element matching *match*. *match* may be a tag
415 name or path. Same as getroot().find(match). Returns the first matching
416 element, or ``None`` if no element was found.
Georg Brandl116aa622007-08-15 14:28:22 +0000417
418
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000419 .. method:: findall(match)
Georg Brandl116aa622007-08-15 14:28:22 +0000420
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000421 Finds all matching subelements, by tag name or path. Same as
422 getroot().findall(match). *match* may be a tag name or path. Returns a
423 list containing all matching elements, in document order.
Georg Brandl116aa622007-08-15 14:28:22 +0000424
425
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000426 .. method:: findtext(match, default=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000427
Benjamin Petersone41251e2008-04-25 01:59:09 +0000428 Finds the element text for the first toplevel element with given tag.
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000429 Same as getroot().findtext(match). *match* may be a tag name or path.
430 *default* is the value to return if the element was not found. Returns
431 the text content of the first matching element, or the default value no
432 element was found. Note that if the element is found, but has no text
433 content, this method returns an empty string.
Georg Brandl116aa622007-08-15 14:28:22 +0000434
435
Georg Brandl7f01a132009-09-16 15:58:14 +0000436 .. method:: getiterator(tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000437
Georg Brandl67b21b72010-08-17 15:07:14 +0000438 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000439 Use method :meth:`ElementTree.iter` instead.
Georg Brandl116aa622007-08-15 14:28:22 +0000440
441
Benjamin Petersone41251e2008-04-25 01:59:09 +0000442 .. method:: getroot()
Florent Xiclunac17f1722010-08-08 19:48:29 +0000443
Benjamin Petersone41251e2008-04-25 01:59:09 +0000444 Returns the root element for this tree.
Georg Brandl116aa622007-08-15 14:28:22 +0000445
446
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000447 .. method:: iter(tag=None)
448
449 Creates and returns a tree iterator for the root element. The iterator
450 loops over all elements in this tree, in section order. *tag* is the tag
451 to look for (default is to return all elements)
452
453
454 .. method:: iterfind(match)
455
456 Finds all matching subelements, by tag name or path. Same as
457 getroot().iterfind(match). Returns an iterable yielding all matching
458 elements in document order.
459
Ezio Melottif8754a62010-03-21 07:16:43 +0000460 .. versionadded:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000461
462
Georg Brandl7f01a132009-09-16 15:58:14 +0000463 .. method:: parse(source, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000464
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000465 Loads an external XML section into this element tree. *source* is a file
Antoine Pitrou11cb9612010-09-15 11:11:28 +0000466 name or :term:`file object`. *parser* is an optional parser instance.
467 If not given, the standard XMLParser parser is used. Returns the section
Benjamin Petersone41251e2008-04-25 01:59:09 +0000468 root element.
Georg Brandl116aa622007-08-15 14:28:22 +0000469
470
Serhiy Storchaka03530b92013-01-13 21:58:04 +0200471 .. method:: write(file, encoding="us-ascii", xml_declaration=None, \
472 default_namespace=None, method="xml")
Georg Brandl116aa622007-08-15 14:28:22 +0000473
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000474 Writes the element tree to a file, as XML. *file* is a file name, or a
Serhiy Storchaka03530b92013-01-13 21:58:04 +0200475 :term:`file object` opened for writing. *encoding* [1]_ is the output
476 encoding (default is US-ASCII). Use ``encoding="unicode"`` to write a
477 Unicode string. *xml_declaration* controls if an XML declaration
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000478 should be added to the file. Use False for never, True for always, None
Serhiy Storchaka03530b92013-01-13 21:58:04 +0200479 for only if not US-ASCII or UTF-8 or Unicode (default is None).
480 *default_namespace* sets the default XML namespace (for "xmlns").
481 *method* is either ``"xml"``, ``"html"`` or ``"text"`` (default is
482 ``"xml"``). Returns an (optionally) encoded string.
Georg Brandl116aa622007-08-15 14:28:22 +0000483
Christian Heimesd8654cf2007-12-02 15:22:16 +0000484This is the XML file that is going to be manipulated::
485
486 <html>
487 <head>
488 <title>Example page</title>
489 </head>
490 <body>
Georg Brandl48310cd2009-01-03 21:18:54 +0000491 <p>Moved to <a href="http://example.org/">example.org</a>
Christian Heimesd8654cf2007-12-02 15:22:16 +0000492 or <a href="http://example.com/">example.com</a>.</p>
493 </body>
494 </html>
495
496Example of changing the attribute "target" of every link in first paragraph::
497
498 >>> from xml.etree.ElementTree import ElementTree
499 >>> tree = ElementTree()
500 >>> tree.parse("index.xhtml")
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000501 <Element 'html' at 0xb77e6fac>
Christian Heimesd8654cf2007-12-02 15:22:16 +0000502 >>> p = tree.find("body/p") # Finds first occurrence of tag p in body
503 >>> p
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000504 <Element 'p' at 0xb77ec26c>
505 >>> links = list(p.iter("a")) # Returns list of all links
Christian Heimesd8654cf2007-12-02 15:22:16 +0000506 >>> links
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000507 [<Element 'a' at 0xb77ec2ac>, <Element 'a' at 0xb77ec1cc>]
Christian Heimesd8654cf2007-12-02 15:22:16 +0000508 >>> for i in links: # Iterates through all found links
509 ... i.attrib["target"] = "blank"
510 >>> tree.write("output.xhtml")
Georg Brandl116aa622007-08-15 14:28:22 +0000511
512.. _elementtree-qname-objects:
513
514QName Objects
515-------------
516
517
Georg Brandl7f01a132009-09-16 15:58:14 +0000518.. class:: QName(text_or_uri, tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000519
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000520 QName wrapper. This can be used to wrap a QName attribute value, in order
521 to get proper namespace handling on output. *text_or_uri* is a string
522 containing the QName value, in the form {uri}local, or, if the tag argument
523 is given, the URI part of a QName. If *tag* is given, the first argument is
524 interpreted as an URI, and this argument is interpreted as a local name.
525 :class:`QName` instances are opaque.
Georg Brandl116aa622007-08-15 14:28:22 +0000526
527
528.. _elementtree-treebuilder-objects:
529
530TreeBuilder Objects
531-------------------
532
533
Georg Brandl7f01a132009-09-16 15:58:14 +0000534.. class:: TreeBuilder(element_factory=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000535
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000536 Generic element structure builder. This builder converts a sequence of
537 start, data, and end method calls to a well-formed element structure. You
538 can use this class to build an element structure using a custom XML parser,
539 or a parser for some other XML-like format. The *element_factory* is called
540 to create new :class:`Element` instances when given.
Georg Brandl116aa622007-08-15 14:28:22 +0000541
542
Benjamin Petersone41251e2008-04-25 01:59:09 +0000543 .. method:: close()
Georg Brandl116aa622007-08-15 14:28:22 +0000544
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000545 Flushes the builder buffers, and returns the toplevel document
546 element. Returns an :class:`Element` instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000547
548
Benjamin Petersone41251e2008-04-25 01:59:09 +0000549 .. method:: data(data)
Georg Brandl116aa622007-08-15 14:28:22 +0000550
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000551 Adds text to the current element. *data* is a string. This should be
552 either a bytestring, or a Unicode string.
Georg Brandl116aa622007-08-15 14:28:22 +0000553
554
Benjamin Petersone41251e2008-04-25 01:59:09 +0000555 .. method:: end(tag)
Georg Brandl116aa622007-08-15 14:28:22 +0000556
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000557 Closes the current element. *tag* is the element name. Returns the
558 closed element.
Georg Brandl116aa622007-08-15 14:28:22 +0000559
560
Benjamin Petersone41251e2008-04-25 01:59:09 +0000561 .. method:: start(tag, attrs)
Georg Brandl116aa622007-08-15 14:28:22 +0000562
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000563 Opens a new element. *tag* is the element name. *attrs* is a dictionary
564 containing element attributes. Returns the opened element.
Georg Brandl116aa622007-08-15 14:28:22 +0000565
566
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000567 In addition, a custom :class:`TreeBuilder` object can provide the
568 following method:
Georg Brandl116aa622007-08-15 14:28:22 +0000569
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000570 .. method:: doctype(name, pubid, system)
571
572 Handles a doctype declaration. *name* is the doctype name. *pubid* is
573 the public identifier. *system* is the system identifier. This method
574 does not exist on the default :class:`TreeBuilder` class.
575
Ezio Melottif8754a62010-03-21 07:16:43 +0000576 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000577
578
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000579.. _elementtree-xmlparser-objects:
Georg Brandl116aa622007-08-15 14:28:22 +0000580
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000581XMLParser Objects
582-----------------
583
584
585.. class:: XMLParser(html=0, target=None, encoding=None)
586
587 :class:`Element` structure builder for XML source data, based on the expat
588 parser. *html* are predefined HTML entities. This flag is not supported by
589 the current implementation. *target* is the target object. If omitted, the
590 builder uses an instance of the standard TreeBuilder class. *encoding* [1]_
591 is optional. If given, the value overrides the encoding specified in the
592 XML file.
Georg Brandl116aa622007-08-15 14:28:22 +0000593
594
Benjamin Petersone41251e2008-04-25 01:59:09 +0000595 .. method:: close()
Georg Brandl116aa622007-08-15 14:28:22 +0000596
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000597 Finishes feeding data to the parser. Returns an element structure.
Georg Brandl116aa622007-08-15 14:28:22 +0000598
599
Benjamin Petersone41251e2008-04-25 01:59:09 +0000600 .. method:: doctype(name, pubid, system)
Georg Brandl116aa622007-08-15 14:28:22 +0000601
Georg Brandl67b21b72010-08-17 15:07:14 +0000602 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000603 Define the :meth:`TreeBuilder.doctype` method on a custom TreeBuilder
604 target.
Georg Brandl116aa622007-08-15 14:28:22 +0000605
606
Benjamin Petersone41251e2008-04-25 01:59:09 +0000607 .. method:: feed(data)
Georg Brandl116aa622007-08-15 14:28:22 +0000608
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000609 Feeds data to the parser. *data* is encoded data.
Georg Brandl116aa622007-08-15 14:28:22 +0000610
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000611:meth:`XMLParser.feed` calls *target*\'s :meth:`start` method
Christian Heimesd8654cf2007-12-02 15:22:16 +0000612for each opening tag, its :meth:`end` method for each closing tag,
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000613and data is processed by method :meth:`data`. :meth:`XMLParser.close`
Georg Brandl48310cd2009-01-03 21:18:54 +0000614calls *target*\'s method :meth:`close`.
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000615:class:`XMLParser` can be used not only for building a tree structure.
Christian Heimesd8654cf2007-12-02 15:22:16 +0000616This is an example of counting the maximum depth of an XML file::
617
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000618 >>> from xml.etree.ElementTree import XMLParser
Christian Heimesd8654cf2007-12-02 15:22:16 +0000619 >>> class MaxDepth: # The target object of the parser
620 ... maxDepth = 0
621 ... depth = 0
622 ... def start(self, tag, attrib): # Called for each opening tag.
Georg Brandl48310cd2009-01-03 21:18:54 +0000623 ... self.depth += 1
Christian Heimesd8654cf2007-12-02 15:22:16 +0000624 ... if self.depth > self.maxDepth:
625 ... self.maxDepth = self.depth
626 ... def end(self, tag): # Called for each closing tag.
627 ... self.depth -= 1
Georg Brandl48310cd2009-01-03 21:18:54 +0000628 ... def data(self, data):
Christian Heimesd8654cf2007-12-02 15:22:16 +0000629 ... pass # We do not need to do anything with data.
630 ... def close(self): # Called when all data has been parsed.
631 ... return self.maxDepth
Georg Brandl48310cd2009-01-03 21:18:54 +0000632 ...
Christian Heimesd8654cf2007-12-02 15:22:16 +0000633 >>> target = MaxDepth()
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000634 >>> parser = XMLParser(target=target)
Christian Heimesd8654cf2007-12-02 15:22:16 +0000635 >>> exampleXml = """
636 ... <a>
637 ... <b>
638 ... </b>
639 ... <b>
640 ... <c>
641 ... <d>
642 ... </d>
643 ... </c>
644 ... </b>
645 ... </a>"""
646 >>> parser.feed(exampleXml)
647 >>> parser.close()
648 4
Christian Heimesb186d002008-03-18 15:15:01 +0000649
650
651.. rubric:: Footnotes
652
653.. [#] The encoding string included in XML output should conform to the
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000654 appropriate standards. For example, "UTF-8" is valid, but "UTF8" is
655 not. See http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl
Benjamin Petersonad3d5c22009-02-26 03:38:59 +0000656 and http://www.iana.org/assignments/character-sets.