blob: d320711b656e1255fce846e8e02a901d053b8247 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`xml.etree.ElementTree` --- The ElementTree XML API
2========================================================
3
4.. module:: xml.etree.ElementTree
5 :synopsis: Implementation of the ElementTree API.
6.. moduleauthor:: Fredrik Lundh <fredrik@pythonware.com>
7
Raymond Hettinger3029aff2011-02-10 08:09:36 +00008**Source code:** :source:`Lib/xml/etree/ElementTree.py`
9
10--------------
Georg Brandl116aa622007-08-15 14:28:22 +000011
Florent Xiclunaf15351d2010-03-13 23:24:31 +000012The :class:`Element` type is a flexible container object, designed to store
13hierarchical data structures in memory. The type can be described as a cross
14between a list and a dictionary.
Georg Brandl116aa622007-08-15 14:28:22 +000015
16Each element has a number of properties associated with it:
17
18* a tag which is a string identifying what kind of data this element represents
19 (the element type, in other words).
20
21* a number of attributes, stored in a Python dictionary.
22
23* a text string.
24
25* an optional tail string.
26
27* a number of child elements, stored in a Python sequence
28
Florent Xiclunaf15351d2010-03-13 23:24:31 +000029To create an element instance, use the :class:`Element` constructor or the
30:func:`SubElement` factory function.
Georg Brandl116aa622007-08-15 14:28:22 +000031
32The :class:`ElementTree` class can be used to wrap an element structure, and
33convert it from and to XML.
34
35A C implementation of this API is available as :mod:`xml.etree.cElementTree`.
36
Christian Heimesd8654cf2007-12-02 15:22:16 +000037See http://effbot.org/zone/element-index.htm for tutorials and links to other
Florent Xiclunaf15351d2010-03-13 23:24:31 +000038docs. Fredrik Lundh's page is also the location of the development version of
39the xml.etree.ElementTree.
40
Ezio Melottif8754a62010-03-21 07:16:43 +000041.. versionchanged:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +000042 The ElementTree API is updated to 1.3. For more information, see
43 `Introducing ElementTree 1.3
44 <http://effbot.org/zone/elementtree-13-intro.htm>`_.
45
Georg Brandl116aa622007-08-15 14:28:22 +000046
47.. _elementtree-functions:
48
49Functions
50---------
51
52
Georg Brandl7f01a132009-09-16 15:58:14 +000053.. function:: Comment(text=None)
Georg Brandl116aa622007-08-15 14:28:22 +000054
Georg Brandlf6945182008-02-01 11:56:49 +000055 Comment element factory. This factory function creates a special element
Florent Xiclunaf15351d2010-03-13 23:24:31 +000056 that will be serialized as an XML comment by the standard serializer. The
57 comment string can be either a bytestring or a Unicode string. *text* is a
58 string containing the comment string. Returns an element instance
Georg Brandlf6945182008-02-01 11:56:49 +000059 representing a comment.
Georg Brandl116aa622007-08-15 14:28:22 +000060
61
62.. function:: dump(elem)
63
Florent Xiclunaf15351d2010-03-13 23:24:31 +000064 Writes an element tree or element structure to sys.stdout. This function
65 should be used for debugging only.
Georg Brandl116aa622007-08-15 14:28:22 +000066
67 The exact output format is implementation dependent. In this version, it's
68 written as an ordinary XML file.
69
70 *elem* is an element tree or an individual element.
71
72
Georg Brandl116aa622007-08-15 14:28:22 +000073.. function:: fromstring(text)
74
Florent Xiclunadddd5e92010-03-14 01:28:07 +000075 Parses an XML section from a string constant. Same as :func:`XML`. *text*
76 is a string containing XML data. Returns an :class:`Element` instance.
Florent Xiclunaf15351d2010-03-13 23:24:31 +000077
78
79.. function:: fromstringlist(sequence, parser=None)
80
81 Parses an XML document from a sequence of string fragments. *sequence* is a
82 list or other sequence containing XML data fragments. *parser* is an
83 optional parser instance. If not given, the standard :class:`XMLParser`
84 parser is used. Returns an :class:`Element` instance.
85
Ezio Melottif8754a62010-03-21 07:16:43 +000086 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +000087
88
89.. function:: iselement(element)
90
Florent Xiclunaf15351d2010-03-13 23:24:31 +000091 Checks if an object appears to be a valid element object. *element* is an
92 element instance. Returns a true value if this is an element object.
Georg Brandl116aa622007-08-15 14:28:22 +000093
94
Florent Xiclunaf15351d2010-03-13 23:24:31 +000095.. function:: iterparse(source, events=None, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +000096
97 Parses an XML section into an element tree incrementally, and reports what's
Eli Bendersky604c4ff2012-03-16 08:41:30 +020098 going on to the user. *source* is a filename or :term:`file object`
99 containing XML data. *events* is a list of events to report back. The
100 supported events are the strings ``"start"``, ``"end"``, ``"start-ns"``
101 and ``"end-ns"`` (the "ns" events are used to get detailed namespace
102 information). If *events* is omitted, only ``"end"`` events are reported.
103 *parser* is an optional parser instance. If not given, the standard
Eli Bendersky48c50bf2013-01-24 07:23:34 -0800104 :class:`XMLParser` parser is used. *parser* is not supported by
105 ``cElementTree``. Returns an :term:`iterator` providing ``(event, elem)``
106 pairs.
Georg Brandl116aa622007-08-15 14:28:22 +0000107
Benjamin Peterson75edad02009-01-01 15:05:06 +0000108 .. note::
109
110 :func:`iterparse` only guarantees that it has seen the ">"
111 character of a starting tag when it emits a "start" event, so the
112 attributes are defined, but the contents of the text and tail attributes
113 are undefined at that point. The same applies to the element children;
114 they may or may not be present.
115
116 If you need a fully populated element, look for "end" events instead.
117
Georg Brandl116aa622007-08-15 14:28:22 +0000118
Georg Brandl7f01a132009-09-16 15:58:14 +0000119.. function:: parse(source, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000120
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000121 Parses an XML section into an element tree. *source* is a filename or file
122 object containing XML data. *parser* is an optional parser instance. If
123 not given, the standard :class:`XMLParser` parser is used. Returns an
124 :class:`ElementTree` instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000125
126
Georg Brandl7f01a132009-09-16 15:58:14 +0000127.. function:: ProcessingInstruction(target, text=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000128
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000129 PI element factory. This factory function creates a special element that
130 will be serialized as an XML processing instruction. *target* is a string
131 containing the PI target. *text* is a string containing the PI contents, if
132 given. Returns an element instance, representing a processing instruction.
133
134
135.. function:: register_namespace(prefix, uri)
136
137 Registers a namespace prefix. The registry is global, and any existing
138 mapping for either the given prefix or the namespace URI will be removed.
139 *prefix* is a namespace prefix. *uri* is a namespace uri. Tags and
140 attributes in this namespace will be serialized with the given prefix, if at
141 all possible.
142
Ezio Melottif8754a62010-03-21 07:16:43 +0000143 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000144
145
Georg Brandl7f01a132009-09-16 15:58:14 +0000146.. function:: SubElement(parent, tag, attrib={}, **extra)
Georg Brandl116aa622007-08-15 14:28:22 +0000147
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000148 Subelement factory. This function creates an element instance, and appends
149 it to an existing element.
Georg Brandl116aa622007-08-15 14:28:22 +0000150
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000151 The element name, attribute names, and attribute values can be either
152 bytestrings or Unicode strings. *parent* is the parent element. *tag* is
153 the subelement name. *attrib* is an optional dictionary, containing element
154 attributes. *extra* contains additional attributes, given as keyword
155 arguments. Returns an element instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000156
157
Florent Xiclunac17f1722010-08-08 19:48:29 +0000158.. function:: tostring(element, encoding="us-ascii", method="xml")
Georg Brandl116aa622007-08-15 14:28:22 +0000159
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000160 Generates a string representation of an XML element, including all
Florent Xiclunadddd5e92010-03-14 01:28:07 +0000161 subelements. *element* is an :class:`Element` instance. *encoding* [1]_ is
Florent Xiclunac17f1722010-08-08 19:48:29 +0000162 the output encoding (default is US-ASCII). Use ``encoding="unicode"`` to
163 generate a Unicode string. *method* is either ``"xml"``,
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000164 ``"html"`` or ``"text"`` (default is ``"xml"``). Returns an (optionally)
165 encoded string containing the XML data.
Georg Brandl116aa622007-08-15 14:28:22 +0000166
167
Florent Xiclunac17f1722010-08-08 19:48:29 +0000168.. function:: tostringlist(element, encoding="us-ascii", method="xml")
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000169
170 Generates a string representation of an XML element, including all
Florent Xiclunadddd5e92010-03-14 01:28:07 +0000171 subelements. *element* is an :class:`Element` instance. *encoding* [1]_ is
Florent Xiclunac17f1722010-08-08 19:48:29 +0000172 the output encoding (default is US-ASCII). Use ``encoding="unicode"`` to
173 generate a Unicode string. *method* is either ``"xml"``,
Florent Xiclunadddd5e92010-03-14 01:28:07 +0000174 ``"html"`` or ``"text"`` (default is ``"xml"``). Returns a list of
175 (optionally) encoded strings containing the XML data. It does not guarantee
176 any specific sequence, except that ``"".join(tostringlist(element)) ==
177 tostring(element)``.
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000178
Ezio Melottif8754a62010-03-21 07:16:43 +0000179 .. versionadded:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000180
181
182.. function:: XML(text, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000183
184 Parses an XML section from a string constant. This function can be used to
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000185 embed "XML literals" in Python code. *text* is a string containing XML
186 data. *parser* is an optional parser instance. If not given, the standard
187 :class:`XMLParser` parser is used. Returns an :class:`Element` instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000188
189
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000190.. function:: XMLID(text, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000191
192 Parses an XML section from a string constant, and also returns a dictionary
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000193 which maps from element id:s to elements. *text* is a string containing XML
194 data. *parser* is an optional parser instance. If not given, the standard
195 :class:`XMLParser` parser is used. Returns a tuple containing an
196 :class:`Element` instance and a dictionary.
Georg Brandl116aa622007-08-15 14:28:22 +0000197
198
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000199.. _elementtree-element-objects:
Georg Brandl116aa622007-08-15 14:28:22 +0000200
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000201Element Objects
202---------------
Georg Brandl116aa622007-08-15 14:28:22 +0000203
Georg Brandl116aa622007-08-15 14:28:22 +0000204
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000205.. class:: Element(tag, attrib={}, **extra)
Georg Brandl116aa622007-08-15 14:28:22 +0000206
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000207 Element class. This class defines the Element interface, and provides a
208 reference implementation of this interface.
Georg Brandl116aa622007-08-15 14:28:22 +0000209
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000210 The element name, attribute names, and attribute values can be either
211 bytestrings or Unicode strings. *tag* is the element name. *attrib* is
212 an optional dictionary, containing element attributes. *extra* contains
213 additional attributes, given as keyword arguments.
Georg Brandl116aa622007-08-15 14:28:22 +0000214
215
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000216 .. attribute:: tag
Georg Brandl116aa622007-08-15 14:28:22 +0000217
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000218 A string identifying what kind of data this element represents (the
219 element type, in other words).
Georg Brandl116aa622007-08-15 14:28:22 +0000220
221
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000222 .. attribute:: text
Georg Brandl116aa622007-08-15 14:28:22 +0000223
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000224 The *text* attribute can be used to hold additional data associated with
225 the element. As the name implies this attribute is usually a string but
226 may be any application-specific object. If the element is created from
227 an XML file the attribute will contain any text found between the element
228 tags.
Georg Brandl116aa622007-08-15 14:28:22 +0000229
230
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000231 .. attribute:: tail
Georg Brandl116aa622007-08-15 14:28:22 +0000232
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000233 The *tail* attribute can be used to hold additional data associated with
234 the element. This attribute is usually a string but may be any
235 application-specific object. If the element is created from an XML file
236 the attribute will contain any text found after the element's end tag and
237 before the next tag.
Georg Brandl116aa622007-08-15 14:28:22 +0000238
Georg Brandl116aa622007-08-15 14:28:22 +0000239
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000240 .. attribute:: attrib
Georg Brandl116aa622007-08-15 14:28:22 +0000241
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000242 A dictionary containing the element's attributes. Note that while the
243 *attrib* value is always a real mutable Python dictionary, an ElementTree
244 implementation may choose to use another internal representation, and
245 create the dictionary only if someone asks for it. To take advantage of
246 such implementations, use the dictionary methods below whenever possible.
Georg Brandl116aa622007-08-15 14:28:22 +0000247
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000248 The following dictionary-like methods work on the element attributes.
Georg Brandl116aa622007-08-15 14:28:22 +0000249
250
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000251 .. method:: clear()
Georg Brandl116aa622007-08-15 14:28:22 +0000252
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000253 Resets an element. This function removes all subelements, clears all
254 attributes, and sets the text and tail attributes to None.
Georg Brandl116aa622007-08-15 14:28:22 +0000255
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000256
257 .. method:: get(key, default=None)
258
259 Gets the element attribute named *key*.
260
261 Returns the attribute value, or *default* if the attribute was not found.
262
263
264 .. method:: items()
265
266 Returns the element attributes as a sequence of (name, value) pairs. The
267 attributes are returned in an arbitrary order.
268
269
270 .. method:: keys()
271
272 Returns the elements attribute names as a list. The names are returned
273 in an arbitrary order.
274
275
276 .. method:: set(key, value)
277
278 Set the attribute *key* on the element to *value*.
279
280 The following methods work on the element's children (subelements).
281
282
283 .. method:: append(subelement)
284
285 Adds the element *subelement* to the end of this elements internal list
286 of subelements.
287
288
289 .. method:: extend(subelements)
Georg Brandl116aa622007-08-15 14:28:22 +0000290
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000291 Appends *subelements* from a sequence object with zero or more elements.
292 Raises :exc:`AssertionError` if a subelement is not a valid object.
Georg Brandl116aa622007-08-15 14:28:22 +0000293
Ezio Melottif8754a62010-03-21 07:16:43 +0000294 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000295
Georg Brandl116aa622007-08-15 14:28:22 +0000296
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000297 .. method:: find(match)
Georg Brandl116aa622007-08-15 14:28:22 +0000298
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000299 Finds the first subelement matching *match*. *match* may be a tag name
300 or path. Returns an element instance or ``None``.
Georg Brandl116aa622007-08-15 14:28:22 +0000301
Georg Brandl116aa622007-08-15 14:28:22 +0000302
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000303 .. method:: findall(match)
Georg Brandl116aa622007-08-15 14:28:22 +0000304
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000305 Finds all matching subelements, by tag name or path. Returns a list
306 containing all matching elements in document order.
Georg Brandl116aa622007-08-15 14:28:22 +0000307
Georg Brandl116aa622007-08-15 14:28:22 +0000308
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000309 .. method:: findtext(match, default=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000310
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000311 Finds text for the first subelement matching *match*. *match* may be
312 a tag name or path. Returns the text content of the first matching
313 element, or *default* if no element was found. Note that if the matching
314 element has no text content an empty string is returned.
Georg Brandl116aa622007-08-15 14:28:22 +0000315
Georg Brandl116aa622007-08-15 14:28:22 +0000316
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000317 .. method:: getchildren()
Georg Brandl116aa622007-08-15 14:28:22 +0000318
Georg Brandl67b21b72010-08-17 15:07:14 +0000319 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000320 Use ``list(elem)`` or iteration.
Georg Brandl116aa622007-08-15 14:28:22 +0000321
Georg Brandl116aa622007-08-15 14:28:22 +0000322
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000323 .. method:: getiterator(tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000324
Georg Brandl67b21b72010-08-17 15:07:14 +0000325 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000326 Use method :meth:`Element.iter` instead.
Georg Brandl116aa622007-08-15 14:28:22 +0000327
Georg Brandl116aa622007-08-15 14:28:22 +0000328
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000329 .. method:: insert(index, element)
Georg Brandl116aa622007-08-15 14:28:22 +0000330
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000331 Inserts a subelement at the given position in this element.
Georg Brandl116aa622007-08-15 14:28:22 +0000332
Georg Brandl116aa622007-08-15 14:28:22 +0000333
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000334 .. method:: iter(tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000335
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000336 Creates a tree :term:`iterator` with the current element as the root.
337 The iterator iterates over this element and all elements below it, in
338 document (depth first) order. If *tag* is not ``None`` or ``'*'``, only
339 elements whose tag equals *tag* are returned from the iterator. If the
340 tree structure is modified during iteration, the result is undefined.
Georg Brandl116aa622007-08-15 14:28:22 +0000341
Ezio Melotti138fc892011-10-10 00:02:03 +0300342 .. versionadded:: 3.2
343
Georg Brandl116aa622007-08-15 14:28:22 +0000344
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000345 .. method:: iterfind(match)
Georg Brandl116aa622007-08-15 14:28:22 +0000346
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000347 Finds all matching subelements, by tag name or path. Returns an iterable
348 yielding all matching elements in document order.
Georg Brandl116aa622007-08-15 14:28:22 +0000349
Ezio Melottif8754a62010-03-21 07:16:43 +0000350 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000351
Georg Brandl116aa622007-08-15 14:28:22 +0000352
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000353 .. method:: itertext()
Georg Brandl116aa622007-08-15 14:28:22 +0000354
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000355 Creates a text iterator. The iterator loops over this element and all
356 subelements, in document order, and returns all inner text.
Georg Brandl116aa622007-08-15 14:28:22 +0000357
Ezio Melottif8754a62010-03-21 07:16:43 +0000358 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000359
360
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000361 .. method:: makeelement(tag, attrib)
Georg Brandl116aa622007-08-15 14:28:22 +0000362
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000363 Creates a new element object of the same type as this element. Do not
364 call this method, use the :func:`SubElement` factory function instead.
Georg Brandl116aa622007-08-15 14:28:22 +0000365
366
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000367 .. method:: remove(subelement)
Georg Brandl116aa622007-08-15 14:28:22 +0000368
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000369 Removes *subelement* from the element. Unlike the find\* methods this
370 method compares elements based on the instance identity, not on tag value
371 or contents.
Georg Brandl116aa622007-08-15 14:28:22 +0000372
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000373 :class:`Element` objects also support the following sequence type methods
374 for working with subelements: :meth:`__delitem__`, :meth:`__getitem__`,
375 :meth:`__setitem__`, :meth:`__len__`.
Georg Brandl116aa622007-08-15 14:28:22 +0000376
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000377 Caution: Elements with no subelements will test as ``False``. This behavior
378 will change in future versions. Use specific ``len(elem)`` or ``elem is
379 None`` test instead. ::
Georg Brandl116aa622007-08-15 14:28:22 +0000380
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000381 element = root.find('foo')
Georg Brandl116aa622007-08-15 14:28:22 +0000382
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000383 if not element: # careful!
384 print("element not found, or element has no subelements")
Georg Brandl116aa622007-08-15 14:28:22 +0000385
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000386 if element is None:
387 print("element not found")
Georg Brandl116aa622007-08-15 14:28:22 +0000388
389
390.. _elementtree-elementtree-objects:
391
392ElementTree Objects
393-------------------
394
395
Georg Brandl7f01a132009-09-16 15:58:14 +0000396.. class:: ElementTree(element=None, file=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000397
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000398 ElementTree wrapper class. This class represents an entire element
399 hierarchy, and adds some extra support for serialization to and from
400 standard XML.
Georg Brandl116aa622007-08-15 14:28:22 +0000401
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000402 *element* is the root element. The tree is initialized with the contents
403 of the XML *file* if given.
Georg Brandl116aa622007-08-15 14:28:22 +0000404
405
Benjamin Petersone41251e2008-04-25 01:59:09 +0000406 .. method:: _setroot(element)
Georg Brandl116aa622007-08-15 14:28:22 +0000407
Benjamin Petersone41251e2008-04-25 01:59:09 +0000408 Replaces the root element for this tree. This discards the current
409 contents of the tree, and replaces it with the given element. Use with
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000410 care. *element* is an element instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000411
412
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000413 .. method:: find(match)
Georg Brandl116aa622007-08-15 14:28:22 +0000414
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000415 Finds the first toplevel element matching *match*. *match* may be a tag
416 name or path. Same as getroot().find(match). Returns the first matching
417 element, or ``None`` if no element was found.
Georg Brandl116aa622007-08-15 14:28:22 +0000418
419
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000420 .. method:: findall(match)
Georg Brandl116aa622007-08-15 14:28:22 +0000421
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000422 Finds all matching subelements, by tag name or path. Same as
423 getroot().findall(match). *match* may be a tag name or path. Returns a
424 list containing all matching elements, in document order.
Georg Brandl116aa622007-08-15 14:28:22 +0000425
426
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000427 .. method:: findtext(match, default=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000428
Benjamin Petersone41251e2008-04-25 01:59:09 +0000429 Finds the element text for the first toplevel element with given tag.
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000430 Same as getroot().findtext(match). *match* may be a tag name or path.
431 *default* is the value to return if the element was not found. Returns
432 the text content of the first matching element, or the default value no
433 element was found. Note that if the element is found, but has no text
434 content, this method returns an empty string.
Georg Brandl116aa622007-08-15 14:28:22 +0000435
436
Georg Brandl7f01a132009-09-16 15:58:14 +0000437 .. method:: getiterator(tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000438
Georg Brandl67b21b72010-08-17 15:07:14 +0000439 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000440 Use method :meth:`ElementTree.iter` instead.
Georg Brandl116aa622007-08-15 14:28:22 +0000441
442
Benjamin Petersone41251e2008-04-25 01:59:09 +0000443 .. method:: getroot()
Florent Xiclunac17f1722010-08-08 19:48:29 +0000444
Benjamin Petersone41251e2008-04-25 01:59:09 +0000445 Returns the root element for this tree.
Georg Brandl116aa622007-08-15 14:28:22 +0000446
447
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000448 .. method:: iter(tag=None)
449
450 Creates and returns a tree iterator for the root element. The iterator
451 loops over all elements in this tree, in section order. *tag* is the tag
452 to look for (default is to return all elements)
453
454
455 .. method:: iterfind(match)
456
457 Finds all matching subelements, by tag name or path. Same as
458 getroot().iterfind(match). Returns an iterable yielding all matching
459 elements in document order.
460
Ezio Melottif8754a62010-03-21 07:16:43 +0000461 .. versionadded:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000462
463
Georg Brandl7f01a132009-09-16 15:58:14 +0000464 .. method:: parse(source, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000465
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000466 Loads an external XML section into this element tree. *source* is a file
Antoine Pitrou11cb9612010-09-15 11:11:28 +0000467 name or :term:`file object`. *parser* is an optional parser instance.
468 If not given, the standard XMLParser parser is used. Returns the section
Benjamin Petersone41251e2008-04-25 01:59:09 +0000469 root element.
Georg Brandl116aa622007-08-15 14:28:22 +0000470
471
Serhiy Storchaka03530b92013-01-13 21:58:04 +0200472 .. method:: write(file, encoding="us-ascii", xml_declaration=None, \
473 default_namespace=None, method="xml")
Georg Brandl116aa622007-08-15 14:28:22 +0000474
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000475 Writes the element tree to a file, as XML. *file* is a file name, or a
Serhiy Storchaka03530b92013-01-13 21:58:04 +0200476 :term:`file object` opened for writing. *encoding* [1]_ is the output
477 encoding (default is US-ASCII). Use ``encoding="unicode"`` to write a
478 Unicode string. *xml_declaration* controls if an XML declaration
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000479 should be added to the file. Use False for never, True for always, None
Serhiy Storchaka03530b92013-01-13 21:58:04 +0200480 for only if not US-ASCII or UTF-8 or Unicode (default is None).
481 *default_namespace* sets the default XML namespace (for "xmlns").
482 *method* is either ``"xml"``, ``"html"`` or ``"text"`` (default is
483 ``"xml"``). Returns an (optionally) encoded string.
Georg Brandl116aa622007-08-15 14:28:22 +0000484
Christian Heimesd8654cf2007-12-02 15:22:16 +0000485This is the XML file that is going to be manipulated::
486
487 <html>
488 <head>
489 <title>Example page</title>
490 </head>
491 <body>
Georg Brandl48310cd2009-01-03 21:18:54 +0000492 <p>Moved to <a href="http://example.org/">example.org</a>
Christian Heimesd8654cf2007-12-02 15:22:16 +0000493 or <a href="http://example.com/">example.com</a>.</p>
494 </body>
495 </html>
496
497Example of changing the attribute "target" of every link in first paragraph::
498
499 >>> from xml.etree.ElementTree import ElementTree
500 >>> tree = ElementTree()
501 >>> tree.parse("index.xhtml")
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000502 <Element 'html' at 0xb77e6fac>
Christian Heimesd8654cf2007-12-02 15:22:16 +0000503 >>> p = tree.find("body/p") # Finds first occurrence of tag p in body
504 >>> p
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000505 <Element 'p' at 0xb77ec26c>
506 >>> links = list(p.iter("a")) # Returns list of all links
Christian Heimesd8654cf2007-12-02 15:22:16 +0000507 >>> links
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000508 [<Element 'a' at 0xb77ec2ac>, <Element 'a' at 0xb77ec1cc>]
Christian Heimesd8654cf2007-12-02 15:22:16 +0000509 >>> for i in links: # Iterates through all found links
510 ... i.attrib["target"] = "blank"
511 >>> tree.write("output.xhtml")
Georg Brandl116aa622007-08-15 14:28:22 +0000512
513.. _elementtree-qname-objects:
514
515QName Objects
516-------------
517
518
Georg Brandl7f01a132009-09-16 15:58:14 +0000519.. class:: QName(text_or_uri, tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000520
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000521 QName wrapper. This can be used to wrap a QName attribute value, in order
522 to get proper namespace handling on output. *text_or_uri* is a string
523 containing the QName value, in the form {uri}local, or, if the tag argument
524 is given, the URI part of a QName. If *tag* is given, the first argument is
525 interpreted as an URI, and this argument is interpreted as a local name.
526 :class:`QName` instances are opaque.
Georg Brandl116aa622007-08-15 14:28:22 +0000527
528
529.. _elementtree-treebuilder-objects:
530
531TreeBuilder Objects
532-------------------
533
534
Georg Brandl7f01a132009-09-16 15:58:14 +0000535.. class:: TreeBuilder(element_factory=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000536
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000537 Generic element structure builder. This builder converts a sequence of
538 start, data, and end method calls to a well-formed element structure. You
539 can use this class to build an element structure using a custom XML parser,
540 or a parser for some other XML-like format. The *element_factory* is called
541 to create new :class:`Element` instances when given.
Georg Brandl116aa622007-08-15 14:28:22 +0000542
543
Benjamin Petersone41251e2008-04-25 01:59:09 +0000544 .. method:: close()
Georg Brandl116aa622007-08-15 14:28:22 +0000545
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000546 Flushes the builder buffers, and returns the toplevel document
547 element. Returns an :class:`Element` instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000548
549
Benjamin Petersone41251e2008-04-25 01:59:09 +0000550 .. method:: data(data)
Georg Brandl116aa622007-08-15 14:28:22 +0000551
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000552 Adds text to the current element. *data* is a string. This should be
553 either a bytestring, or a Unicode string.
Georg Brandl116aa622007-08-15 14:28:22 +0000554
555
Benjamin Petersone41251e2008-04-25 01:59:09 +0000556 .. method:: end(tag)
Georg Brandl116aa622007-08-15 14:28:22 +0000557
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000558 Closes the current element. *tag* is the element name. Returns the
559 closed element.
Georg Brandl116aa622007-08-15 14:28:22 +0000560
561
Benjamin Petersone41251e2008-04-25 01:59:09 +0000562 .. method:: start(tag, attrs)
Georg Brandl116aa622007-08-15 14:28:22 +0000563
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000564 Opens a new element. *tag* is the element name. *attrs* is a dictionary
565 containing element attributes. Returns the opened element.
Georg Brandl116aa622007-08-15 14:28:22 +0000566
567
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000568 In addition, a custom :class:`TreeBuilder` object can provide the
569 following method:
Georg Brandl116aa622007-08-15 14:28:22 +0000570
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000571 .. method:: doctype(name, pubid, system)
572
573 Handles a doctype declaration. *name* is the doctype name. *pubid* is
574 the public identifier. *system* is the system identifier. This method
575 does not exist on the default :class:`TreeBuilder` class.
576
Ezio Melottif8754a62010-03-21 07:16:43 +0000577 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000578
579
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000580.. _elementtree-xmlparser-objects:
Georg Brandl116aa622007-08-15 14:28:22 +0000581
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000582XMLParser Objects
583-----------------
584
585
586.. class:: XMLParser(html=0, target=None, encoding=None)
587
588 :class:`Element` structure builder for XML source data, based on the expat
589 parser. *html* are predefined HTML entities. This flag is not supported by
590 the current implementation. *target* is the target object. If omitted, the
591 builder uses an instance of the standard TreeBuilder class. *encoding* [1]_
592 is optional. If given, the value overrides the encoding specified in the
593 XML file.
Georg Brandl116aa622007-08-15 14:28:22 +0000594
595
Benjamin Petersone41251e2008-04-25 01:59:09 +0000596 .. method:: close()
Georg Brandl116aa622007-08-15 14:28:22 +0000597
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000598 Finishes feeding data to the parser. Returns an element structure.
Georg Brandl116aa622007-08-15 14:28:22 +0000599
600
Benjamin Petersone41251e2008-04-25 01:59:09 +0000601 .. method:: doctype(name, pubid, system)
Georg Brandl116aa622007-08-15 14:28:22 +0000602
Georg Brandl67b21b72010-08-17 15:07:14 +0000603 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000604 Define the :meth:`TreeBuilder.doctype` method on a custom TreeBuilder
605 target.
Georg Brandl116aa622007-08-15 14:28:22 +0000606
607
Benjamin Petersone41251e2008-04-25 01:59:09 +0000608 .. method:: feed(data)
Georg Brandl116aa622007-08-15 14:28:22 +0000609
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000610 Feeds data to the parser. *data* is encoded data.
Georg Brandl116aa622007-08-15 14:28:22 +0000611
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000612:meth:`XMLParser.feed` calls *target*\'s :meth:`start` method
Christian Heimesd8654cf2007-12-02 15:22:16 +0000613for each opening tag, its :meth:`end` method for each closing tag,
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000614and data is processed by method :meth:`data`. :meth:`XMLParser.close`
Georg Brandl48310cd2009-01-03 21:18:54 +0000615calls *target*\'s method :meth:`close`.
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000616:class:`XMLParser` can be used not only for building a tree structure.
Christian Heimesd8654cf2007-12-02 15:22:16 +0000617This is an example of counting the maximum depth of an XML file::
618
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000619 >>> from xml.etree.ElementTree import XMLParser
Christian Heimesd8654cf2007-12-02 15:22:16 +0000620 >>> class MaxDepth: # The target object of the parser
621 ... maxDepth = 0
622 ... depth = 0
623 ... def start(self, tag, attrib): # Called for each opening tag.
Georg Brandl48310cd2009-01-03 21:18:54 +0000624 ... self.depth += 1
Christian Heimesd8654cf2007-12-02 15:22:16 +0000625 ... if self.depth > self.maxDepth:
626 ... self.maxDepth = self.depth
627 ... def end(self, tag): # Called for each closing tag.
628 ... self.depth -= 1
Georg Brandl48310cd2009-01-03 21:18:54 +0000629 ... def data(self, data):
Christian Heimesd8654cf2007-12-02 15:22:16 +0000630 ... pass # We do not need to do anything with data.
631 ... def close(self): # Called when all data has been parsed.
632 ... return self.maxDepth
Georg Brandl48310cd2009-01-03 21:18:54 +0000633 ...
Christian Heimesd8654cf2007-12-02 15:22:16 +0000634 >>> target = MaxDepth()
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000635 >>> parser = XMLParser(target=target)
Christian Heimesd8654cf2007-12-02 15:22:16 +0000636 >>> exampleXml = """
637 ... <a>
638 ... <b>
639 ... </b>
640 ... <b>
641 ... <c>
642 ... <d>
643 ... </d>
644 ... </c>
645 ... </b>
646 ... </a>"""
647 >>> parser.feed(exampleXml)
648 >>> parser.close()
649 4
Christian Heimesb186d002008-03-18 15:15:01 +0000650
651
652.. rubric:: Footnotes
653
654.. [#] The encoding string included in XML output should conform to the
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000655 appropriate standards. For example, "UTF-8" is valid, but "UTF8" is
656 not. See http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl
Benjamin Petersonad3d5c22009-02-26 03:38:59 +0000657 and http://www.iana.org/assignments/character-sets.