blob: c3738949feed2ec111c7a0c2d6a5342c78a4513d [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`xml.etree.ElementTree` --- The ElementTree XML API
2========================================================
3
4.. module:: xml.etree.ElementTree
5 :synopsis: Implementation of the ElementTree API.
6.. moduleauthor:: Fredrik Lundh <fredrik@pythonware.com>
7
Raymond Hettinger3029aff2011-02-10 08:09:36 +00008**Source code:** :source:`Lib/xml/etree/ElementTree.py`
9
10--------------
Georg Brandl116aa622007-08-15 14:28:22 +000011
Florent Xiclunaf15351d2010-03-13 23:24:31 +000012The :class:`Element` type is a flexible container object, designed to store
13hierarchical data structures in memory. The type can be described as a cross
14between a list and a dictionary.
Georg Brandl116aa622007-08-15 14:28:22 +000015
16Each element has a number of properties associated with it:
17
18* a tag which is a string identifying what kind of data this element represents
19 (the element type, in other words).
20
21* a number of attributes, stored in a Python dictionary.
22
23* a text string.
24
25* an optional tail string.
26
27* a number of child elements, stored in a Python sequence
28
Florent Xiclunaf15351d2010-03-13 23:24:31 +000029To create an element instance, use the :class:`Element` constructor or the
30:func:`SubElement` factory function.
Georg Brandl116aa622007-08-15 14:28:22 +000031
32The :class:`ElementTree` class can be used to wrap an element structure, and
33convert it from and to XML.
34
Christian Heimesd8654cf2007-12-02 15:22:16 +000035See http://effbot.org/zone/element-index.htm for tutorials and links to other
Florent Xiclunaa72a98f2012-02-13 11:03:30 +010036docs.
Florent Xiclunaf15351d2010-03-13 23:24:31 +000037
Ezio Melottif8754a62010-03-21 07:16:43 +000038.. versionchanged:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +000039 The ElementTree API is updated to 1.3. For more information, see
40 `Introducing ElementTree 1.3
41 <http://effbot.org/zone/elementtree-13-intro.htm>`_.
42
Florent Xiclunaa72a98f2012-02-13 11:03:30 +010043.. versionchanged:: 3.3
44 This module will use a fast implementation whenever available.
45 The :mod:`xml.etree.cElementTree` module is deprecated.
46
Georg Brandl116aa622007-08-15 14:28:22 +000047
Eli Bendersky3a4875e2012-03-26 20:43:32 +020048.. _elementtree-xpath:
49
50XPath support
51-------------
52
53This module provides limited support for
54`XPath expressions <http://www.w3.org/TR/xpath>`_ for locating elements in a
55tree. The goal is to support a small subset of the abbreviated syntax; a full
56XPath engine is outside the scope of the module.
57
58Example
59^^^^^^^
60
61Here's an example that demonstrates some of the XPath capabilities of the
62module::
63
64 import xml.etree.ElementTree as ET
65
66 xml = r'''<?xml version="1.0"?>
67 <data>
68 <country name="Liechtenshtein">
69 <rank>1</rank>
70 <year>2008</year>
71 <gdppc>141100</gdppc>
72 <neighbor name="Austria" direction="E"/>
73 <neighbor name="Switzerland" direction="W"/>
74 </country>
75 <country name="Singapore">
76 <rank>4</rank>
77 <year>2011</year>
78 <gdppc>59900</gdppc>
79 <neighbor name="Malaysia" direction="N"/>
80 </country>
81 <country name="Panama">
82 <rank>68</rank>
83 <year>2011</year>
84 <gdppc>13600</gdppc>
85 <neighbor name="Costa Rica" direction="W"/>
86 <neighbor name="Colombia" direction="E"/>
87 </country>
88 </data>
89 '''
90
91 tree = ET.fromstring(xml)
92
93 # Top-level elements
94 tree.findall(".")
95
96 # All 'neighbor' grand-children of 'country' children of the top-level
97 # elements
98 tree.findall("./country/neighbor")
99
100 # Nodes with name='Singapore' that have a 'year' child
101 tree.findall(".//year/..[@name='Singapore']")
102
103 # 'year' nodes that are children of nodes with name='Singapore'
104 tree.findall(".//*[@name='Singapore']/year")
105
106 # All 'neighbor' nodes that are the second child of their parent
107 tree.findall(".//neighbor[2]")
108
109Supported XPath syntax
110^^^^^^^^^^^^^^^^^^^^^^
111
112+-----------------------+------------------------------------------------------+
113| Syntax | Meaning |
114+=======================+======================================================+
115| ``tag`` | Selects all child elements with the given tag. |
116| | For example, ``spam`` selects all child elements |
117| | named ``spam``, ``spam/egg`` selects all |
118| | grandchildren named ``egg`` in all children named |
119| | ``spam``. |
120+-----------------------+------------------------------------------------------+
121| ``*`` | Selects all child elements. For example, ``*/egg`` |
122| | selects all grandchildren named ``egg``. |
123+-----------------------+------------------------------------------------------+
124| ``.`` | Selects the current node. This is mostly useful |
125| | at the beginning of the path, to indicate that it's |
126| | a relative path. |
127+-----------------------+------------------------------------------------------+
128| ``//`` | Selects all subelements, on all levels beneath the |
129| | current element. For example, ``./egg`` selects |
130| | all ``egg`` elements in the entire tree. |
131+-----------------------+------------------------------------------------------+
132| ``..`` | Selects the parent element. |
133+-----------------------+------------------------------------------------------+
134| ``[@attrib]`` | Selects all elements that have the given attribute. |
135+-----------------------+------------------------------------------------------+
136| ``[@attrib='value']`` | Selects all elements for which the given attribute |
137| | has the given value. The value cannot contain |
138| | quotes. |
139+-----------------------+------------------------------------------------------+
140| ``[tag]`` | Selects all elements that have a child named |
141| | ``tag``. Only immediate children are supported. |
142+-----------------------+------------------------------------------------------+
143| ``[position]`` | Selects all elements that are located at the given |
144| | position. The position can be either an integer |
145| | (1 is the first position), the expression ``last()`` |
146| | (for the last position), or a position relative to |
147| | the last position (e.g. ``last()-1``). |
148+-----------------------+------------------------------------------------------+
149
150Predicates (expressions within square brackets) must be preceded by a tag
151name, an asterisk, or another predicate. ``position`` predicates must be
152preceded by a tag name.
153
154Reference
155---------
156
Georg Brandl116aa622007-08-15 14:28:22 +0000157.. _elementtree-functions:
158
159Functions
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200160^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +0000161
162
Georg Brandl7f01a132009-09-16 15:58:14 +0000163.. function:: Comment(text=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000164
Georg Brandlf6945182008-02-01 11:56:49 +0000165 Comment element factory. This factory function creates a special element
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000166 that will be serialized as an XML comment by the standard serializer. The
167 comment string can be either a bytestring or a Unicode string. *text* is a
168 string containing the comment string. Returns an element instance
Georg Brandlf6945182008-02-01 11:56:49 +0000169 representing a comment.
Georg Brandl116aa622007-08-15 14:28:22 +0000170
171
172.. function:: dump(elem)
173
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000174 Writes an element tree or element structure to sys.stdout. This function
175 should be used for debugging only.
Georg Brandl116aa622007-08-15 14:28:22 +0000176
177 The exact output format is implementation dependent. In this version, it's
178 written as an ordinary XML file.
179
180 *elem* is an element tree or an individual element.
181
182
Georg Brandl116aa622007-08-15 14:28:22 +0000183.. function:: fromstring(text)
184
Florent Xiclunadddd5e92010-03-14 01:28:07 +0000185 Parses an XML section from a string constant. Same as :func:`XML`. *text*
186 is a string containing XML data. Returns an :class:`Element` instance.
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000187
188
189.. function:: fromstringlist(sequence, parser=None)
190
191 Parses an XML document from a sequence of string fragments. *sequence* is a
192 list or other sequence containing XML data fragments. *parser* is an
193 optional parser instance. If not given, the standard :class:`XMLParser`
194 parser is used. Returns an :class:`Element` instance.
195
Ezio Melottif8754a62010-03-21 07:16:43 +0000196 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000197
198
199.. function:: iselement(element)
200
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000201 Checks if an object appears to be a valid element object. *element* is an
202 element instance. Returns a true value if this is an element object.
Georg Brandl116aa622007-08-15 14:28:22 +0000203
204
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000205.. function:: iterparse(source, events=None, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000206
207 Parses an XML section into an element tree incrementally, and reports what's
Eli Bendersky604c4ff2012-03-16 08:41:30 +0200208 going on to the user. *source* is a filename or :term:`file object`
209 containing XML data. *events* is a list of events to report back. The
210 supported events are the strings ``"start"``, ``"end"``, ``"start-ns"``
211 and ``"end-ns"`` (the "ns" events are used to get detailed namespace
212 information). If *events* is omitted, only ``"end"`` events are reported.
213 *parser* is an optional parser instance. If not given, the standard
214 :class:`XMLParser` parser is used. Returns an :term:`iterator` providing
215 ``(event, elem)`` pairs.
Georg Brandl116aa622007-08-15 14:28:22 +0000216
Benjamin Peterson75edad02009-01-01 15:05:06 +0000217 .. note::
218
219 :func:`iterparse` only guarantees that it has seen the ">"
220 character of a starting tag when it emits a "start" event, so the
221 attributes are defined, but the contents of the text and tail attributes
222 are undefined at that point. The same applies to the element children;
223 they may or may not be present.
224
225 If you need a fully populated element, look for "end" events instead.
226
Georg Brandl116aa622007-08-15 14:28:22 +0000227
Georg Brandl7f01a132009-09-16 15:58:14 +0000228.. function:: parse(source, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000229
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000230 Parses an XML section into an element tree. *source* is a filename or file
231 object containing XML data. *parser* is an optional parser instance. If
232 not given, the standard :class:`XMLParser` parser is used. Returns an
233 :class:`ElementTree` instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000234
235
Georg Brandl7f01a132009-09-16 15:58:14 +0000236.. function:: ProcessingInstruction(target, text=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000237
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000238 PI element factory. This factory function creates a special element that
239 will be serialized as an XML processing instruction. *target* is a string
240 containing the PI target. *text* is a string containing the PI contents, if
241 given. Returns an element instance, representing a processing instruction.
242
243
244.. function:: register_namespace(prefix, uri)
245
246 Registers a namespace prefix. The registry is global, and any existing
247 mapping for either the given prefix or the namespace URI will be removed.
248 *prefix* is a namespace prefix. *uri* is a namespace uri. Tags and
249 attributes in this namespace will be serialized with the given prefix, if at
250 all possible.
251
Ezio Melottif8754a62010-03-21 07:16:43 +0000252 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000253
254
Georg Brandl7f01a132009-09-16 15:58:14 +0000255.. function:: SubElement(parent, tag, attrib={}, **extra)
Georg Brandl116aa622007-08-15 14:28:22 +0000256
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000257 Subelement factory. This function creates an element instance, and appends
258 it to an existing element.
Georg Brandl116aa622007-08-15 14:28:22 +0000259
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000260 The element name, attribute names, and attribute values can be either
261 bytestrings or Unicode strings. *parent* is the parent element. *tag* is
262 the subelement name. *attrib* is an optional dictionary, containing element
263 attributes. *extra* contains additional attributes, given as keyword
264 arguments. Returns an element instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000265
266
Florent Xiclunac17f1722010-08-08 19:48:29 +0000267.. function:: tostring(element, encoding="us-ascii", method="xml")
Georg Brandl116aa622007-08-15 14:28:22 +0000268
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000269 Generates a string representation of an XML element, including all
Florent Xiclunadddd5e92010-03-14 01:28:07 +0000270 subelements. *element* is an :class:`Element` instance. *encoding* [1]_ is
Florent Xiclunac17f1722010-08-08 19:48:29 +0000271 the output encoding (default is US-ASCII). Use ``encoding="unicode"`` to
272 generate a Unicode string. *method* is either ``"xml"``,
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000273 ``"html"`` or ``"text"`` (default is ``"xml"``). Returns an (optionally)
274 encoded string containing the XML data.
Georg Brandl116aa622007-08-15 14:28:22 +0000275
276
Florent Xiclunac17f1722010-08-08 19:48:29 +0000277.. function:: tostringlist(element, encoding="us-ascii", method="xml")
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000278
279 Generates a string representation of an XML element, including all
Florent Xiclunadddd5e92010-03-14 01:28:07 +0000280 subelements. *element* is an :class:`Element` instance. *encoding* [1]_ is
Florent Xiclunac17f1722010-08-08 19:48:29 +0000281 the output encoding (default is US-ASCII). Use ``encoding="unicode"`` to
282 generate a Unicode string. *method* is either ``"xml"``,
Florent Xiclunadddd5e92010-03-14 01:28:07 +0000283 ``"html"`` or ``"text"`` (default is ``"xml"``). Returns a list of
284 (optionally) encoded strings containing the XML data. It does not guarantee
285 any specific sequence, except that ``"".join(tostringlist(element)) ==
286 tostring(element)``.
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000287
Ezio Melottif8754a62010-03-21 07:16:43 +0000288 .. versionadded:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000289
290
291.. function:: XML(text, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000292
293 Parses an XML section from a string constant. This function can be used to
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000294 embed "XML literals" in Python code. *text* is a string containing XML
295 data. *parser* is an optional parser instance. If not given, the standard
296 :class:`XMLParser` parser is used. Returns an :class:`Element` instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000297
298
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000299.. function:: XMLID(text, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000300
301 Parses an XML section from a string constant, and also returns a dictionary
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000302 which maps from element id:s to elements. *text* is a string containing XML
303 data. *parser* is an optional parser instance. If not given, the standard
304 :class:`XMLParser` parser is used. Returns a tuple containing an
305 :class:`Element` instance and a dictionary.
Georg Brandl116aa622007-08-15 14:28:22 +0000306
307
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000308.. _elementtree-element-objects:
Georg Brandl116aa622007-08-15 14:28:22 +0000309
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000310Element Objects
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200311^^^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +0000312
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000313.. class:: Element(tag, attrib={}, **extra)
Georg Brandl116aa622007-08-15 14:28:22 +0000314
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000315 Element class. This class defines the Element interface, and provides a
316 reference implementation of this interface.
Georg Brandl116aa622007-08-15 14:28:22 +0000317
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000318 The element name, attribute names, and attribute values can be either
319 bytestrings or Unicode strings. *tag* is the element name. *attrib* is
320 an optional dictionary, containing element attributes. *extra* contains
321 additional attributes, given as keyword arguments.
Georg Brandl116aa622007-08-15 14:28:22 +0000322
323
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000324 .. attribute:: tag
Georg Brandl116aa622007-08-15 14:28:22 +0000325
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000326 A string identifying what kind of data this element represents (the
327 element type, in other words).
Georg Brandl116aa622007-08-15 14:28:22 +0000328
329
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000330 .. attribute:: text
Georg Brandl116aa622007-08-15 14:28:22 +0000331
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000332 The *text* attribute can be used to hold additional data associated with
333 the element. As the name implies this attribute is usually a string but
334 may be any application-specific object. If the element is created from
335 an XML file the attribute will contain any text found between the element
336 tags.
Georg Brandl116aa622007-08-15 14:28:22 +0000337
338
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000339 .. attribute:: tail
Georg Brandl116aa622007-08-15 14:28:22 +0000340
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000341 The *tail* attribute can be used to hold additional data associated with
342 the element. This attribute is usually a string but may be any
343 application-specific object. If the element is created from an XML file
344 the attribute will contain any text found after the element's end tag and
345 before the next tag.
Georg Brandl116aa622007-08-15 14:28:22 +0000346
Georg Brandl116aa622007-08-15 14:28:22 +0000347
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000348 .. attribute:: attrib
Georg Brandl116aa622007-08-15 14:28:22 +0000349
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000350 A dictionary containing the element's attributes. Note that while the
351 *attrib* value is always a real mutable Python dictionary, an ElementTree
352 implementation may choose to use another internal representation, and
353 create the dictionary only if someone asks for it. To take advantage of
354 such implementations, use the dictionary methods below whenever possible.
Georg Brandl116aa622007-08-15 14:28:22 +0000355
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000356 The following dictionary-like methods work on the element attributes.
Georg Brandl116aa622007-08-15 14:28:22 +0000357
358
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000359 .. method:: clear()
Georg Brandl116aa622007-08-15 14:28:22 +0000360
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000361 Resets an element. This function removes all subelements, clears all
362 attributes, and sets the text and tail attributes to None.
Georg Brandl116aa622007-08-15 14:28:22 +0000363
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000364
365 .. method:: get(key, default=None)
366
367 Gets the element attribute named *key*.
368
369 Returns the attribute value, or *default* if the attribute was not found.
370
371
372 .. method:: items()
373
374 Returns the element attributes as a sequence of (name, value) pairs. The
375 attributes are returned in an arbitrary order.
376
377
378 .. method:: keys()
379
380 Returns the elements attribute names as a list. The names are returned
381 in an arbitrary order.
382
383
384 .. method:: set(key, value)
385
386 Set the attribute *key* on the element to *value*.
387
388 The following methods work on the element's children (subelements).
389
390
391 .. method:: append(subelement)
392
Eli Bendersky396e8fc2012-03-23 14:24:20 +0200393 Adds the element *subelement* to the end of this element's internal list
394 of subelements. Raises :exc:`TypeError` if *subelement* is not an
395 :class:`Element`.
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000396
397
398 .. method:: extend(subelements)
Georg Brandl116aa622007-08-15 14:28:22 +0000399
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000400 Appends *subelements* from a sequence object with zero or more elements.
Eli Bendersky396e8fc2012-03-23 14:24:20 +0200401 Raises :exc:`TypeError` if a subelement is not an :class:`Element`.
Georg Brandl116aa622007-08-15 14:28:22 +0000402
Ezio Melottif8754a62010-03-21 07:16:43 +0000403 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000404
Georg Brandl116aa622007-08-15 14:28:22 +0000405
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000406 .. method:: find(match)
Georg Brandl116aa622007-08-15 14:28:22 +0000407
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000408 Finds the first subelement matching *match*. *match* may be a tag name
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200409 or a :ref:`path <elementtree-xpath>`. Returns an element instance
410 or ``None``.
Georg Brandl116aa622007-08-15 14:28:22 +0000411
Georg Brandl116aa622007-08-15 14:28:22 +0000412
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000413 .. method:: findall(match)
Georg Brandl116aa622007-08-15 14:28:22 +0000414
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200415 Finds all matching subelements, by tag name or
416 :ref:`path <elementtree-xpath>`. Returns a list containing all matching
417 elements in document order.
Georg Brandl116aa622007-08-15 14:28:22 +0000418
Georg Brandl116aa622007-08-15 14:28:22 +0000419
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000420 .. method:: findtext(match, default=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000421
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000422 Finds text for the first subelement matching *match*. *match* may be
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200423 a tag name or a :ref:`path <elementtree-xpath>`. Returns the text content
424 of the first matching element, or *default* if no element was found.
425 Note that if the matching element has no text content an empty string
426 is returned.
Georg Brandl116aa622007-08-15 14:28:22 +0000427
Georg Brandl116aa622007-08-15 14:28:22 +0000428
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000429 .. method:: getchildren()
Georg Brandl116aa622007-08-15 14:28:22 +0000430
Georg Brandl67b21b72010-08-17 15:07:14 +0000431 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000432 Use ``list(elem)`` or iteration.
Georg Brandl116aa622007-08-15 14:28:22 +0000433
Georg Brandl116aa622007-08-15 14:28:22 +0000434
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000435 .. method:: getiterator(tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000436
Georg Brandl67b21b72010-08-17 15:07:14 +0000437 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000438 Use method :meth:`Element.iter` instead.
Georg Brandl116aa622007-08-15 14:28:22 +0000439
Georg Brandl116aa622007-08-15 14:28:22 +0000440
Eli Bendersky396e8fc2012-03-23 14:24:20 +0200441 .. method:: insert(index, subelement)
Georg Brandl116aa622007-08-15 14:28:22 +0000442
Eli Bendersky396e8fc2012-03-23 14:24:20 +0200443 Inserts *subelement* at the given position in this element. Raises
444 :exc:`TypeError` if *subelement* is not an :class:`Element`.
Georg Brandl116aa622007-08-15 14:28:22 +0000445
Georg Brandl116aa622007-08-15 14:28:22 +0000446
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000447 .. method:: iter(tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000448
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000449 Creates a tree :term:`iterator` with the current element as the root.
450 The iterator iterates over this element and all elements below it, in
451 document (depth first) order. If *tag* is not ``None`` or ``'*'``, only
452 elements whose tag equals *tag* are returned from the iterator. If the
453 tree structure is modified during iteration, the result is undefined.
Georg Brandl116aa622007-08-15 14:28:22 +0000454
Ezio Melotti138fc892011-10-10 00:02:03 +0300455 .. versionadded:: 3.2
456
Georg Brandl116aa622007-08-15 14:28:22 +0000457
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000458 .. method:: iterfind(match)
Georg Brandl116aa622007-08-15 14:28:22 +0000459
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200460 Finds all matching subelements, by tag name or
461 :ref:`path <elementtree-xpath>`. Returns an iterable yielding all
462 matching elements in document order.
Georg Brandl116aa622007-08-15 14:28:22 +0000463
Ezio Melottif8754a62010-03-21 07:16:43 +0000464 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000465
Georg Brandl116aa622007-08-15 14:28:22 +0000466
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000467 .. method:: itertext()
Georg Brandl116aa622007-08-15 14:28:22 +0000468
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000469 Creates a text iterator. The iterator loops over this element and all
470 subelements, in document order, and returns all inner text.
Georg Brandl116aa622007-08-15 14:28:22 +0000471
Ezio Melottif8754a62010-03-21 07:16:43 +0000472 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000473
474
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000475 .. method:: makeelement(tag, attrib)
Georg Brandl116aa622007-08-15 14:28:22 +0000476
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000477 Creates a new element object of the same type as this element. Do not
478 call this method, use the :func:`SubElement` factory function instead.
Georg Brandl116aa622007-08-15 14:28:22 +0000479
480
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000481 .. method:: remove(subelement)
Georg Brandl116aa622007-08-15 14:28:22 +0000482
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000483 Removes *subelement* from the element. Unlike the find\* methods this
484 method compares elements based on the instance identity, not on tag value
485 or contents.
Georg Brandl116aa622007-08-15 14:28:22 +0000486
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000487 :class:`Element` objects also support the following sequence type methods
488 for working with subelements: :meth:`__delitem__`, :meth:`__getitem__`,
489 :meth:`__setitem__`, :meth:`__len__`.
Georg Brandl116aa622007-08-15 14:28:22 +0000490
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000491 Caution: Elements with no subelements will test as ``False``. This behavior
492 will change in future versions. Use specific ``len(elem)`` or ``elem is
493 None`` test instead. ::
Georg Brandl116aa622007-08-15 14:28:22 +0000494
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000495 element = root.find('foo')
Georg Brandl116aa622007-08-15 14:28:22 +0000496
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000497 if not element: # careful!
498 print("element not found, or element has no subelements")
Georg Brandl116aa622007-08-15 14:28:22 +0000499
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000500 if element is None:
501 print("element not found")
Georg Brandl116aa622007-08-15 14:28:22 +0000502
503
504.. _elementtree-elementtree-objects:
505
506ElementTree Objects
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200507^^^^^^^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +0000508
509
Georg Brandl7f01a132009-09-16 15:58:14 +0000510.. class:: ElementTree(element=None, file=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000511
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000512 ElementTree wrapper class. This class represents an entire element
513 hierarchy, and adds some extra support for serialization to and from
514 standard XML.
Georg Brandl116aa622007-08-15 14:28:22 +0000515
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000516 *element* is the root element. The tree is initialized with the contents
517 of the XML *file* if given.
Georg Brandl116aa622007-08-15 14:28:22 +0000518
519
Benjamin Petersone41251e2008-04-25 01:59:09 +0000520 .. method:: _setroot(element)
Georg Brandl116aa622007-08-15 14:28:22 +0000521
Benjamin Petersone41251e2008-04-25 01:59:09 +0000522 Replaces the root element for this tree. This discards the current
523 contents of the tree, and replaces it with the given element. Use with
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000524 care. *element* is an element instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000525
526
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000527 .. method:: find(match)
Georg Brandl116aa622007-08-15 14:28:22 +0000528
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200529 Same as :meth:`Element.find`, starting at the root of the tree.
Georg Brandl116aa622007-08-15 14:28:22 +0000530
531
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000532 .. method:: findall(match)
Georg Brandl116aa622007-08-15 14:28:22 +0000533
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200534 Same as :meth:`Element.findall`, starting at the root of the tree.
Georg Brandl116aa622007-08-15 14:28:22 +0000535
536
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000537 .. method:: findtext(match, default=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000538
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200539 Same as :meth:`Element.findtext`, starting at the root of the tree.
Georg Brandl116aa622007-08-15 14:28:22 +0000540
541
Georg Brandl7f01a132009-09-16 15:58:14 +0000542 .. method:: getiterator(tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000543
Georg Brandl67b21b72010-08-17 15:07:14 +0000544 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000545 Use method :meth:`ElementTree.iter` instead.
Georg Brandl116aa622007-08-15 14:28:22 +0000546
547
Benjamin Petersone41251e2008-04-25 01:59:09 +0000548 .. method:: getroot()
Florent Xiclunac17f1722010-08-08 19:48:29 +0000549
Benjamin Petersone41251e2008-04-25 01:59:09 +0000550 Returns the root element for this tree.
Georg Brandl116aa622007-08-15 14:28:22 +0000551
552
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000553 .. method:: iter(tag=None)
554
555 Creates and returns a tree iterator for the root element. The iterator
556 loops over all elements in this tree, in section order. *tag* is the tag
557 to look for (default is to return all elements)
558
559
560 .. method:: iterfind(match)
561
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200562 Same as :meth:`Element.iterfind`, starting at the root of the tree.
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000563
Ezio Melottif8754a62010-03-21 07:16:43 +0000564 .. versionadded:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000565
566
Georg Brandl7f01a132009-09-16 15:58:14 +0000567 .. method:: parse(source, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000568
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000569 Loads an external XML section into this element tree. *source* is a file
Antoine Pitrou11cb9612010-09-15 11:11:28 +0000570 name or :term:`file object`. *parser* is an optional parser instance.
571 If not given, the standard XMLParser parser is used. Returns the section
Benjamin Petersone41251e2008-04-25 01:59:09 +0000572 root element.
Georg Brandl116aa622007-08-15 14:28:22 +0000573
574
Florent Xiclunac17f1722010-08-08 19:48:29 +0000575 .. method:: write(file, encoding="us-ascii", xml_declaration=None, method="xml")
Georg Brandl116aa622007-08-15 14:28:22 +0000576
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000577 Writes the element tree to a file, as XML. *file* is a file name, or a
Antoine Pitrou11cb9612010-09-15 11:11:28 +0000578 :term:`file object` opened for writing. *encoding* [1]_ is the output encoding
Florent Xiclunac17f1722010-08-08 19:48:29 +0000579 (default is US-ASCII). Use ``encoding="unicode"`` to write a Unicode string.
580 *xml_declaration* controls if an XML declaration
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000581 should be added to the file. Use False for never, True for always, None
Florent Xiclunac17f1722010-08-08 19:48:29 +0000582 for only if not US-ASCII or UTF-8 or Unicode (default is None). *method* is
583 either ``"xml"``, ``"html"`` or ``"text"`` (default is ``"xml"``).
584 Returns an (optionally) encoded string.
Georg Brandl116aa622007-08-15 14:28:22 +0000585
Christian Heimesd8654cf2007-12-02 15:22:16 +0000586This is the XML file that is going to be manipulated::
587
588 <html>
589 <head>
590 <title>Example page</title>
591 </head>
592 <body>
Georg Brandl48310cd2009-01-03 21:18:54 +0000593 <p>Moved to <a href="http://example.org/">example.org</a>
Christian Heimesd8654cf2007-12-02 15:22:16 +0000594 or <a href="http://example.com/">example.com</a>.</p>
595 </body>
596 </html>
597
598Example of changing the attribute "target" of every link in first paragraph::
599
600 >>> from xml.etree.ElementTree import ElementTree
601 >>> tree = ElementTree()
602 >>> tree.parse("index.xhtml")
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000603 <Element 'html' at 0xb77e6fac>
Christian Heimesd8654cf2007-12-02 15:22:16 +0000604 >>> p = tree.find("body/p") # Finds first occurrence of tag p in body
605 >>> p
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000606 <Element 'p' at 0xb77ec26c>
607 >>> links = list(p.iter("a")) # Returns list of all links
Christian Heimesd8654cf2007-12-02 15:22:16 +0000608 >>> links
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000609 [<Element 'a' at 0xb77ec2ac>, <Element 'a' at 0xb77ec1cc>]
Christian Heimesd8654cf2007-12-02 15:22:16 +0000610 >>> for i in links: # Iterates through all found links
611 ... i.attrib["target"] = "blank"
612 >>> tree.write("output.xhtml")
Georg Brandl116aa622007-08-15 14:28:22 +0000613
614.. _elementtree-qname-objects:
615
616QName Objects
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200617^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +0000618
619
Georg Brandl7f01a132009-09-16 15:58:14 +0000620.. class:: QName(text_or_uri, tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000621
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000622 QName wrapper. This can be used to wrap a QName attribute value, in order
623 to get proper namespace handling on output. *text_or_uri* is a string
624 containing the QName value, in the form {uri}local, or, if the tag argument
625 is given, the URI part of a QName. If *tag* is given, the first argument is
626 interpreted as an URI, and this argument is interpreted as a local name.
627 :class:`QName` instances are opaque.
Georg Brandl116aa622007-08-15 14:28:22 +0000628
629
630.. _elementtree-treebuilder-objects:
631
632TreeBuilder Objects
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200633^^^^^^^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +0000634
635
Georg Brandl7f01a132009-09-16 15:58:14 +0000636.. class:: TreeBuilder(element_factory=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000637
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000638 Generic element structure builder. This builder converts a sequence of
639 start, data, and end method calls to a well-formed element structure. You
640 can use this class to build an element structure using a custom XML parser,
641 or a parser for some other XML-like format. The *element_factory* is called
642 to create new :class:`Element` instances when given.
Georg Brandl116aa622007-08-15 14:28:22 +0000643
644
Benjamin Petersone41251e2008-04-25 01:59:09 +0000645 .. method:: close()
Georg Brandl116aa622007-08-15 14:28:22 +0000646
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000647 Flushes the builder buffers, and returns the toplevel document
648 element. Returns an :class:`Element` instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000649
650
Benjamin Petersone41251e2008-04-25 01:59:09 +0000651 .. method:: data(data)
Georg Brandl116aa622007-08-15 14:28:22 +0000652
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000653 Adds text to the current element. *data* is a string. This should be
654 either a bytestring, or a Unicode string.
Georg Brandl116aa622007-08-15 14:28:22 +0000655
656
Benjamin Petersone41251e2008-04-25 01:59:09 +0000657 .. method:: end(tag)
Georg Brandl116aa622007-08-15 14:28:22 +0000658
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000659 Closes the current element. *tag* is the element name. Returns the
660 closed element.
Georg Brandl116aa622007-08-15 14:28:22 +0000661
662
Benjamin Petersone41251e2008-04-25 01:59:09 +0000663 .. method:: start(tag, attrs)
Georg Brandl116aa622007-08-15 14:28:22 +0000664
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000665 Opens a new element. *tag* is the element name. *attrs* is a dictionary
666 containing element attributes. Returns the opened element.
Georg Brandl116aa622007-08-15 14:28:22 +0000667
668
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000669 In addition, a custom :class:`TreeBuilder` object can provide the
670 following method:
Georg Brandl116aa622007-08-15 14:28:22 +0000671
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000672 .. method:: doctype(name, pubid, system)
673
674 Handles a doctype declaration. *name* is the doctype name. *pubid* is
675 the public identifier. *system* is the system identifier. This method
676 does not exist on the default :class:`TreeBuilder` class.
677
Ezio Melottif8754a62010-03-21 07:16:43 +0000678 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000679
680
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000681.. _elementtree-xmlparser-objects:
Georg Brandl116aa622007-08-15 14:28:22 +0000682
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000683XMLParser Objects
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200684^^^^^^^^^^^^^^^^^
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000685
686
687.. class:: XMLParser(html=0, target=None, encoding=None)
688
689 :class:`Element` structure builder for XML source data, based on the expat
690 parser. *html* are predefined HTML entities. This flag is not supported by
691 the current implementation. *target* is the target object. If omitted, the
692 builder uses an instance of the standard TreeBuilder class. *encoding* [1]_
693 is optional. If given, the value overrides the encoding specified in the
694 XML file.
Georg Brandl116aa622007-08-15 14:28:22 +0000695
696
Benjamin Petersone41251e2008-04-25 01:59:09 +0000697 .. method:: close()
Georg Brandl116aa622007-08-15 14:28:22 +0000698
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000699 Finishes feeding data to the parser. Returns an element structure.
Georg Brandl116aa622007-08-15 14:28:22 +0000700
701
Benjamin Petersone41251e2008-04-25 01:59:09 +0000702 .. method:: doctype(name, pubid, system)
Georg Brandl116aa622007-08-15 14:28:22 +0000703
Georg Brandl67b21b72010-08-17 15:07:14 +0000704 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000705 Define the :meth:`TreeBuilder.doctype` method on a custom TreeBuilder
706 target.
Georg Brandl116aa622007-08-15 14:28:22 +0000707
708
Benjamin Petersone41251e2008-04-25 01:59:09 +0000709 .. method:: feed(data)
Georg Brandl116aa622007-08-15 14:28:22 +0000710
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000711 Feeds data to the parser. *data* is encoded data.
Georg Brandl116aa622007-08-15 14:28:22 +0000712
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000713:meth:`XMLParser.feed` calls *target*\'s :meth:`start` method
Christian Heimesd8654cf2007-12-02 15:22:16 +0000714for each opening tag, its :meth:`end` method for each closing tag,
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000715and data is processed by method :meth:`data`. :meth:`XMLParser.close`
Georg Brandl48310cd2009-01-03 21:18:54 +0000716calls *target*\'s method :meth:`close`.
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000717:class:`XMLParser` can be used not only for building a tree structure.
Christian Heimesd8654cf2007-12-02 15:22:16 +0000718This is an example of counting the maximum depth of an XML file::
719
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000720 >>> from xml.etree.ElementTree import XMLParser
Christian Heimesd8654cf2007-12-02 15:22:16 +0000721 >>> class MaxDepth: # The target object of the parser
722 ... maxDepth = 0
723 ... depth = 0
724 ... def start(self, tag, attrib): # Called for each opening tag.
Georg Brandl48310cd2009-01-03 21:18:54 +0000725 ... self.depth += 1
Christian Heimesd8654cf2007-12-02 15:22:16 +0000726 ... if self.depth > self.maxDepth:
727 ... self.maxDepth = self.depth
728 ... def end(self, tag): # Called for each closing tag.
729 ... self.depth -= 1
Georg Brandl48310cd2009-01-03 21:18:54 +0000730 ... def data(self, data):
Christian Heimesd8654cf2007-12-02 15:22:16 +0000731 ... pass # We do not need to do anything with data.
732 ... def close(self): # Called when all data has been parsed.
733 ... return self.maxDepth
Georg Brandl48310cd2009-01-03 21:18:54 +0000734 ...
Christian Heimesd8654cf2007-12-02 15:22:16 +0000735 >>> target = MaxDepth()
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000736 >>> parser = XMLParser(target=target)
Christian Heimesd8654cf2007-12-02 15:22:16 +0000737 >>> exampleXml = """
738 ... <a>
739 ... <b>
740 ... </b>
741 ... <b>
742 ... <c>
743 ... <d>
744 ... </d>
745 ... </c>
746 ... </b>
747 ... </a>"""
748 >>> parser.feed(exampleXml)
749 >>> parser.close()
750 4
Christian Heimesb186d002008-03-18 15:15:01 +0000751
Eli Bendersky5b77d812012-03-16 08:20:05 +0200752Exceptions
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200753^^^^^^^^^^
Eli Bendersky5b77d812012-03-16 08:20:05 +0200754
755.. class:: ParseError
756
757 XML parse error, raised by the various parsing methods in this module when
758 parsing fails. The string representation of an instance of this exception
759 will contain a user-friendly error message. In addition, it will have
760 the following attributes available:
761
762 .. attribute:: code
763
764 A numeric error code from the expat parser. See the documentation of
765 :mod:`xml.parsers.expat` for the list of error codes and their meanings.
766
767 .. attribute:: position
768
769 A tuple of *line*, *column* numbers, specifying where the error occurred.
Christian Heimesb186d002008-03-18 15:15:01 +0000770
771.. rubric:: Footnotes
772
773.. [#] The encoding string included in XML output should conform to the
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000774 appropriate standards. For example, "UTF-8" is valid, but "UTF8" is
775 not. See http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl
Benjamin Petersonad3d5c22009-02-26 03:38:59 +0000776 and http://www.iana.org/assignments/character-sets.