blob: a40fb65ee6ebda1ae2cb2bb54f39aab7a50e9fe3 [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001:mod:`xml.etree.ElementTree` --- The ElementTree XML API
2========================================================
3
4.. module:: xml.etree.ElementTree
5 :synopsis: Implementation of the ElementTree API.
6.. moduleauthor:: Fredrik Lundh <fredrik@pythonware.com>
7
8
9.. versionadded:: 2.5
10
Éric Araujo29a0b572011-08-19 02:14:03 +020011**Source code:** :source:`Lib/xml/etree/ElementTree.py`
12
13--------------
14
Florent Xicluna583302c2010-03-13 17:56:19 +000015The :class:`Element` type is a flexible container object, designed to store
16hierarchical data structures in memory. The type can be described as a cross
17between a list and a dictionary.
Georg Brandl8ec7f652007-08-15 14:28:01 +000018
19Each element has a number of properties associated with it:
20
21* a tag which is a string identifying what kind of data this element represents
22 (the element type, in other words).
23
24* a number of attributes, stored in a Python dictionary.
25
26* a text string.
27
28* an optional tail string.
29
30* a number of child elements, stored in a Python sequence
31
Florent Xicluna3e8c1892010-03-11 14:36:19 +000032To create an element instance, use the :class:`Element` constructor or the
33:func:`SubElement` factory function.
Georg Brandl8ec7f652007-08-15 14:28:01 +000034
35The :class:`ElementTree` class can be used to wrap an element structure, and
36convert it from and to XML.
37
38A C implementation of this API is available as :mod:`xml.etree.cElementTree`.
39
Georg Brandl39bd0592007-12-01 22:42:46 +000040See http://effbot.org/zone/element-index.htm for tutorials and links to other
Florent Xicluna583302c2010-03-13 17:56:19 +000041docs. Fredrik Lundh's page is also the location of the development version of
42the xml.etree.ElementTree.
43
44.. versionchanged:: 2.7
45 The ElementTree API is updated to 1.3. For more information, see
46 `Introducing ElementTree 1.3
47 <http://effbot.org/zone/elementtree-13-intro.htm>`_.
48
Georg Brandl8ec7f652007-08-15 14:28:01 +000049
50.. _elementtree-functions:
51
52Functions
53---------
54
55
Florent Xiclunaa231e452010-03-13 20:30:15 +000056.. function:: Comment(text=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +000057
Florent Xicluna583302c2010-03-13 17:56:19 +000058 Comment element factory. This factory function creates a special element
59 that will be serialized as an XML comment by the standard serializer. The
60 comment string can be either a bytestring or a Unicode string. *text* is a
61 string containing the comment string. Returns an element instance
62 representing a comment.
Georg Brandl8ec7f652007-08-15 14:28:01 +000063
64
65.. function:: dump(elem)
66
Florent Xicluna583302c2010-03-13 17:56:19 +000067 Writes an element tree or element structure to sys.stdout. This function
68 should be used for debugging only.
Georg Brandl8ec7f652007-08-15 14:28:01 +000069
70 The exact output format is implementation dependent. In this version, it's
71 written as an ordinary XML file.
72
73 *elem* is an element tree or an individual element.
74
75
Georg Brandl8ec7f652007-08-15 14:28:01 +000076.. function:: fromstring(text)
77
Florent Xicluna88db6f42010-03-14 01:22:09 +000078 Parses an XML section from a string constant. Same as :func:`XML`. *text*
79 is a string containing XML data. Returns an :class:`Element` instance.
Georg Brandl8ec7f652007-08-15 14:28:01 +000080
81
Florent Xiclunaa231e452010-03-13 20:30:15 +000082.. function:: fromstringlist(sequence, parser=None)
Florent Xicluna3e8c1892010-03-11 14:36:19 +000083
Florent Xicluna583302c2010-03-13 17:56:19 +000084 Parses an XML document from a sequence of string fragments. *sequence* is a
85 list or other sequence containing XML data fragments. *parser* is an
86 optional parser instance. If not given, the standard :class:`XMLParser`
87 parser is used. Returns an :class:`Element` instance.
Florent Xicluna3e8c1892010-03-11 14:36:19 +000088
89 .. versionadded:: 2.7
90
91
Georg Brandl8ec7f652007-08-15 14:28:01 +000092.. function:: iselement(element)
93
Florent Xicluna583302c2010-03-13 17:56:19 +000094 Checks if an object appears to be a valid element object. *element* is an
95 element instance. Returns a true value if this is an element object.
Georg Brandl8ec7f652007-08-15 14:28:01 +000096
97
Florent Xiclunaa231e452010-03-13 20:30:15 +000098.. function:: iterparse(source, events=None, parser=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +000099
100 Parses an XML section into an element tree incrementally, and reports what's
Florent Xicluna583302c2010-03-13 17:56:19 +0000101 going on to the user. *source* is a filename or file object containing XML
102 data. *events* is a list of events to report back. If omitted, only "end"
103 events are reported. *parser* is an optional parser instance. If not
104 given, the standard :class:`XMLParser` parser is used. Returns an
105 :term:`iterator` providing ``(event, elem)`` pairs.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000106
Georg Brandlfb222632009-01-01 11:46:51 +0000107 .. note::
108
109 :func:`iterparse` only guarantees that it has seen the ">"
110 character of a starting tag when it emits a "start" event, so the
111 attributes are defined, but the contents of the text and tail attributes
112 are undefined at that point. The same applies to the element children;
113 they may or may not be present.
114
115 If you need a fully populated element, look for "end" events instead.
116
Georg Brandl8ec7f652007-08-15 14:28:01 +0000117
Florent Xiclunaa231e452010-03-13 20:30:15 +0000118.. function:: parse(source, parser=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000119
Florent Xicluna583302c2010-03-13 17:56:19 +0000120 Parses an XML section into an element tree. *source* is a filename or file
121 object containing XML data. *parser* is an optional parser instance. If
122 not given, the standard :class:`XMLParser` parser is used. Returns an
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000123 :class:`ElementTree` instance.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000124
125
Florent Xiclunaa231e452010-03-13 20:30:15 +0000126.. function:: ProcessingInstruction(target, text=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000127
Florent Xicluna583302c2010-03-13 17:56:19 +0000128 PI element factory. This factory function creates a special element that
129 will be serialized as an XML processing instruction. *target* is a string
130 containing the PI target. *text* is a string containing the PI contents, if
131 given. Returns an element instance, representing a processing instruction.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000132
133
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000134.. function:: register_namespace(prefix, uri)
135
Florent Xicluna583302c2010-03-13 17:56:19 +0000136 Registers a namespace prefix. The registry is global, and any existing
137 mapping for either the given prefix or the namespace URI will be removed.
138 *prefix* is a namespace prefix. *uri* is a namespace uri. Tags and
139 attributes in this namespace will be serialized with the given prefix, if at
140 all possible.
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000141
142 .. versionadded:: 2.7
143
144
Florent Xicluna88db6f42010-03-14 01:22:09 +0000145.. function:: SubElement(parent, tag, attrib={}, **extra)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000146
Florent Xicluna583302c2010-03-13 17:56:19 +0000147 Subelement factory. This function creates an element instance, and appends
148 it to an existing element.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000149
Florent Xicluna583302c2010-03-13 17:56:19 +0000150 The element name, attribute names, and attribute values can be either
151 bytestrings or Unicode strings. *parent* is the parent element. *tag* is
152 the subelement name. *attrib* is an optional dictionary, containing element
153 attributes. *extra* contains additional attributes, given as keyword
154 arguments. Returns an element instance.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000155
156
Florent Xicluna88db6f42010-03-14 01:22:09 +0000157.. function:: tostring(element, encoding="us-ascii", method="xml")
Georg Brandl8ec7f652007-08-15 14:28:01 +0000158
Florent Xicluna583302c2010-03-13 17:56:19 +0000159 Generates a string representation of an XML element, including all
Florent Xicluna88db6f42010-03-14 01:22:09 +0000160 subelements. *element* is an :class:`Element` instance. *encoding* [1]_ is
161 the output encoding (default is US-ASCII). *method* is either ``"xml"``,
Florent Xiclunaa231e452010-03-13 20:30:15 +0000162 ``"html"`` or ``"text"`` (default is ``"xml"``). Returns an encoded string
163 containing the XML data.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000164
165
Florent Xicluna88db6f42010-03-14 01:22:09 +0000166.. function:: tostringlist(element, encoding="us-ascii", method="xml")
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000167
Florent Xicluna583302c2010-03-13 17:56:19 +0000168 Generates a string representation of an XML element, including all
Florent Xicluna88db6f42010-03-14 01:22:09 +0000169 subelements. *element* is an :class:`Element` instance. *encoding* [1]_ is
170 the output encoding (default is US-ASCII). *method* is either ``"xml"``,
171 ``"html"`` or ``"text"`` (default is ``"xml"``). Returns a list of encoded
172 strings containing the XML data. It does not guarantee any specific
173 sequence, except that ``"".join(tostringlist(element)) ==
174 tostring(element)``.
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000175
176 .. versionadded:: 2.7
177
178
Florent Xiclunaa231e452010-03-13 20:30:15 +0000179.. function:: XML(text, parser=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000180
181 Parses an XML section from a string constant. This function can be used to
Florent Xicluna583302c2010-03-13 17:56:19 +0000182 embed "XML literals" in Python code. *text* is a string containing XML
183 data. *parser* is an optional parser instance. If not given, the standard
184 :class:`XMLParser` parser is used. Returns an :class:`Element` instance.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000185
186
Florent Xiclunaa231e452010-03-13 20:30:15 +0000187.. function:: XMLID(text, parser=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000188
189 Parses an XML section from a string constant, and also returns a dictionary
Florent Xicluna583302c2010-03-13 17:56:19 +0000190 which maps from element id:s to elements. *text* is a string containing XML
191 data. *parser* is an optional parser instance. If not given, the standard
192 :class:`XMLParser` parser is used. Returns a tuple containing an
193 :class:`Element` instance and a dictionary.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000194
195
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000196.. _elementtree-element-objects:
Georg Brandl8ec7f652007-08-15 14:28:01 +0000197
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000198Element Objects
199---------------
Georg Brandl8ec7f652007-08-15 14:28:01 +0000200
Georg Brandl8ec7f652007-08-15 14:28:01 +0000201
Florent Xiclunaa231e452010-03-13 20:30:15 +0000202.. class:: Element(tag, attrib={}, **extra)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000203
Florent Xicluna583302c2010-03-13 17:56:19 +0000204 Element class. This class defines the Element interface, and provides a
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000205 reference implementation of this interface.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000206
Florent Xicluna583302c2010-03-13 17:56:19 +0000207 The element name, attribute names, and attribute values can be either
208 bytestrings or Unicode strings. *tag* is the element name. *attrib* is
209 an optional dictionary, containing element attributes. *extra* contains
210 additional attributes, given as keyword arguments.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000211
212
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000213 .. attribute:: tag
Georg Brandl8ec7f652007-08-15 14:28:01 +0000214
Florent Xicluna583302c2010-03-13 17:56:19 +0000215 A string identifying what kind of data this element represents (the
216 element type, in other words).
Georg Brandl8ec7f652007-08-15 14:28:01 +0000217
218
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000219 .. attribute:: text
Georg Brandl8ec7f652007-08-15 14:28:01 +0000220
Florent Xicluna583302c2010-03-13 17:56:19 +0000221 The *text* attribute can be used to hold additional data associated with
222 the element. As the name implies this attribute is usually a string but
223 may be any application-specific object. If the element is created from
224 an XML file the attribute will contain any text found between the element
225 tags.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000226
227
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000228 .. attribute:: tail
Georg Brandl8ec7f652007-08-15 14:28:01 +0000229
Florent Xicluna583302c2010-03-13 17:56:19 +0000230 The *tail* attribute can be used to hold additional data associated with
231 the element. This attribute is usually a string but may be any
232 application-specific object. If the element is created from an XML file
233 the attribute will contain any text found after the element's end tag and
234 before the next tag.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000235
Georg Brandl8ec7f652007-08-15 14:28:01 +0000236
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000237 .. attribute:: attrib
Georg Brandl8ec7f652007-08-15 14:28:01 +0000238
Florent Xicluna583302c2010-03-13 17:56:19 +0000239 A dictionary containing the element's attributes. Note that while the
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000240 *attrib* value is always a real mutable Python dictionary, an ElementTree
Florent Xicluna583302c2010-03-13 17:56:19 +0000241 implementation may choose to use another internal representation, and
242 create the dictionary only if someone asks for it. To take advantage of
243 such implementations, use the dictionary methods below whenever possible.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000244
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000245 The following dictionary-like methods work on the element attributes.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000246
247
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000248 .. method:: clear()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000249
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000250 Resets an element. This function removes all subelements, clears all
251 attributes, and sets the text and tail attributes to None.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000252
Georg Brandl8ec7f652007-08-15 14:28:01 +0000253
Florent Xiclunaa231e452010-03-13 20:30:15 +0000254 .. method:: get(key, default=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000255
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000256 Gets the element attribute named *key*.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000257
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000258 Returns the attribute value, or *default* if the attribute was not found.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000259
260
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000261 .. method:: items()
262
Florent Xicluna583302c2010-03-13 17:56:19 +0000263 Returns the element attributes as a sequence of (name, value) pairs. The
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000264 attributes are returned in an arbitrary order.
265
266
267 .. method:: keys()
268
Florent Xicluna583302c2010-03-13 17:56:19 +0000269 Returns the elements attribute names as a list. The names are returned
270 in an arbitrary order.
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000271
272
273 .. method:: set(key, value)
274
275 Set the attribute *key* on the element to *value*.
276
277 The following methods work on the element's children (subelements).
278
279
280 .. method:: append(subelement)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000281
Florent Xicluna583302c2010-03-13 17:56:19 +0000282 Adds the element *subelement* to the end of this elements internal list
283 of subelements.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000284
285
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000286 .. method:: extend(subelements)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000287
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000288 Appends *subelements* from a sequence object with zero or more elements.
289 Raises :exc:`AssertionError` if a subelement is not a valid object.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000290
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000291 .. versionadded:: 2.7
Georg Brandl8ec7f652007-08-15 14:28:01 +0000292
293
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000294 .. method:: find(match)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000295
Florent Xicluna583302c2010-03-13 17:56:19 +0000296 Finds the first subelement matching *match*. *match* may be a tag name
297 or path. Returns an element instance or ``None``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000298
299
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000300 .. method:: findall(match)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000301
Florent Xicluna583302c2010-03-13 17:56:19 +0000302 Finds all matching subelements, by tag name or path. Returns a list
303 containing all matching elements in document order.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000304
305
Florent Xiclunaa231e452010-03-13 20:30:15 +0000306 .. method:: findtext(match, default=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000307
Florent Xicluna583302c2010-03-13 17:56:19 +0000308 Finds text for the first subelement matching *match*. *match* may be
309 a tag name or path. Returns the text content of the first matching
310 element, or *default* if no element was found. Note that if the matching
311 element has no text content an empty string is returned.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000312
313
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000314 .. method:: getchildren()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000315
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000316 .. deprecated:: 2.7
317 Use ``list(elem)`` or iteration.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000318
319
Florent Xiclunaa231e452010-03-13 20:30:15 +0000320 .. method:: getiterator(tag=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000321
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000322 .. deprecated:: 2.7
323 Use method :meth:`Element.iter` instead.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000324
325
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000326 .. method:: insert(index, element)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000327
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000328 Inserts a subelement at the given position in this element.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000329
330
Florent Xiclunaa231e452010-03-13 20:30:15 +0000331 .. method:: iter(tag=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000332
Florent Xicluna583302c2010-03-13 17:56:19 +0000333 Creates a tree :term:`iterator` with the current element as the root.
334 The iterator iterates over this element and all elements below it, in
335 document (depth first) order. If *tag* is not ``None`` or ``'*'``, only
336 elements whose tag equals *tag* are returned from the iterator. If the
337 tree structure is modified during iteration, the result is undefined.
338
339
340 .. method:: iterfind(match)
341
342 Finds all matching subelements, by tag name or path. Returns an iterable
343 yielding all matching elements in document order.
344
345 .. versionadded:: 2.7
346
347
348 .. method:: itertext()
349
350 Creates a text iterator. The iterator loops over this element and all
351 subelements, in document order, and returns all inner text.
352
353 .. versionadded:: 2.7
Georg Brandl8ec7f652007-08-15 14:28:01 +0000354
355
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000356 .. method:: makeelement(tag, attrib)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000357
Florent Xicluna583302c2010-03-13 17:56:19 +0000358 Creates a new element object of the same type as this element. Do not
359 call this method, use the :func:`SubElement` factory function instead.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000360
361
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000362 .. method:: remove(subelement)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000363
Florent Xicluna583302c2010-03-13 17:56:19 +0000364 Removes *subelement* from the element. Unlike the find\* methods this
365 method compares elements based on the instance identity, not on tag value
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000366 or contents.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000367
Florent Xicluna583302c2010-03-13 17:56:19 +0000368 :class:`Element` objects also support the following sequence type methods
369 for working with subelements: :meth:`__delitem__`, :meth:`__getitem__`,
370 :meth:`__setitem__`, :meth:`__len__`.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000371
Florent Xicluna583302c2010-03-13 17:56:19 +0000372 Caution: Elements with no subelements will test as ``False``. This behavior
373 will change in future versions. Use specific ``len(elem)`` or ``elem is
374 None`` test instead. ::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000375
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000376 element = root.find('foo')
Georg Brandl8ec7f652007-08-15 14:28:01 +0000377
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000378 if not element: # careful!
379 print "element not found, or element has no subelements"
Georg Brandl8ec7f652007-08-15 14:28:01 +0000380
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000381 if element is None:
382 print "element not found"
Georg Brandl8ec7f652007-08-15 14:28:01 +0000383
384
385.. _elementtree-elementtree-objects:
386
387ElementTree Objects
388-------------------
389
390
Florent Xiclunaa231e452010-03-13 20:30:15 +0000391.. class:: ElementTree(element=None, file=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000392
Florent Xicluna583302c2010-03-13 17:56:19 +0000393 ElementTree wrapper class. This class represents an entire element
394 hierarchy, and adds some extra support for serialization to and from
395 standard XML.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000396
Florent Xicluna583302c2010-03-13 17:56:19 +0000397 *element* is the root element. The tree is initialized with the contents
398 of the XML *file* if given.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000399
400
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000401 .. method:: _setroot(element)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000402
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000403 Replaces the root element for this tree. This discards the current
404 contents of the tree, and replaces it with the given element. Use with
Florent Xicluna583302c2010-03-13 17:56:19 +0000405 care. *element* is an element instance.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000406
407
Florent Xicluna583302c2010-03-13 17:56:19 +0000408 .. method:: find(match)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000409
Florent Xicluna583302c2010-03-13 17:56:19 +0000410 Finds the first toplevel element matching *match*. *match* may be a tag
411 name or path. Same as getroot().find(match). Returns the first matching
412 element, or ``None`` if no element was found.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000413
414
Florent Xicluna583302c2010-03-13 17:56:19 +0000415 .. method:: findall(match)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000416
Florent Xicluna583302c2010-03-13 17:56:19 +0000417 Finds all matching subelements, by tag name or path. Same as
418 getroot().findall(match). *match* may be a tag name or path. Returns a
419 list containing all matching elements, in document order.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000420
421
Florent Xiclunaa231e452010-03-13 20:30:15 +0000422 .. method:: findtext(match, default=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000423
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000424 Finds the element text for the first toplevel element with given tag.
Florent Xicluna583302c2010-03-13 17:56:19 +0000425 Same as getroot().findtext(match). *match* may be a tag name or path.
426 *default* is the value to return if the element was not found. Returns
427 the text content of the first matching element, or the default value no
428 element was found. Note that if the element is found, but has no text
429 content, this method returns an empty string.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000430
431
Florent Xiclunaa231e452010-03-13 20:30:15 +0000432 .. method:: getiterator(tag=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000433
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000434 .. deprecated:: 2.7
435 Use method :meth:`ElementTree.iter` instead.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000436
437
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000438 .. method:: getroot()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000439
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000440 Returns the root element for this tree.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000441
442
Florent Xiclunaa231e452010-03-13 20:30:15 +0000443 .. method:: iter(tag=None)
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000444
445 Creates and returns a tree iterator for the root element. The iterator
Florent Xicluna583302c2010-03-13 17:56:19 +0000446 loops over all elements in this tree, in section order. *tag* is the tag
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000447 to look for (default is to return all elements)
448
449
Florent Xicluna583302c2010-03-13 17:56:19 +0000450 .. method:: iterfind(match)
451
452 Finds all matching subelements, by tag name or path. Same as
453 getroot().iterfind(match). Returns an iterable yielding all matching
454 elements in document order.
455
456 .. versionadded:: 2.7
457
458
Florent Xiclunaa231e452010-03-13 20:30:15 +0000459 .. method:: parse(source, parser=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000460
Florent Xicluna583302c2010-03-13 17:56:19 +0000461 Loads an external XML section into this element tree. *source* is a file
462 name or file object. *parser* is an optional parser instance. If not
463 given, the standard XMLParser parser is used. Returns the section
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000464 root element.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000465
466
Florent Xicluna88db6f42010-03-14 01:22:09 +0000467 .. method:: write(file, encoding="us-ascii", xml_declaration=None, method="xml")
Georg Brandl8ec7f652007-08-15 14:28:01 +0000468
Florent Xicluna583302c2010-03-13 17:56:19 +0000469 Writes the element tree to a file, as XML. *file* is a file name, or a
470 file object opened for writing. *encoding* [1]_ is the output encoding
471 (default is US-ASCII). *xml_declaration* controls if an XML declaration
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000472 should be added to the file. Use False for never, True for always, None
Florent Xiclunaa231e452010-03-13 20:30:15 +0000473 for only if not US-ASCII or UTF-8 (default is None). *method* is either
474 ``"xml"``, ``"html"`` or ``"text"`` (default is ``"xml"``). Returns an
475 encoded string.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000476
Georg Brandl39bd0592007-12-01 22:42:46 +0000477This is the XML file that is going to be manipulated::
478
479 <html>
480 <head>
481 <title>Example page</title>
482 </head>
483 <body>
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000484 <p>Moved to <a href="http://example.org/">example.org</a>
Georg Brandl39bd0592007-12-01 22:42:46 +0000485 or <a href="http://example.com/">example.com</a>.</p>
486 </body>
487 </html>
488
489Example of changing the attribute "target" of every link in first paragraph::
490
491 >>> from xml.etree.ElementTree import ElementTree
492 >>> tree = ElementTree()
493 >>> tree.parse("index.xhtml")
Florent Xicluna583302c2010-03-13 17:56:19 +0000494 <Element 'html' at 0xb77e6fac>
Georg Brandl39bd0592007-12-01 22:42:46 +0000495 >>> p = tree.find("body/p") # Finds first occurrence of tag p in body
496 >>> p
Florent Xicluna583302c2010-03-13 17:56:19 +0000497 <Element 'p' at 0xb77ec26c>
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000498 >>> links = list(p.iter("a")) # Returns list of all links
Georg Brandl39bd0592007-12-01 22:42:46 +0000499 >>> links
Florent Xicluna583302c2010-03-13 17:56:19 +0000500 [<Element 'a' at 0xb77ec2ac>, <Element 'a' at 0xb77ec1cc>]
Georg Brandl39bd0592007-12-01 22:42:46 +0000501 >>> for i in links: # Iterates through all found links
502 ... i.attrib["target"] = "blank"
503 >>> tree.write("output.xhtml")
Georg Brandl8ec7f652007-08-15 14:28:01 +0000504
505.. _elementtree-qname-objects:
506
507QName Objects
508-------------
509
510
Florent Xiclunaa231e452010-03-13 20:30:15 +0000511.. class:: QName(text_or_uri, tag=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000512
Florent Xicluna583302c2010-03-13 17:56:19 +0000513 QName wrapper. This can be used to wrap a QName attribute value, in order
514 to get proper namespace handling on output. *text_or_uri* is a string
515 containing the QName value, in the form {uri}local, or, if the tag argument
516 is given, the URI part of a QName. If *tag* is given, the first argument is
517 interpreted as an URI, and this argument is interpreted as a local name.
518 :class:`QName` instances are opaque.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000519
520
521.. _elementtree-treebuilder-objects:
522
523TreeBuilder Objects
524-------------------
525
526
Florent Xiclunaa231e452010-03-13 20:30:15 +0000527.. class:: TreeBuilder(element_factory=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000528
Florent Xicluna583302c2010-03-13 17:56:19 +0000529 Generic element structure builder. This builder converts a sequence of
530 start, data, and end method calls to a well-formed element structure. You
531 can use this class to build an element structure using a custom XML parser,
532 or a parser for some other XML-like format. The *element_factory* is called
533 to create new :class:`Element` instances when given.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000534
535
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000536 .. method:: close()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000537
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000538 Flushes the builder buffers, and returns the toplevel document
Florent Xicluna583302c2010-03-13 17:56:19 +0000539 element. Returns an :class:`Element` instance.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000540
541
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000542 .. method:: data(data)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000543
Florent Xicluna583302c2010-03-13 17:56:19 +0000544 Adds text to the current element. *data* is a string. This should be
545 either a bytestring, or a Unicode string.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000546
547
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000548 .. method:: end(tag)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000549
Florent Xicluna583302c2010-03-13 17:56:19 +0000550 Closes the current element. *tag* is the element name. Returns the
551 closed element.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000552
553
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000554 .. method:: start(tag, attrs)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000555
Florent Xicluna583302c2010-03-13 17:56:19 +0000556 Opens a new element. *tag* is the element name. *attrs* is a dictionary
557 containing element attributes. Returns the opened element.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000558
559
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000560 In addition, a custom :class:`TreeBuilder` object can provide the
561 following method:
Georg Brandl8ec7f652007-08-15 14:28:01 +0000562
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000563 .. method:: doctype(name, pubid, system)
564
Florent Xicluna583302c2010-03-13 17:56:19 +0000565 Handles a doctype declaration. *name* is the doctype name. *pubid* is
566 the public identifier. *system* is the system identifier. This method
567 does not exist on the default :class:`TreeBuilder` class.
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000568
569 .. versionadded:: 2.7
Georg Brandl8ec7f652007-08-15 14:28:01 +0000570
571
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000572.. _elementtree-xmlparser-objects:
573
574XMLParser Objects
575-----------------
576
577
Florent Xiclunaa231e452010-03-13 20:30:15 +0000578.. class:: XMLParser(html=0, target=None, encoding=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000579
Florent Xicluna583302c2010-03-13 17:56:19 +0000580 :class:`Element` structure builder for XML source data, based on the expat
581 parser. *html* are predefined HTML entities. This flag is not supported by
582 the current implementation. *target* is the target object. If omitted, the
583 builder uses an instance of the standard TreeBuilder class. *encoding* [1]_
584 is optional. If given, the value overrides the encoding specified in the
585 XML file.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000586
587
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000588 .. method:: close()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000589
Florent Xicluna583302c2010-03-13 17:56:19 +0000590 Finishes feeding data to the parser. Returns an element structure.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000591
592
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000593 .. method:: doctype(name, pubid, system)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000594
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000595 .. deprecated:: 2.7
596 Define the :meth:`TreeBuilder.doctype` method on a custom TreeBuilder
597 target.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000598
599
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000600 .. method:: feed(data)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000601
Florent Xicluna583302c2010-03-13 17:56:19 +0000602 Feeds data to the parser. *data* is encoded data.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000603
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000604:meth:`XMLParser.feed` calls *target*\'s :meth:`start` method
Georg Brandl39bd0592007-12-01 22:42:46 +0000605for each opening tag, its :meth:`end` method for each closing tag,
Florent Xicluna583302c2010-03-13 17:56:19 +0000606and data is processed by method :meth:`data`. :meth:`XMLParser.close`
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000607calls *target*\'s method :meth:`close`.
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000608:class:`XMLParser` can be used not only for building a tree structure.
Georg Brandl39bd0592007-12-01 22:42:46 +0000609This is an example of counting the maximum depth of an XML file::
610
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000611 >>> from xml.etree.ElementTree import XMLParser
Georg Brandl39bd0592007-12-01 22:42:46 +0000612 >>> class MaxDepth: # The target object of the parser
613 ... maxDepth = 0
614 ... depth = 0
615 ... def start(self, tag, attrib): # Called for each opening tag.
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000616 ... self.depth += 1
Georg Brandl39bd0592007-12-01 22:42:46 +0000617 ... if self.depth > self.maxDepth:
618 ... self.maxDepth = self.depth
619 ... def end(self, tag): # Called for each closing tag.
620 ... self.depth -= 1
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000621 ... def data(self, data):
Georg Brandl39bd0592007-12-01 22:42:46 +0000622 ... pass # We do not need to do anything with data.
623 ... def close(self): # Called when all data has been parsed.
624 ... return self.maxDepth
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000625 ...
Georg Brandl39bd0592007-12-01 22:42:46 +0000626 >>> target = MaxDepth()
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000627 >>> parser = XMLParser(target=target)
Georg Brandl39bd0592007-12-01 22:42:46 +0000628 >>> exampleXml = """
629 ... <a>
630 ... <b>
631 ... </b>
632 ... <b>
633 ... <c>
634 ... <d>
635 ... </d>
636 ... </c>
637 ... </b>
638 ... </a>"""
639 >>> parser.feed(exampleXml)
640 >>> parser.close()
641 4
Mark Summerfield43da35d2008-03-17 08:28:15 +0000642
643
644.. rubric:: Footnotes
645
646.. [#] The encoding string included in XML output should conform to the
Florent Xicluna583302c2010-03-13 17:56:19 +0000647 appropriate standards. For example, "UTF-8" is valid, but "UTF8" is
648 not. See http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl
Georg Brandl8b8c2df2009-02-20 08:45:47 +0000649 and http://www.iana.org/assignments/character-sets.