blob: def46696908b0b51ec61dbc6519bf18ed5897c50 [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001
2:mod:`xml.etree.ElementTree` --- The ElementTree XML API
3========================================================
4
5.. module:: xml.etree.ElementTree
6 :synopsis: Implementation of the ElementTree API.
7.. moduleauthor:: Fredrik Lundh <fredrik@pythonware.com>
8
9
10.. versionadded:: 2.5
11
Florent Xicluna583302c2010-03-13 17:56:19 +000012The :class:`Element` type is a flexible container object, designed to store
13hierarchical data structures in memory. The type can be described as a cross
14between a list and a dictionary.
Georg Brandl8ec7f652007-08-15 14:28:01 +000015
16Each element has a number of properties associated with it:
17
18* a tag which is a string identifying what kind of data this element represents
19 (the element type, in other words).
20
21* a number of attributes, stored in a Python dictionary.
22
23* a text string.
24
25* an optional tail string.
26
27* a number of child elements, stored in a Python sequence
28
Florent Xicluna3e8c1892010-03-11 14:36:19 +000029To create an element instance, use the :class:`Element` constructor or the
30:func:`SubElement` factory function.
Georg Brandl8ec7f652007-08-15 14:28:01 +000031
32The :class:`ElementTree` class can be used to wrap an element structure, and
33convert it from and to XML.
34
35A C implementation of this API is available as :mod:`xml.etree.cElementTree`.
36
Georg Brandl39bd0592007-12-01 22:42:46 +000037See http://effbot.org/zone/element-index.htm for tutorials and links to other
Florent Xicluna583302c2010-03-13 17:56:19 +000038docs. Fredrik Lundh's page is also the location of the development version of
39the xml.etree.ElementTree.
40
41.. versionchanged:: 2.7
42 The ElementTree API is updated to 1.3. For more information, see
43 `Introducing ElementTree 1.3
44 <http://effbot.org/zone/elementtree-13-intro.htm>`_.
45
Georg Brandl8ec7f652007-08-15 14:28:01 +000046
47.. _elementtree-functions:
48
49Functions
50---------
51
52
Florent Xiclunaa231e452010-03-13 20:30:15 +000053.. function:: Comment(text=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +000054
Florent Xicluna583302c2010-03-13 17:56:19 +000055 Comment element factory. This factory function creates a special element
56 that will be serialized as an XML comment by the standard serializer. The
57 comment string can be either a bytestring or a Unicode string. *text* is a
58 string containing the comment string. Returns an element instance
59 representing a comment.
Georg Brandl8ec7f652007-08-15 14:28:01 +000060
61
62.. function:: dump(elem)
63
Florent Xicluna583302c2010-03-13 17:56:19 +000064 Writes an element tree or element structure to sys.stdout. This function
65 should be used for debugging only.
Georg Brandl8ec7f652007-08-15 14:28:01 +000066
67 The exact output format is implementation dependent. In this version, it's
68 written as an ordinary XML file.
69
70 *elem* is an element tree or an individual element.
71
72
Georg Brandl8ec7f652007-08-15 14:28:01 +000073.. function:: fromstring(text)
74
Florent Xicluna88db6f42010-03-14 01:22:09 +000075 Parses an XML section from a string constant. Same as :func:`XML`. *text*
76 is a string containing XML data. Returns an :class:`Element` instance.
Georg Brandl8ec7f652007-08-15 14:28:01 +000077
78
Florent Xiclunaa231e452010-03-13 20:30:15 +000079.. function:: fromstringlist(sequence, parser=None)
Florent Xicluna3e8c1892010-03-11 14:36:19 +000080
Florent Xicluna583302c2010-03-13 17:56:19 +000081 Parses an XML document from a sequence of string fragments. *sequence* is a
82 list or other sequence containing XML data fragments. *parser* is an
83 optional parser instance. If not given, the standard :class:`XMLParser`
84 parser is used. Returns an :class:`Element` instance.
Florent Xicluna3e8c1892010-03-11 14:36:19 +000085
86 .. versionadded:: 2.7
87
88
Georg Brandl8ec7f652007-08-15 14:28:01 +000089.. function:: iselement(element)
90
Florent Xicluna583302c2010-03-13 17:56:19 +000091 Checks if an object appears to be a valid element object. *element* is an
92 element instance. Returns a true value if this is an element object.
Georg Brandl8ec7f652007-08-15 14:28:01 +000093
94
Florent Xiclunaa231e452010-03-13 20:30:15 +000095.. function:: iterparse(source, events=None, parser=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +000096
97 Parses an XML section into an element tree incrementally, and reports what's
Florent Xicluna583302c2010-03-13 17:56:19 +000098 going on to the user. *source* is a filename or file object containing XML
99 data. *events* is a list of events to report back. If omitted, only "end"
100 events are reported. *parser* is an optional parser instance. If not
101 given, the standard :class:`XMLParser` parser is used. Returns an
102 :term:`iterator` providing ``(event, elem)`` pairs.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000103
Georg Brandlfb222632009-01-01 11:46:51 +0000104 .. note::
105
106 :func:`iterparse` only guarantees that it has seen the ">"
107 character of a starting tag when it emits a "start" event, so the
108 attributes are defined, but the contents of the text and tail attributes
109 are undefined at that point. The same applies to the element children;
110 they may or may not be present.
111
112 If you need a fully populated element, look for "end" events instead.
113
Georg Brandl8ec7f652007-08-15 14:28:01 +0000114
Florent Xiclunaa231e452010-03-13 20:30:15 +0000115.. function:: parse(source, parser=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000116
Florent Xicluna583302c2010-03-13 17:56:19 +0000117 Parses an XML section into an element tree. *source* is a filename or file
118 object containing XML data. *parser* is an optional parser instance. If
119 not given, the standard :class:`XMLParser` parser is used. Returns an
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000120 :class:`ElementTree` instance.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000121
122
Florent Xiclunaa231e452010-03-13 20:30:15 +0000123.. function:: ProcessingInstruction(target, text=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000124
Florent Xicluna583302c2010-03-13 17:56:19 +0000125 PI element factory. This factory function creates a special element that
126 will be serialized as an XML processing instruction. *target* is a string
127 containing the PI target. *text* is a string containing the PI contents, if
128 given. Returns an element instance, representing a processing instruction.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000129
130
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000131.. function:: register_namespace(prefix, uri)
132
Florent Xicluna583302c2010-03-13 17:56:19 +0000133 Registers a namespace prefix. The registry is global, and any existing
134 mapping for either the given prefix or the namespace URI will be removed.
135 *prefix* is a namespace prefix. *uri* is a namespace uri. Tags and
136 attributes in this namespace will be serialized with the given prefix, if at
137 all possible.
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000138
139 .. versionadded:: 2.7
140
141
Florent Xicluna88db6f42010-03-14 01:22:09 +0000142.. function:: SubElement(parent, tag, attrib={}, **extra)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000143
Florent Xicluna583302c2010-03-13 17:56:19 +0000144 Subelement factory. This function creates an element instance, and appends
145 it to an existing element.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000146
Florent Xicluna583302c2010-03-13 17:56:19 +0000147 The element name, attribute names, and attribute values can be either
148 bytestrings or Unicode strings. *parent* is the parent element. *tag* is
149 the subelement name. *attrib* is an optional dictionary, containing element
150 attributes. *extra* contains additional attributes, given as keyword
151 arguments. Returns an element instance.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000152
153
Florent Xicluna88db6f42010-03-14 01:22:09 +0000154.. function:: tostring(element, encoding="us-ascii", method="xml")
Georg Brandl8ec7f652007-08-15 14:28:01 +0000155
Florent Xicluna583302c2010-03-13 17:56:19 +0000156 Generates a string representation of an XML element, including all
Florent Xicluna88db6f42010-03-14 01:22:09 +0000157 subelements. *element* is an :class:`Element` instance. *encoding* [1]_ is
158 the output encoding (default is US-ASCII). *method* is either ``"xml"``,
Florent Xiclunaa231e452010-03-13 20:30:15 +0000159 ``"html"`` or ``"text"`` (default is ``"xml"``). Returns an encoded string
160 containing the XML data.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000161
162
Florent Xicluna88db6f42010-03-14 01:22:09 +0000163.. function:: tostringlist(element, encoding="us-ascii", method="xml")
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000164
Florent Xicluna583302c2010-03-13 17:56:19 +0000165 Generates a string representation of an XML element, including all
Florent Xicluna88db6f42010-03-14 01:22:09 +0000166 subelements. *element* is an :class:`Element` instance. *encoding* [1]_ is
167 the output encoding (default is US-ASCII). *method* is either ``"xml"``,
168 ``"html"`` or ``"text"`` (default is ``"xml"``). Returns a list of encoded
169 strings containing the XML data. It does not guarantee any specific
170 sequence, except that ``"".join(tostringlist(element)) ==
171 tostring(element)``.
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000172
173 .. versionadded:: 2.7
174
175
Florent Xiclunaa231e452010-03-13 20:30:15 +0000176.. function:: XML(text, parser=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000177
178 Parses an XML section from a string constant. This function can be used to
Florent Xicluna583302c2010-03-13 17:56:19 +0000179 embed "XML literals" in Python code. *text* is a string containing XML
180 data. *parser* is an optional parser instance. If not given, the standard
181 :class:`XMLParser` parser is used. Returns an :class:`Element` instance.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000182
183
Florent Xiclunaa231e452010-03-13 20:30:15 +0000184.. function:: XMLID(text, parser=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000185
186 Parses an XML section from a string constant, and also returns a dictionary
Florent Xicluna583302c2010-03-13 17:56:19 +0000187 which maps from element id:s to elements. *text* is a string containing XML
188 data. *parser* is an optional parser instance. If not given, the standard
189 :class:`XMLParser` parser is used. Returns a tuple containing an
190 :class:`Element` instance and a dictionary.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000191
192
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000193.. _elementtree-element-objects:
Georg Brandl8ec7f652007-08-15 14:28:01 +0000194
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000195Element Objects
196---------------
Georg Brandl8ec7f652007-08-15 14:28:01 +0000197
Georg Brandl8ec7f652007-08-15 14:28:01 +0000198
Florent Xiclunaa231e452010-03-13 20:30:15 +0000199.. class:: Element(tag, attrib={}, **extra)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000200
Florent Xicluna583302c2010-03-13 17:56:19 +0000201 Element class. This class defines the Element interface, and provides a
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000202 reference implementation of this interface.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000203
Florent Xicluna583302c2010-03-13 17:56:19 +0000204 The element name, attribute names, and attribute values can be either
205 bytestrings or Unicode strings. *tag* is the element name. *attrib* is
206 an optional dictionary, containing element attributes. *extra* contains
207 additional attributes, given as keyword arguments.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000208
209
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000210 .. attribute:: tag
Georg Brandl8ec7f652007-08-15 14:28:01 +0000211
Florent Xicluna583302c2010-03-13 17:56:19 +0000212 A string identifying what kind of data this element represents (the
213 element type, in other words).
Georg Brandl8ec7f652007-08-15 14:28:01 +0000214
215
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000216 .. attribute:: text
Georg Brandl8ec7f652007-08-15 14:28:01 +0000217
Florent Xicluna583302c2010-03-13 17:56:19 +0000218 The *text* attribute can be used to hold additional data associated with
219 the element. As the name implies this attribute is usually a string but
220 may be any application-specific object. If the element is created from
221 an XML file the attribute will contain any text found between the element
222 tags.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000223
224
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000225 .. attribute:: tail
Georg Brandl8ec7f652007-08-15 14:28:01 +0000226
Florent Xicluna583302c2010-03-13 17:56:19 +0000227 The *tail* attribute can be used to hold additional data associated with
228 the element. This attribute is usually a string but may be any
229 application-specific object. If the element is created from an XML file
230 the attribute will contain any text found after the element's end tag and
231 before the next tag.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000232
Georg Brandl8ec7f652007-08-15 14:28:01 +0000233
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000234 .. attribute:: attrib
Georg Brandl8ec7f652007-08-15 14:28:01 +0000235
Florent Xicluna583302c2010-03-13 17:56:19 +0000236 A dictionary containing the element's attributes. Note that while the
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000237 *attrib* value is always a real mutable Python dictionary, an ElementTree
Florent Xicluna583302c2010-03-13 17:56:19 +0000238 implementation may choose to use another internal representation, and
239 create the dictionary only if someone asks for it. To take advantage of
240 such implementations, use the dictionary methods below whenever possible.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000241
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000242 The following dictionary-like methods work on the element attributes.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000243
244
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000245 .. method:: clear()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000246
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000247 Resets an element. This function removes all subelements, clears all
248 attributes, and sets the text and tail attributes to None.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000249
Georg Brandl8ec7f652007-08-15 14:28:01 +0000250
Florent Xiclunaa231e452010-03-13 20:30:15 +0000251 .. method:: get(key, default=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000252
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000253 Gets the element attribute named *key*.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000254
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000255 Returns the attribute value, or *default* if the attribute was not found.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000256
257
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000258 .. method:: items()
259
Florent Xicluna583302c2010-03-13 17:56:19 +0000260 Returns the element attributes as a sequence of (name, value) pairs. The
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000261 attributes are returned in an arbitrary order.
262
263
264 .. method:: keys()
265
Florent Xicluna583302c2010-03-13 17:56:19 +0000266 Returns the elements attribute names as a list. The names are returned
267 in an arbitrary order.
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000268
269
270 .. method:: set(key, value)
271
272 Set the attribute *key* on the element to *value*.
273
274 The following methods work on the element's children (subelements).
275
276
277 .. method:: append(subelement)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000278
Florent Xicluna583302c2010-03-13 17:56:19 +0000279 Adds the element *subelement* to the end of this elements internal list
280 of subelements.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000281
282
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000283 .. method:: extend(subelements)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000284
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000285 Appends *subelements* from a sequence object with zero or more elements.
286 Raises :exc:`AssertionError` if a subelement is not a valid object.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000287
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000288 .. versionadded:: 2.7
Georg Brandl8ec7f652007-08-15 14:28:01 +0000289
290
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000291 .. method:: find(match)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000292
Florent Xicluna583302c2010-03-13 17:56:19 +0000293 Finds the first subelement matching *match*. *match* may be a tag name
294 or path. Returns an element instance or ``None``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000295
296
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000297 .. method:: findall(match)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000298
Florent Xicluna583302c2010-03-13 17:56:19 +0000299 Finds all matching subelements, by tag name or path. Returns a list
300 containing all matching elements in document order.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000301
302
Florent Xiclunaa231e452010-03-13 20:30:15 +0000303 .. method:: findtext(match, default=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000304
Florent Xicluna583302c2010-03-13 17:56:19 +0000305 Finds text for the first subelement matching *match*. *match* may be
306 a tag name or path. Returns the text content of the first matching
307 element, or *default* if no element was found. Note that if the matching
308 element has no text content an empty string is returned.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000309
310
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000311 .. method:: getchildren()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000312
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000313 .. deprecated:: 2.7
314 Use ``list(elem)`` or iteration.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000315
316
Florent Xiclunaa231e452010-03-13 20:30:15 +0000317 .. method:: getiterator(tag=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000318
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000319 .. deprecated:: 2.7
320 Use method :meth:`Element.iter` instead.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000321
322
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000323 .. method:: insert(index, element)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000324
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000325 Inserts a subelement at the given position in this element.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000326
327
Florent Xiclunaa231e452010-03-13 20:30:15 +0000328 .. method:: iter(tag=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000329
Florent Xicluna583302c2010-03-13 17:56:19 +0000330 Creates a tree :term:`iterator` with the current element as the root.
331 The iterator iterates over this element and all elements below it, in
332 document (depth first) order. If *tag* is not ``None`` or ``'*'``, only
333 elements whose tag equals *tag* are returned from the iterator. If the
334 tree structure is modified during iteration, the result is undefined.
335
336
337 .. method:: iterfind(match)
338
339 Finds all matching subelements, by tag name or path. Returns an iterable
340 yielding all matching elements in document order.
341
342 .. versionadded:: 2.7
343
344
345 .. method:: itertext()
346
347 Creates a text iterator. The iterator loops over this element and all
348 subelements, in document order, and returns all inner text.
349
350 .. versionadded:: 2.7
Georg Brandl8ec7f652007-08-15 14:28:01 +0000351
352
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000353 .. method:: makeelement(tag, attrib)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000354
Florent Xicluna583302c2010-03-13 17:56:19 +0000355 Creates a new element object of the same type as this element. Do not
356 call this method, use the :func:`SubElement` factory function instead.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000357
358
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000359 .. method:: remove(subelement)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000360
Florent Xicluna583302c2010-03-13 17:56:19 +0000361 Removes *subelement* from the element. Unlike the find\* methods this
362 method compares elements based on the instance identity, not on tag value
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000363 or contents.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000364
Florent Xicluna583302c2010-03-13 17:56:19 +0000365 :class:`Element` objects also support the following sequence type methods
366 for working with subelements: :meth:`__delitem__`, :meth:`__getitem__`,
367 :meth:`__setitem__`, :meth:`__len__`.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000368
Florent Xicluna583302c2010-03-13 17:56:19 +0000369 Caution: Elements with no subelements will test as ``False``. This behavior
370 will change in future versions. Use specific ``len(elem)`` or ``elem is
371 None`` test instead. ::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000372
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000373 element = root.find('foo')
Georg Brandl8ec7f652007-08-15 14:28:01 +0000374
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000375 if not element: # careful!
376 print "element not found, or element has no subelements"
Georg Brandl8ec7f652007-08-15 14:28:01 +0000377
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000378 if element is None:
379 print "element not found"
Georg Brandl8ec7f652007-08-15 14:28:01 +0000380
381
382.. _elementtree-elementtree-objects:
383
384ElementTree Objects
385-------------------
386
387
Florent Xiclunaa231e452010-03-13 20:30:15 +0000388.. class:: ElementTree(element=None, file=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000389
Florent Xicluna583302c2010-03-13 17:56:19 +0000390 ElementTree wrapper class. This class represents an entire element
391 hierarchy, and adds some extra support for serialization to and from
392 standard XML.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000393
Florent Xicluna583302c2010-03-13 17:56:19 +0000394 *element* is the root element. The tree is initialized with the contents
395 of the XML *file* if given.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000396
397
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000398 .. method:: _setroot(element)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000399
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000400 Replaces the root element for this tree. This discards the current
401 contents of the tree, and replaces it with the given element. Use with
Florent Xicluna583302c2010-03-13 17:56:19 +0000402 care. *element* is an element instance.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000403
404
Florent Xicluna583302c2010-03-13 17:56:19 +0000405 .. method:: find(match)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000406
Florent Xicluna583302c2010-03-13 17:56:19 +0000407 Finds the first toplevel element matching *match*. *match* may be a tag
408 name or path. Same as getroot().find(match). Returns the first matching
409 element, or ``None`` if no element was found.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000410
411
Florent Xicluna583302c2010-03-13 17:56:19 +0000412 .. method:: findall(match)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000413
Florent Xicluna583302c2010-03-13 17:56:19 +0000414 Finds all matching subelements, by tag name or path. Same as
415 getroot().findall(match). *match* may be a tag name or path. Returns a
416 list containing all matching elements, in document order.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000417
418
Florent Xiclunaa231e452010-03-13 20:30:15 +0000419 .. method:: findtext(match, default=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000420
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000421 Finds the element text for the first toplevel element with given tag.
Florent Xicluna583302c2010-03-13 17:56:19 +0000422 Same as getroot().findtext(match). *match* may be a tag name or path.
423 *default* is the value to return if the element was not found. Returns
424 the text content of the first matching element, or the default value no
425 element was found. Note that if the element is found, but has no text
426 content, this method returns an empty string.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000427
428
Florent Xiclunaa231e452010-03-13 20:30:15 +0000429 .. method:: getiterator(tag=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000430
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000431 .. deprecated:: 2.7
432 Use method :meth:`ElementTree.iter` instead.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000433
434
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000435 .. method:: getroot()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000436
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000437 Returns the root element for this tree.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000438
439
Florent Xiclunaa231e452010-03-13 20:30:15 +0000440 .. method:: iter(tag=None)
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000441
442 Creates and returns a tree iterator for the root element. The iterator
Florent Xicluna583302c2010-03-13 17:56:19 +0000443 loops over all elements in this tree, in section order. *tag* is the tag
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000444 to look for (default is to return all elements)
445
446
Florent Xicluna583302c2010-03-13 17:56:19 +0000447 .. method:: iterfind(match)
448
449 Finds all matching subelements, by tag name or path. Same as
450 getroot().iterfind(match). Returns an iterable yielding all matching
451 elements in document order.
452
453 .. versionadded:: 2.7
454
455
Florent Xiclunaa231e452010-03-13 20:30:15 +0000456 .. method:: parse(source, parser=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000457
Florent Xicluna583302c2010-03-13 17:56:19 +0000458 Loads an external XML section into this element tree. *source* is a file
459 name or file object. *parser* is an optional parser instance. If not
460 given, the standard XMLParser parser is used. Returns the section
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000461 root element.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000462
463
Florent Xicluna88db6f42010-03-14 01:22:09 +0000464 .. method:: write(file, encoding="us-ascii", xml_declaration=None, method="xml")
Georg Brandl8ec7f652007-08-15 14:28:01 +0000465
Florent Xicluna583302c2010-03-13 17:56:19 +0000466 Writes the element tree to a file, as XML. *file* is a file name, or a
467 file object opened for writing. *encoding* [1]_ is the output encoding
468 (default is US-ASCII). *xml_declaration* controls if an XML declaration
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000469 should be added to the file. Use False for never, True for always, None
Florent Xiclunaa231e452010-03-13 20:30:15 +0000470 for only if not US-ASCII or UTF-8 (default is None). *method* is either
471 ``"xml"``, ``"html"`` or ``"text"`` (default is ``"xml"``). Returns an
472 encoded string.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000473
Georg Brandl39bd0592007-12-01 22:42:46 +0000474This is the XML file that is going to be manipulated::
475
476 <html>
477 <head>
478 <title>Example page</title>
479 </head>
480 <body>
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000481 <p>Moved to <a href="http://example.org/">example.org</a>
Georg Brandl39bd0592007-12-01 22:42:46 +0000482 or <a href="http://example.com/">example.com</a>.</p>
483 </body>
484 </html>
485
486Example of changing the attribute "target" of every link in first paragraph::
487
488 >>> from xml.etree.ElementTree import ElementTree
489 >>> tree = ElementTree()
490 >>> tree.parse("index.xhtml")
Florent Xicluna583302c2010-03-13 17:56:19 +0000491 <Element 'html' at 0xb77e6fac>
Georg Brandl39bd0592007-12-01 22:42:46 +0000492 >>> p = tree.find("body/p") # Finds first occurrence of tag p in body
493 >>> p
Florent Xicluna583302c2010-03-13 17:56:19 +0000494 <Element 'p' at 0xb77ec26c>
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000495 >>> links = list(p.iter("a")) # Returns list of all links
Georg Brandl39bd0592007-12-01 22:42:46 +0000496 >>> links
Florent Xicluna583302c2010-03-13 17:56:19 +0000497 [<Element 'a' at 0xb77ec2ac>, <Element 'a' at 0xb77ec1cc>]
Georg Brandl39bd0592007-12-01 22:42:46 +0000498 >>> for i in links: # Iterates through all found links
499 ... i.attrib["target"] = "blank"
500 >>> tree.write("output.xhtml")
Georg Brandl8ec7f652007-08-15 14:28:01 +0000501
502.. _elementtree-qname-objects:
503
504QName Objects
505-------------
506
507
Florent Xiclunaa231e452010-03-13 20:30:15 +0000508.. class:: QName(text_or_uri, tag=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000509
Florent Xicluna583302c2010-03-13 17:56:19 +0000510 QName wrapper. This can be used to wrap a QName attribute value, in order
511 to get proper namespace handling on output. *text_or_uri* is a string
512 containing the QName value, in the form {uri}local, or, if the tag argument
513 is given, the URI part of a QName. If *tag* is given, the first argument is
514 interpreted as an URI, and this argument is interpreted as a local name.
515 :class:`QName` instances are opaque.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000516
517
518.. _elementtree-treebuilder-objects:
519
520TreeBuilder Objects
521-------------------
522
523
Florent Xiclunaa231e452010-03-13 20:30:15 +0000524.. class:: TreeBuilder(element_factory=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000525
Florent Xicluna583302c2010-03-13 17:56:19 +0000526 Generic element structure builder. This builder converts a sequence of
527 start, data, and end method calls to a well-formed element structure. You
528 can use this class to build an element structure using a custom XML parser,
529 or a parser for some other XML-like format. The *element_factory* is called
530 to create new :class:`Element` instances when given.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000531
532
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000533 .. method:: close()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000534
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000535 Flushes the builder buffers, and returns the toplevel document
Florent Xicluna583302c2010-03-13 17:56:19 +0000536 element. Returns an :class:`Element` instance.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000537
538
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000539 .. method:: data(data)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000540
Florent Xicluna583302c2010-03-13 17:56:19 +0000541 Adds text to the current element. *data* is a string. This should be
542 either a bytestring, or a Unicode string.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000543
544
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000545 .. method:: end(tag)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000546
Florent Xicluna583302c2010-03-13 17:56:19 +0000547 Closes the current element. *tag* is the element name. Returns the
548 closed element.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000549
550
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000551 .. method:: start(tag, attrs)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000552
Florent Xicluna583302c2010-03-13 17:56:19 +0000553 Opens a new element. *tag* is the element name. *attrs* is a dictionary
554 containing element attributes. Returns the opened element.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000555
556
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000557 In addition, a custom :class:`TreeBuilder` object can provide the
558 following method:
Georg Brandl8ec7f652007-08-15 14:28:01 +0000559
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000560 .. method:: doctype(name, pubid, system)
561
Florent Xicluna583302c2010-03-13 17:56:19 +0000562 Handles a doctype declaration. *name* is the doctype name. *pubid* is
563 the public identifier. *system* is the system identifier. This method
564 does not exist on the default :class:`TreeBuilder` class.
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000565
566 .. versionadded:: 2.7
Georg Brandl8ec7f652007-08-15 14:28:01 +0000567
568
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000569.. _elementtree-xmlparser-objects:
570
571XMLParser Objects
572-----------------
573
574
Florent Xiclunaa231e452010-03-13 20:30:15 +0000575.. class:: XMLParser(html=0, target=None, encoding=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000576
Florent Xicluna583302c2010-03-13 17:56:19 +0000577 :class:`Element` structure builder for XML source data, based on the expat
578 parser. *html* are predefined HTML entities. This flag is not supported by
579 the current implementation. *target* is the target object. If omitted, the
580 builder uses an instance of the standard TreeBuilder class. *encoding* [1]_
581 is optional. If given, the value overrides the encoding specified in the
582 XML file.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000583
584
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000585 .. method:: close()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000586
Florent Xicluna583302c2010-03-13 17:56:19 +0000587 Finishes feeding data to the parser. Returns an element structure.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000588
589
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000590 .. method:: doctype(name, pubid, system)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000591
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000592 .. deprecated:: 2.7
593 Define the :meth:`TreeBuilder.doctype` method on a custom TreeBuilder
594 target.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000595
596
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000597 .. method:: feed(data)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000598
Florent Xicluna583302c2010-03-13 17:56:19 +0000599 Feeds data to the parser. *data* is encoded data.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000600
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000601:meth:`XMLParser.feed` calls *target*\'s :meth:`start` method
Georg Brandl39bd0592007-12-01 22:42:46 +0000602for each opening tag, its :meth:`end` method for each closing tag,
Florent Xicluna583302c2010-03-13 17:56:19 +0000603and data is processed by method :meth:`data`. :meth:`XMLParser.close`
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000604calls *target*\'s method :meth:`close`.
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000605:class:`XMLParser` can be used not only for building a tree structure.
Georg Brandl39bd0592007-12-01 22:42:46 +0000606This is an example of counting the maximum depth of an XML file::
607
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000608 >>> from xml.etree.ElementTree import XMLParser
Georg Brandl39bd0592007-12-01 22:42:46 +0000609 >>> class MaxDepth: # The target object of the parser
610 ... maxDepth = 0
611 ... depth = 0
612 ... def start(self, tag, attrib): # Called for each opening tag.
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000613 ... self.depth += 1
Georg Brandl39bd0592007-12-01 22:42:46 +0000614 ... if self.depth > self.maxDepth:
615 ... self.maxDepth = self.depth
616 ... def end(self, tag): # Called for each closing tag.
617 ... self.depth -= 1
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000618 ... def data(self, data):
Georg Brandl39bd0592007-12-01 22:42:46 +0000619 ... pass # We do not need to do anything with data.
620 ... def close(self): # Called when all data has been parsed.
621 ... return self.maxDepth
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000622 ...
Georg Brandl39bd0592007-12-01 22:42:46 +0000623 >>> target = MaxDepth()
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000624 >>> parser = XMLParser(target=target)
Georg Brandl39bd0592007-12-01 22:42:46 +0000625 >>> exampleXml = """
626 ... <a>
627 ... <b>
628 ... </b>
629 ... <b>
630 ... <c>
631 ... <d>
632 ... </d>
633 ... </c>
634 ... </b>
635 ... </a>"""
636 >>> parser.feed(exampleXml)
637 >>> parser.close()
638 4
Mark Summerfield43da35d2008-03-17 08:28:15 +0000639
640
641.. rubric:: Footnotes
642
643.. [#] The encoding string included in XML output should conform to the
Florent Xicluna583302c2010-03-13 17:56:19 +0000644 appropriate standards. For example, "UTF-8" is valid, but "UTF8" is
645 not. See http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl
Georg Brandl8b8c2df2009-02-20 08:45:47 +0000646 and http://www.iana.org/assignments/character-sets.