blob: e6ea004b577f7775e37b342f4e591d162a7853a0 [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001:mod:`xml.etree.ElementTree` --- The ElementTree XML API
2========================================================
3
4.. module:: xml.etree.ElementTree
5 :synopsis: Implementation of the ElementTree API.
6.. moduleauthor:: Fredrik Lundh <fredrik@pythonware.com>
7
8
9.. versionadded:: 2.5
10
Éric Araujo29a0b572011-08-19 02:14:03 +020011**Source code:** :source:`Lib/xml/etree/ElementTree.py`
12
13--------------
14
Florent Xicluna583302c2010-03-13 17:56:19 +000015The :class:`Element` type is a flexible container object, designed to store
16hierarchical data structures in memory. The type can be described as a cross
17between a list and a dictionary.
Georg Brandl8ec7f652007-08-15 14:28:01 +000018
Christian Heimes23790b42013-03-26 17:53:05 +010019
20.. warning::
21
22 The :mod:`xml.etree.ElementTree` module is not secure against
23 maliciously constructed data. If you need to parse untrusted or
24 unauthenticated data see :ref:`xml-vulnerabilities`.
25
26
Georg Brandl8ec7f652007-08-15 14:28:01 +000027Each element has a number of properties associated with it:
28
29* a tag which is a string identifying what kind of data this element represents
30 (the element type, in other words).
31
32* a number of attributes, stored in a Python dictionary.
33
34* a text string.
35
36* an optional tail string.
37
38* a number of child elements, stored in a Python sequence
39
Florent Xicluna3e8c1892010-03-11 14:36:19 +000040To create an element instance, use the :class:`Element` constructor or the
41:func:`SubElement` factory function.
Georg Brandl8ec7f652007-08-15 14:28:01 +000042
43The :class:`ElementTree` class can be used to wrap an element structure, and
44convert it from and to XML.
45
46A C implementation of this API is available as :mod:`xml.etree.cElementTree`.
47
Georg Brandl39bd0592007-12-01 22:42:46 +000048See http://effbot.org/zone/element-index.htm for tutorials and links to other
Florent Xicluna583302c2010-03-13 17:56:19 +000049docs. Fredrik Lundh's page is also the location of the development version of
50the xml.etree.ElementTree.
51
52.. versionchanged:: 2.7
53 The ElementTree API is updated to 1.3. For more information, see
54 `Introducing ElementTree 1.3
55 <http://effbot.org/zone/elementtree-13-intro.htm>`_.
56
Eli Bendersky6ee21872012-08-18 05:40:38 +030057Tutorial
58--------
59
60This is a short tutorial for using :mod:`xml.etree.ElementTree` (``ET`` in
61short). The goal is to demonstrate some of the building blocks and basic
62concepts of the module.
63
64XML tree and elements
65^^^^^^^^^^^^^^^^^^^^^
66
67XML is an inherently hierarchical data format, and the most natural way to
68represent it is with a tree. ``ET`` has two classes for this purpose -
69:class:`ElementTree` represents the whole XML document as a tree, and
70:class:`Element` represents a single node in this tree. Interactions with
71the whole document (reading and writing to/from files) are usually done
72on the :class:`ElementTree` level. Interactions with a single XML element
73and its sub-elements are done on the :class:`Element` level.
74
75.. _elementtree-parsing-xml:
76
77Parsing XML
78^^^^^^^^^^^
79
80We'll be using the following XML document as the sample data for this section:
81
82.. code-block:: xml
83
84 <?xml version="1.0"?>
85 <data>
86 <country name="Liechtenstein">
87 <rank>1</rank>
88 <year>2008</year>
89 <gdppc>141100</gdppc>
90 <neighbor name="Austria" direction="E"/>
91 <neighbor name="Switzerland" direction="W"/>
92 </country>
93 <country name="Singapore">
94 <rank>4</rank>
95 <year>2011</year>
96 <gdppc>59900</gdppc>
97 <neighbor name="Malaysia" direction="N"/>
98 </country>
99 <country name="Panama">
100 <rank>68</rank>
101 <year>2011</year>
102 <gdppc>13600</gdppc>
103 <neighbor name="Costa Rica" direction="W"/>
104 <neighbor name="Colombia" direction="E"/>
105 </country>
106 </data>
107
108We have a number of ways to import the data. Reading the file from disk::
109
110 import xml.etree.ElementTree as ET
111 tree = ET.parse('country_data.xml')
112 root = tree.getroot()
113
114Reading the data from a string::
115
116 root = ET.fromstring(country_data_as_string)
117
118:func:`fromstring` parses XML from a string directly into an :class:`Element`,
119which is the root element of the parsed tree. Other parsing functions may
120create an :class:`ElementTree`. Check the documentation to be sure.
121
122As an :class:`Element`, ``root`` has a tag and a dictionary of attributes::
123
124 >>> root.tag
125 'data'
126 >>> root.attrib
127 {}
128
129It also has children nodes over which we can iterate::
130
131 >>> for child in root:
132 ... print child.tag, child.attrib
133 ...
134 country {'name': 'Liechtenstein'}
135 country {'name': 'Singapore'}
136 country {'name': 'Panama'}
137
138Children are nested, and we can access specific child nodes by index::
139
140 >>> root[0][1].text
141 '2008'
142
143Finding interesting elements
144^^^^^^^^^^^^^^^^^^^^^^^^^^^^
145
146:class:`Element` has some useful methods that help iterate recursively over all
147the sub-tree below it (its children, their children, and so on). For example,
148:meth:`Element.iter`::
149
150 >>> for neighbor in root.iter('neighbor'):
151 ... print neighbor.attrib
152 ...
153 {'name': 'Austria', 'direction': 'E'}
154 {'name': 'Switzerland', 'direction': 'W'}
155 {'name': 'Malaysia', 'direction': 'N'}
156 {'name': 'Costa Rica', 'direction': 'W'}
157 {'name': 'Colombia', 'direction': 'E'}
158
159:meth:`Element.findall` finds only elements with a tag which are direct
160children of the current element. :meth:`Element.find` finds the *first* child
161with a particular tag, and :meth:`Element.text` accesses the element's text
162content. :meth:`Element.get` accesses the element's attributes::
163
164 >>> for country in root.findall('country'):
165 ... rank = country.find('rank').text
166 ... name = country.get('name')
167 ... print name, rank
168 ...
169 Liechtenstein 1
170 Singapore 4
171 Panama 68
172
173More sophisticated specification of which elements to look for is possible by
174using :ref:`XPath <elementtree-xpath>`.
175
176Modifying an XML File
177^^^^^^^^^^^^^^^^^^^^^
178
179:class:`ElementTree` provides a simple way to build XML documents and write them to files.
180The :meth:`ElementTree.write` method serves this purpose.
181
182Once created, an :class:`Element` object may be manipulated by directly changing
183its fields (such as :attr:`Element.text`), adding and modifying attributes
184(:meth:`Element.set` method), as well as adding new children (for example
185with :meth:`Element.append`).
186
187Let's say we want to add one to each country's rank, and add an ``updated``
188attribute to the rank element::
189
190 >>> for rank in root.iter('rank'):
191 ... new_rank = int(rank.text) + 1
192 ... rank.text = str(new_rank)
193 ... rank.set('updated', 'yes')
194 ...
195 >>> tree.write('output.xml')
196
197Our XML now looks like this:
198
199.. code-block:: xml
200
201 <?xml version="1.0"?>
202 <data>
203 <country name="Liechtenstein">
204 <rank updated="yes">2</rank>
205 <year>2008</year>
206 <gdppc>141100</gdppc>
207 <neighbor name="Austria" direction="E"/>
208 <neighbor name="Switzerland" direction="W"/>
209 </country>
210 <country name="Singapore">
211 <rank updated="yes">5</rank>
212 <year>2011</year>
213 <gdppc>59900</gdppc>
214 <neighbor name="Malaysia" direction="N"/>
215 </country>
216 <country name="Panama">
217 <rank updated="yes">69</rank>
218 <year>2011</year>
219 <gdppc>13600</gdppc>
220 <neighbor name="Costa Rica" direction="W"/>
221 <neighbor name="Colombia" direction="E"/>
222 </country>
223 </data>
224
225We can remove elements using :meth:`Element.remove`. Let's say we want to
226remove all countries with a rank higher than 50::
227
228 >>> for country in root.findall('country'):
229 ... rank = int(country.find('rank').text)
230 ... if rank > 50:
231 ... root.remove(country)
232 ...
233 >>> tree.write('output.xml')
234
235Our XML now looks like this:
236
237.. code-block:: xml
238
239 <?xml version="1.0"?>
240 <data>
241 <country name="Liechtenstein">
242 <rank updated="yes">2</rank>
243 <year>2008</year>
244 <gdppc>141100</gdppc>
245 <neighbor name="Austria" direction="E"/>
246 <neighbor name="Switzerland" direction="W"/>
247 </country>
248 <country name="Singapore">
249 <rank updated="yes">5</rank>
250 <year>2011</year>
251 <gdppc>59900</gdppc>
252 <neighbor name="Malaysia" direction="N"/>
253 </country>
254 </data>
255
256Building XML documents
257^^^^^^^^^^^^^^^^^^^^^^
258
259The :func:`SubElement` function also provides a convenient way to create new
260sub-elements for a given element::
261
262 >>> a = ET.Element('a')
263 >>> b = ET.SubElement(a, 'b')
264 >>> c = ET.SubElement(a, 'c')
265 >>> d = ET.SubElement(c, 'd')
266 >>> ET.dump(a)
267 <a><b /><c><d /></c></a>
268
269Additional resources
270^^^^^^^^^^^^^^^^^^^^
271
272See http://effbot.org/zone/element-index.htm for tutorials and links to other
273docs.
274
275.. _elementtree-xpath:
276
277XPath support
278-------------
279
280This module provides limited support for
281`XPath expressions <http://www.w3.org/TR/xpath>`_ for locating elements in a
282tree. The goal is to support a small subset of the abbreviated syntax; a full
283XPath engine is outside the scope of the module.
284
285Example
286^^^^^^^
287
288Here's an example that demonstrates some of the XPath capabilities of the
289module. We'll be using the ``countrydata`` XML document from the
290:ref:`Parsing XML <elementtree-parsing-xml>` section::
291
292 import xml.etree.ElementTree as ET
293
294 root = ET.fromstring(countrydata)
295
296 # Top-level elements
297 root.findall(".")
298
299 # All 'neighbor' grand-children of 'country' children of the top-level
300 # elements
301 root.findall("./country/neighbor")
302
303 # Nodes with name='Singapore' that have a 'year' child
304 root.findall(".//year/..[@name='Singapore']")
305
306 # 'year' nodes that are children of nodes with name='Singapore'
307 root.findall(".//*[@name='Singapore']/year")
308
309 # All 'neighbor' nodes that are the second child of their parent
310 root.findall(".//neighbor[2]")
311
312Supported XPath syntax
313^^^^^^^^^^^^^^^^^^^^^^
314
Georg Brandl44ea77b2013-03-28 13:28:44 +0100315.. tabularcolumns:: |l|L|
316
Eli Bendersky6ee21872012-08-18 05:40:38 +0300317+-----------------------+------------------------------------------------------+
318| Syntax | Meaning |
319+=======================+======================================================+
320| ``tag`` | Selects all child elements with the given tag. |
321| | For example, ``spam`` selects all child elements |
322| | named ``spam``, ``spam/egg`` selects all |
323| | grandchildren named ``egg`` in all children named |
324| | ``spam``. |
325+-----------------------+------------------------------------------------------+
326| ``*`` | Selects all child elements. For example, ``*/egg`` |
327| | selects all grandchildren named ``egg``. |
328+-----------------------+------------------------------------------------------+
329| ``.`` | Selects the current node. This is mostly useful |
330| | at the beginning of the path, to indicate that it's |
331| | a relative path. |
332+-----------------------+------------------------------------------------------+
333| ``//`` | Selects all subelements, on all levels beneath the |
334| | current element. For example, ``.//egg`` selects |
335| | all ``egg`` elements in the entire tree. |
336+-----------------------+------------------------------------------------------+
337| ``..`` | Selects the parent element. |
338+-----------------------+------------------------------------------------------+
339| ``[@attrib]`` | Selects all elements that have the given attribute. |
340+-----------------------+------------------------------------------------------+
341| ``[@attrib='value']`` | Selects all elements for which the given attribute |
342| | has the given value. The value cannot contain |
343| | quotes. |
344+-----------------------+------------------------------------------------------+
345| ``[tag]`` | Selects all elements that have a child named |
346| | ``tag``. Only immediate children are supported. |
347+-----------------------+------------------------------------------------------+
348| ``[position]`` | Selects all elements that are located at the given |
349| | position. The position can be either an integer |
350| | (1 is the first position), the expression ``last()`` |
351| | (for the last position), or a position relative to |
352| | the last position (e.g. ``last()-1``). |
353+-----------------------+------------------------------------------------------+
354
355Predicates (expressions within square brackets) must be preceded by a tag
356name, an asterisk, or another predicate. ``position`` predicates must be
357preceded by a tag name.
358
359Reference
360---------
Georg Brandl8ec7f652007-08-15 14:28:01 +0000361
362.. _elementtree-functions:
363
364Functions
Eli Bendersky6ee21872012-08-18 05:40:38 +0300365^^^^^^^^^
Georg Brandl8ec7f652007-08-15 14:28:01 +0000366
367
Florent Xiclunaa231e452010-03-13 20:30:15 +0000368.. function:: Comment(text=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000369
Florent Xicluna583302c2010-03-13 17:56:19 +0000370 Comment element factory. This factory function creates a special element
371 that will be serialized as an XML comment by the standard serializer. The
372 comment string can be either a bytestring or a Unicode string. *text* is a
373 string containing the comment string. Returns an element instance
374 representing a comment.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000375
376
377.. function:: dump(elem)
378
Florent Xicluna583302c2010-03-13 17:56:19 +0000379 Writes an element tree or element structure to sys.stdout. This function
380 should be used for debugging only.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000381
382 The exact output format is implementation dependent. In this version, it's
383 written as an ordinary XML file.
384
385 *elem* is an element tree or an individual element.
386
387
Georg Brandl8ec7f652007-08-15 14:28:01 +0000388.. function:: fromstring(text)
389
Florent Xicluna88db6f42010-03-14 01:22:09 +0000390 Parses an XML section from a string constant. Same as :func:`XML`. *text*
391 is a string containing XML data. Returns an :class:`Element` instance.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000392
393
Florent Xiclunaa231e452010-03-13 20:30:15 +0000394.. function:: fromstringlist(sequence, parser=None)
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000395
Florent Xicluna583302c2010-03-13 17:56:19 +0000396 Parses an XML document from a sequence of string fragments. *sequence* is a
397 list or other sequence containing XML data fragments. *parser* is an
398 optional parser instance. If not given, the standard :class:`XMLParser`
399 parser is used. Returns an :class:`Element` instance.
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000400
401 .. versionadded:: 2.7
402
403
Georg Brandl8ec7f652007-08-15 14:28:01 +0000404.. function:: iselement(element)
405
Florent Xicluna583302c2010-03-13 17:56:19 +0000406 Checks if an object appears to be a valid element object. *element* is an
407 element instance. Returns a true value if this is an element object.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000408
409
Florent Xiclunaa231e452010-03-13 20:30:15 +0000410.. function:: iterparse(source, events=None, parser=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000411
412 Parses an XML section into an element tree incrementally, and reports what's
Florent Xicluna583302c2010-03-13 17:56:19 +0000413 going on to the user. *source* is a filename or file object containing XML
414 data. *events* is a list of events to report back. If omitted, only "end"
415 events are reported. *parser* is an optional parser instance. If not
Eli Benderskyf4fbf242013-01-24 07:28:33 -0800416 given, the standard :class:`XMLParser` parser is used. *parser* is not
417 supported by ``cElementTree``. Returns an :term:`iterator` providing
418 ``(event, elem)`` pairs.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000419
Georg Brandlfb222632009-01-01 11:46:51 +0000420 .. note::
421
422 :func:`iterparse` only guarantees that it has seen the ">"
423 character of a starting tag when it emits a "start" event, so the
424 attributes are defined, but the contents of the text and tail attributes
425 are undefined at that point. The same applies to the element children;
426 they may or may not be present.
427
428 If you need a fully populated element, look for "end" events instead.
429
Georg Brandl8ec7f652007-08-15 14:28:01 +0000430
Florent Xiclunaa231e452010-03-13 20:30:15 +0000431.. function:: parse(source, parser=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000432
Florent Xicluna583302c2010-03-13 17:56:19 +0000433 Parses an XML section into an element tree. *source* is a filename or file
434 object containing XML data. *parser* is an optional parser instance. If
435 not given, the standard :class:`XMLParser` parser is used. Returns an
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000436 :class:`ElementTree` instance.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000437
438
Florent Xiclunaa231e452010-03-13 20:30:15 +0000439.. function:: ProcessingInstruction(target, text=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000440
Florent Xicluna583302c2010-03-13 17:56:19 +0000441 PI element factory. This factory function creates a special element that
442 will be serialized as an XML processing instruction. *target* is a string
443 containing the PI target. *text* is a string containing the PI contents, if
444 given. Returns an element instance, representing a processing instruction.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000445
446
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000447.. function:: register_namespace(prefix, uri)
448
Florent Xicluna583302c2010-03-13 17:56:19 +0000449 Registers a namespace prefix. The registry is global, and any existing
450 mapping for either the given prefix or the namespace URI will be removed.
451 *prefix* is a namespace prefix. *uri* is a namespace uri. Tags and
452 attributes in this namespace will be serialized with the given prefix, if at
453 all possible.
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000454
455 .. versionadded:: 2.7
456
457
Florent Xicluna88db6f42010-03-14 01:22:09 +0000458.. function:: SubElement(parent, tag, attrib={}, **extra)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000459
Florent Xicluna583302c2010-03-13 17:56:19 +0000460 Subelement factory. This function creates an element instance, and appends
461 it to an existing element.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000462
Florent Xicluna583302c2010-03-13 17:56:19 +0000463 The element name, attribute names, and attribute values can be either
464 bytestrings or Unicode strings. *parent* is the parent element. *tag* is
465 the subelement name. *attrib* is an optional dictionary, containing element
466 attributes. *extra* contains additional attributes, given as keyword
467 arguments. Returns an element instance.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000468
469
Florent Xicluna88db6f42010-03-14 01:22:09 +0000470.. function:: tostring(element, encoding="us-ascii", method="xml")
Georg Brandl8ec7f652007-08-15 14:28:01 +0000471
Florent Xicluna583302c2010-03-13 17:56:19 +0000472 Generates a string representation of an XML element, including all
Florent Xicluna88db6f42010-03-14 01:22:09 +0000473 subelements. *element* is an :class:`Element` instance. *encoding* [1]_ is
474 the output encoding (default is US-ASCII). *method* is either ``"xml"``,
Florent Xiclunaa231e452010-03-13 20:30:15 +0000475 ``"html"`` or ``"text"`` (default is ``"xml"``). Returns an encoded string
476 containing the XML data.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000477
478
Florent Xicluna88db6f42010-03-14 01:22:09 +0000479.. function:: tostringlist(element, encoding="us-ascii", method="xml")
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000480
Florent Xicluna583302c2010-03-13 17:56:19 +0000481 Generates a string representation of an XML element, including all
Florent Xicluna88db6f42010-03-14 01:22:09 +0000482 subelements. *element* is an :class:`Element` instance. *encoding* [1]_ is
483 the output encoding (default is US-ASCII). *method* is either ``"xml"``,
484 ``"html"`` or ``"text"`` (default is ``"xml"``). Returns a list of encoded
485 strings containing the XML data. It does not guarantee any specific
486 sequence, except that ``"".join(tostringlist(element)) ==
487 tostring(element)``.
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000488
489 .. versionadded:: 2.7
490
491
Florent Xiclunaa231e452010-03-13 20:30:15 +0000492.. function:: XML(text, parser=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000493
494 Parses an XML section from a string constant. This function can be used to
Florent Xicluna583302c2010-03-13 17:56:19 +0000495 embed "XML literals" in Python code. *text* is a string containing XML
496 data. *parser* is an optional parser instance. If not given, the standard
497 :class:`XMLParser` parser is used. Returns an :class:`Element` instance.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000498
499
Florent Xiclunaa231e452010-03-13 20:30:15 +0000500.. function:: XMLID(text, parser=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000501
502 Parses an XML section from a string constant, and also returns a dictionary
Florent Xicluna583302c2010-03-13 17:56:19 +0000503 which maps from element id:s to elements. *text* is a string containing XML
504 data. *parser* is an optional parser instance. If not given, the standard
505 :class:`XMLParser` parser is used. Returns a tuple containing an
506 :class:`Element` instance and a dictionary.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000507
508
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000509.. _elementtree-element-objects:
Georg Brandl8ec7f652007-08-15 14:28:01 +0000510
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000511Element Objects
Eli Bendersky6ee21872012-08-18 05:40:38 +0300512^^^^^^^^^^^^^^^
Georg Brandl8ec7f652007-08-15 14:28:01 +0000513
Florent Xiclunaa231e452010-03-13 20:30:15 +0000514.. class:: Element(tag, attrib={}, **extra)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000515
Florent Xicluna583302c2010-03-13 17:56:19 +0000516 Element class. This class defines the Element interface, and provides a
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000517 reference implementation of this interface.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000518
Florent Xicluna583302c2010-03-13 17:56:19 +0000519 The element name, attribute names, and attribute values can be either
520 bytestrings or Unicode strings. *tag* is the element name. *attrib* is
521 an optional dictionary, containing element attributes. *extra* contains
522 additional attributes, given as keyword arguments.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000523
524
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000525 .. attribute:: tag
Georg Brandl8ec7f652007-08-15 14:28:01 +0000526
Florent Xicluna583302c2010-03-13 17:56:19 +0000527 A string identifying what kind of data this element represents (the
528 element type, in other words).
Georg Brandl8ec7f652007-08-15 14:28:01 +0000529
530
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000531 .. attribute:: text
Georg Brandl8ec7f652007-08-15 14:28:01 +0000532
Florent Xicluna583302c2010-03-13 17:56:19 +0000533 The *text* attribute can be used to hold additional data associated with
534 the element. As the name implies this attribute is usually a string but
535 may be any application-specific object. If the element is created from
536 an XML file the attribute will contain any text found between the element
537 tags.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000538
539
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000540 .. attribute:: tail
Georg Brandl8ec7f652007-08-15 14:28:01 +0000541
Florent Xicluna583302c2010-03-13 17:56:19 +0000542 The *tail* attribute can be used to hold additional data associated with
543 the element. This attribute is usually a string but may be any
544 application-specific object. If the element is created from an XML file
545 the attribute will contain any text found after the element's end tag and
546 before the next tag.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000547
Georg Brandl8ec7f652007-08-15 14:28:01 +0000548
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000549 .. attribute:: attrib
Georg Brandl8ec7f652007-08-15 14:28:01 +0000550
Florent Xicluna583302c2010-03-13 17:56:19 +0000551 A dictionary containing the element's attributes. Note that while the
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000552 *attrib* value is always a real mutable Python dictionary, an ElementTree
Florent Xicluna583302c2010-03-13 17:56:19 +0000553 implementation may choose to use another internal representation, and
554 create the dictionary only if someone asks for it. To take advantage of
555 such implementations, use the dictionary methods below whenever possible.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000556
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000557 The following dictionary-like methods work on the element attributes.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000558
559
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000560 .. method:: clear()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000561
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000562 Resets an element. This function removes all subelements, clears all
563 attributes, and sets the text and tail attributes to None.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000564
Georg Brandl8ec7f652007-08-15 14:28:01 +0000565
Florent Xiclunaa231e452010-03-13 20:30:15 +0000566 .. method:: get(key, default=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000567
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000568 Gets the element attribute named *key*.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000569
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000570 Returns the attribute value, or *default* if the attribute was not found.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000571
572
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000573 .. method:: items()
574
Florent Xicluna583302c2010-03-13 17:56:19 +0000575 Returns the element attributes as a sequence of (name, value) pairs. The
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000576 attributes are returned in an arbitrary order.
577
578
579 .. method:: keys()
580
Florent Xicluna583302c2010-03-13 17:56:19 +0000581 Returns the elements attribute names as a list. The names are returned
582 in an arbitrary order.
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000583
584
585 .. method:: set(key, value)
586
587 Set the attribute *key* on the element to *value*.
588
589 The following methods work on the element's children (subelements).
590
591
592 .. method:: append(subelement)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000593
Florent Xicluna583302c2010-03-13 17:56:19 +0000594 Adds the element *subelement* to the end of this elements internal list
595 of subelements.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000596
597
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000598 .. method:: extend(subelements)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000599
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000600 Appends *subelements* from a sequence object with zero or more elements.
601 Raises :exc:`AssertionError` if a subelement is not a valid object.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000602
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000603 .. versionadded:: 2.7
Georg Brandl8ec7f652007-08-15 14:28:01 +0000604
605
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000606 .. method:: find(match)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000607
Florent Xicluna583302c2010-03-13 17:56:19 +0000608 Finds the first subelement matching *match*. *match* may be a tag name
609 or path. Returns an element instance or ``None``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000610
611
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000612 .. method:: findall(match)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000613
Florent Xicluna583302c2010-03-13 17:56:19 +0000614 Finds all matching subelements, by tag name or path. Returns a list
615 containing all matching elements in document order.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000616
617
Florent Xiclunaa231e452010-03-13 20:30:15 +0000618 .. method:: findtext(match, default=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000619
Florent Xicluna583302c2010-03-13 17:56:19 +0000620 Finds text for the first subelement matching *match*. *match* may be
621 a tag name or path. Returns the text content of the first matching
622 element, or *default* if no element was found. Note that if the matching
623 element has no text content an empty string is returned.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000624
625
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000626 .. method:: getchildren()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000627
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000628 .. deprecated:: 2.7
629 Use ``list(elem)`` or iteration.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000630
631
Florent Xiclunaa231e452010-03-13 20:30:15 +0000632 .. method:: getiterator(tag=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000633
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000634 .. deprecated:: 2.7
635 Use method :meth:`Element.iter` instead.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000636
637
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000638 .. method:: insert(index, element)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000639
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000640 Inserts a subelement at the given position in this element.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000641
642
Florent Xiclunaa231e452010-03-13 20:30:15 +0000643 .. method:: iter(tag=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000644
Florent Xicluna583302c2010-03-13 17:56:19 +0000645 Creates a tree :term:`iterator` with the current element as the root.
646 The iterator iterates over this element and all elements below it, in
647 document (depth first) order. If *tag* is not ``None`` or ``'*'``, only
648 elements whose tag equals *tag* are returned from the iterator. If the
649 tree structure is modified during iteration, the result is undefined.
650
Ezio Melottic54d97b2011-10-09 23:56:51 +0300651 .. versionadded:: 2.7
652
Florent Xicluna583302c2010-03-13 17:56:19 +0000653
654 .. method:: iterfind(match)
655
656 Finds all matching subelements, by tag name or path. Returns an iterable
657 yielding all matching elements in document order.
658
659 .. versionadded:: 2.7
660
661
662 .. method:: itertext()
663
664 Creates a text iterator. The iterator loops over this element and all
665 subelements, in document order, and returns all inner text.
666
667 .. versionadded:: 2.7
Georg Brandl8ec7f652007-08-15 14:28:01 +0000668
669
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000670 .. method:: makeelement(tag, attrib)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000671
Florent Xicluna583302c2010-03-13 17:56:19 +0000672 Creates a new element object of the same type as this element. Do not
673 call this method, use the :func:`SubElement` factory function instead.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000674
675
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000676 .. method:: remove(subelement)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000677
Florent Xicluna583302c2010-03-13 17:56:19 +0000678 Removes *subelement* from the element. Unlike the find\* methods this
679 method compares elements based on the instance identity, not on tag value
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000680 or contents.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000681
Florent Xicluna583302c2010-03-13 17:56:19 +0000682 :class:`Element` objects also support the following sequence type methods
683 for working with subelements: :meth:`__delitem__`, :meth:`__getitem__`,
684 :meth:`__setitem__`, :meth:`__len__`.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000685
Florent Xicluna583302c2010-03-13 17:56:19 +0000686 Caution: Elements with no subelements will test as ``False``. This behavior
687 will change in future versions. Use specific ``len(elem)`` or ``elem is
688 None`` test instead. ::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000689
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000690 element = root.find('foo')
Georg Brandl8ec7f652007-08-15 14:28:01 +0000691
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000692 if not element: # careful!
693 print "element not found, or element has no subelements"
Georg Brandl8ec7f652007-08-15 14:28:01 +0000694
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000695 if element is None:
696 print "element not found"
Georg Brandl8ec7f652007-08-15 14:28:01 +0000697
698
699.. _elementtree-elementtree-objects:
700
701ElementTree Objects
Eli Bendersky6ee21872012-08-18 05:40:38 +0300702^^^^^^^^^^^^^^^^^^^
Georg Brandl8ec7f652007-08-15 14:28:01 +0000703
704
Florent Xiclunaa231e452010-03-13 20:30:15 +0000705.. class:: ElementTree(element=None, file=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000706
Florent Xicluna583302c2010-03-13 17:56:19 +0000707 ElementTree wrapper class. This class represents an entire element
708 hierarchy, and adds some extra support for serialization to and from
709 standard XML.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000710
Florent Xicluna583302c2010-03-13 17:56:19 +0000711 *element* is the root element. The tree is initialized with the contents
712 of the XML *file* if given.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000713
714
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000715 .. method:: _setroot(element)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000716
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000717 Replaces the root element for this tree. This discards the current
718 contents of the tree, and replaces it with the given element. Use with
Florent Xicluna583302c2010-03-13 17:56:19 +0000719 care. *element* is an element instance.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000720
721
Florent Xicluna583302c2010-03-13 17:56:19 +0000722 .. method:: find(match)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000723
Eli Bendersky981c3bd2013-03-12 06:08:04 -0700724 Same as :meth:`Element.find`, starting at the root of the tree.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000725
726
Florent Xicluna583302c2010-03-13 17:56:19 +0000727 .. method:: findall(match)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000728
Eli Bendersky981c3bd2013-03-12 06:08:04 -0700729 Same as :meth:`Element.findall`, starting at the root of the tree.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000730
731
Florent Xiclunaa231e452010-03-13 20:30:15 +0000732 .. method:: findtext(match, default=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000733
Eli Bendersky981c3bd2013-03-12 06:08:04 -0700734 Same as :meth:`Element.findtext`, starting at the root of the tree.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000735
736
Florent Xiclunaa231e452010-03-13 20:30:15 +0000737 .. method:: getiterator(tag=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000738
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000739 .. deprecated:: 2.7
740 Use method :meth:`ElementTree.iter` instead.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000741
742
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000743 .. method:: getroot()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000744
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000745 Returns the root element for this tree.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000746
747
Florent Xiclunaa231e452010-03-13 20:30:15 +0000748 .. method:: iter(tag=None)
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000749
750 Creates and returns a tree iterator for the root element. The iterator
Florent Xicluna583302c2010-03-13 17:56:19 +0000751 loops over all elements in this tree, in section order. *tag* is the tag
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000752 to look for (default is to return all elements)
753
754
Florent Xicluna583302c2010-03-13 17:56:19 +0000755 .. method:: iterfind(match)
756
757 Finds all matching subelements, by tag name or path. Same as
758 getroot().iterfind(match). Returns an iterable yielding all matching
759 elements in document order.
760
761 .. versionadded:: 2.7
762
763
Florent Xiclunaa231e452010-03-13 20:30:15 +0000764 .. method:: parse(source, parser=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000765
Florent Xicluna583302c2010-03-13 17:56:19 +0000766 Loads an external XML section into this element tree. *source* is a file
767 name or file object. *parser* is an optional parser instance. If not
768 given, the standard XMLParser parser is used. Returns the section
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000769 root element.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000770
771
Serhiy Storchaka3d4a02a2013-01-13 21:57:14 +0200772 .. method:: write(file, encoding="us-ascii", xml_declaration=None, \
773 default_namespace=None, method="xml")
Georg Brandl8ec7f652007-08-15 14:28:01 +0000774
Florent Xicluna583302c2010-03-13 17:56:19 +0000775 Writes the element tree to a file, as XML. *file* is a file name, or a
776 file object opened for writing. *encoding* [1]_ is the output encoding
777 (default is US-ASCII). *xml_declaration* controls if an XML declaration
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000778 should be added to the file. Use False for never, True for always, None
Serhiy Storchaka3d4a02a2013-01-13 21:57:14 +0200779 for only if not US-ASCII or UTF-8 (default is None). *default_namespace*
780 sets the default XML namespace (for "xmlns"). *method* is either
Florent Xiclunaa231e452010-03-13 20:30:15 +0000781 ``"xml"``, ``"html"`` or ``"text"`` (default is ``"xml"``). Returns an
782 encoded string.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000783
Georg Brandl39bd0592007-12-01 22:42:46 +0000784This is the XML file that is going to be manipulated::
785
786 <html>
787 <head>
788 <title>Example page</title>
789 </head>
790 <body>
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000791 <p>Moved to <a href="http://example.org/">example.org</a>
Georg Brandl39bd0592007-12-01 22:42:46 +0000792 or <a href="http://example.com/">example.com</a>.</p>
793 </body>
794 </html>
795
796Example of changing the attribute "target" of every link in first paragraph::
797
798 >>> from xml.etree.ElementTree import ElementTree
799 >>> tree = ElementTree()
800 >>> tree.parse("index.xhtml")
Florent Xicluna583302c2010-03-13 17:56:19 +0000801 <Element 'html' at 0xb77e6fac>
Georg Brandl39bd0592007-12-01 22:42:46 +0000802 >>> p = tree.find("body/p") # Finds first occurrence of tag p in body
803 >>> p
Florent Xicluna583302c2010-03-13 17:56:19 +0000804 <Element 'p' at 0xb77ec26c>
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000805 >>> links = list(p.iter("a")) # Returns list of all links
Georg Brandl39bd0592007-12-01 22:42:46 +0000806 >>> links
Florent Xicluna583302c2010-03-13 17:56:19 +0000807 [<Element 'a' at 0xb77ec2ac>, <Element 'a' at 0xb77ec1cc>]
Georg Brandl39bd0592007-12-01 22:42:46 +0000808 >>> for i in links: # Iterates through all found links
809 ... i.attrib["target"] = "blank"
810 >>> tree.write("output.xhtml")
Georg Brandl8ec7f652007-08-15 14:28:01 +0000811
812.. _elementtree-qname-objects:
813
814QName Objects
Eli Bendersky6ee21872012-08-18 05:40:38 +0300815^^^^^^^^^^^^^
Georg Brandl8ec7f652007-08-15 14:28:01 +0000816
817
Florent Xiclunaa231e452010-03-13 20:30:15 +0000818.. class:: QName(text_or_uri, tag=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000819
Florent Xicluna583302c2010-03-13 17:56:19 +0000820 QName wrapper. This can be used to wrap a QName attribute value, in order
821 to get proper namespace handling on output. *text_or_uri* is a string
822 containing the QName value, in the form {uri}local, or, if the tag argument
823 is given, the URI part of a QName. If *tag* is given, the first argument is
824 interpreted as an URI, and this argument is interpreted as a local name.
825 :class:`QName` instances are opaque.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000826
827
828.. _elementtree-treebuilder-objects:
829
830TreeBuilder Objects
Eli Bendersky6ee21872012-08-18 05:40:38 +0300831^^^^^^^^^^^^^^^^^^^
Georg Brandl8ec7f652007-08-15 14:28:01 +0000832
833
Florent Xiclunaa231e452010-03-13 20:30:15 +0000834.. class:: TreeBuilder(element_factory=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000835
Florent Xicluna583302c2010-03-13 17:56:19 +0000836 Generic element structure builder. This builder converts a sequence of
837 start, data, and end method calls to a well-formed element structure. You
838 can use this class to build an element structure using a custom XML parser,
839 or a parser for some other XML-like format. The *element_factory* is called
840 to create new :class:`Element` instances when given.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000841
842
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000843 .. method:: close()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000844
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000845 Flushes the builder buffers, and returns the toplevel document
Florent Xicluna583302c2010-03-13 17:56:19 +0000846 element. Returns an :class:`Element` instance.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000847
848
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000849 .. method:: data(data)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000850
Florent Xicluna583302c2010-03-13 17:56:19 +0000851 Adds text to the current element. *data* is a string. This should be
852 either a bytestring, or a Unicode string.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000853
854
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000855 .. method:: end(tag)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000856
Florent Xicluna583302c2010-03-13 17:56:19 +0000857 Closes the current element. *tag* is the element name. Returns the
858 closed element.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000859
860
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000861 .. method:: start(tag, attrs)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000862
Florent Xicluna583302c2010-03-13 17:56:19 +0000863 Opens a new element. *tag* is the element name. *attrs* is a dictionary
864 containing element attributes. Returns the opened element.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000865
866
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000867 In addition, a custom :class:`TreeBuilder` object can provide the
868 following method:
Georg Brandl8ec7f652007-08-15 14:28:01 +0000869
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000870 .. method:: doctype(name, pubid, system)
871
Florent Xicluna583302c2010-03-13 17:56:19 +0000872 Handles a doctype declaration. *name* is the doctype name. *pubid* is
873 the public identifier. *system* is the system identifier. This method
874 does not exist on the default :class:`TreeBuilder` class.
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000875
876 .. versionadded:: 2.7
Georg Brandl8ec7f652007-08-15 14:28:01 +0000877
878
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000879.. _elementtree-xmlparser-objects:
880
881XMLParser Objects
Eli Bendersky6ee21872012-08-18 05:40:38 +0300882^^^^^^^^^^^^^^^^^
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000883
884
Florent Xiclunaa231e452010-03-13 20:30:15 +0000885.. class:: XMLParser(html=0, target=None, encoding=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000886
Florent Xicluna583302c2010-03-13 17:56:19 +0000887 :class:`Element` structure builder for XML source data, based on the expat
888 parser. *html* are predefined HTML entities. This flag is not supported by
889 the current implementation. *target* is the target object. If omitted, the
890 builder uses an instance of the standard TreeBuilder class. *encoding* [1]_
891 is optional. If given, the value overrides the encoding specified in the
892 XML file.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000893
894
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000895 .. method:: close()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000896
Florent Xicluna583302c2010-03-13 17:56:19 +0000897 Finishes feeding data to the parser. Returns an element structure.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000898
899
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000900 .. method:: doctype(name, pubid, system)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000901
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000902 .. deprecated:: 2.7
903 Define the :meth:`TreeBuilder.doctype` method on a custom TreeBuilder
904 target.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000905
906
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000907 .. method:: feed(data)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000908
Florent Xicluna583302c2010-03-13 17:56:19 +0000909 Feeds data to the parser. *data* is encoded data.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000910
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000911:meth:`XMLParser.feed` calls *target*\'s :meth:`start` method
Georg Brandl39bd0592007-12-01 22:42:46 +0000912for each opening tag, its :meth:`end` method for each closing tag,
Florent Xicluna583302c2010-03-13 17:56:19 +0000913and data is processed by method :meth:`data`. :meth:`XMLParser.close`
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000914calls *target*\'s method :meth:`close`.
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000915:class:`XMLParser` can be used not only for building a tree structure.
Georg Brandl39bd0592007-12-01 22:42:46 +0000916This is an example of counting the maximum depth of an XML file::
917
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000918 >>> from xml.etree.ElementTree import XMLParser
Georg Brandl39bd0592007-12-01 22:42:46 +0000919 >>> class MaxDepth: # The target object of the parser
920 ... maxDepth = 0
921 ... depth = 0
922 ... def start(self, tag, attrib): # Called for each opening tag.
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000923 ... self.depth += 1
Georg Brandl39bd0592007-12-01 22:42:46 +0000924 ... if self.depth > self.maxDepth:
925 ... self.maxDepth = self.depth
926 ... def end(self, tag): # Called for each closing tag.
927 ... self.depth -= 1
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000928 ... def data(self, data):
Georg Brandl39bd0592007-12-01 22:42:46 +0000929 ... pass # We do not need to do anything with data.
930 ... def close(self): # Called when all data has been parsed.
931 ... return self.maxDepth
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000932 ...
Georg Brandl39bd0592007-12-01 22:42:46 +0000933 >>> target = MaxDepth()
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000934 >>> parser = XMLParser(target=target)
Georg Brandl39bd0592007-12-01 22:42:46 +0000935 >>> exampleXml = """
936 ... <a>
937 ... <b>
938 ... </b>
939 ... <b>
940 ... <c>
941 ... <d>
942 ... </d>
943 ... </c>
944 ... </b>
945 ... </a>"""
946 >>> parser.feed(exampleXml)
947 >>> parser.close()
948 4
Mark Summerfield43da35d2008-03-17 08:28:15 +0000949
950
951.. rubric:: Footnotes
952
953.. [#] The encoding string included in XML output should conform to the
Florent Xicluna583302c2010-03-13 17:56:19 +0000954 appropriate standards. For example, "UTF-8" is valid, but "UTF8" is
955 not. See http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl
Georg Brandl8b8c2df2009-02-20 08:45:47 +0000956 and http://www.iana.org/assignments/character-sets.