blob: c8cc7738b42201800ba3b960ef0ef511e09f36f6 [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001:mod:`xml.etree.ElementTree` --- The ElementTree XML API
2========================================================
3
4.. module:: xml.etree.ElementTree
5 :synopsis: Implementation of the ElementTree API.
6.. moduleauthor:: Fredrik Lundh <fredrik@pythonware.com>
7
8
9.. versionadded:: 2.5
10
Éric Araujo29a0b572011-08-19 02:14:03 +020011**Source code:** :source:`Lib/xml/etree/ElementTree.py`
12
13--------------
14
Florent Xicluna583302c2010-03-13 17:56:19 +000015The :class:`Element` type is a flexible container object, designed to store
16hierarchical data structures in memory. The type can be described as a cross
17between a list and a dictionary.
Georg Brandl8ec7f652007-08-15 14:28:01 +000018
19Each element has a number of properties associated with it:
20
21* a tag which is a string identifying what kind of data this element represents
22 (the element type, in other words).
23
24* a number of attributes, stored in a Python dictionary.
25
26* a text string.
27
28* an optional tail string.
29
30* a number of child elements, stored in a Python sequence
31
Florent Xicluna3e8c1892010-03-11 14:36:19 +000032To create an element instance, use the :class:`Element` constructor or the
33:func:`SubElement` factory function.
Georg Brandl8ec7f652007-08-15 14:28:01 +000034
35The :class:`ElementTree` class can be used to wrap an element structure, and
36convert it from and to XML.
37
38A C implementation of this API is available as :mod:`xml.etree.cElementTree`.
39
Georg Brandl39bd0592007-12-01 22:42:46 +000040See http://effbot.org/zone/element-index.htm for tutorials and links to other
Florent Xicluna583302c2010-03-13 17:56:19 +000041docs. Fredrik Lundh's page is also the location of the development version of
42the xml.etree.ElementTree.
43
44.. versionchanged:: 2.7
45 The ElementTree API is updated to 1.3. For more information, see
46 `Introducing ElementTree 1.3
47 <http://effbot.org/zone/elementtree-13-intro.htm>`_.
48
Eli Bendersky6ee21872012-08-18 05:40:38 +030049Tutorial
50--------
51
52This is a short tutorial for using :mod:`xml.etree.ElementTree` (``ET`` in
53short). The goal is to demonstrate some of the building blocks and basic
54concepts of the module.
55
56XML tree and elements
57^^^^^^^^^^^^^^^^^^^^^
58
59XML is an inherently hierarchical data format, and the most natural way to
60represent it is with a tree. ``ET`` has two classes for this purpose -
61:class:`ElementTree` represents the whole XML document as a tree, and
62:class:`Element` represents a single node in this tree. Interactions with
63the whole document (reading and writing to/from files) are usually done
64on the :class:`ElementTree` level. Interactions with a single XML element
65and its sub-elements are done on the :class:`Element` level.
66
67.. _elementtree-parsing-xml:
68
69Parsing XML
70^^^^^^^^^^^
71
72We'll be using the following XML document as the sample data for this section:
73
74.. code-block:: xml
75
76 <?xml version="1.0"?>
77 <data>
78 <country name="Liechtenstein">
79 <rank>1</rank>
80 <year>2008</year>
81 <gdppc>141100</gdppc>
82 <neighbor name="Austria" direction="E"/>
83 <neighbor name="Switzerland" direction="W"/>
84 </country>
85 <country name="Singapore">
86 <rank>4</rank>
87 <year>2011</year>
88 <gdppc>59900</gdppc>
89 <neighbor name="Malaysia" direction="N"/>
90 </country>
91 <country name="Panama">
92 <rank>68</rank>
93 <year>2011</year>
94 <gdppc>13600</gdppc>
95 <neighbor name="Costa Rica" direction="W"/>
96 <neighbor name="Colombia" direction="E"/>
97 </country>
98 </data>
99
100We have a number of ways to import the data. Reading the file from disk::
101
102 import xml.etree.ElementTree as ET
103 tree = ET.parse('country_data.xml')
104 root = tree.getroot()
105
106Reading the data from a string::
107
108 root = ET.fromstring(country_data_as_string)
109
110:func:`fromstring` parses XML from a string directly into an :class:`Element`,
111which is the root element of the parsed tree. Other parsing functions may
112create an :class:`ElementTree`. Check the documentation to be sure.
113
114As an :class:`Element`, ``root`` has a tag and a dictionary of attributes::
115
116 >>> root.tag
117 'data'
118 >>> root.attrib
119 {}
120
121It also has children nodes over which we can iterate::
122
123 >>> for child in root:
124 ... print child.tag, child.attrib
125 ...
126 country {'name': 'Liechtenstein'}
127 country {'name': 'Singapore'}
128 country {'name': 'Panama'}
129
130Children are nested, and we can access specific child nodes by index::
131
132 >>> root[0][1].text
133 '2008'
134
135Finding interesting elements
136^^^^^^^^^^^^^^^^^^^^^^^^^^^^
137
138:class:`Element` has some useful methods that help iterate recursively over all
139the sub-tree below it (its children, their children, and so on). For example,
140:meth:`Element.iter`::
141
142 >>> for neighbor in root.iter('neighbor'):
143 ... print neighbor.attrib
144 ...
145 {'name': 'Austria', 'direction': 'E'}
146 {'name': 'Switzerland', 'direction': 'W'}
147 {'name': 'Malaysia', 'direction': 'N'}
148 {'name': 'Costa Rica', 'direction': 'W'}
149 {'name': 'Colombia', 'direction': 'E'}
150
151:meth:`Element.findall` finds only elements with a tag which are direct
152children of the current element. :meth:`Element.find` finds the *first* child
153with a particular tag, and :meth:`Element.text` accesses the element's text
154content. :meth:`Element.get` accesses the element's attributes::
155
156 >>> for country in root.findall('country'):
157 ... rank = country.find('rank').text
158 ... name = country.get('name')
159 ... print name, rank
160 ...
161 Liechtenstein 1
162 Singapore 4
163 Panama 68
164
165More sophisticated specification of which elements to look for is possible by
166using :ref:`XPath <elementtree-xpath>`.
167
168Modifying an XML File
169^^^^^^^^^^^^^^^^^^^^^
170
171:class:`ElementTree` provides a simple way to build XML documents and write them to files.
172The :meth:`ElementTree.write` method serves this purpose.
173
174Once created, an :class:`Element` object may be manipulated by directly changing
175its fields (such as :attr:`Element.text`), adding and modifying attributes
176(:meth:`Element.set` method), as well as adding new children (for example
177with :meth:`Element.append`).
178
179Let's say we want to add one to each country's rank, and add an ``updated``
180attribute to the rank element::
181
182 >>> for rank in root.iter('rank'):
183 ... new_rank = int(rank.text) + 1
184 ... rank.text = str(new_rank)
185 ... rank.set('updated', 'yes')
186 ...
187 >>> tree.write('output.xml')
188
189Our XML now looks like this:
190
191.. code-block:: xml
192
193 <?xml version="1.0"?>
194 <data>
195 <country name="Liechtenstein">
196 <rank updated="yes">2</rank>
197 <year>2008</year>
198 <gdppc>141100</gdppc>
199 <neighbor name="Austria" direction="E"/>
200 <neighbor name="Switzerland" direction="W"/>
201 </country>
202 <country name="Singapore">
203 <rank updated="yes">5</rank>
204 <year>2011</year>
205 <gdppc>59900</gdppc>
206 <neighbor name="Malaysia" direction="N"/>
207 </country>
208 <country name="Panama">
209 <rank updated="yes">69</rank>
210 <year>2011</year>
211 <gdppc>13600</gdppc>
212 <neighbor name="Costa Rica" direction="W"/>
213 <neighbor name="Colombia" direction="E"/>
214 </country>
215 </data>
216
217We can remove elements using :meth:`Element.remove`. Let's say we want to
218remove all countries with a rank higher than 50::
219
220 >>> for country in root.findall('country'):
221 ... rank = int(country.find('rank').text)
222 ... if rank > 50:
223 ... root.remove(country)
224 ...
225 >>> tree.write('output.xml')
226
227Our XML now looks like this:
228
229.. code-block:: xml
230
231 <?xml version="1.0"?>
232 <data>
233 <country name="Liechtenstein">
234 <rank updated="yes">2</rank>
235 <year>2008</year>
236 <gdppc>141100</gdppc>
237 <neighbor name="Austria" direction="E"/>
238 <neighbor name="Switzerland" direction="W"/>
239 </country>
240 <country name="Singapore">
241 <rank updated="yes">5</rank>
242 <year>2011</year>
243 <gdppc>59900</gdppc>
244 <neighbor name="Malaysia" direction="N"/>
245 </country>
246 </data>
247
248Building XML documents
249^^^^^^^^^^^^^^^^^^^^^^
250
251The :func:`SubElement` function also provides a convenient way to create new
252sub-elements for a given element::
253
254 >>> a = ET.Element('a')
255 >>> b = ET.SubElement(a, 'b')
256 >>> c = ET.SubElement(a, 'c')
257 >>> d = ET.SubElement(c, 'd')
258 >>> ET.dump(a)
259 <a><b /><c><d /></c></a>
260
261Additional resources
262^^^^^^^^^^^^^^^^^^^^
263
264See http://effbot.org/zone/element-index.htm for tutorials and links to other
265docs.
266
267.. _elementtree-xpath:
268
269XPath support
270-------------
271
272This module provides limited support for
273`XPath expressions <http://www.w3.org/TR/xpath>`_ for locating elements in a
274tree. The goal is to support a small subset of the abbreviated syntax; a full
275XPath engine is outside the scope of the module.
276
277Example
278^^^^^^^
279
280Here's an example that demonstrates some of the XPath capabilities of the
281module. We'll be using the ``countrydata`` XML document from the
282:ref:`Parsing XML <elementtree-parsing-xml>` section::
283
284 import xml.etree.ElementTree as ET
285
286 root = ET.fromstring(countrydata)
287
288 # Top-level elements
289 root.findall(".")
290
291 # All 'neighbor' grand-children of 'country' children of the top-level
292 # elements
293 root.findall("./country/neighbor")
294
295 # Nodes with name='Singapore' that have a 'year' child
296 root.findall(".//year/..[@name='Singapore']")
297
298 # 'year' nodes that are children of nodes with name='Singapore'
299 root.findall(".//*[@name='Singapore']/year")
300
301 # All 'neighbor' nodes that are the second child of their parent
302 root.findall(".//neighbor[2]")
303
304Supported XPath syntax
305^^^^^^^^^^^^^^^^^^^^^^
306
307+-----------------------+------------------------------------------------------+
308| Syntax | Meaning |
309+=======================+======================================================+
310| ``tag`` | Selects all child elements with the given tag. |
311| | For example, ``spam`` selects all child elements |
312| | named ``spam``, ``spam/egg`` selects all |
313| | grandchildren named ``egg`` in all children named |
314| | ``spam``. |
315+-----------------------+------------------------------------------------------+
316| ``*`` | Selects all child elements. For example, ``*/egg`` |
317| | selects all grandchildren named ``egg``. |
318+-----------------------+------------------------------------------------------+
319| ``.`` | Selects the current node. This is mostly useful |
320| | at the beginning of the path, to indicate that it's |
321| | a relative path. |
322+-----------------------+------------------------------------------------------+
323| ``//`` | Selects all subelements, on all levels beneath the |
324| | current element. For example, ``.//egg`` selects |
325| | all ``egg`` elements in the entire tree. |
326+-----------------------+------------------------------------------------------+
327| ``..`` | Selects the parent element. |
328+-----------------------+------------------------------------------------------+
329| ``[@attrib]`` | Selects all elements that have the given attribute. |
330+-----------------------+------------------------------------------------------+
331| ``[@attrib='value']`` | Selects all elements for which the given attribute |
332| | has the given value. The value cannot contain |
333| | quotes. |
334+-----------------------+------------------------------------------------------+
335| ``[tag]`` | Selects all elements that have a child named |
336| | ``tag``. Only immediate children are supported. |
337+-----------------------+------------------------------------------------------+
338| ``[position]`` | Selects all elements that are located at the given |
339| | position. The position can be either an integer |
340| | (1 is the first position), the expression ``last()`` |
341| | (for the last position), or a position relative to |
342| | the last position (e.g. ``last()-1``). |
343+-----------------------+------------------------------------------------------+
344
345Predicates (expressions within square brackets) must be preceded by a tag
346name, an asterisk, or another predicate. ``position`` predicates must be
347preceded by a tag name.
348
349Reference
350---------
Georg Brandl8ec7f652007-08-15 14:28:01 +0000351
352.. _elementtree-functions:
353
354Functions
Eli Bendersky6ee21872012-08-18 05:40:38 +0300355^^^^^^^^^
Georg Brandl8ec7f652007-08-15 14:28:01 +0000356
357
Florent Xiclunaa231e452010-03-13 20:30:15 +0000358.. function:: Comment(text=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000359
Florent Xicluna583302c2010-03-13 17:56:19 +0000360 Comment element factory. This factory function creates a special element
361 that will be serialized as an XML comment by the standard serializer. The
362 comment string can be either a bytestring or a Unicode string. *text* is a
363 string containing the comment string. Returns an element instance
364 representing a comment.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000365
366
367.. function:: dump(elem)
368
Florent Xicluna583302c2010-03-13 17:56:19 +0000369 Writes an element tree or element structure to sys.stdout. This function
370 should be used for debugging only.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000371
372 The exact output format is implementation dependent. In this version, it's
373 written as an ordinary XML file.
374
375 *elem* is an element tree or an individual element.
376
377
Georg Brandl8ec7f652007-08-15 14:28:01 +0000378.. function:: fromstring(text)
379
Florent Xicluna88db6f42010-03-14 01:22:09 +0000380 Parses an XML section from a string constant. Same as :func:`XML`. *text*
381 is a string containing XML data. Returns an :class:`Element` instance.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000382
383
Florent Xiclunaa231e452010-03-13 20:30:15 +0000384.. function:: fromstringlist(sequence, parser=None)
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000385
Florent Xicluna583302c2010-03-13 17:56:19 +0000386 Parses an XML document from a sequence of string fragments. *sequence* is a
387 list or other sequence containing XML data fragments. *parser* is an
388 optional parser instance. If not given, the standard :class:`XMLParser`
389 parser is used. Returns an :class:`Element` instance.
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000390
391 .. versionadded:: 2.7
392
393
Georg Brandl8ec7f652007-08-15 14:28:01 +0000394.. function:: iselement(element)
395
Florent Xicluna583302c2010-03-13 17:56:19 +0000396 Checks if an object appears to be a valid element object. *element* is an
397 element instance. Returns a true value if this is an element object.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000398
399
Florent Xiclunaa231e452010-03-13 20:30:15 +0000400.. function:: iterparse(source, events=None, parser=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000401
402 Parses an XML section into an element tree incrementally, and reports what's
Florent Xicluna583302c2010-03-13 17:56:19 +0000403 going on to the user. *source* is a filename or file object containing XML
404 data. *events* is a list of events to report back. If omitted, only "end"
405 events are reported. *parser* is an optional parser instance. If not
Eli Benderskyf4fbf242013-01-24 07:28:33 -0800406 given, the standard :class:`XMLParser` parser is used. *parser* is not
407 supported by ``cElementTree``. Returns an :term:`iterator` providing
408 ``(event, elem)`` pairs.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000409
Georg Brandlfb222632009-01-01 11:46:51 +0000410 .. note::
411
412 :func:`iterparse` only guarantees that it has seen the ">"
413 character of a starting tag when it emits a "start" event, so the
414 attributes are defined, but the contents of the text and tail attributes
415 are undefined at that point. The same applies to the element children;
416 they may or may not be present.
417
418 If you need a fully populated element, look for "end" events instead.
419
Georg Brandl8ec7f652007-08-15 14:28:01 +0000420
Florent Xiclunaa231e452010-03-13 20:30:15 +0000421.. function:: parse(source, parser=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000422
Florent Xicluna583302c2010-03-13 17:56:19 +0000423 Parses an XML section into an element tree. *source* is a filename or file
424 object containing XML data. *parser* is an optional parser instance. If
425 not given, the standard :class:`XMLParser` parser is used. Returns an
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000426 :class:`ElementTree` instance.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000427
428
Florent Xiclunaa231e452010-03-13 20:30:15 +0000429.. function:: ProcessingInstruction(target, text=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000430
Florent Xicluna583302c2010-03-13 17:56:19 +0000431 PI element factory. This factory function creates a special element that
432 will be serialized as an XML processing instruction. *target* is a string
433 containing the PI target. *text* is a string containing the PI contents, if
434 given. Returns an element instance, representing a processing instruction.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000435
436
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000437.. function:: register_namespace(prefix, uri)
438
Florent Xicluna583302c2010-03-13 17:56:19 +0000439 Registers a namespace prefix. The registry is global, and any existing
440 mapping for either the given prefix or the namespace URI will be removed.
441 *prefix* is a namespace prefix. *uri* is a namespace uri. Tags and
442 attributes in this namespace will be serialized with the given prefix, if at
443 all possible.
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000444
445 .. versionadded:: 2.7
446
447
Florent Xicluna88db6f42010-03-14 01:22:09 +0000448.. function:: SubElement(parent, tag, attrib={}, **extra)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000449
Florent Xicluna583302c2010-03-13 17:56:19 +0000450 Subelement factory. This function creates an element instance, and appends
451 it to an existing element.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000452
Florent Xicluna583302c2010-03-13 17:56:19 +0000453 The element name, attribute names, and attribute values can be either
454 bytestrings or Unicode strings. *parent* is the parent element. *tag* is
455 the subelement name. *attrib* is an optional dictionary, containing element
456 attributes. *extra* contains additional attributes, given as keyword
457 arguments. Returns an element instance.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000458
459
Florent Xicluna88db6f42010-03-14 01:22:09 +0000460.. function:: tostring(element, encoding="us-ascii", method="xml")
Georg Brandl8ec7f652007-08-15 14:28:01 +0000461
Florent Xicluna583302c2010-03-13 17:56:19 +0000462 Generates a string representation of an XML element, including all
Florent Xicluna88db6f42010-03-14 01:22:09 +0000463 subelements. *element* is an :class:`Element` instance. *encoding* [1]_ is
464 the output encoding (default is US-ASCII). *method* is either ``"xml"``,
Florent Xiclunaa231e452010-03-13 20:30:15 +0000465 ``"html"`` or ``"text"`` (default is ``"xml"``). Returns an encoded string
466 containing the XML data.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000467
468
Florent Xicluna88db6f42010-03-14 01:22:09 +0000469.. function:: tostringlist(element, encoding="us-ascii", method="xml")
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000470
Florent Xicluna583302c2010-03-13 17:56:19 +0000471 Generates a string representation of an XML element, including all
Florent Xicluna88db6f42010-03-14 01:22:09 +0000472 subelements. *element* is an :class:`Element` instance. *encoding* [1]_ is
473 the output encoding (default is US-ASCII). *method* is either ``"xml"``,
474 ``"html"`` or ``"text"`` (default is ``"xml"``). Returns a list of encoded
475 strings containing the XML data. It does not guarantee any specific
476 sequence, except that ``"".join(tostringlist(element)) ==
477 tostring(element)``.
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000478
479 .. versionadded:: 2.7
480
481
Florent Xiclunaa231e452010-03-13 20:30:15 +0000482.. function:: XML(text, parser=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000483
484 Parses an XML section from a string constant. This function can be used to
Florent Xicluna583302c2010-03-13 17:56:19 +0000485 embed "XML literals" in Python code. *text* is a string containing XML
486 data. *parser* is an optional parser instance. If not given, the standard
487 :class:`XMLParser` parser is used. Returns an :class:`Element` instance.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000488
489
Florent Xiclunaa231e452010-03-13 20:30:15 +0000490.. function:: XMLID(text, parser=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000491
492 Parses an XML section from a string constant, and also returns a dictionary
Florent Xicluna583302c2010-03-13 17:56:19 +0000493 which maps from element id:s to elements. *text* is a string containing XML
494 data. *parser* is an optional parser instance. If not given, the standard
495 :class:`XMLParser` parser is used. Returns a tuple containing an
496 :class:`Element` instance and a dictionary.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000497
498
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000499.. _elementtree-element-objects:
Georg Brandl8ec7f652007-08-15 14:28:01 +0000500
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000501Element Objects
Eli Bendersky6ee21872012-08-18 05:40:38 +0300502^^^^^^^^^^^^^^^
Georg Brandl8ec7f652007-08-15 14:28:01 +0000503
Florent Xiclunaa231e452010-03-13 20:30:15 +0000504.. class:: Element(tag, attrib={}, **extra)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000505
Florent Xicluna583302c2010-03-13 17:56:19 +0000506 Element class. This class defines the Element interface, and provides a
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000507 reference implementation of this interface.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000508
Florent Xicluna583302c2010-03-13 17:56:19 +0000509 The element name, attribute names, and attribute values can be either
510 bytestrings or Unicode strings. *tag* is the element name. *attrib* is
511 an optional dictionary, containing element attributes. *extra* contains
512 additional attributes, given as keyword arguments.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000513
514
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000515 .. attribute:: tag
Georg Brandl8ec7f652007-08-15 14:28:01 +0000516
Florent Xicluna583302c2010-03-13 17:56:19 +0000517 A string identifying what kind of data this element represents (the
518 element type, in other words).
Georg Brandl8ec7f652007-08-15 14:28:01 +0000519
520
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000521 .. attribute:: text
Georg Brandl8ec7f652007-08-15 14:28:01 +0000522
Florent Xicluna583302c2010-03-13 17:56:19 +0000523 The *text* attribute can be used to hold additional data associated with
524 the element. As the name implies this attribute is usually a string but
525 may be any application-specific object. If the element is created from
526 an XML file the attribute will contain any text found between the element
527 tags.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000528
529
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000530 .. attribute:: tail
Georg Brandl8ec7f652007-08-15 14:28:01 +0000531
Florent Xicluna583302c2010-03-13 17:56:19 +0000532 The *tail* attribute can be used to hold additional data associated with
533 the element. This attribute is usually a string but may be any
534 application-specific object. If the element is created from an XML file
535 the attribute will contain any text found after the element's end tag and
536 before the next tag.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000537
Georg Brandl8ec7f652007-08-15 14:28:01 +0000538
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000539 .. attribute:: attrib
Georg Brandl8ec7f652007-08-15 14:28:01 +0000540
Florent Xicluna583302c2010-03-13 17:56:19 +0000541 A dictionary containing the element's attributes. Note that while the
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000542 *attrib* value is always a real mutable Python dictionary, an ElementTree
Florent Xicluna583302c2010-03-13 17:56:19 +0000543 implementation may choose to use another internal representation, and
544 create the dictionary only if someone asks for it. To take advantage of
545 such implementations, use the dictionary methods below whenever possible.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000546
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000547 The following dictionary-like methods work on the element attributes.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000548
549
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000550 .. method:: clear()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000551
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000552 Resets an element. This function removes all subelements, clears all
553 attributes, and sets the text and tail attributes to None.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000554
Georg Brandl8ec7f652007-08-15 14:28:01 +0000555
Florent Xiclunaa231e452010-03-13 20:30:15 +0000556 .. method:: get(key, default=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000557
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000558 Gets the element attribute named *key*.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000559
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000560 Returns the attribute value, or *default* if the attribute was not found.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000561
562
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000563 .. method:: items()
564
Florent Xicluna583302c2010-03-13 17:56:19 +0000565 Returns the element attributes as a sequence of (name, value) pairs. The
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000566 attributes are returned in an arbitrary order.
567
568
569 .. method:: keys()
570
Florent Xicluna583302c2010-03-13 17:56:19 +0000571 Returns the elements attribute names as a list. The names are returned
572 in an arbitrary order.
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000573
574
575 .. method:: set(key, value)
576
577 Set the attribute *key* on the element to *value*.
578
579 The following methods work on the element's children (subelements).
580
581
582 .. method:: append(subelement)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000583
Florent Xicluna583302c2010-03-13 17:56:19 +0000584 Adds the element *subelement* to the end of this elements internal list
585 of subelements.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000586
587
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000588 .. method:: extend(subelements)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000589
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000590 Appends *subelements* from a sequence object with zero or more elements.
591 Raises :exc:`AssertionError` if a subelement is not a valid object.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000592
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000593 .. versionadded:: 2.7
Georg Brandl8ec7f652007-08-15 14:28:01 +0000594
595
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000596 .. method:: find(match)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000597
Florent Xicluna583302c2010-03-13 17:56:19 +0000598 Finds the first subelement matching *match*. *match* may be a tag name
599 or path. Returns an element instance or ``None``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000600
601
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000602 .. method:: findall(match)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000603
Florent Xicluna583302c2010-03-13 17:56:19 +0000604 Finds all matching subelements, by tag name or path. Returns a list
605 containing all matching elements in document order.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000606
607
Florent Xiclunaa231e452010-03-13 20:30:15 +0000608 .. method:: findtext(match, default=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000609
Florent Xicluna583302c2010-03-13 17:56:19 +0000610 Finds text for the first subelement matching *match*. *match* may be
611 a tag name or path. Returns the text content of the first matching
612 element, or *default* if no element was found. Note that if the matching
613 element has no text content an empty string is returned.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000614
615
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000616 .. method:: getchildren()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000617
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000618 .. deprecated:: 2.7
619 Use ``list(elem)`` or iteration.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000620
621
Florent Xiclunaa231e452010-03-13 20:30:15 +0000622 .. method:: getiterator(tag=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000623
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000624 .. deprecated:: 2.7
625 Use method :meth:`Element.iter` instead.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000626
627
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000628 .. method:: insert(index, element)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000629
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000630 Inserts a subelement at the given position in this element.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000631
632
Florent Xiclunaa231e452010-03-13 20:30:15 +0000633 .. method:: iter(tag=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000634
Florent Xicluna583302c2010-03-13 17:56:19 +0000635 Creates a tree :term:`iterator` with the current element as the root.
636 The iterator iterates over this element and all elements below it, in
637 document (depth first) order. If *tag* is not ``None`` or ``'*'``, only
638 elements whose tag equals *tag* are returned from the iterator. If the
639 tree structure is modified during iteration, the result is undefined.
640
Ezio Melottic54d97b2011-10-09 23:56:51 +0300641 .. versionadded:: 2.7
642
Florent Xicluna583302c2010-03-13 17:56:19 +0000643
644 .. method:: iterfind(match)
645
646 Finds all matching subelements, by tag name or path. Returns an iterable
647 yielding all matching elements in document order.
648
649 .. versionadded:: 2.7
650
651
652 .. method:: itertext()
653
654 Creates a text iterator. The iterator loops over this element and all
655 subelements, in document order, and returns all inner text.
656
657 .. versionadded:: 2.7
Georg Brandl8ec7f652007-08-15 14:28:01 +0000658
659
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000660 .. method:: makeelement(tag, attrib)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000661
Florent Xicluna583302c2010-03-13 17:56:19 +0000662 Creates a new element object of the same type as this element. Do not
663 call this method, use the :func:`SubElement` factory function instead.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000664
665
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000666 .. method:: remove(subelement)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000667
Florent Xicluna583302c2010-03-13 17:56:19 +0000668 Removes *subelement* from the element. Unlike the find\* methods this
669 method compares elements based on the instance identity, not on tag value
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000670 or contents.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000671
Florent Xicluna583302c2010-03-13 17:56:19 +0000672 :class:`Element` objects also support the following sequence type methods
673 for working with subelements: :meth:`__delitem__`, :meth:`__getitem__`,
674 :meth:`__setitem__`, :meth:`__len__`.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000675
Florent Xicluna583302c2010-03-13 17:56:19 +0000676 Caution: Elements with no subelements will test as ``False``. This behavior
677 will change in future versions. Use specific ``len(elem)`` or ``elem is
678 None`` test instead. ::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000679
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000680 element = root.find('foo')
Georg Brandl8ec7f652007-08-15 14:28:01 +0000681
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000682 if not element: # careful!
683 print "element not found, or element has no subelements"
Georg Brandl8ec7f652007-08-15 14:28:01 +0000684
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000685 if element is None:
686 print "element not found"
Georg Brandl8ec7f652007-08-15 14:28:01 +0000687
688
689.. _elementtree-elementtree-objects:
690
691ElementTree Objects
Eli Bendersky6ee21872012-08-18 05:40:38 +0300692^^^^^^^^^^^^^^^^^^^
Georg Brandl8ec7f652007-08-15 14:28:01 +0000693
694
Florent Xiclunaa231e452010-03-13 20:30:15 +0000695.. class:: ElementTree(element=None, file=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000696
Florent Xicluna583302c2010-03-13 17:56:19 +0000697 ElementTree wrapper class. This class represents an entire element
698 hierarchy, and adds some extra support for serialization to and from
699 standard XML.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000700
Florent Xicluna583302c2010-03-13 17:56:19 +0000701 *element* is the root element. The tree is initialized with the contents
702 of the XML *file* if given.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000703
704
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000705 .. method:: _setroot(element)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000706
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000707 Replaces the root element for this tree. This discards the current
708 contents of the tree, and replaces it with the given element. Use with
Florent Xicluna583302c2010-03-13 17:56:19 +0000709 care. *element* is an element instance.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000710
711
Florent Xicluna583302c2010-03-13 17:56:19 +0000712 .. method:: find(match)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000713
Eli Bendersky981c3bd2013-03-12 06:08:04 -0700714 Same as :meth:`Element.find`, starting at the root of the tree.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000715
716
Florent Xicluna583302c2010-03-13 17:56:19 +0000717 .. method:: findall(match)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000718
Eli Bendersky981c3bd2013-03-12 06:08:04 -0700719 Same as :meth:`Element.findall`, starting at the root of the tree.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000720
721
Florent Xiclunaa231e452010-03-13 20:30:15 +0000722 .. method:: findtext(match, default=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000723
Eli Bendersky981c3bd2013-03-12 06:08:04 -0700724 Same as :meth:`Element.findtext`, starting at the root of the tree.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000725
726
Florent Xiclunaa231e452010-03-13 20:30:15 +0000727 .. method:: getiterator(tag=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000728
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000729 .. deprecated:: 2.7
730 Use method :meth:`ElementTree.iter` instead.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000731
732
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000733 .. method:: getroot()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000734
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000735 Returns the root element for this tree.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000736
737
Florent Xiclunaa231e452010-03-13 20:30:15 +0000738 .. method:: iter(tag=None)
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000739
740 Creates and returns a tree iterator for the root element. The iterator
Florent Xicluna583302c2010-03-13 17:56:19 +0000741 loops over all elements in this tree, in section order. *tag* is the tag
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000742 to look for (default is to return all elements)
743
744
Florent Xicluna583302c2010-03-13 17:56:19 +0000745 .. method:: iterfind(match)
746
747 Finds all matching subelements, by tag name or path. Same as
748 getroot().iterfind(match). Returns an iterable yielding all matching
749 elements in document order.
750
751 .. versionadded:: 2.7
752
753
Florent Xiclunaa231e452010-03-13 20:30:15 +0000754 .. method:: parse(source, parser=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000755
Florent Xicluna583302c2010-03-13 17:56:19 +0000756 Loads an external XML section into this element tree. *source* is a file
757 name or file object. *parser* is an optional parser instance. If not
758 given, the standard XMLParser parser is used. Returns the section
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000759 root element.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000760
761
Serhiy Storchaka3d4a02a2013-01-13 21:57:14 +0200762 .. method:: write(file, encoding="us-ascii", xml_declaration=None, \
763 default_namespace=None, method="xml")
Georg Brandl8ec7f652007-08-15 14:28:01 +0000764
Florent Xicluna583302c2010-03-13 17:56:19 +0000765 Writes the element tree to a file, as XML. *file* is a file name, or a
766 file object opened for writing. *encoding* [1]_ is the output encoding
767 (default is US-ASCII). *xml_declaration* controls if an XML declaration
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000768 should be added to the file. Use False for never, True for always, None
Serhiy Storchaka3d4a02a2013-01-13 21:57:14 +0200769 for only if not US-ASCII or UTF-8 (default is None). *default_namespace*
770 sets the default XML namespace (for "xmlns"). *method* is either
Florent Xiclunaa231e452010-03-13 20:30:15 +0000771 ``"xml"``, ``"html"`` or ``"text"`` (default is ``"xml"``). Returns an
772 encoded string.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000773
Georg Brandl39bd0592007-12-01 22:42:46 +0000774This is the XML file that is going to be manipulated::
775
776 <html>
777 <head>
778 <title>Example page</title>
779 </head>
780 <body>
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000781 <p>Moved to <a href="http://example.org/">example.org</a>
Georg Brandl39bd0592007-12-01 22:42:46 +0000782 or <a href="http://example.com/">example.com</a>.</p>
783 </body>
784 </html>
785
786Example of changing the attribute "target" of every link in first paragraph::
787
788 >>> from xml.etree.ElementTree import ElementTree
789 >>> tree = ElementTree()
790 >>> tree.parse("index.xhtml")
Florent Xicluna583302c2010-03-13 17:56:19 +0000791 <Element 'html' at 0xb77e6fac>
Georg Brandl39bd0592007-12-01 22:42:46 +0000792 >>> p = tree.find("body/p") # Finds first occurrence of tag p in body
793 >>> p
Florent Xicluna583302c2010-03-13 17:56:19 +0000794 <Element 'p' at 0xb77ec26c>
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000795 >>> links = list(p.iter("a")) # Returns list of all links
Georg Brandl39bd0592007-12-01 22:42:46 +0000796 >>> links
Florent Xicluna583302c2010-03-13 17:56:19 +0000797 [<Element 'a' at 0xb77ec2ac>, <Element 'a' at 0xb77ec1cc>]
Georg Brandl39bd0592007-12-01 22:42:46 +0000798 >>> for i in links: # Iterates through all found links
799 ... i.attrib["target"] = "blank"
800 >>> tree.write("output.xhtml")
Georg Brandl8ec7f652007-08-15 14:28:01 +0000801
802.. _elementtree-qname-objects:
803
804QName Objects
Eli Bendersky6ee21872012-08-18 05:40:38 +0300805^^^^^^^^^^^^^
Georg Brandl8ec7f652007-08-15 14:28:01 +0000806
807
Florent Xiclunaa231e452010-03-13 20:30:15 +0000808.. class:: QName(text_or_uri, tag=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000809
Florent Xicluna583302c2010-03-13 17:56:19 +0000810 QName wrapper. This can be used to wrap a QName attribute value, in order
811 to get proper namespace handling on output. *text_or_uri* is a string
812 containing the QName value, in the form {uri}local, or, if the tag argument
813 is given, the URI part of a QName. If *tag* is given, the first argument is
814 interpreted as an URI, and this argument is interpreted as a local name.
815 :class:`QName` instances are opaque.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000816
817
818.. _elementtree-treebuilder-objects:
819
820TreeBuilder Objects
Eli Bendersky6ee21872012-08-18 05:40:38 +0300821^^^^^^^^^^^^^^^^^^^
Georg Brandl8ec7f652007-08-15 14:28:01 +0000822
823
Florent Xiclunaa231e452010-03-13 20:30:15 +0000824.. class:: TreeBuilder(element_factory=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000825
Florent Xicluna583302c2010-03-13 17:56:19 +0000826 Generic element structure builder. This builder converts a sequence of
827 start, data, and end method calls to a well-formed element structure. You
828 can use this class to build an element structure using a custom XML parser,
829 or a parser for some other XML-like format. The *element_factory* is called
830 to create new :class:`Element` instances when given.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000831
832
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000833 .. method:: close()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000834
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000835 Flushes the builder buffers, and returns the toplevel document
Florent Xicluna583302c2010-03-13 17:56:19 +0000836 element. Returns an :class:`Element` instance.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000837
838
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000839 .. method:: data(data)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000840
Florent Xicluna583302c2010-03-13 17:56:19 +0000841 Adds text to the current element. *data* is a string. This should be
842 either a bytestring, or a Unicode string.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000843
844
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000845 .. method:: end(tag)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000846
Florent Xicluna583302c2010-03-13 17:56:19 +0000847 Closes the current element. *tag* is the element name. Returns the
848 closed element.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000849
850
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000851 .. method:: start(tag, attrs)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000852
Florent Xicluna583302c2010-03-13 17:56:19 +0000853 Opens a new element. *tag* is the element name. *attrs* is a dictionary
854 containing element attributes. Returns the opened element.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000855
856
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000857 In addition, a custom :class:`TreeBuilder` object can provide the
858 following method:
Georg Brandl8ec7f652007-08-15 14:28:01 +0000859
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000860 .. method:: doctype(name, pubid, system)
861
Florent Xicluna583302c2010-03-13 17:56:19 +0000862 Handles a doctype declaration. *name* is the doctype name. *pubid* is
863 the public identifier. *system* is the system identifier. This method
864 does not exist on the default :class:`TreeBuilder` class.
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000865
866 .. versionadded:: 2.7
Georg Brandl8ec7f652007-08-15 14:28:01 +0000867
868
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000869.. _elementtree-xmlparser-objects:
870
871XMLParser Objects
Eli Bendersky6ee21872012-08-18 05:40:38 +0300872^^^^^^^^^^^^^^^^^
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000873
874
Florent Xiclunaa231e452010-03-13 20:30:15 +0000875.. class:: XMLParser(html=0, target=None, encoding=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000876
Florent Xicluna583302c2010-03-13 17:56:19 +0000877 :class:`Element` structure builder for XML source data, based on the expat
878 parser. *html* are predefined HTML entities. This flag is not supported by
879 the current implementation. *target* is the target object. If omitted, the
880 builder uses an instance of the standard TreeBuilder class. *encoding* [1]_
881 is optional. If given, the value overrides the encoding specified in the
882 XML file.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000883
884
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000885 .. method:: close()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000886
Florent Xicluna583302c2010-03-13 17:56:19 +0000887 Finishes feeding data to the parser. Returns an element structure.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000888
889
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000890 .. method:: doctype(name, pubid, system)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000891
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000892 .. deprecated:: 2.7
893 Define the :meth:`TreeBuilder.doctype` method on a custom TreeBuilder
894 target.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000895
896
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000897 .. method:: feed(data)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000898
Florent Xicluna583302c2010-03-13 17:56:19 +0000899 Feeds data to the parser. *data* is encoded data.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000900
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000901:meth:`XMLParser.feed` calls *target*\'s :meth:`start` method
Georg Brandl39bd0592007-12-01 22:42:46 +0000902for each opening tag, its :meth:`end` method for each closing tag,
Florent Xicluna583302c2010-03-13 17:56:19 +0000903and data is processed by method :meth:`data`. :meth:`XMLParser.close`
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000904calls *target*\'s method :meth:`close`.
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000905:class:`XMLParser` can be used not only for building a tree structure.
Georg Brandl39bd0592007-12-01 22:42:46 +0000906This is an example of counting the maximum depth of an XML file::
907
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000908 >>> from xml.etree.ElementTree import XMLParser
Georg Brandl39bd0592007-12-01 22:42:46 +0000909 >>> class MaxDepth: # The target object of the parser
910 ... maxDepth = 0
911 ... depth = 0
912 ... def start(self, tag, attrib): # Called for each opening tag.
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000913 ... self.depth += 1
Georg Brandl39bd0592007-12-01 22:42:46 +0000914 ... if self.depth > self.maxDepth:
915 ... self.maxDepth = self.depth
916 ... def end(self, tag): # Called for each closing tag.
917 ... self.depth -= 1
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000918 ... def data(self, data):
Georg Brandl39bd0592007-12-01 22:42:46 +0000919 ... pass # We do not need to do anything with data.
920 ... def close(self): # Called when all data has been parsed.
921 ... return self.maxDepth
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000922 ...
Georg Brandl39bd0592007-12-01 22:42:46 +0000923 >>> target = MaxDepth()
Florent Xicluna3e8c1892010-03-11 14:36:19 +0000924 >>> parser = XMLParser(target=target)
Georg Brandl39bd0592007-12-01 22:42:46 +0000925 >>> exampleXml = """
926 ... <a>
927 ... <b>
928 ... </b>
929 ... <b>
930 ... <c>
931 ... <d>
932 ... </d>
933 ... </c>
934 ... </b>
935 ... </a>"""
936 >>> parser.feed(exampleXml)
937 >>> parser.close()
938 4
Mark Summerfield43da35d2008-03-17 08:28:15 +0000939
940
941.. rubric:: Footnotes
942
943.. [#] The encoding string included in XML output should conform to the
Florent Xicluna583302c2010-03-13 17:56:19 +0000944 appropriate standards. For example, "UTF-8" is valid, but "UTF8" is
945 not. See http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl
Georg Brandl8b8c2df2009-02-20 08:45:47 +0000946 and http://www.iana.org/assignments/character-sets.