blob: 6bafbe708257473dae4ed9af71d0f5fb8c6c97c2 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`xml.etree.ElementTree` --- The ElementTree XML API
2========================================================
3
4.. module:: xml.etree.ElementTree
5 :synopsis: Implementation of the ElementTree API.
6.. moduleauthor:: Fredrik Lundh <fredrik@pythonware.com>
7
Eli Benderskyc1d98692012-03-30 11:44:15 +03008The :mod:`xml.etree.ElementTree` module implements a simple and efficient API
9for parsing and creating XML data.
Florent Xiclunaf15351d2010-03-13 23:24:31 +000010
Florent Xiclunaa72a98f2012-02-13 11:03:30 +010011.. versionchanged:: 3.3
12 This module will use a fast implementation whenever available.
13 The :mod:`xml.etree.cElementTree` module is deprecated.
14
Christian Heimes7380a672013-03-26 17:35:55 +010015
16.. warning::
17
18 The :mod:`xml.etree.ElementTree` module is not secure against
19 maliciously constructed data. If you need to parse untrusted or
20 unauthenticated data see :ref:`xml-vulnerabilities`.
21
Eli Benderskyc1d98692012-03-30 11:44:15 +030022Tutorial
23--------
Georg Brandl116aa622007-08-15 14:28:22 +000024
Eli Benderskyc1d98692012-03-30 11:44:15 +030025This is a short tutorial for using :mod:`xml.etree.ElementTree` (``ET`` in
26short). The goal is to demonstrate some of the building blocks and basic
27concepts of the module.
Eli Bendersky3a4875e2012-03-26 20:43:32 +020028
Eli Benderskyc1d98692012-03-30 11:44:15 +030029XML tree and elements
30^^^^^^^^^^^^^^^^^^^^^
Eli Bendersky3a4875e2012-03-26 20:43:32 +020031
Eli Benderskyc1d98692012-03-30 11:44:15 +030032XML is an inherently hierarchical data format, and the most natural way to
33represent it is with a tree. ``ET`` has two classes for this purpose -
34:class:`ElementTree` represents the whole XML document as a tree, and
35:class:`Element` represents a single node in this tree. Interactions with
36the whole document (reading and writing to/from files) are usually done
37on the :class:`ElementTree` level. Interactions with a single XML element
38and its sub-elements are done on the :class:`Element` level.
Eli Bendersky3a4875e2012-03-26 20:43:32 +020039
Eli Benderskyc1d98692012-03-30 11:44:15 +030040.. _elementtree-parsing-xml:
Eli Bendersky3a4875e2012-03-26 20:43:32 +020041
Eli Benderskyc1d98692012-03-30 11:44:15 +030042Parsing XML
43^^^^^^^^^^^
Eli Bendersky3a4875e2012-03-26 20:43:32 +020044
Eli Bendersky0f4e9342012-08-14 07:19:33 +030045We'll be using the following XML document as the sample data for this section:
Eli Bendersky3a4875e2012-03-26 20:43:32 +020046
Eli Bendersky0f4e9342012-08-14 07:19:33 +030047.. code-block:: xml
48
49 <?xml version="1.0"?>
Eli Bendersky3a4875e2012-03-26 20:43:32 +020050 <data>
Eli Bendersky3115f0d2012-08-15 14:26:30 +030051 <country name="Liechtenstein">
Eli Bendersky3a4875e2012-03-26 20:43:32 +020052 <rank>1</rank>
53 <year>2008</year>
54 <gdppc>141100</gdppc>
55 <neighbor name="Austria" direction="E"/>
56 <neighbor name="Switzerland" direction="W"/>
57 </country>
58 <country name="Singapore">
59 <rank>4</rank>
60 <year>2011</year>
61 <gdppc>59900</gdppc>
62 <neighbor name="Malaysia" direction="N"/>
63 </country>
64 <country name="Panama">
65 <rank>68</rank>
66 <year>2011</year>
67 <gdppc>13600</gdppc>
68 <neighbor name="Costa Rica" direction="W"/>
69 <neighbor name="Colombia" direction="E"/>
70 </country>
71 </data>
Eli Bendersky3a4875e2012-03-26 20:43:32 +020072
Eli Bendersky0f4e9342012-08-14 07:19:33 +030073We can import this data by reading from a file::
Eli Benderskyc1d98692012-03-30 11:44:15 +030074
75 import xml.etree.ElementTree as ET
Eli Bendersky0f4e9342012-08-14 07:19:33 +030076 tree = ET.parse('country_data.xml')
77 root = tree.getroot()
Eli Benderskyc1d98692012-03-30 11:44:15 +030078
Eli Bendersky0f4e9342012-08-14 07:19:33 +030079Or directly from a string::
80
81 root = ET.fromstring(country_data_as_string)
Eli Benderskyc1d98692012-03-30 11:44:15 +030082
83:func:`fromstring` parses XML from a string directly into an :class:`Element`,
84which is the root element of the parsed tree. Other parsing functions may
Eli Bendersky0f4e9342012-08-14 07:19:33 +030085create an :class:`ElementTree`. Check the documentation to be sure.
Eli Benderskyc1d98692012-03-30 11:44:15 +030086
87As an :class:`Element`, ``root`` has a tag and a dictionary of attributes::
88
89 >>> root.tag
90 'data'
91 >>> root.attrib
92 {}
93
94It also has children nodes over which we can iterate::
95
96 >>> for child in root:
97 ... print(child.tag, child.attrib)
98 ...
Eli Bendersky3115f0d2012-08-15 14:26:30 +030099 country {'name': 'Liechtenstein'}
Eli Benderskyc1d98692012-03-30 11:44:15 +0300100 country {'name': 'Singapore'}
101 country {'name': 'Panama'}
102
103Children are nested, and we can access specific child nodes by index::
104
105 >>> root[0][1].text
106 '2008'
107
Eli Bendersky3bdead12013-04-20 09:06:27 -0700108Incremental parsing
109^^^^^^^^^^^^^^^^^^^
110
111It's possible to parse XML incrementally (i.e. not the whole document at once).
112The most powerful tool for doing this is :class:`IncrementalParser`. It does
113not require a blocking read to obtain the XML data, and is instead fed with
114data incrementally with :meth:`IncrementalParser.data_received` calls. To get
115the parsed XML elements, call :meth:`IncrementalParser.events`. Here's an
116example::
117
118 >>> incparser = ET.IncrementalParser(['start', 'end'])
119 >>> incparser.data_received('<mytag>sometext')
120 >>> list(incparser.events())
121 [('start', <Element 'mytag' at 0x7fba3f2a8688>)]
122 >>> incparser.data_received(' more text</mytag>')
123 >>> for event, elem in incparser.events():
124 ... print(event)
125 ... print(elem.tag, 'text=', elem.text)
126 ...
127 end
128 mytag text= sometext more text
129
130The obvious use case is applications that operate in an asynchronous fashion
131where the XML data is being received from a socket or read incrementally from
132some storage device. In such cases, blocking reads are unacceptable.
133
134Because it's so flexible, :class:`IncrementalParser` can be inconvenient
135to use for simpler use-cases. If you don't mind your application blocking on
136reading XML data but would still like to have incremental parsing capabilities,
137take a look at :func:`iterparse`. It can be useful when you're reading a large
138XML document and don't want to hold it wholly in memory.
139
Eli Benderskyc1d98692012-03-30 11:44:15 +0300140Finding interesting elements
141^^^^^^^^^^^^^^^^^^^^^^^^^^^^
142
143:class:`Element` has some useful methods that help iterate recursively over all
144the sub-tree below it (its children, their children, and so on). For example,
145:meth:`Element.iter`::
146
147 >>> for neighbor in root.iter('neighbor'):
148 ... print(neighbor.attrib)
149 ...
150 {'name': 'Austria', 'direction': 'E'}
151 {'name': 'Switzerland', 'direction': 'W'}
152 {'name': 'Malaysia', 'direction': 'N'}
153 {'name': 'Costa Rica', 'direction': 'W'}
154 {'name': 'Colombia', 'direction': 'E'}
155
Eli Bendersky0f4e9342012-08-14 07:19:33 +0300156:meth:`Element.findall` finds only elements with a tag which are direct
157children of the current element. :meth:`Element.find` finds the *first* child
158with a particular tag, and :meth:`Element.text` accesses the element's text
159content. :meth:`Element.get` accesses the element's attributes::
160
161 >>> for country in root.findall('country'):
162 ... rank = country.find('rank').text
163 ... name = country.get('name')
164 ... print(name, rank)
165 ...
Eli Bendersky3115f0d2012-08-15 14:26:30 +0300166 Liechtenstein 1
Eli Bendersky0f4e9342012-08-14 07:19:33 +0300167 Singapore 4
168 Panama 68
169
Eli Benderskyc1d98692012-03-30 11:44:15 +0300170More sophisticated specification of which elements to look for is possible by
171using :ref:`XPath <elementtree-xpath>`.
172
Eli Bendersky0f4e9342012-08-14 07:19:33 +0300173Modifying an XML File
174^^^^^^^^^^^^^^^^^^^^^
Eli Benderskyc1d98692012-03-30 11:44:15 +0300175
Eli Bendersky0f4e9342012-08-14 07:19:33 +0300176:class:`ElementTree` provides a simple way to build XML documents and write them to files.
Eli Benderskyc1d98692012-03-30 11:44:15 +0300177The :meth:`ElementTree.write` method serves this purpose.
178
179Once created, an :class:`Element` object may be manipulated by directly changing
180its fields (such as :attr:`Element.text`), adding and modifying attributes
181(:meth:`Element.set` method), as well as adding new children (for example
182with :meth:`Element.append`).
183
Eli Bendersky0f4e9342012-08-14 07:19:33 +0300184Let's say we want to add one to each country's rank, and add an ``updated``
185attribute to the rank element::
186
187 >>> for rank in root.iter('rank'):
188 ... new_rank = int(rank.text) + 1
189 ... rank.text = str(new_rank)
190 ... rank.set('updated', 'yes')
191 ...
Eli Benderskya1b0f6d2012-08-18 05:42:22 +0300192 >>> tree.write('output.xml')
Eli Bendersky0f4e9342012-08-14 07:19:33 +0300193
194Our XML now looks like this:
195
196.. code-block:: xml
197
198 <?xml version="1.0"?>
199 <data>
Eli Bendersky3115f0d2012-08-15 14:26:30 +0300200 <country name="Liechtenstein">
Eli Bendersky0f4e9342012-08-14 07:19:33 +0300201 <rank updated="yes">2</rank>
202 <year>2008</year>
203 <gdppc>141100</gdppc>
204 <neighbor name="Austria" direction="E"/>
205 <neighbor name="Switzerland" direction="W"/>
206 </country>
207 <country name="Singapore">
208 <rank updated="yes">5</rank>
209 <year>2011</year>
210 <gdppc>59900</gdppc>
211 <neighbor name="Malaysia" direction="N"/>
212 </country>
213 <country name="Panama">
214 <rank updated="yes">69</rank>
215 <year>2011</year>
216 <gdppc>13600</gdppc>
217 <neighbor name="Costa Rica" direction="W"/>
218 <neighbor name="Colombia" direction="E"/>
219 </country>
220 </data>
221
222We can remove elements using :meth:`Element.remove`. Let's say we want to
223remove all countries with a rank higher than 50::
224
225 >>> for country in root.findall('country'):
226 ... rank = int(country.find('rank').text)
227 ... if rank > 50:
228 ... root.remove(country)
229 ...
Eli Benderskya1b0f6d2012-08-18 05:42:22 +0300230 >>> tree.write('output.xml')
Eli Bendersky0f4e9342012-08-14 07:19:33 +0300231
232Our XML now looks like this:
233
234.. code-block:: xml
235
236 <?xml version="1.0"?>
237 <data>
Eli Bendersky3115f0d2012-08-15 14:26:30 +0300238 <country name="Liechtenstein">
Eli Bendersky0f4e9342012-08-14 07:19:33 +0300239 <rank updated="yes">2</rank>
240 <year>2008</year>
241 <gdppc>141100</gdppc>
242 <neighbor name="Austria" direction="E"/>
243 <neighbor name="Switzerland" direction="W"/>
244 </country>
245 <country name="Singapore">
246 <rank updated="yes">5</rank>
247 <year>2011</year>
248 <gdppc>59900</gdppc>
249 <neighbor name="Malaysia" direction="N"/>
250 </country>
251 </data>
252
253Building XML documents
254^^^^^^^^^^^^^^^^^^^^^^
255
Eli Benderskyc1d98692012-03-30 11:44:15 +0300256The :func:`SubElement` function also provides a convenient way to create new
257sub-elements for a given element::
258
259 >>> a = ET.Element('a')
260 >>> b = ET.SubElement(a, 'b')
261 >>> c = ET.SubElement(a, 'c')
262 >>> d = ET.SubElement(c, 'd')
263 >>> ET.dump(a)
264 <a><b /><c><d /></c></a>
265
266Additional resources
267^^^^^^^^^^^^^^^^^^^^
268
269See http://effbot.org/zone/element-index.htm for tutorials and links to other
270docs.
271
272
273.. _elementtree-xpath:
274
275XPath support
276-------------
277
278This module provides limited support for
279`XPath expressions <http://www.w3.org/TR/xpath>`_ for locating elements in a
280tree. The goal is to support a small subset of the abbreviated syntax; a full
281XPath engine is outside the scope of the module.
282
283Example
284^^^^^^^
285
286Here's an example that demonstrates some of the XPath capabilities of the
287module. We'll be using the ``countrydata`` XML document from the
288:ref:`Parsing XML <elementtree-parsing-xml>` section::
289
290 import xml.etree.ElementTree as ET
291
292 root = ET.fromstring(countrydata)
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200293
294 # Top-level elements
Eli Benderskyc1d98692012-03-30 11:44:15 +0300295 root.findall(".")
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200296
297 # All 'neighbor' grand-children of 'country' children of the top-level
298 # elements
Eli Benderskyc1d98692012-03-30 11:44:15 +0300299 root.findall("./country/neighbor")
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200300
301 # Nodes with name='Singapore' that have a 'year' child
Eli Benderskyc1d98692012-03-30 11:44:15 +0300302 root.findall(".//year/..[@name='Singapore']")
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200303
304 # 'year' nodes that are children of nodes with name='Singapore'
Eli Benderskyc1d98692012-03-30 11:44:15 +0300305 root.findall(".//*[@name='Singapore']/year")
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200306
307 # All 'neighbor' nodes that are the second child of their parent
Eli Benderskyc1d98692012-03-30 11:44:15 +0300308 root.findall(".//neighbor[2]")
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200309
310Supported XPath syntax
311^^^^^^^^^^^^^^^^^^^^^^
312
Georg Brandl44ea77b2013-03-28 13:28:44 +0100313.. tabularcolumns:: |l|L|
314
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200315+-----------------------+------------------------------------------------------+
316| Syntax | Meaning |
317+=======================+======================================================+
318| ``tag`` | Selects all child elements with the given tag. |
319| | For example, ``spam`` selects all child elements |
320| | named ``spam``, ``spam/egg`` selects all |
321| | grandchildren named ``egg`` in all children named |
322| | ``spam``. |
323+-----------------------+------------------------------------------------------+
324| ``*`` | Selects all child elements. For example, ``*/egg`` |
325| | selects all grandchildren named ``egg``. |
326+-----------------------+------------------------------------------------------+
327| ``.`` | Selects the current node. This is mostly useful |
328| | at the beginning of the path, to indicate that it's |
329| | a relative path. |
330+-----------------------+------------------------------------------------------+
331| ``//`` | Selects all subelements, on all levels beneath the |
Eli Benderskyede001a2012-03-27 04:57:23 +0200332| | current element. For example, ``.//egg`` selects |
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200333| | all ``egg`` elements in the entire tree. |
334+-----------------------+------------------------------------------------------+
Eli Bendersky323a43a2012-10-09 06:46:33 -0700335| ``..`` | Selects the parent element. Returns ``None`` if the |
336| | path attempts to reach the ancestors of the start |
337| | element (the element ``find`` was called on). |
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200338+-----------------------+------------------------------------------------------+
339| ``[@attrib]`` | Selects all elements that have the given attribute. |
340+-----------------------+------------------------------------------------------+
341| ``[@attrib='value']`` | Selects all elements for which the given attribute |
342| | has the given value. The value cannot contain |
343| | quotes. |
344+-----------------------+------------------------------------------------------+
345| ``[tag]`` | Selects all elements that have a child named |
346| | ``tag``. Only immediate children are supported. |
347+-----------------------+------------------------------------------------------+
348| ``[position]`` | Selects all elements that are located at the given |
349| | position. The position can be either an integer |
350| | (1 is the first position), the expression ``last()`` |
351| | (for the last position), or a position relative to |
352| | the last position (e.g. ``last()-1``). |
353+-----------------------+------------------------------------------------------+
354
355Predicates (expressions within square brackets) must be preceded by a tag
356name, an asterisk, or another predicate. ``position`` predicates must be
357preceded by a tag name.
358
359Reference
360---------
361
Georg Brandl116aa622007-08-15 14:28:22 +0000362.. _elementtree-functions:
363
364Functions
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200365^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +0000366
367
Georg Brandl7f01a132009-09-16 15:58:14 +0000368.. function:: Comment(text=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000369
Georg Brandlf6945182008-02-01 11:56:49 +0000370 Comment element factory. This factory function creates a special element
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000371 that will be serialized as an XML comment by the standard serializer. The
372 comment string can be either a bytestring or a Unicode string. *text* is a
373 string containing the comment string. Returns an element instance
Georg Brandlf6945182008-02-01 11:56:49 +0000374 representing a comment.
Georg Brandl116aa622007-08-15 14:28:22 +0000375
376
377.. function:: dump(elem)
378
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000379 Writes an element tree or element structure to sys.stdout. This function
380 should be used for debugging only.
Georg Brandl116aa622007-08-15 14:28:22 +0000381
382 The exact output format is implementation dependent. In this version, it's
383 written as an ordinary XML file.
384
385 *elem* is an element tree or an individual element.
386
387
Georg Brandl116aa622007-08-15 14:28:22 +0000388.. function:: fromstring(text)
389
Florent Xiclunadddd5e92010-03-14 01:28:07 +0000390 Parses an XML section from a string constant. Same as :func:`XML`. *text*
391 is a string containing XML data. Returns an :class:`Element` instance.
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000392
393
394.. function:: fromstringlist(sequence, parser=None)
395
396 Parses an XML document from a sequence of string fragments. *sequence* is a
397 list or other sequence containing XML data fragments. *parser* is an
398 optional parser instance. If not given, the standard :class:`XMLParser`
399 parser is used. Returns an :class:`Element` instance.
400
Ezio Melottif8754a62010-03-21 07:16:43 +0000401 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000402
403
404.. function:: iselement(element)
405
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000406 Checks if an object appears to be a valid element object. *element* is an
407 element instance. Returns a true value if this is an element object.
Georg Brandl116aa622007-08-15 14:28:22 +0000408
409
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000410.. function:: iterparse(source, events=None, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000411
412 Parses an XML section into an element tree incrementally, and reports what's
Eli Bendersky604c4ff2012-03-16 08:41:30 +0200413 going on to the user. *source* is a filename or :term:`file object`
Eli Benderskyfb625442013-05-19 09:09:24 -0700414 containing XML data. *events* is a sequence of events to report back. The
Eli Bendersky604c4ff2012-03-16 08:41:30 +0200415 supported events are the strings ``"start"``, ``"end"``, ``"start-ns"``
416 and ``"end-ns"`` (the "ns" events are used to get detailed namespace
417 information). If *events* is omitted, only ``"end"`` events are reported.
418 *parser* is an optional parser instance. If not given, the standard
Eli Benderskyca97fd32013-08-03 18:52:32 -0700419 :class:`XMLParser` parser is used. *parser* can only use the default
420 :class:`TreeBuilder` as a target. Returns an :term:`iterator` providing
Eli Bendersky604c4ff2012-03-16 08:41:30 +0200421 ``(event, elem)`` pairs.
Georg Brandl116aa622007-08-15 14:28:22 +0000422
Eli Benderskyab2a76c2013-04-20 05:53:50 -0700423 Note that while :func:`iterparse` builds the tree incrementally, it issues
424 blocking reads on *source* (or the file it names). As such, it's unsuitable
425 for asynchronous applications where blocking reads can't be made. For fully
Eli Bendersky10e0af82013-04-20 05:54:29 -0700426 asynchronous parsing, see :class:`IncrementalParser`.
Eli Benderskyab2a76c2013-04-20 05:53:50 -0700427
Benjamin Peterson75edad02009-01-01 15:05:06 +0000428 .. note::
429
430 :func:`iterparse` only guarantees that it has seen the ">"
431 character of a starting tag when it emits a "start" event, so the
432 attributes are defined, but the contents of the text and tail attributes
433 are undefined at that point. The same applies to the element children;
434 they may or may not be present.
435
436 If you need a fully populated element, look for "end" events instead.
437
Georg Brandl7f01a132009-09-16 15:58:14 +0000438.. function:: parse(source, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000439
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000440 Parses an XML section into an element tree. *source* is a filename or file
441 object containing XML data. *parser* is an optional parser instance. If
442 not given, the standard :class:`XMLParser` parser is used. Returns an
443 :class:`ElementTree` instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000444
445
Georg Brandl7f01a132009-09-16 15:58:14 +0000446.. function:: ProcessingInstruction(target, text=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000447
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000448 PI element factory. This factory function creates a special element that
449 will be serialized as an XML processing instruction. *target* is a string
450 containing the PI target. *text* is a string containing the PI contents, if
451 given. Returns an element instance, representing a processing instruction.
452
453
454.. function:: register_namespace(prefix, uri)
455
456 Registers a namespace prefix. The registry is global, and any existing
457 mapping for either the given prefix or the namespace URI will be removed.
458 *prefix* is a namespace prefix. *uri* is a namespace uri. Tags and
459 attributes in this namespace will be serialized with the given prefix, if at
460 all possible.
461
Ezio Melottif8754a62010-03-21 07:16:43 +0000462 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000463
464
Georg Brandl7f01a132009-09-16 15:58:14 +0000465.. function:: SubElement(parent, tag, attrib={}, **extra)
Georg Brandl116aa622007-08-15 14:28:22 +0000466
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000467 Subelement factory. This function creates an element instance, and appends
468 it to an existing element.
Georg Brandl116aa622007-08-15 14:28:22 +0000469
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000470 The element name, attribute names, and attribute values can be either
471 bytestrings or Unicode strings. *parent* is the parent element. *tag* is
472 the subelement name. *attrib* is an optional dictionary, containing element
473 attributes. *extra* contains additional attributes, given as keyword
474 arguments. Returns an element instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000475
476
Serhiy Storchaka9e189f02013-01-13 22:24:27 +0200477.. function:: tostring(element, encoding="us-ascii", method="xml", *, \
Eli Benderskya9a2ef52013-01-13 06:04:43 -0800478 short_empty_elements=True)
Georg Brandl116aa622007-08-15 14:28:22 +0000479
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000480 Generates a string representation of an XML element, including all
Florent Xiclunadddd5e92010-03-14 01:28:07 +0000481 subelements. *element* is an :class:`Element` instance. *encoding* [1]_ is
Florent Xiclunac17f1722010-08-08 19:48:29 +0000482 the output encoding (default is US-ASCII). Use ``encoding="unicode"`` to
Eli Bendersky831893a2012-10-09 07:18:16 -0700483 generate a Unicode string (otherwise, a bytestring is generated). *method*
484 is either ``"xml"``, ``"html"`` or ``"text"`` (default is ``"xml"``).
Eli Benderskya9a2ef52013-01-13 06:04:43 -0800485 *short_empty_elements* has the same meaning as in :meth:`ElementTree.write`.
Eli Bendersky831893a2012-10-09 07:18:16 -0700486 Returns an (optionally) encoded string containing the XML data.
Georg Brandl116aa622007-08-15 14:28:22 +0000487
Eli Benderskya9a2ef52013-01-13 06:04:43 -0800488 .. versionadded:: 3.4
489 The *short_empty_elements* parameter.
Georg Brandl116aa622007-08-15 14:28:22 +0000490
Eli Benderskya9a2ef52013-01-13 06:04:43 -0800491
Serhiy Storchaka9e189f02013-01-13 22:24:27 +0200492.. function:: tostringlist(element, encoding="us-ascii", method="xml", *, \
Eli Benderskya9a2ef52013-01-13 06:04:43 -0800493 short_empty_elements=True)
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000494
495 Generates a string representation of an XML element, including all
Florent Xiclunadddd5e92010-03-14 01:28:07 +0000496 subelements. *element* is an :class:`Element` instance. *encoding* [1]_ is
Florent Xiclunac17f1722010-08-08 19:48:29 +0000497 the output encoding (default is US-ASCII). Use ``encoding="unicode"`` to
Eli Bendersky831893a2012-10-09 07:18:16 -0700498 generate a Unicode string (otherwise, a bytestring is generated). *method*
499 is either ``"xml"``, ``"html"`` or ``"text"`` (default is ``"xml"``).
Eli Benderskya9a2ef52013-01-13 06:04:43 -0800500 *short_empty_elements* has the same meaning as in :meth:`ElementTree.write`.
Eli Bendersky831893a2012-10-09 07:18:16 -0700501 Returns a list of (optionally) encoded strings containing the XML data.
502 It does not guarantee any specific sequence, except that
503 ``"".join(tostringlist(element)) == tostring(element)``.
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000504
Ezio Melottif8754a62010-03-21 07:16:43 +0000505 .. versionadded:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000506
Eli Benderskya9a2ef52013-01-13 06:04:43 -0800507 .. versionadded:: 3.4
508 The *short_empty_elements* parameter.
509
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000510
511.. function:: XML(text, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000512
513 Parses an XML section from a string constant. This function can be used to
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000514 embed "XML literals" in Python code. *text* is a string containing XML
515 data. *parser* is an optional parser instance. If not given, the standard
516 :class:`XMLParser` parser is used. Returns an :class:`Element` instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000517
518
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000519.. function:: XMLID(text, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000520
521 Parses an XML section from a string constant, and also returns a dictionary
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000522 which maps from element id:s to elements. *text* is a string containing XML
523 data. *parser* is an optional parser instance. If not given, the standard
524 :class:`XMLParser` parser is used. Returns a tuple containing an
525 :class:`Element` instance and a dictionary.
Georg Brandl116aa622007-08-15 14:28:22 +0000526
527
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000528.. _elementtree-element-objects:
Georg Brandl116aa622007-08-15 14:28:22 +0000529
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000530Element Objects
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200531^^^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +0000532
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000533.. class:: Element(tag, attrib={}, **extra)
Georg Brandl116aa622007-08-15 14:28:22 +0000534
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000535 Element class. This class defines the Element interface, and provides a
536 reference implementation of this interface.
Georg Brandl116aa622007-08-15 14:28:22 +0000537
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000538 The element name, attribute names, and attribute values can be either
539 bytestrings or Unicode strings. *tag* is the element name. *attrib* is
540 an optional dictionary, containing element attributes. *extra* contains
541 additional attributes, given as keyword arguments.
Georg Brandl116aa622007-08-15 14:28:22 +0000542
543
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000544 .. attribute:: tag
Georg Brandl116aa622007-08-15 14:28:22 +0000545
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000546 A string identifying what kind of data this element represents (the
547 element type, in other words).
Georg Brandl116aa622007-08-15 14:28:22 +0000548
549
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000550 .. attribute:: text
Georg Brandl116aa622007-08-15 14:28:22 +0000551
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000552 The *text* attribute can be used to hold additional data associated with
553 the element. As the name implies this attribute is usually a string but
554 may be any application-specific object. If the element is created from
555 an XML file the attribute will contain any text found between the element
556 tags.
Georg Brandl116aa622007-08-15 14:28:22 +0000557
558
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000559 .. attribute:: tail
Georg Brandl116aa622007-08-15 14:28:22 +0000560
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000561 The *tail* attribute can be used to hold additional data associated with
562 the element. This attribute is usually a string but may be any
563 application-specific object. If the element is created from an XML file
564 the attribute will contain any text found after the element's end tag and
565 before the next tag.
Georg Brandl116aa622007-08-15 14:28:22 +0000566
Georg Brandl116aa622007-08-15 14:28:22 +0000567
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000568 .. attribute:: attrib
Georg Brandl116aa622007-08-15 14:28:22 +0000569
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000570 A dictionary containing the element's attributes. Note that while the
571 *attrib* value is always a real mutable Python dictionary, an ElementTree
572 implementation may choose to use another internal representation, and
573 create the dictionary only if someone asks for it. To take advantage of
574 such implementations, use the dictionary methods below whenever possible.
Georg Brandl116aa622007-08-15 14:28:22 +0000575
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000576 The following dictionary-like methods work on the element attributes.
Georg Brandl116aa622007-08-15 14:28:22 +0000577
578
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000579 .. method:: clear()
Georg Brandl116aa622007-08-15 14:28:22 +0000580
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000581 Resets an element. This function removes all subelements, clears all
Eli Bendersky323a43a2012-10-09 06:46:33 -0700582 attributes, and sets the text and tail attributes to ``None``.
Georg Brandl116aa622007-08-15 14:28:22 +0000583
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000584
585 .. method:: get(key, default=None)
586
587 Gets the element attribute named *key*.
588
589 Returns the attribute value, or *default* if the attribute was not found.
590
591
592 .. method:: items()
593
594 Returns the element attributes as a sequence of (name, value) pairs. The
595 attributes are returned in an arbitrary order.
596
597
598 .. method:: keys()
599
600 Returns the elements attribute names as a list. The names are returned
601 in an arbitrary order.
602
603
604 .. method:: set(key, value)
605
606 Set the attribute *key* on the element to *value*.
607
608 The following methods work on the element's children (subelements).
609
610
611 .. method:: append(subelement)
612
Eli Bendersky396e8fc2012-03-23 14:24:20 +0200613 Adds the element *subelement* to the end of this element's internal list
614 of subelements. Raises :exc:`TypeError` if *subelement* is not an
615 :class:`Element`.
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000616
617
618 .. method:: extend(subelements)
Georg Brandl116aa622007-08-15 14:28:22 +0000619
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000620 Appends *subelements* from a sequence object with zero or more elements.
Eli Bendersky396e8fc2012-03-23 14:24:20 +0200621 Raises :exc:`TypeError` if a subelement is not an :class:`Element`.
Georg Brandl116aa622007-08-15 14:28:22 +0000622
Ezio Melottif8754a62010-03-21 07:16:43 +0000623 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000624
Georg Brandl116aa622007-08-15 14:28:22 +0000625
Eli Bendersky737b1732012-05-29 06:02:56 +0300626 .. method:: find(match, namespaces=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000627
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000628 Finds the first subelement matching *match*. *match* may be a tag name
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200629 or a :ref:`path <elementtree-xpath>`. Returns an element instance
Eli Bendersky737b1732012-05-29 06:02:56 +0300630 or ``None``. *namespaces* is an optional mapping from namespace prefix
631 to full name.
Georg Brandl116aa622007-08-15 14:28:22 +0000632
Georg Brandl116aa622007-08-15 14:28:22 +0000633
Eli Bendersky737b1732012-05-29 06:02:56 +0300634 .. method:: findall(match, namespaces=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000635
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200636 Finds all matching subelements, by tag name or
637 :ref:`path <elementtree-xpath>`. Returns a list containing all matching
Eli Bendersky737b1732012-05-29 06:02:56 +0300638 elements in document order. *namespaces* is an optional mapping from
639 namespace prefix to full name.
Georg Brandl116aa622007-08-15 14:28:22 +0000640
Georg Brandl116aa622007-08-15 14:28:22 +0000641
Eli Bendersky737b1732012-05-29 06:02:56 +0300642 .. method:: findtext(match, default=None, namespaces=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000643
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000644 Finds text for the first subelement matching *match*. *match* may be
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200645 a tag name or a :ref:`path <elementtree-xpath>`. Returns the text content
646 of the first matching element, or *default* if no element was found.
647 Note that if the matching element has no text content an empty string
Eli Bendersky737b1732012-05-29 06:02:56 +0300648 is returned. *namespaces* is an optional mapping from namespace prefix
649 to full name.
Georg Brandl116aa622007-08-15 14:28:22 +0000650
Georg Brandl116aa622007-08-15 14:28:22 +0000651
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000652 .. method:: getchildren()
Georg Brandl116aa622007-08-15 14:28:22 +0000653
Georg Brandl67b21b72010-08-17 15:07:14 +0000654 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000655 Use ``list(elem)`` or iteration.
Georg Brandl116aa622007-08-15 14:28:22 +0000656
Georg Brandl116aa622007-08-15 14:28:22 +0000657
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000658 .. method:: getiterator(tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000659
Georg Brandl67b21b72010-08-17 15:07:14 +0000660 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000661 Use method :meth:`Element.iter` instead.
Georg Brandl116aa622007-08-15 14:28:22 +0000662
Georg Brandl116aa622007-08-15 14:28:22 +0000663
Eli Bendersky396e8fc2012-03-23 14:24:20 +0200664 .. method:: insert(index, subelement)
Georg Brandl116aa622007-08-15 14:28:22 +0000665
Eli Bendersky396e8fc2012-03-23 14:24:20 +0200666 Inserts *subelement* at the given position in this element. Raises
667 :exc:`TypeError` if *subelement* is not an :class:`Element`.
Georg Brandl116aa622007-08-15 14:28:22 +0000668
Georg Brandl116aa622007-08-15 14:28:22 +0000669
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000670 .. method:: iter(tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000671
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000672 Creates a tree :term:`iterator` with the current element as the root.
673 The iterator iterates over this element and all elements below it, in
674 document (depth first) order. If *tag* is not ``None`` or ``'*'``, only
675 elements whose tag equals *tag* are returned from the iterator. If the
676 tree structure is modified during iteration, the result is undefined.
Georg Brandl116aa622007-08-15 14:28:22 +0000677
Ezio Melotti138fc892011-10-10 00:02:03 +0300678 .. versionadded:: 3.2
679
Georg Brandl116aa622007-08-15 14:28:22 +0000680
Eli Bendersky737b1732012-05-29 06:02:56 +0300681 .. method:: iterfind(match, namespaces=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000682
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200683 Finds all matching subelements, by tag name or
684 :ref:`path <elementtree-xpath>`. Returns an iterable yielding all
Eli Bendersky737b1732012-05-29 06:02:56 +0300685 matching elements in document order. *namespaces* is an optional mapping
686 from namespace prefix to full name.
687
Georg Brandl116aa622007-08-15 14:28:22 +0000688
Ezio Melottif8754a62010-03-21 07:16:43 +0000689 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000690
Georg Brandl116aa622007-08-15 14:28:22 +0000691
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000692 .. method:: itertext()
Georg Brandl116aa622007-08-15 14:28:22 +0000693
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000694 Creates a text iterator. The iterator loops over this element and all
695 subelements, in document order, and returns all inner text.
Georg Brandl116aa622007-08-15 14:28:22 +0000696
Ezio Melottif8754a62010-03-21 07:16:43 +0000697 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000698
699
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000700 .. method:: makeelement(tag, attrib)
Georg Brandl116aa622007-08-15 14:28:22 +0000701
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000702 Creates a new element object of the same type as this element. Do not
703 call this method, use the :func:`SubElement` factory function instead.
Georg Brandl116aa622007-08-15 14:28:22 +0000704
705
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000706 .. method:: remove(subelement)
Georg Brandl116aa622007-08-15 14:28:22 +0000707
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000708 Removes *subelement* from the element. Unlike the find\* methods this
709 method compares elements based on the instance identity, not on tag value
710 or contents.
Georg Brandl116aa622007-08-15 14:28:22 +0000711
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000712 :class:`Element` objects also support the following sequence type methods
713 for working with subelements: :meth:`__delitem__`, :meth:`__getitem__`,
714 :meth:`__setitem__`, :meth:`__len__`.
Georg Brandl116aa622007-08-15 14:28:22 +0000715
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000716 Caution: Elements with no subelements will test as ``False``. This behavior
717 will change in future versions. Use specific ``len(elem)`` or ``elem is
718 None`` test instead. ::
Georg Brandl116aa622007-08-15 14:28:22 +0000719
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000720 element = root.find('foo')
Georg Brandl116aa622007-08-15 14:28:22 +0000721
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000722 if not element: # careful!
723 print("element not found, or element has no subelements")
Georg Brandl116aa622007-08-15 14:28:22 +0000724
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000725 if element is None:
726 print("element not found")
Georg Brandl116aa622007-08-15 14:28:22 +0000727
728
729.. _elementtree-elementtree-objects:
730
731ElementTree Objects
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200732^^^^^^^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +0000733
734
Georg Brandl7f01a132009-09-16 15:58:14 +0000735.. class:: ElementTree(element=None, file=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000736
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000737 ElementTree wrapper class. This class represents an entire element
738 hierarchy, and adds some extra support for serialization to and from
739 standard XML.
Georg Brandl116aa622007-08-15 14:28:22 +0000740
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000741 *element* is the root element. The tree is initialized with the contents
742 of the XML *file* if given.
Georg Brandl116aa622007-08-15 14:28:22 +0000743
744
Benjamin Petersone41251e2008-04-25 01:59:09 +0000745 .. method:: _setroot(element)
Georg Brandl116aa622007-08-15 14:28:22 +0000746
Benjamin Petersone41251e2008-04-25 01:59:09 +0000747 Replaces the root element for this tree. This discards the current
748 contents of the tree, and replaces it with the given element. Use with
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000749 care. *element* is an element instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000750
751
Eli Bendersky737b1732012-05-29 06:02:56 +0300752 .. method:: find(match, namespaces=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000753
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200754 Same as :meth:`Element.find`, starting at the root of the tree.
Georg Brandl116aa622007-08-15 14:28:22 +0000755
756
Eli Bendersky737b1732012-05-29 06:02:56 +0300757 .. method:: findall(match, namespaces=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000758
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200759 Same as :meth:`Element.findall`, starting at the root of the tree.
Georg Brandl116aa622007-08-15 14:28:22 +0000760
761
Eli Bendersky737b1732012-05-29 06:02:56 +0300762 .. method:: findtext(match, default=None, namespaces=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000763
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200764 Same as :meth:`Element.findtext`, starting at the root of the tree.
Georg Brandl116aa622007-08-15 14:28:22 +0000765
766
Georg Brandl7f01a132009-09-16 15:58:14 +0000767 .. method:: getiterator(tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000768
Georg Brandl67b21b72010-08-17 15:07:14 +0000769 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000770 Use method :meth:`ElementTree.iter` instead.
Georg Brandl116aa622007-08-15 14:28:22 +0000771
772
Benjamin Petersone41251e2008-04-25 01:59:09 +0000773 .. method:: getroot()
Florent Xiclunac17f1722010-08-08 19:48:29 +0000774
Benjamin Petersone41251e2008-04-25 01:59:09 +0000775 Returns the root element for this tree.
Georg Brandl116aa622007-08-15 14:28:22 +0000776
777
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000778 .. method:: iter(tag=None)
779
780 Creates and returns a tree iterator for the root element. The iterator
781 loops over all elements in this tree, in section order. *tag* is the tag
782 to look for (default is to return all elements)
783
784
Eli Bendersky737b1732012-05-29 06:02:56 +0300785 .. method:: iterfind(match, namespaces=None)
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000786
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200787 Same as :meth:`Element.iterfind`, starting at the root of the tree.
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000788
Ezio Melottif8754a62010-03-21 07:16:43 +0000789 .. versionadded:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000790
791
Georg Brandl7f01a132009-09-16 15:58:14 +0000792 .. method:: parse(source, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000793
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000794 Loads an external XML section into this element tree. *source* is a file
Antoine Pitrou11cb9612010-09-15 11:11:28 +0000795 name or :term:`file object`. *parser* is an optional parser instance.
Eli Bendersky52467b12012-06-01 07:13:08 +0300796 If not given, the standard :class:`XMLParser` parser is used. Returns the
797 section root element.
Georg Brandl116aa622007-08-15 14:28:22 +0000798
799
Eli Benderskyf96cf912012-07-15 06:19:44 +0300800 .. method:: write(file, encoding="us-ascii", xml_declaration=None, \
Serhiy Storchaka9e189f02013-01-13 22:24:27 +0200801 default_namespace=None, method="xml", *, \
Eli Benderskye9af8272013-01-13 06:27:51 -0800802 short_empty_elements=True)
Georg Brandl116aa622007-08-15 14:28:22 +0000803
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000804 Writes the element tree to a file, as XML. *file* is a file name, or a
Eli Benderskyf96cf912012-07-15 06:19:44 +0300805 :term:`file object` opened for writing. *encoding* [1]_ is the output
806 encoding (default is US-ASCII).
807 *xml_declaration* controls if an XML declaration should be added to the
808 file. Use ``False`` for never, ``True`` for always, ``None``
809 for only if not US-ASCII or UTF-8 or Unicode (default is ``None``).
Serhiy Storchaka03530b92013-01-13 21:58:04 +0200810 *default_namespace* sets the default XML namespace (for "xmlns").
Eli Benderskyf96cf912012-07-15 06:19:44 +0300811 *method* is either ``"xml"``, ``"html"`` or ``"text"`` (default is
812 ``"xml"``).
Eli Benderskya9a2ef52013-01-13 06:04:43 -0800813 The keyword-only *short_empty_elements* parameter controls the formatting
814 of elements that contain no content. If *True* (the default), they are
815 emitted as a single self-closed tag, otherwise they are emitted as a pair
816 of start/end tags.
Eli Benderskyf96cf912012-07-15 06:19:44 +0300817
818 The output is either a string (:class:`str`) or binary (:class:`bytes`).
819 This is controlled by the *encoding* argument. If *encoding* is
820 ``"unicode"``, the output is a string; otherwise, it's binary. Note that
821 this may conflict with the type of *file* if it's an open
822 :term:`file object`; make sure you do not try to write a string to a
823 binary stream and vice versa.
824
Eli Benderskya9a2ef52013-01-13 06:04:43 -0800825 .. versionadded:: 3.4
826 The *short_empty_elements* parameter.
827
Georg Brandl116aa622007-08-15 14:28:22 +0000828
Christian Heimesd8654cf2007-12-02 15:22:16 +0000829This is the XML file that is going to be manipulated::
830
831 <html>
832 <head>
833 <title>Example page</title>
834 </head>
835 <body>
Georg Brandl48310cd2009-01-03 21:18:54 +0000836 <p>Moved to <a href="http://example.org/">example.org</a>
Christian Heimesd8654cf2007-12-02 15:22:16 +0000837 or <a href="http://example.com/">example.com</a>.</p>
838 </body>
839 </html>
840
841Example of changing the attribute "target" of every link in first paragraph::
842
843 >>> from xml.etree.ElementTree import ElementTree
844 >>> tree = ElementTree()
845 >>> tree.parse("index.xhtml")
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000846 <Element 'html' at 0xb77e6fac>
Christian Heimesd8654cf2007-12-02 15:22:16 +0000847 >>> p = tree.find("body/p") # Finds first occurrence of tag p in body
848 >>> p
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000849 <Element 'p' at 0xb77ec26c>
850 >>> links = list(p.iter("a")) # Returns list of all links
Christian Heimesd8654cf2007-12-02 15:22:16 +0000851 >>> links
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000852 [<Element 'a' at 0xb77ec2ac>, <Element 'a' at 0xb77ec1cc>]
Christian Heimesd8654cf2007-12-02 15:22:16 +0000853 >>> for i in links: # Iterates through all found links
854 ... i.attrib["target"] = "blank"
855 >>> tree.write("output.xhtml")
Georg Brandl116aa622007-08-15 14:28:22 +0000856
857.. _elementtree-qname-objects:
858
859QName Objects
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200860^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +0000861
862
Georg Brandl7f01a132009-09-16 15:58:14 +0000863.. class:: QName(text_or_uri, tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000864
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000865 QName wrapper. This can be used to wrap a QName attribute value, in order
866 to get proper namespace handling on output. *text_or_uri* is a string
867 containing the QName value, in the form {uri}local, or, if the tag argument
868 is given, the URI part of a QName. If *tag* is given, the first argument is
869 interpreted as an URI, and this argument is interpreted as a local name.
870 :class:`QName` instances are opaque.
Georg Brandl116aa622007-08-15 14:28:22 +0000871
872
Antoine Pitrou5b235d02013-04-18 19:37:06 +0200873IncrementalParser Objects
874^^^^^^^^^^^^^^^^^^^^^^^^^
875
Antoine Pitrou5b235d02013-04-18 19:37:06 +0200876.. class:: IncrementalParser(events=None, parser=None)
877
878 An incremental, event-driven parser suitable for non-blocking applications.
Eli Benderskyfb625442013-05-19 09:09:24 -0700879 *events* is a sequence of events to report back. The supported events are
880 the strings ``"start"``, ``"end"``, ``"start-ns"`` and ``"end-ns"`` (the "ns"
Antoine Pitrou5b235d02013-04-18 19:37:06 +0200881 events are used to get detailed namespace information). If *events* is
882 omitted, only ``"end"`` events are reported. *parser* is an optional
883 parser instance. If not given, the standard :class:`XMLParser` parser is
Eli Benderskyc4216ab2013-08-03 18:55:10 -0700884 used. *parser* can only use the default :class:`TreeBuilder` as a target.
Antoine Pitrou5b235d02013-04-18 19:37:06 +0200885
886 .. method:: data_received(data)
887
888 Feed the given bytes data to the incremental parser.
889
890 .. method:: eof_received()
891
892 Signal the incremental parser that the data stream is terminated.
893
894 .. method:: events()
895
896 Iterate over the events which have been encountered in the data fed
897 to the parser. This method yields ``(event, elem)`` pairs, where
898 *event* is a string representing the type of event (e.g. ``"end"``)
Eli Bendersky3bdead12013-04-20 09:06:27 -0700899 and *elem* is the encountered :class:`Element` object. Events
900 provided in a previous call to :meth:`events` will not be yielded
901 again.
Antoine Pitrou5b235d02013-04-18 19:37:06 +0200902
903 .. note::
904
905 :class:`IncrementalParser` only guarantees that it has seen the ">"
906 character of a starting tag when it emits a "start" event, so the
907 attributes are defined, but the contents of the text and tail attributes
908 are undefined at that point. The same applies to the element children;
909 they may or may not be present.
910
911 If you need a fully populated element, look for "end" events instead.
912
913 .. versionadded:: 3.4
914
915
Georg Brandl116aa622007-08-15 14:28:22 +0000916.. _elementtree-treebuilder-objects:
917
918TreeBuilder Objects
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200919^^^^^^^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +0000920
921
Georg Brandl7f01a132009-09-16 15:58:14 +0000922.. class:: TreeBuilder(element_factory=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000923
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000924 Generic element structure builder. This builder converts a sequence of
925 start, data, and end method calls to a well-formed element structure. You
926 can use this class to build an element structure using a custom XML parser,
Eli Bendersky48d358b2012-05-30 17:57:50 +0300927 or a parser for some other XML-like format. *element_factory*, when given,
928 must be a callable accepting two positional arguments: a tag and
929 a dict of attributes. It is expected to return a new element instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000930
Benjamin Petersone41251e2008-04-25 01:59:09 +0000931 .. method:: close()
Georg Brandl116aa622007-08-15 14:28:22 +0000932
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000933 Flushes the builder buffers, and returns the toplevel document
934 element. Returns an :class:`Element` instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000935
936
Benjamin Petersone41251e2008-04-25 01:59:09 +0000937 .. method:: data(data)
Georg Brandl116aa622007-08-15 14:28:22 +0000938
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000939 Adds text to the current element. *data* is a string. This should be
940 either a bytestring, or a Unicode string.
Georg Brandl116aa622007-08-15 14:28:22 +0000941
942
Benjamin Petersone41251e2008-04-25 01:59:09 +0000943 .. method:: end(tag)
Georg Brandl116aa622007-08-15 14:28:22 +0000944
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000945 Closes the current element. *tag* is the element name. Returns the
946 closed element.
Georg Brandl116aa622007-08-15 14:28:22 +0000947
948
Benjamin Petersone41251e2008-04-25 01:59:09 +0000949 .. method:: start(tag, attrs)
Georg Brandl116aa622007-08-15 14:28:22 +0000950
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000951 Opens a new element. *tag* is the element name. *attrs* is a dictionary
952 containing element attributes. Returns the opened element.
Georg Brandl116aa622007-08-15 14:28:22 +0000953
954
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000955 In addition, a custom :class:`TreeBuilder` object can provide the
956 following method:
Georg Brandl116aa622007-08-15 14:28:22 +0000957
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000958 .. method:: doctype(name, pubid, system)
959
960 Handles a doctype declaration. *name* is the doctype name. *pubid* is
961 the public identifier. *system* is the system identifier. This method
962 does not exist on the default :class:`TreeBuilder` class.
963
Ezio Melottif8754a62010-03-21 07:16:43 +0000964 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000965
966
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000967.. _elementtree-xmlparser-objects:
Georg Brandl116aa622007-08-15 14:28:22 +0000968
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000969XMLParser Objects
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200970^^^^^^^^^^^^^^^^^
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000971
972
973.. class:: XMLParser(html=0, target=None, encoding=None)
974
975 :class:`Element` structure builder for XML source data, based on the expat
976 parser. *html* are predefined HTML entities. This flag is not supported by
977 the current implementation. *target* is the target object. If omitted, the
Eli Bendersky1bf23942012-06-01 07:15:00 +0300978 builder uses an instance of the standard :class:`TreeBuilder` class.
Eli Bendersky52467b12012-06-01 07:13:08 +0300979 *encoding* [1]_ is optional. If given, the value overrides the encoding
980 specified in the XML file.
Georg Brandl116aa622007-08-15 14:28:22 +0000981
982
Benjamin Petersone41251e2008-04-25 01:59:09 +0000983 .. method:: close()
Georg Brandl116aa622007-08-15 14:28:22 +0000984
Eli Benderskybfd78372013-08-24 15:11:44 -0700985 Finishes feeding data to the parser. Returns the result of calling the
986 `close` method of the *target* passed during construction; by default,
987 this is the toplevel document element.
Georg Brandl116aa622007-08-15 14:28:22 +0000988
989
Benjamin Petersone41251e2008-04-25 01:59:09 +0000990 .. method:: doctype(name, pubid, system)
Georg Brandl116aa622007-08-15 14:28:22 +0000991
Georg Brandl67b21b72010-08-17 15:07:14 +0000992 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000993 Define the :meth:`TreeBuilder.doctype` method on a custom TreeBuilder
994 target.
Georg Brandl116aa622007-08-15 14:28:22 +0000995
996
Benjamin Petersone41251e2008-04-25 01:59:09 +0000997 .. method:: feed(data)
Georg Brandl116aa622007-08-15 14:28:22 +0000998
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000999 Feeds data to the parser. *data* is encoded data.
Georg Brandl116aa622007-08-15 14:28:22 +00001000
Florent Xiclunaf15351d2010-03-13 23:24:31 +00001001:meth:`XMLParser.feed` calls *target*\'s :meth:`start` method
Christian Heimesd8654cf2007-12-02 15:22:16 +00001002for each opening tag, its :meth:`end` method for each closing tag,
Florent Xiclunaf15351d2010-03-13 23:24:31 +00001003and data is processed by method :meth:`data`. :meth:`XMLParser.close`
Georg Brandl48310cd2009-01-03 21:18:54 +00001004calls *target*\'s method :meth:`close`.
Florent Xiclunaf15351d2010-03-13 23:24:31 +00001005:class:`XMLParser` can be used not only for building a tree structure.
Christian Heimesd8654cf2007-12-02 15:22:16 +00001006This is an example of counting the maximum depth of an XML file::
1007
Florent Xiclunaf15351d2010-03-13 23:24:31 +00001008 >>> from xml.etree.ElementTree import XMLParser
Christian Heimesd8654cf2007-12-02 15:22:16 +00001009 >>> class MaxDepth: # The target object of the parser
1010 ... maxDepth = 0
1011 ... depth = 0
1012 ... def start(self, tag, attrib): # Called for each opening tag.
Georg Brandl48310cd2009-01-03 21:18:54 +00001013 ... self.depth += 1
Christian Heimesd8654cf2007-12-02 15:22:16 +00001014 ... if self.depth > self.maxDepth:
1015 ... self.maxDepth = self.depth
1016 ... def end(self, tag): # Called for each closing tag.
1017 ... self.depth -= 1
Georg Brandl48310cd2009-01-03 21:18:54 +00001018 ... def data(self, data):
Christian Heimesd8654cf2007-12-02 15:22:16 +00001019 ... pass # We do not need to do anything with data.
1020 ... def close(self): # Called when all data has been parsed.
1021 ... return self.maxDepth
Georg Brandl48310cd2009-01-03 21:18:54 +00001022 ...
Christian Heimesd8654cf2007-12-02 15:22:16 +00001023 >>> target = MaxDepth()
Florent Xiclunaf15351d2010-03-13 23:24:31 +00001024 >>> parser = XMLParser(target=target)
Christian Heimesd8654cf2007-12-02 15:22:16 +00001025 >>> exampleXml = """
1026 ... <a>
1027 ... <b>
1028 ... </b>
1029 ... <b>
1030 ... <c>
1031 ... <d>
1032 ... </d>
1033 ... </c>
1034 ... </b>
1035 ... </a>"""
1036 >>> parser.feed(exampleXml)
1037 >>> parser.close()
1038 4
Christian Heimesb186d002008-03-18 15:15:01 +00001039
Eli Bendersky5b77d812012-03-16 08:20:05 +02001040Exceptions
Eli Bendersky3a4875e2012-03-26 20:43:32 +02001041^^^^^^^^^^
Eli Bendersky5b77d812012-03-16 08:20:05 +02001042
1043.. class:: ParseError
1044
1045 XML parse error, raised by the various parsing methods in this module when
1046 parsing fails. The string representation of an instance of this exception
1047 will contain a user-friendly error message. In addition, it will have
1048 the following attributes available:
1049
1050 .. attribute:: code
1051
1052 A numeric error code from the expat parser. See the documentation of
1053 :mod:`xml.parsers.expat` for the list of error codes and their meanings.
1054
1055 .. attribute:: position
1056
1057 A tuple of *line*, *column* numbers, specifying where the error occurred.
Christian Heimesb186d002008-03-18 15:15:01 +00001058
1059.. rubric:: Footnotes
1060
1061.. [#] The encoding string included in XML output should conform to the
Florent Xiclunaf15351d2010-03-13 23:24:31 +00001062 appropriate standards. For example, "UTF-8" is valid, but "UTF8" is
1063 not. See http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl
Benjamin Petersonad3d5c22009-02-26 03:38:59 +00001064 and http://www.iana.org/assignments/character-sets.