blob: c15041f2e02722d0111348df1a13bfb3ce5c48dc [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`xml.etree.ElementTree` --- The ElementTree XML API
2========================================================
3
4.. module:: xml.etree.ElementTree
5 :synopsis: Implementation of the ElementTree API.
6.. moduleauthor:: Fredrik Lundh <fredrik@pythonware.com>
7
Eli Benderskyc1d98692012-03-30 11:44:15 +03008The :mod:`xml.etree.ElementTree` module implements a simple and efficient API
9for parsing and creating XML data.
Florent Xiclunaf15351d2010-03-13 23:24:31 +000010
Florent Xiclunaa72a98f2012-02-13 11:03:30 +010011.. versionchanged:: 3.3
12 This module will use a fast implementation whenever available.
13 The :mod:`xml.etree.cElementTree` module is deprecated.
14
Christian Heimes7380a672013-03-26 17:35:55 +010015
16.. warning::
17
18 The :mod:`xml.etree.ElementTree` module is not secure against
19 maliciously constructed data. If you need to parse untrusted or
20 unauthenticated data see :ref:`xml-vulnerabilities`.
21
Eli Benderskyc1d98692012-03-30 11:44:15 +030022Tutorial
23--------
Georg Brandl116aa622007-08-15 14:28:22 +000024
Eli Benderskyc1d98692012-03-30 11:44:15 +030025This is a short tutorial for using :mod:`xml.etree.ElementTree` (``ET`` in
26short). The goal is to demonstrate some of the building blocks and basic
27concepts of the module.
Eli Bendersky3a4875e2012-03-26 20:43:32 +020028
Eli Benderskyc1d98692012-03-30 11:44:15 +030029XML tree and elements
30^^^^^^^^^^^^^^^^^^^^^
Eli Bendersky3a4875e2012-03-26 20:43:32 +020031
Eli Benderskyc1d98692012-03-30 11:44:15 +030032XML is an inherently hierarchical data format, and the most natural way to
33represent it is with a tree. ``ET`` has two classes for this purpose -
34:class:`ElementTree` represents the whole XML document as a tree, and
35:class:`Element` represents a single node in this tree. Interactions with
36the whole document (reading and writing to/from files) are usually done
37on the :class:`ElementTree` level. Interactions with a single XML element
38and its sub-elements are done on the :class:`Element` level.
Eli Bendersky3a4875e2012-03-26 20:43:32 +020039
Eli Benderskyc1d98692012-03-30 11:44:15 +030040.. _elementtree-parsing-xml:
Eli Bendersky3a4875e2012-03-26 20:43:32 +020041
Eli Benderskyc1d98692012-03-30 11:44:15 +030042Parsing XML
43^^^^^^^^^^^
Eli Bendersky3a4875e2012-03-26 20:43:32 +020044
Eli Bendersky0f4e9342012-08-14 07:19:33 +030045We'll be using the following XML document as the sample data for this section:
Eli Bendersky3a4875e2012-03-26 20:43:32 +020046
Eli Bendersky0f4e9342012-08-14 07:19:33 +030047.. code-block:: xml
48
49 <?xml version="1.0"?>
Eli Bendersky3a4875e2012-03-26 20:43:32 +020050 <data>
Eli Bendersky3115f0d2012-08-15 14:26:30 +030051 <country name="Liechtenstein">
Eli Bendersky3a4875e2012-03-26 20:43:32 +020052 <rank>1</rank>
53 <year>2008</year>
54 <gdppc>141100</gdppc>
55 <neighbor name="Austria" direction="E"/>
56 <neighbor name="Switzerland" direction="W"/>
57 </country>
58 <country name="Singapore">
59 <rank>4</rank>
60 <year>2011</year>
61 <gdppc>59900</gdppc>
62 <neighbor name="Malaysia" direction="N"/>
63 </country>
64 <country name="Panama">
65 <rank>68</rank>
66 <year>2011</year>
67 <gdppc>13600</gdppc>
68 <neighbor name="Costa Rica" direction="W"/>
69 <neighbor name="Colombia" direction="E"/>
70 </country>
71 </data>
Eli Bendersky3a4875e2012-03-26 20:43:32 +020072
Eli Bendersky0f4e9342012-08-14 07:19:33 +030073We can import this data by reading from a file::
Eli Benderskyc1d98692012-03-30 11:44:15 +030074
75 import xml.etree.ElementTree as ET
Eli Bendersky0f4e9342012-08-14 07:19:33 +030076 tree = ET.parse('country_data.xml')
77 root = tree.getroot()
Eli Benderskyc1d98692012-03-30 11:44:15 +030078
Eli Bendersky0f4e9342012-08-14 07:19:33 +030079Or directly from a string::
80
81 root = ET.fromstring(country_data_as_string)
Eli Benderskyc1d98692012-03-30 11:44:15 +030082
83:func:`fromstring` parses XML from a string directly into an :class:`Element`,
84which is the root element of the parsed tree. Other parsing functions may
Eli Bendersky0f4e9342012-08-14 07:19:33 +030085create an :class:`ElementTree`. Check the documentation to be sure.
Eli Benderskyc1d98692012-03-30 11:44:15 +030086
87As an :class:`Element`, ``root`` has a tag and a dictionary of attributes::
88
89 >>> root.tag
90 'data'
91 >>> root.attrib
92 {}
93
94It also has children nodes over which we can iterate::
95
96 >>> for child in root:
97 ... print(child.tag, child.attrib)
98 ...
Eli Bendersky3115f0d2012-08-15 14:26:30 +030099 country {'name': 'Liechtenstein'}
Eli Benderskyc1d98692012-03-30 11:44:15 +0300100 country {'name': 'Singapore'}
101 country {'name': 'Panama'}
102
103Children are nested, and we can access specific child nodes by index::
104
105 >>> root[0][1].text
106 '2008'
107
Eli Bendersky2c68e302013-08-31 07:37:23 -0700108Pull API for non-blocking parsing
Eli Benderskyb5869342013-08-30 05:51:20 -0700109^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Eli Bendersky3bdead12013-04-20 09:06:27 -0700110
Eli Benderskyb5869342013-08-30 05:51:20 -0700111Most parsing functions provided by this module require to read the whole
112document at once before returning any result. It is possible to use a
113:class:`XMLParser` and feed data into it incrementally, but it's a push API that
114calls methods on a callback target, which is too low-level and inconvenient for
115most needs. Sometimes what the user really wants is to be able to parse XML
116incrementally, without blocking operations, while enjoying the convenience of
117fully constructed :class:`Element` objects.
Eli Bendersky3bdead12013-04-20 09:06:27 -0700118
Eli Benderskyb5869342013-08-30 05:51:20 -0700119The most powerful tool for doing this is :class:`XMLPullParser`. It does not
120require a blocking read to obtain the XML data, and is instead fed with data
121incrementally with :meth:`XMLPullParser.feed` calls. To get the parsed XML
122elements, call :meth:`XMLPullParser.read_events`. Here's an example::
123
Eli Bendersky2c68e302013-08-31 07:37:23 -0700124 >>> parser = ET.XMLPullParser(['start', 'end'])
125 >>> parser.feed('<mytag>sometext')
126 >>> list(parser.read_events())
Eli Benderskyb5869342013-08-30 05:51:20 -0700127 [('start', <Element 'mytag' at 0x7fa66db2be58>)]
Eli Bendersky2c68e302013-08-31 07:37:23 -0700128 >>> parser.feed(' more text</mytag>')
129 >>> for event, elem in parser.read_events():
Eli Benderskyb5869342013-08-30 05:51:20 -0700130 ... print(event)
131 ... print(elem.tag, 'text=', elem.text)
132 ...
133 end
Eli Bendersky3bdead12013-04-20 09:06:27 -0700134
Eli Bendersky2c68e302013-08-31 07:37:23 -0700135The obvious use case is applications that operate in a non-blocking fashion
Eli Bendersky3bdead12013-04-20 09:06:27 -0700136where the XML data is being received from a socket or read incrementally from
137some storage device. In such cases, blocking reads are unacceptable.
138
Eli Benderskyb5869342013-08-30 05:51:20 -0700139Because it's so flexible, :class:`XMLPullParser` can be inconvenient to use for
140simpler use-cases. If you don't mind your application blocking on reading XML
141data but would still like to have incremental parsing capabilities, take a look
142at :func:`iterparse`. It can be useful when you're reading a large XML document
143and don't want to hold it wholly in memory.
Eli Bendersky3bdead12013-04-20 09:06:27 -0700144
Eli Benderskyc1d98692012-03-30 11:44:15 +0300145Finding interesting elements
146^^^^^^^^^^^^^^^^^^^^^^^^^^^^
147
148:class:`Element` has some useful methods that help iterate recursively over all
149the sub-tree below it (its children, their children, and so on). For example,
150:meth:`Element.iter`::
151
152 >>> for neighbor in root.iter('neighbor'):
153 ... print(neighbor.attrib)
154 ...
155 {'name': 'Austria', 'direction': 'E'}
156 {'name': 'Switzerland', 'direction': 'W'}
157 {'name': 'Malaysia', 'direction': 'N'}
158 {'name': 'Costa Rica', 'direction': 'W'}
159 {'name': 'Colombia', 'direction': 'E'}
160
Eli Bendersky0f4e9342012-08-14 07:19:33 +0300161:meth:`Element.findall` finds only elements with a tag which are direct
162children of the current element. :meth:`Element.find` finds the *first* child
163with a particular tag, and :meth:`Element.text` accesses the element's text
164content. :meth:`Element.get` accesses the element's attributes::
165
166 >>> for country in root.findall('country'):
167 ... rank = country.find('rank').text
168 ... name = country.get('name')
169 ... print(name, rank)
170 ...
Eli Bendersky3115f0d2012-08-15 14:26:30 +0300171 Liechtenstein 1
Eli Bendersky0f4e9342012-08-14 07:19:33 +0300172 Singapore 4
173 Panama 68
174
Eli Benderskyc1d98692012-03-30 11:44:15 +0300175More sophisticated specification of which elements to look for is possible by
176using :ref:`XPath <elementtree-xpath>`.
177
Eli Bendersky0f4e9342012-08-14 07:19:33 +0300178Modifying an XML File
179^^^^^^^^^^^^^^^^^^^^^
Eli Benderskyc1d98692012-03-30 11:44:15 +0300180
Eli Bendersky0f4e9342012-08-14 07:19:33 +0300181:class:`ElementTree` provides a simple way to build XML documents and write them to files.
Eli Benderskyc1d98692012-03-30 11:44:15 +0300182The :meth:`ElementTree.write` method serves this purpose.
183
184Once created, an :class:`Element` object may be manipulated by directly changing
185its fields (such as :attr:`Element.text`), adding and modifying attributes
186(:meth:`Element.set` method), as well as adding new children (for example
187with :meth:`Element.append`).
188
Eli Bendersky0f4e9342012-08-14 07:19:33 +0300189Let's say we want to add one to each country's rank, and add an ``updated``
190attribute to the rank element::
191
192 >>> for rank in root.iter('rank'):
193 ... new_rank = int(rank.text) + 1
194 ... rank.text = str(new_rank)
195 ... rank.set('updated', 'yes')
196 ...
Eli Benderskya1b0f6d2012-08-18 05:42:22 +0300197 >>> tree.write('output.xml')
Eli Bendersky0f4e9342012-08-14 07:19:33 +0300198
199Our XML now looks like this:
200
201.. code-block:: xml
202
203 <?xml version="1.0"?>
204 <data>
Eli Bendersky3115f0d2012-08-15 14:26:30 +0300205 <country name="Liechtenstein">
Eli Bendersky0f4e9342012-08-14 07:19:33 +0300206 <rank updated="yes">2</rank>
207 <year>2008</year>
208 <gdppc>141100</gdppc>
209 <neighbor name="Austria" direction="E"/>
210 <neighbor name="Switzerland" direction="W"/>
211 </country>
212 <country name="Singapore">
213 <rank updated="yes">5</rank>
214 <year>2011</year>
215 <gdppc>59900</gdppc>
216 <neighbor name="Malaysia" direction="N"/>
217 </country>
218 <country name="Panama">
219 <rank updated="yes">69</rank>
220 <year>2011</year>
221 <gdppc>13600</gdppc>
222 <neighbor name="Costa Rica" direction="W"/>
223 <neighbor name="Colombia" direction="E"/>
224 </country>
225 </data>
226
227We can remove elements using :meth:`Element.remove`. Let's say we want to
228remove all countries with a rank higher than 50::
229
230 >>> for country in root.findall('country'):
231 ... rank = int(country.find('rank').text)
232 ... if rank > 50:
233 ... root.remove(country)
234 ...
Eli Benderskya1b0f6d2012-08-18 05:42:22 +0300235 >>> tree.write('output.xml')
Eli Bendersky0f4e9342012-08-14 07:19:33 +0300236
237Our XML now looks like this:
238
239.. code-block:: xml
240
241 <?xml version="1.0"?>
242 <data>
Eli Bendersky3115f0d2012-08-15 14:26:30 +0300243 <country name="Liechtenstein">
Eli Bendersky0f4e9342012-08-14 07:19:33 +0300244 <rank updated="yes">2</rank>
245 <year>2008</year>
246 <gdppc>141100</gdppc>
247 <neighbor name="Austria" direction="E"/>
248 <neighbor name="Switzerland" direction="W"/>
249 </country>
250 <country name="Singapore">
251 <rank updated="yes">5</rank>
252 <year>2011</year>
253 <gdppc>59900</gdppc>
254 <neighbor name="Malaysia" direction="N"/>
255 </country>
256 </data>
257
258Building XML documents
259^^^^^^^^^^^^^^^^^^^^^^
260
Eli Benderskyc1d98692012-03-30 11:44:15 +0300261The :func:`SubElement` function also provides a convenient way to create new
262sub-elements for a given element::
263
264 >>> a = ET.Element('a')
265 >>> b = ET.SubElement(a, 'b')
266 >>> c = ET.SubElement(a, 'c')
267 >>> d = ET.SubElement(c, 'd')
268 >>> ET.dump(a)
269 <a><b /><c><d /></c></a>
270
271Additional resources
272^^^^^^^^^^^^^^^^^^^^
273
274See http://effbot.org/zone/element-index.htm for tutorials and links to other
275docs.
276
277
278.. _elementtree-xpath:
279
280XPath support
281-------------
282
283This module provides limited support for
284`XPath expressions <http://www.w3.org/TR/xpath>`_ for locating elements in a
285tree. The goal is to support a small subset of the abbreviated syntax; a full
286XPath engine is outside the scope of the module.
287
288Example
289^^^^^^^
290
291Here's an example that demonstrates some of the XPath capabilities of the
292module. We'll be using the ``countrydata`` XML document from the
293:ref:`Parsing XML <elementtree-parsing-xml>` section::
294
295 import xml.etree.ElementTree as ET
296
297 root = ET.fromstring(countrydata)
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200298
299 # Top-level elements
Eli Benderskyc1d98692012-03-30 11:44:15 +0300300 root.findall(".")
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200301
302 # All 'neighbor' grand-children of 'country' children of the top-level
303 # elements
Eli Benderskyc1d98692012-03-30 11:44:15 +0300304 root.findall("./country/neighbor")
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200305
306 # Nodes with name='Singapore' that have a 'year' child
Eli Benderskyc1d98692012-03-30 11:44:15 +0300307 root.findall(".//year/..[@name='Singapore']")
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200308
309 # 'year' nodes that are children of nodes with name='Singapore'
Eli Benderskyc1d98692012-03-30 11:44:15 +0300310 root.findall(".//*[@name='Singapore']/year")
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200311
312 # All 'neighbor' nodes that are the second child of their parent
Eli Benderskyc1d98692012-03-30 11:44:15 +0300313 root.findall(".//neighbor[2]")
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200314
315Supported XPath syntax
316^^^^^^^^^^^^^^^^^^^^^^
317
Georg Brandl44ea77b2013-03-28 13:28:44 +0100318.. tabularcolumns:: |l|L|
319
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200320+-----------------------+------------------------------------------------------+
321| Syntax | Meaning |
322+=======================+======================================================+
323| ``tag`` | Selects all child elements with the given tag. |
324| | For example, ``spam`` selects all child elements |
325| | named ``spam``, ``spam/egg`` selects all |
326| | grandchildren named ``egg`` in all children named |
327| | ``spam``. |
328+-----------------------+------------------------------------------------------+
329| ``*`` | Selects all child elements. For example, ``*/egg`` |
330| | selects all grandchildren named ``egg``. |
331+-----------------------+------------------------------------------------------+
332| ``.`` | Selects the current node. This is mostly useful |
333| | at the beginning of the path, to indicate that it's |
334| | a relative path. |
335+-----------------------+------------------------------------------------------+
336| ``//`` | Selects all subelements, on all levels beneath the |
Eli Benderskyede001a2012-03-27 04:57:23 +0200337| | current element. For example, ``.//egg`` selects |
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200338| | all ``egg`` elements in the entire tree. |
339+-----------------------+------------------------------------------------------+
Eli Bendersky323a43a2012-10-09 06:46:33 -0700340| ``..`` | Selects the parent element. Returns ``None`` if the |
341| | path attempts to reach the ancestors of the start |
342| | element (the element ``find`` was called on). |
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200343+-----------------------+------------------------------------------------------+
344| ``[@attrib]`` | Selects all elements that have the given attribute. |
345+-----------------------+------------------------------------------------------+
346| ``[@attrib='value']`` | Selects all elements for which the given attribute |
347| | has the given value. The value cannot contain |
348| | quotes. |
349+-----------------------+------------------------------------------------------+
350| ``[tag]`` | Selects all elements that have a child named |
351| | ``tag``. Only immediate children are supported. |
352+-----------------------+------------------------------------------------------+
353| ``[position]`` | Selects all elements that are located at the given |
354| | position. The position can be either an integer |
355| | (1 is the first position), the expression ``last()`` |
356| | (for the last position), or a position relative to |
357| | the last position (e.g. ``last()-1``). |
358+-----------------------+------------------------------------------------------+
359
360Predicates (expressions within square brackets) must be preceded by a tag
361name, an asterisk, or another predicate. ``position`` predicates must be
362preceded by a tag name.
363
364Reference
365---------
366
Georg Brandl116aa622007-08-15 14:28:22 +0000367.. _elementtree-functions:
368
369Functions
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200370^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +0000371
372
Georg Brandl7f01a132009-09-16 15:58:14 +0000373.. function:: Comment(text=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000374
Georg Brandlf6945182008-02-01 11:56:49 +0000375 Comment element factory. This factory function creates a special element
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000376 that will be serialized as an XML comment by the standard serializer. The
377 comment string can be either a bytestring or a Unicode string. *text* is a
378 string containing the comment string. Returns an element instance
Georg Brandlf6945182008-02-01 11:56:49 +0000379 representing a comment.
Georg Brandl116aa622007-08-15 14:28:22 +0000380
381
382.. function:: dump(elem)
383
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000384 Writes an element tree or element structure to sys.stdout. This function
385 should be used for debugging only.
Georg Brandl116aa622007-08-15 14:28:22 +0000386
387 The exact output format is implementation dependent. In this version, it's
388 written as an ordinary XML file.
389
390 *elem* is an element tree or an individual element.
391
392
Georg Brandl116aa622007-08-15 14:28:22 +0000393.. function:: fromstring(text)
394
Florent Xiclunadddd5e92010-03-14 01:28:07 +0000395 Parses an XML section from a string constant. Same as :func:`XML`. *text*
396 is a string containing XML data. Returns an :class:`Element` instance.
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000397
398
399.. function:: fromstringlist(sequence, parser=None)
400
401 Parses an XML document from a sequence of string fragments. *sequence* is a
402 list or other sequence containing XML data fragments. *parser* is an
403 optional parser instance. If not given, the standard :class:`XMLParser`
404 parser is used. Returns an :class:`Element` instance.
405
Ezio Melottif8754a62010-03-21 07:16:43 +0000406 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000407
408
409.. function:: iselement(element)
410
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000411 Checks if an object appears to be a valid element object. *element* is an
412 element instance. Returns a true value if this is an element object.
Georg Brandl116aa622007-08-15 14:28:22 +0000413
414
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000415.. function:: iterparse(source, events=None, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000416
417 Parses an XML section into an element tree incrementally, and reports what's
Eli Bendersky604c4ff2012-03-16 08:41:30 +0200418 going on to the user. *source* is a filename or :term:`file object`
Eli Benderskyfb625442013-05-19 09:09:24 -0700419 containing XML data. *events* is a sequence of events to report back. The
Eli Benderskyb5869342013-08-30 05:51:20 -0700420 supported events are the strings ``"start"``, ``"end"``, ``"start-ns"`` and
421 ``"end-ns"`` (the "ns" events are used to get detailed namespace
Eli Bendersky604c4ff2012-03-16 08:41:30 +0200422 information). If *events* is omitted, only ``"end"`` events are reported.
423 *parser* is an optional parser instance. If not given, the standard
Eli Benderskyb5869342013-08-30 05:51:20 -0700424 :class:`XMLParser` parser is used. *parser* must be a subclass of
425 :class:`XMLParser` and can only use the default :class:`TreeBuilder` as a
426 target. Returns an :term:`iterator` providing ``(event, elem)`` pairs.
Georg Brandl116aa622007-08-15 14:28:22 +0000427
Eli Benderskyab2a76c2013-04-20 05:53:50 -0700428 Note that while :func:`iterparse` builds the tree incrementally, it issues
429 blocking reads on *source* (or the file it names). As such, it's unsuitable
Eli Bendersky2c68e302013-08-31 07:37:23 -0700430 for applications where blocking reads can't be made. For fully non-blocking
431 parsing, see :class:`XMLPullParser`.
Eli Benderskyab2a76c2013-04-20 05:53:50 -0700432
Benjamin Peterson75edad02009-01-01 15:05:06 +0000433 .. note::
434
Eli Benderskyb5869342013-08-30 05:51:20 -0700435 :func:`iterparse` only guarantees that it has seen the ">" character of a
436 starting tag when it emits a "start" event, so the attributes are defined,
437 but the contents of the text and tail attributes are undefined at that
438 point. The same applies to the element children; they may or may not be
439 present.
Benjamin Peterson75edad02009-01-01 15:05:06 +0000440
441 If you need a fully populated element, look for "end" events instead.
442
Eli Benderskyb5869342013-08-30 05:51:20 -0700443 .. deprecated:: 3.4
444 The *parser* argument.
445
Georg Brandl7f01a132009-09-16 15:58:14 +0000446.. function:: parse(source, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000447
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000448 Parses an XML section into an element tree. *source* is a filename or file
449 object containing XML data. *parser* is an optional parser instance. If
450 not given, the standard :class:`XMLParser` parser is used. Returns an
451 :class:`ElementTree` instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000452
453
Georg Brandl7f01a132009-09-16 15:58:14 +0000454.. function:: ProcessingInstruction(target, text=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000455
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000456 PI element factory. This factory function creates a special element that
457 will be serialized as an XML processing instruction. *target* is a string
458 containing the PI target. *text* is a string containing the PI contents, if
459 given. Returns an element instance, representing a processing instruction.
460
461
462.. function:: register_namespace(prefix, uri)
463
464 Registers a namespace prefix. The registry is global, and any existing
465 mapping for either the given prefix or the namespace URI will be removed.
466 *prefix* is a namespace prefix. *uri* is a namespace uri. Tags and
467 attributes in this namespace will be serialized with the given prefix, if at
468 all possible.
469
Ezio Melottif8754a62010-03-21 07:16:43 +0000470 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000471
472
Georg Brandl7f01a132009-09-16 15:58:14 +0000473.. function:: SubElement(parent, tag, attrib={}, **extra)
Georg Brandl116aa622007-08-15 14:28:22 +0000474
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000475 Subelement factory. This function creates an element instance, and appends
476 it to an existing element.
Georg Brandl116aa622007-08-15 14:28:22 +0000477
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000478 The element name, attribute names, and attribute values can be either
479 bytestrings or Unicode strings. *parent* is the parent element. *tag* is
480 the subelement name. *attrib* is an optional dictionary, containing element
481 attributes. *extra* contains additional attributes, given as keyword
482 arguments. Returns an element instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000483
484
Serhiy Storchaka9e189f02013-01-13 22:24:27 +0200485.. function:: tostring(element, encoding="us-ascii", method="xml", *, \
Eli Benderskya9a2ef52013-01-13 06:04:43 -0800486 short_empty_elements=True)
Georg Brandl116aa622007-08-15 14:28:22 +0000487
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000488 Generates a string representation of an XML element, including all
Florent Xiclunadddd5e92010-03-14 01:28:07 +0000489 subelements. *element* is an :class:`Element` instance. *encoding* [1]_ is
Florent Xiclunac17f1722010-08-08 19:48:29 +0000490 the output encoding (default is US-ASCII). Use ``encoding="unicode"`` to
Eli Bendersky831893a2012-10-09 07:18:16 -0700491 generate a Unicode string (otherwise, a bytestring is generated). *method*
492 is either ``"xml"``, ``"html"`` or ``"text"`` (default is ``"xml"``).
Eli Benderskya9a2ef52013-01-13 06:04:43 -0800493 *short_empty_elements* has the same meaning as in :meth:`ElementTree.write`.
Eli Bendersky831893a2012-10-09 07:18:16 -0700494 Returns an (optionally) encoded string containing the XML data.
Georg Brandl116aa622007-08-15 14:28:22 +0000495
Eli Benderskya9a2ef52013-01-13 06:04:43 -0800496 .. versionadded:: 3.4
497 The *short_empty_elements* parameter.
Georg Brandl116aa622007-08-15 14:28:22 +0000498
Eli Benderskya9a2ef52013-01-13 06:04:43 -0800499
Serhiy Storchaka9e189f02013-01-13 22:24:27 +0200500.. function:: tostringlist(element, encoding="us-ascii", method="xml", *, \
Eli Benderskya9a2ef52013-01-13 06:04:43 -0800501 short_empty_elements=True)
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000502
503 Generates a string representation of an XML element, including all
Florent Xiclunadddd5e92010-03-14 01:28:07 +0000504 subelements. *element* is an :class:`Element` instance. *encoding* [1]_ is
Florent Xiclunac17f1722010-08-08 19:48:29 +0000505 the output encoding (default is US-ASCII). Use ``encoding="unicode"`` to
Eli Bendersky831893a2012-10-09 07:18:16 -0700506 generate a Unicode string (otherwise, a bytestring is generated). *method*
507 is either ``"xml"``, ``"html"`` or ``"text"`` (default is ``"xml"``).
Eli Benderskya9a2ef52013-01-13 06:04:43 -0800508 *short_empty_elements* has the same meaning as in :meth:`ElementTree.write`.
Eli Bendersky831893a2012-10-09 07:18:16 -0700509 Returns a list of (optionally) encoded strings containing the XML data.
510 It does not guarantee any specific sequence, except that
511 ``"".join(tostringlist(element)) == tostring(element)``.
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000512
Ezio Melottif8754a62010-03-21 07:16:43 +0000513 .. versionadded:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000514
Eli Benderskya9a2ef52013-01-13 06:04:43 -0800515 .. versionadded:: 3.4
516 The *short_empty_elements* parameter.
517
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000518
519.. function:: XML(text, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000520
521 Parses an XML section from a string constant. This function can be used to
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000522 embed "XML literals" in Python code. *text* is a string containing XML
523 data. *parser* is an optional parser instance. If not given, the standard
524 :class:`XMLParser` parser is used. Returns an :class:`Element` instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000525
526
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000527.. function:: XMLID(text, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000528
529 Parses an XML section from a string constant, and also returns a dictionary
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000530 which maps from element id:s to elements. *text* is a string containing XML
531 data. *parser* is an optional parser instance. If not given, the standard
532 :class:`XMLParser` parser is used. Returns a tuple containing an
533 :class:`Element` instance and a dictionary.
Georg Brandl116aa622007-08-15 14:28:22 +0000534
535
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000536.. _elementtree-element-objects:
Georg Brandl116aa622007-08-15 14:28:22 +0000537
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000538Element Objects
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200539^^^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +0000540
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000541.. class:: Element(tag, attrib={}, **extra)
Georg Brandl116aa622007-08-15 14:28:22 +0000542
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000543 Element class. This class defines the Element interface, and provides a
544 reference implementation of this interface.
Georg Brandl116aa622007-08-15 14:28:22 +0000545
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000546 The element name, attribute names, and attribute values can be either
547 bytestrings or Unicode strings. *tag* is the element name. *attrib* is
548 an optional dictionary, containing element attributes. *extra* contains
549 additional attributes, given as keyword arguments.
Georg Brandl116aa622007-08-15 14:28:22 +0000550
551
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000552 .. attribute:: tag
Georg Brandl116aa622007-08-15 14:28:22 +0000553
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000554 A string identifying what kind of data this element represents (the
555 element type, in other words).
Georg Brandl116aa622007-08-15 14:28:22 +0000556
557
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000558 .. attribute:: text
Georg Brandl116aa622007-08-15 14:28:22 +0000559
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000560 The *text* attribute can be used to hold additional data associated with
561 the element. As the name implies this attribute is usually a string but
562 may be any application-specific object. If the element is created from
563 an XML file the attribute will contain any text found between the element
564 tags.
Georg Brandl116aa622007-08-15 14:28:22 +0000565
566
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000567 .. attribute:: tail
Georg Brandl116aa622007-08-15 14:28:22 +0000568
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000569 The *tail* attribute can be used to hold additional data associated with
570 the element. This attribute is usually a string but may be any
571 application-specific object. If the element is created from an XML file
572 the attribute will contain any text found after the element's end tag and
573 before the next tag.
Georg Brandl116aa622007-08-15 14:28:22 +0000574
Georg Brandl116aa622007-08-15 14:28:22 +0000575
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000576 .. attribute:: attrib
Georg Brandl116aa622007-08-15 14:28:22 +0000577
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000578 A dictionary containing the element's attributes. Note that while the
579 *attrib* value is always a real mutable Python dictionary, an ElementTree
580 implementation may choose to use another internal representation, and
581 create the dictionary only if someone asks for it. To take advantage of
582 such implementations, use the dictionary methods below whenever possible.
Georg Brandl116aa622007-08-15 14:28:22 +0000583
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000584 The following dictionary-like methods work on the element attributes.
Georg Brandl116aa622007-08-15 14:28:22 +0000585
586
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000587 .. method:: clear()
Georg Brandl116aa622007-08-15 14:28:22 +0000588
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000589 Resets an element. This function removes all subelements, clears all
Eli Bendersky323a43a2012-10-09 06:46:33 -0700590 attributes, and sets the text and tail attributes to ``None``.
Georg Brandl116aa622007-08-15 14:28:22 +0000591
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000592
593 .. method:: get(key, default=None)
594
595 Gets the element attribute named *key*.
596
597 Returns the attribute value, or *default* if the attribute was not found.
598
599
600 .. method:: items()
601
602 Returns the element attributes as a sequence of (name, value) pairs. The
603 attributes are returned in an arbitrary order.
604
605
606 .. method:: keys()
607
608 Returns the elements attribute names as a list. The names are returned
609 in an arbitrary order.
610
611
612 .. method:: set(key, value)
613
614 Set the attribute *key* on the element to *value*.
615
616 The following methods work on the element's children (subelements).
617
618
619 .. method:: append(subelement)
620
Eli Bendersky396e8fc2012-03-23 14:24:20 +0200621 Adds the element *subelement* to the end of this element's internal list
622 of subelements. Raises :exc:`TypeError` if *subelement* is not an
623 :class:`Element`.
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000624
625
626 .. method:: extend(subelements)
Georg Brandl116aa622007-08-15 14:28:22 +0000627
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000628 Appends *subelements* from a sequence object with zero or more elements.
Eli Bendersky396e8fc2012-03-23 14:24:20 +0200629 Raises :exc:`TypeError` if a subelement is not an :class:`Element`.
Georg Brandl116aa622007-08-15 14:28:22 +0000630
Ezio Melottif8754a62010-03-21 07:16:43 +0000631 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000632
Georg Brandl116aa622007-08-15 14:28:22 +0000633
Eli Bendersky737b1732012-05-29 06:02:56 +0300634 .. method:: find(match, namespaces=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000635
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000636 Finds the first subelement matching *match*. *match* may be a tag name
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200637 or a :ref:`path <elementtree-xpath>`. Returns an element instance
Eli Bendersky737b1732012-05-29 06:02:56 +0300638 or ``None``. *namespaces* is an optional mapping from namespace prefix
639 to full name.
Georg Brandl116aa622007-08-15 14:28:22 +0000640
Georg Brandl116aa622007-08-15 14:28:22 +0000641
Eli Bendersky737b1732012-05-29 06:02:56 +0300642 .. method:: findall(match, namespaces=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000643
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200644 Finds all matching subelements, by tag name or
645 :ref:`path <elementtree-xpath>`. Returns a list containing all matching
Eli Bendersky737b1732012-05-29 06:02:56 +0300646 elements in document order. *namespaces* is an optional mapping from
647 namespace prefix to full name.
Georg Brandl116aa622007-08-15 14:28:22 +0000648
Georg Brandl116aa622007-08-15 14:28:22 +0000649
Eli Bendersky737b1732012-05-29 06:02:56 +0300650 .. method:: findtext(match, default=None, namespaces=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000651
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000652 Finds text for the first subelement matching *match*. *match* may be
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200653 a tag name or a :ref:`path <elementtree-xpath>`. Returns the text content
654 of the first matching element, or *default* if no element was found.
655 Note that if the matching element has no text content an empty string
Eli Bendersky737b1732012-05-29 06:02:56 +0300656 is returned. *namespaces* is an optional mapping from namespace prefix
657 to full name.
Georg Brandl116aa622007-08-15 14:28:22 +0000658
Georg Brandl116aa622007-08-15 14:28:22 +0000659
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000660 .. method:: getchildren()
Georg Brandl116aa622007-08-15 14:28:22 +0000661
Georg Brandl67b21b72010-08-17 15:07:14 +0000662 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000663 Use ``list(elem)`` or iteration.
Georg Brandl116aa622007-08-15 14:28:22 +0000664
Georg Brandl116aa622007-08-15 14:28:22 +0000665
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000666 .. method:: getiterator(tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000667
Georg Brandl67b21b72010-08-17 15:07:14 +0000668 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000669 Use method :meth:`Element.iter` instead.
Georg Brandl116aa622007-08-15 14:28:22 +0000670
Georg Brandl116aa622007-08-15 14:28:22 +0000671
Eli Bendersky396e8fc2012-03-23 14:24:20 +0200672 .. method:: insert(index, subelement)
Georg Brandl116aa622007-08-15 14:28:22 +0000673
Eli Bendersky396e8fc2012-03-23 14:24:20 +0200674 Inserts *subelement* at the given position in this element. Raises
675 :exc:`TypeError` if *subelement* is not an :class:`Element`.
Georg Brandl116aa622007-08-15 14:28:22 +0000676
Georg Brandl116aa622007-08-15 14:28:22 +0000677
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000678 .. method:: iter(tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000679
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000680 Creates a tree :term:`iterator` with the current element as the root.
681 The iterator iterates over this element and all elements below it, in
682 document (depth first) order. If *tag* is not ``None`` or ``'*'``, only
683 elements whose tag equals *tag* are returned from the iterator. If the
684 tree structure is modified during iteration, the result is undefined.
Georg Brandl116aa622007-08-15 14:28:22 +0000685
Ezio Melotti138fc892011-10-10 00:02:03 +0300686 .. versionadded:: 3.2
687
Georg Brandl116aa622007-08-15 14:28:22 +0000688
Eli Bendersky737b1732012-05-29 06:02:56 +0300689 .. method:: iterfind(match, namespaces=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000690
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200691 Finds all matching subelements, by tag name or
692 :ref:`path <elementtree-xpath>`. Returns an iterable yielding all
Eli Bendersky737b1732012-05-29 06:02:56 +0300693 matching elements in document order. *namespaces* is an optional mapping
694 from namespace prefix to full name.
695
Georg Brandl116aa622007-08-15 14:28:22 +0000696
Ezio Melottif8754a62010-03-21 07:16:43 +0000697 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000698
Georg Brandl116aa622007-08-15 14:28:22 +0000699
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000700 .. method:: itertext()
Georg Brandl116aa622007-08-15 14:28:22 +0000701
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000702 Creates a text iterator. The iterator loops over this element and all
703 subelements, in document order, and returns all inner text.
Georg Brandl116aa622007-08-15 14:28:22 +0000704
Ezio Melottif8754a62010-03-21 07:16:43 +0000705 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000706
707
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000708 .. method:: makeelement(tag, attrib)
Georg Brandl116aa622007-08-15 14:28:22 +0000709
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000710 Creates a new element object of the same type as this element. Do not
711 call this method, use the :func:`SubElement` factory function instead.
Georg Brandl116aa622007-08-15 14:28:22 +0000712
713
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000714 .. method:: remove(subelement)
Georg Brandl116aa622007-08-15 14:28:22 +0000715
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000716 Removes *subelement* from the element. Unlike the find\* methods this
717 method compares elements based on the instance identity, not on tag value
718 or contents.
Georg Brandl116aa622007-08-15 14:28:22 +0000719
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000720 :class:`Element` objects also support the following sequence type methods
Serhiy Storchaka15e65902013-08-29 10:28:44 +0300721 for working with subelements: :meth:`~object.__delitem__`,
722 :meth:`~object.__getitem__`, :meth:`~object.__setitem__`,
723 :meth:`~object.__len__`.
Georg Brandl116aa622007-08-15 14:28:22 +0000724
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000725 Caution: Elements with no subelements will test as ``False``. This behavior
726 will change in future versions. Use specific ``len(elem)`` or ``elem is
727 None`` test instead. ::
Georg Brandl116aa622007-08-15 14:28:22 +0000728
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000729 element = root.find('foo')
Georg Brandl116aa622007-08-15 14:28:22 +0000730
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000731 if not element: # careful!
732 print("element not found, or element has no subelements")
Georg Brandl116aa622007-08-15 14:28:22 +0000733
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000734 if element is None:
735 print("element not found")
Georg Brandl116aa622007-08-15 14:28:22 +0000736
737
738.. _elementtree-elementtree-objects:
739
740ElementTree Objects
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200741^^^^^^^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +0000742
743
Georg Brandl7f01a132009-09-16 15:58:14 +0000744.. class:: ElementTree(element=None, file=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000745
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000746 ElementTree wrapper class. This class represents an entire element
747 hierarchy, and adds some extra support for serialization to and from
748 standard XML.
Georg Brandl116aa622007-08-15 14:28:22 +0000749
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000750 *element* is the root element. The tree is initialized with the contents
751 of the XML *file* if given.
Georg Brandl116aa622007-08-15 14:28:22 +0000752
753
Benjamin Petersone41251e2008-04-25 01:59:09 +0000754 .. method:: _setroot(element)
Georg Brandl116aa622007-08-15 14:28:22 +0000755
Benjamin Petersone41251e2008-04-25 01:59:09 +0000756 Replaces the root element for this tree. This discards the current
757 contents of the tree, and replaces it with the given element. Use with
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000758 care. *element* is an element instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000759
760
Eli Bendersky737b1732012-05-29 06:02:56 +0300761 .. method:: find(match, namespaces=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000762
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200763 Same as :meth:`Element.find`, starting at the root of the tree.
Georg Brandl116aa622007-08-15 14:28:22 +0000764
765
Eli Bendersky737b1732012-05-29 06:02:56 +0300766 .. method:: findall(match, namespaces=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000767
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200768 Same as :meth:`Element.findall`, starting at the root of the tree.
Georg Brandl116aa622007-08-15 14:28:22 +0000769
770
Eli Bendersky737b1732012-05-29 06:02:56 +0300771 .. method:: findtext(match, default=None, namespaces=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000772
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200773 Same as :meth:`Element.findtext`, starting at the root of the tree.
Georg Brandl116aa622007-08-15 14:28:22 +0000774
775
Georg Brandl7f01a132009-09-16 15:58:14 +0000776 .. method:: getiterator(tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000777
Georg Brandl67b21b72010-08-17 15:07:14 +0000778 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000779 Use method :meth:`ElementTree.iter` instead.
Georg Brandl116aa622007-08-15 14:28:22 +0000780
781
Benjamin Petersone41251e2008-04-25 01:59:09 +0000782 .. method:: getroot()
Florent Xiclunac17f1722010-08-08 19:48:29 +0000783
Benjamin Petersone41251e2008-04-25 01:59:09 +0000784 Returns the root element for this tree.
Georg Brandl116aa622007-08-15 14:28:22 +0000785
786
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000787 .. method:: iter(tag=None)
788
789 Creates and returns a tree iterator for the root element. The iterator
790 loops over all elements in this tree, in section order. *tag* is the tag
791 to look for (default is to return all elements)
792
793
Eli Bendersky737b1732012-05-29 06:02:56 +0300794 .. method:: iterfind(match, namespaces=None)
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000795
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200796 Same as :meth:`Element.iterfind`, starting at the root of the tree.
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000797
Ezio Melottif8754a62010-03-21 07:16:43 +0000798 .. versionadded:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000799
800
Georg Brandl7f01a132009-09-16 15:58:14 +0000801 .. method:: parse(source, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000802
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000803 Loads an external XML section into this element tree. *source* is a file
Antoine Pitrou11cb9612010-09-15 11:11:28 +0000804 name or :term:`file object`. *parser* is an optional parser instance.
Eli Bendersky52467b12012-06-01 07:13:08 +0300805 If not given, the standard :class:`XMLParser` parser is used. Returns the
806 section root element.
Georg Brandl116aa622007-08-15 14:28:22 +0000807
808
Eli Benderskyf96cf912012-07-15 06:19:44 +0300809 .. method:: write(file, encoding="us-ascii", xml_declaration=None, \
Serhiy Storchaka9e189f02013-01-13 22:24:27 +0200810 default_namespace=None, method="xml", *, \
Eli Benderskye9af8272013-01-13 06:27:51 -0800811 short_empty_elements=True)
Georg Brandl116aa622007-08-15 14:28:22 +0000812
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000813 Writes the element tree to a file, as XML. *file* is a file name, or a
Eli Benderskyf96cf912012-07-15 06:19:44 +0300814 :term:`file object` opened for writing. *encoding* [1]_ is the output
815 encoding (default is US-ASCII).
816 *xml_declaration* controls if an XML declaration should be added to the
817 file. Use ``False`` for never, ``True`` for always, ``None``
818 for only if not US-ASCII or UTF-8 or Unicode (default is ``None``).
Serhiy Storchaka03530b92013-01-13 21:58:04 +0200819 *default_namespace* sets the default XML namespace (for "xmlns").
Eli Benderskyf96cf912012-07-15 06:19:44 +0300820 *method* is either ``"xml"``, ``"html"`` or ``"text"`` (default is
821 ``"xml"``).
Eli Benderskya9a2ef52013-01-13 06:04:43 -0800822 The keyword-only *short_empty_elements* parameter controls the formatting
823 of elements that contain no content. If *True* (the default), they are
824 emitted as a single self-closed tag, otherwise they are emitted as a pair
825 of start/end tags.
Eli Benderskyf96cf912012-07-15 06:19:44 +0300826
827 The output is either a string (:class:`str`) or binary (:class:`bytes`).
828 This is controlled by the *encoding* argument. If *encoding* is
829 ``"unicode"``, the output is a string; otherwise, it's binary. Note that
830 this may conflict with the type of *file* if it's an open
831 :term:`file object`; make sure you do not try to write a string to a
832 binary stream and vice versa.
833
Eli Benderskya9a2ef52013-01-13 06:04:43 -0800834 .. versionadded:: 3.4
835 The *short_empty_elements* parameter.
836
Georg Brandl116aa622007-08-15 14:28:22 +0000837
Christian Heimesd8654cf2007-12-02 15:22:16 +0000838This is the XML file that is going to be manipulated::
839
840 <html>
841 <head>
842 <title>Example page</title>
843 </head>
844 <body>
Georg Brandl48310cd2009-01-03 21:18:54 +0000845 <p>Moved to <a href="http://example.org/">example.org</a>
Christian Heimesd8654cf2007-12-02 15:22:16 +0000846 or <a href="http://example.com/">example.com</a>.</p>
847 </body>
848 </html>
849
850Example of changing the attribute "target" of every link in first paragraph::
851
852 >>> from xml.etree.ElementTree import ElementTree
853 >>> tree = ElementTree()
854 >>> tree.parse("index.xhtml")
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000855 <Element 'html' at 0xb77e6fac>
Christian Heimesd8654cf2007-12-02 15:22:16 +0000856 >>> p = tree.find("body/p") # Finds first occurrence of tag p in body
857 >>> p
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000858 <Element 'p' at 0xb77ec26c>
859 >>> links = list(p.iter("a")) # Returns list of all links
Christian Heimesd8654cf2007-12-02 15:22:16 +0000860 >>> links
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000861 [<Element 'a' at 0xb77ec2ac>, <Element 'a' at 0xb77ec1cc>]
Christian Heimesd8654cf2007-12-02 15:22:16 +0000862 >>> for i in links: # Iterates through all found links
863 ... i.attrib["target"] = "blank"
864 >>> tree.write("output.xhtml")
Georg Brandl116aa622007-08-15 14:28:22 +0000865
866.. _elementtree-qname-objects:
867
868QName Objects
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200869^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +0000870
871
Georg Brandl7f01a132009-09-16 15:58:14 +0000872.. class:: QName(text_or_uri, tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000873
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000874 QName wrapper. This can be used to wrap a QName attribute value, in order
875 to get proper namespace handling on output. *text_or_uri* is a string
876 containing the QName value, in the form {uri}local, or, if the tag argument
877 is given, the URI part of a QName. If *tag* is given, the first argument is
878 interpreted as an URI, and this argument is interpreted as a local name.
879 :class:`QName` instances are opaque.
Georg Brandl116aa622007-08-15 14:28:22 +0000880
881
Antoine Pitrou5b235d02013-04-18 19:37:06 +0200882
Georg Brandl116aa622007-08-15 14:28:22 +0000883.. _elementtree-treebuilder-objects:
884
885TreeBuilder Objects
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200886^^^^^^^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +0000887
888
Georg Brandl7f01a132009-09-16 15:58:14 +0000889.. class:: TreeBuilder(element_factory=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000890
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000891 Generic element structure builder. This builder converts a sequence of
892 start, data, and end method calls to a well-formed element structure. You
893 can use this class to build an element structure using a custom XML parser,
Eli Bendersky48d358b2012-05-30 17:57:50 +0300894 or a parser for some other XML-like format. *element_factory*, when given,
895 must be a callable accepting two positional arguments: a tag and
896 a dict of attributes. It is expected to return a new element instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000897
Benjamin Petersone41251e2008-04-25 01:59:09 +0000898 .. method:: close()
Georg Brandl116aa622007-08-15 14:28:22 +0000899
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000900 Flushes the builder buffers, and returns the toplevel document
901 element. Returns an :class:`Element` instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000902
903
Benjamin Petersone41251e2008-04-25 01:59:09 +0000904 .. method:: data(data)
Georg Brandl116aa622007-08-15 14:28:22 +0000905
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000906 Adds text to the current element. *data* is a string. This should be
907 either a bytestring, or a Unicode string.
Georg Brandl116aa622007-08-15 14:28:22 +0000908
909
Benjamin Petersone41251e2008-04-25 01:59:09 +0000910 .. method:: end(tag)
Georg Brandl116aa622007-08-15 14:28:22 +0000911
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000912 Closes the current element. *tag* is the element name. Returns the
913 closed element.
Georg Brandl116aa622007-08-15 14:28:22 +0000914
915
Benjamin Petersone41251e2008-04-25 01:59:09 +0000916 .. method:: start(tag, attrs)
Georg Brandl116aa622007-08-15 14:28:22 +0000917
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000918 Opens a new element. *tag* is the element name. *attrs* is a dictionary
919 containing element attributes. Returns the opened element.
Georg Brandl116aa622007-08-15 14:28:22 +0000920
921
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000922 In addition, a custom :class:`TreeBuilder` object can provide the
923 following method:
Georg Brandl116aa622007-08-15 14:28:22 +0000924
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000925 .. method:: doctype(name, pubid, system)
926
927 Handles a doctype declaration. *name* is the doctype name. *pubid* is
928 the public identifier. *system* is the system identifier. This method
929 does not exist on the default :class:`TreeBuilder` class.
930
Ezio Melottif8754a62010-03-21 07:16:43 +0000931 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000932
933
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000934.. _elementtree-xmlparser-objects:
Georg Brandl116aa622007-08-15 14:28:22 +0000935
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000936XMLParser Objects
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200937^^^^^^^^^^^^^^^^^
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000938
939
940.. class:: XMLParser(html=0, target=None, encoding=None)
941
Eli Benderskyb5869342013-08-30 05:51:20 -0700942 This class is the low-level building block of the module. It uses
943 :mod:`xml.parsers.expat` for efficient, event-based parsing of XML. It can
944 be fed XML data incrementall with the :meth:`feed` method, and parsing events
945 are translated to a push API - by invoking callbacks on the *target* object.
946 If *target* is omitted, the standard :class:`TreeBuilder` is used. The
947 *html* argument was historically used for backwards compatibility and is now
948 deprecated. If *encoding* [1]_ is given, the value overrides the encoding
Eli Bendersky52467b12012-06-01 07:13:08 +0300949 specified in the XML file.
Georg Brandl116aa622007-08-15 14:28:22 +0000950
Eli Benderskyb5869342013-08-30 05:51:20 -0700951 .. deprecated:: 3.4
952 The *html* argument.
Georg Brandl116aa622007-08-15 14:28:22 +0000953
Benjamin Petersone41251e2008-04-25 01:59:09 +0000954 .. method:: close()
Georg Brandl116aa622007-08-15 14:28:22 +0000955
Eli Benderskybfd78372013-08-24 15:11:44 -0700956 Finishes feeding data to the parser. Returns the result of calling the
Eli Benderskybf8ab772013-08-25 15:27:36 -0700957 ``close()`` method of the *target* passed during construction; by default,
958 this is the toplevel document element.
Georg Brandl116aa622007-08-15 14:28:22 +0000959
960
Benjamin Petersone41251e2008-04-25 01:59:09 +0000961 .. method:: doctype(name, pubid, system)
Georg Brandl116aa622007-08-15 14:28:22 +0000962
Georg Brandl67b21b72010-08-17 15:07:14 +0000963 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000964 Define the :meth:`TreeBuilder.doctype` method on a custom TreeBuilder
965 target.
Georg Brandl116aa622007-08-15 14:28:22 +0000966
967
Benjamin Petersone41251e2008-04-25 01:59:09 +0000968 .. method:: feed(data)
Georg Brandl116aa622007-08-15 14:28:22 +0000969
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000970 Feeds data to the parser. *data* is encoded data.
Georg Brandl116aa622007-08-15 14:28:22 +0000971
Eli Benderskyb5869342013-08-30 05:51:20 -0700972 :meth:`XMLParser.feed` calls *target*\'s ``start(tag, attrs_dict)`` method
973 for each opening tag, its ``end(tag)`` method for each closing tag, and data
974 is processed by method ``data(data)``. :meth:`XMLParser.close` calls
975 *target*\'s method ``close()``. :class:`XMLParser` can be used not only for
976 building a tree structure. This is an example of counting the maximum depth
977 of an XML file::
Christian Heimesd8654cf2007-12-02 15:22:16 +0000978
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000979 >>> from xml.etree.ElementTree import XMLParser
Christian Heimesd8654cf2007-12-02 15:22:16 +0000980 >>> class MaxDepth: # The target object of the parser
981 ... maxDepth = 0
982 ... depth = 0
983 ... def start(self, tag, attrib): # Called for each opening tag.
Georg Brandl48310cd2009-01-03 21:18:54 +0000984 ... self.depth += 1
Christian Heimesd8654cf2007-12-02 15:22:16 +0000985 ... if self.depth > self.maxDepth:
986 ... self.maxDepth = self.depth
987 ... def end(self, tag): # Called for each closing tag.
988 ... self.depth -= 1
Georg Brandl48310cd2009-01-03 21:18:54 +0000989 ... def data(self, data):
Christian Heimesd8654cf2007-12-02 15:22:16 +0000990 ... pass # We do not need to do anything with data.
991 ... def close(self): # Called when all data has been parsed.
992 ... return self.maxDepth
Georg Brandl48310cd2009-01-03 21:18:54 +0000993 ...
Christian Heimesd8654cf2007-12-02 15:22:16 +0000994 >>> target = MaxDepth()
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000995 >>> parser = XMLParser(target=target)
Christian Heimesd8654cf2007-12-02 15:22:16 +0000996 >>> exampleXml = """
997 ... <a>
998 ... <b>
999 ... </b>
1000 ... <b>
1001 ... <c>
1002 ... <d>
1003 ... </d>
1004 ... </c>
1005 ... </b>
1006 ... </a>"""
1007 >>> parser.feed(exampleXml)
1008 >>> parser.close()
1009 4
Christian Heimesb186d002008-03-18 15:15:01 +00001010
Eli Benderskyb5869342013-08-30 05:51:20 -07001011
1012.. _elementtree-xmlpullparser-objects:
1013
1014XMLPullParser Objects
1015^^^^^^^^^^^^^^^^^^^^^
1016
1017.. class:: XMLPullParser(events=None)
1018
Eli Bendersky2c68e302013-08-31 07:37:23 -07001019 A pull parser suitable for non-blocking applications. Its input-side API is
1020 similar to that of :class:`XMLParser`, but instead of pushing calls to a
1021 callback target, :class:`XMLPullParser` collects an internal list of parsing
1022 events and lets the user read from it. *events* is a sequence of events to
1023 report back. The supported events are the strings ``"start"``, ``"end"``,
1024 ``"start-ns"`` and ``"end-ns"`` (the "ns" events are used to get detailed
1025 namespace information). If *events* is omitted, only ``"end"`` events are
1026 reported.
Eli Benderskyb5869342013-08-30 05:51:20 -07001027
1028 .. method:: feed(data)
1029
1030 Feed the given bytes data to the parser.
1031
1032 .. method:: close()
1033
Nick Coghlan4cc2afa2013-09-28 23:50:35 +10001034 Signal the parser that the data stream is terminated. Unlike
1035 :meth:`XMLParser.close`, this method always returns :const:`None`.
1036 Any events not yet retrieved when the parser is closed can still be
1037 read with :meth:`read_events`.
Eli Benderskyb5869342013-08-30 05:51:20 -07001038
1039 .. method:: read_events()
1040
1041 Iterate over the events which have been encountered in the data fed to the
1042 parser. This method yields ``(event, elem)`` pairs, where *event* is a
1043 string representing the type of event (e.g. ``"end"``) and *elem* is the
Nick Coghlan4cc2afa2013-09-28 23:50:35 +10001044 encountered :class:`Element` object.
1045
1046 Events provided in a previous call to :meth:`read_events` will not be
1047 yielded again. As events are consumed from the internal queue only as
1048 they are retrieved from the iterator, multiple readers calling
1049 :meth:`read_events` in parallel will have unpredictable results.
Eli Benderskyb5869342013-08-30 05:51:20 -07001050
1051 .. note::
1052
1053 :class:`XMLPullParser` only guarantees that it has seen the ">"
1054 character of a starting tag when it emits a "start" event, so the
1055 attributes are defined, but the contents of the text and tail attributes
1056 are undefined at that point. The same applies to the element children;
1057 they may or may not be present.
1058
1059 If you need a fully populated element, look for "end" events instead.
1060
1061 .. versionadded:: 3.4
1062
Eli Bendersky5b77d812012-03-16 08:20:05 +02001063Exceptions
Eli Bendersky3a4875e2012-03-26 20:43:32 +02001064^^^^^^^^^^
Eli Bendersky5b77d812012-03-16 08:20:05 +02001065
1066.. class:: ParseError
1067
1068 XML parse error, raised by the various parsing methods in this module when
1069 parsing fails. The string representation of an instance of this exception
1070 will contain a user-friendly error message. In addition, it will have
1071 the following attributes available:
1072
1073 .. attribute:: code
1074
1075 A numeric error code from the expat parser. See the documentation of
1076 :mod:`xml.parsers.expat` for the list of error codes and their meanings.
1077
1078 .. attribute:: position
1079
1080 A tuple of *line*, *column* numbers, specifying where the error occurred.
Christian Heimesb186d002008-03-18 15:15:01 +00001081
1082.. rubric:: Footnotes
1083
1084.. [#] The encoding string included in XML output should conform to the
Florent Xiclunaf15351d2010-03-13 23:24:31 +00001085 appropriate standards. For example, "UTF-8" is valid, but "UTF8" is
1086 not. See http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl
Benjamin Petersonad3d5c22009-02-26 03:38:59 +00001087 and http://www.iana.org/assignments/character-sets.