blob: eef1b583488316d5baf6eec053f86ffcd82e9a4d [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`xml.etree.ElementTree` --- The ElementTree XML API
2========================================================
3
4.. module:: xml.etree.ElementTree
5 :synopsis: Implementation of the ElementTree API.
6.. moduleauthor:: Fredrik Lundh <fredrik@pythonware.com>
7
Eli Benderskyc1d98692012-03-30 11:44:15 +03008The :mod:`xml.etree.ElementTree` module implements a simple and efficient API
9for parsing and creating XML data.
Florent Xiclunaf15351d2010-03-13 23:24:31 +000010
Florent Xiclunaa72a98f2012-02-13 11:03:30 +010011.. versionchanged:: 3.3
12 This module will use a fast implementation whenever available.
13 The :mod:`xml.etree.cElementTree` module is deprecated.
14
Christian Heimes7380a672013-03-26 17:35:55 +010015
16.. warning::
17
18 The :mod:`xml.etree.ElementTree` module is not secure against
19 maliciously constructed data. If you need to parse untrusted or
20 unauthenticated data see :ref:`xml-vulnerabilities`.
21
Eli Benderskyc1d98692012-03-30 11:44:15 +030022Tutorial
23--------
Georg Brandl116aa622007-08-15 14:28:22 +000024
Eli Benderskyc1d98692012-03-30 11:44:15 +030025This is a short tutorial for using :mod:`xml.etree.ElementTree` (``ET`` in
26short). The goal is to demonstrate some of the building blocks and basic
27concepts of the module.
Eli Bendersky3a4875e2012-03-26 20:43:32 +020028
Eli Benderskyc1d98692012-03-30 11:44:15 +030029XML tree and elements
30^^^^^^^^^^^^^^^^^^^^^
Eli Bendersky3a4875e2012-03-26 20:43:32 +020031
Eli Benderskyc1d98692012-03-30 11:44:15 +030032XML is an inherently hierarchical data format, and the most natural way to
33represent it is with a tree. ``ET`` has two classes for this purpose -
34:class:`ElementTree` represents the whole XML document as a tree, and
35:class:`Element` represents a single node in this tree. Interactions with
36the whole document (reading and writing to/from files) are usually done
37on the :class:`ElementTree` level. Interactions with a single XML element
38and its sub-elements are done on the :class:`Element` level.
Eli Bendersky3a4875e2012-03-26 20:43:32 +020039
Eli Benderskyc1d98692012-03-30 11:44:15 +030040.. _elementtree-parsing-xml:
Eli Bendersky3a4875e2012-03-26 20:43:32 +020041
Eli Benderskyc1d98692012-03-30 11:44:15 +030042Parsing XML
43^^^^^^^^^^^
Eli Bendersky3a4875e2012-03-26 20:43:32 +020044
Eli Bendersky0f4e9342012-08-14 07:19:33 +030045We'll be using the following XML document as the sample data for this section:
Eli Bendersky3a4875e2012-03-26 20:43:32 +020046
Eli Bendersky0f4e9342012-08-14 07:19:33 +030047.. code-block:: xml
48
49 <?xml version="1.0"?>
Eli Bendersky3a4875e2012-03-26 20:43:32 +020050 <data>
Eli Bendersky3115f0d2012-08-15 14:26:30 +030051 <country name="Liechtenstein">
Eli Bendersky3a4875e2012-03-26 20:43:32 +020052 <rank>1</rank>
53 <year>2008</year>
54 <gdppc>141100</gdppc>
55 <neighbor name="Austria" direction="E"/>
56 <neighbor name="Switzerland" direction="W"/>
57 </country>
58 <country name="Singapore">
59 <rank>4</rank>
60 <year>2011</year>
61 <gdppc>59900</gdppc>
62 <neighbor name="Malaysia" direction="N"/>
63 </country>
64 <country name="Panama">
65 <rank>68</rank>
66 <year>2011</year>
67 <gdppc>13600</gdppc>
68 <neighbor name="Costa Rica" direction="W"/>
69 <neighbor name="Colombia" direction="E"/>
70 </country>
71 </data>
Eli Bendersky3a4875e2012-03-26 20:43:32 +020072
Eli Bendersky0f4e9342012-08-14 07:19:33 +030073We can import this data by reading from a file::
Eli Benderskyc1d98692012-03-30 11:44:15 +030074
75 import xml.etree.ElementTree as ET
Eli Bendersky0f4e9342012-08-14 07:19:33 +030076 tree = ET.parse('country_data.xml')
77 root = tree.getroot()
Eli Benderskyc1d98692012-03-30 11:44:15 +030078
Eli Bendersky0f4e9342012-08-14 07:19:33 +030079Or directly from a string::
80
81 root = ET.fromstring(country_data_as_string)
Eli Benderskyc1d98692012-03-30 11:44:15 +030082
83:func:`fromstring` parses XML from a string directly into an :class:`Element`,
84which is the root element of the parsed tree. Other parsing functions may
Eli Bendersky0f4e9342012-08-14 07:19:33 +030085create an :class:`ElementTree`. Check the documentation to be sure.
Eli Benderskyc1d98692012-03-30 11:44:15 +030086
87As an :class:`Element`, ``root`` has a tag and a dictionary of attributes::
88
89 >>> root.tag
90 'data'
91 >>> root.attrib
92 {}
93
94It also has children nodes over which we can iterate::
95
96 >>> for child in root:
97 ... print(child.tag, child.attrib)
98 ...
Eli Bendersky3115f0d2012-08-15 14:26:30 +030099 country {'name': 'Liechtenstein'}
Eli Benderskyc1d98692012-03-30 11:44:15 +0300100 country {'name': 'Singapore'}
101 country {'name': 'Panama'}
102
103Children are nested, and we can access specific child nodes by index::
104
105 >>> root[0][1].text
106 '2008'
107
R David Murray410d3202014-01-04 23:52:50 -0500108
Eli Bendersky0bd22d42014-04-03 06:14:38 -0700109.. note::
110
111 Not all elements of the XML input will end up as elements of the
112 parsed tree. Currently, this module skips over any XML comments,
113 processing instructions, and document type declarations in the
114 input. Nevertheless, trees built using this module's API rather
115 than parsing from XML text can have comments and processing
116 instructions in them; they will be included when generating XML
117 output. A document type declaration may be accessed by passing a
118 custom :class:`TreeBuilder` instance to the :class:`XMLParser`
119 constructor.
120
121
R David Murray410d3202014-01-04 23:52:50 -0500122.. _elementtree-pull-parsing:
123
Eli Bendersky2c68e302013-08-31 07:37:23 -0700124Pull API for non-blocking parsing
Eli Benderskyb5869342013-08-30 05:51:20 -0700125^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Eli Bendersky3bdead12013-04-20 09:06:27 -0700126
R David Murray410d3202014-01-04 23:52:50 -0500127Most parsing functions provided by this module require the whole document
128to be read at once before returning any result. It is possible to use an
129:class:`XMLParser` and feed data into it incrementally, but it is a push API that
Eli Benderskyb5869342013-08-30 05:51:20 -0700130calls methods on a callback target, which is too low-level and inconvenient for
131most needs. Sometimes what the user really wants is to be able to parse XML
132incrementally, without blocking operations, while enjoying the convenience of
133fully constructed :class:`Element` objects.
Eli Bendersky3bdead12013-04-20 09:06:27 -0700134
Eli Benderskyb5869342013-08-30 05:51:20 -0700135The most powerful tool for doing this is :class:`XMLPullParser`. It does not
136require a blocking read to obtain the XML data, and is instead fed with data
137incrementally with :meth:`XMLPullParser.feed` calls. To get the parsed XML
R David Murray410d3202014-01-04 23:52:50 -0500138elements, call :meth:`XMLPullParser.read_events`. Here is an example::
Eli Benderskyb5869342013-08-30 05:51:20 -0700139
Eli Bendersky2c68e302013-08-31 07:37:23 -0700140 >>> parser = ET.XMLPullParser(['start', 'end'])
141 >>> parser.feed('<mytag>sometext')
142 >>> list(parser.read_events())
Eli Benderskyb5869342013-08-30 05:51:20 -0700143 [('start', <Element 'mytag' at 0x7fa66db2be58>)]
Eli Bendersky2c68e302013-08-31 07:37:23 -0700144 >>> parser.feed(' more text</mytag>')
145 >>> for event, elem in parser.read_events():
Eli Benderskyb5869342013-08-30 05:51:20 -0700146 ... print(event)
147 ... print(elem.tag, 'text=', elem.text)
148 ...
149 end
Eli Bendersky3bdead12013-04-20 09:06:27 -0700150
Eli Bendersky2c68e302013-08-31 07:37:23 -0700151The obvious use case is applications that operate in a non-blocking fashion
Eli Bendersky3bdead12013-04-20 09:06:27 -0700152where the XML data is being received from a socket or read incrementally from
153some storage device. In such cases, blocking reads are unacceptable.
154
Eli Benderskyb5869342013-08-30 05:51:20 -0700155Because it's so flexible, :class:`XMLPullParser` can be inconvenient to use for
156simpler use-cases. If you don't mind your application blocking on reading XML
157data but would still like to have incremental parsing capabilities, take a look
158at :func:`iterparse`. It can be useful when you're reading a large XML document
159and don't want to hold it wholly in memory.
Eli Bendersky3bdead12013-04-20 09:06:27 -0700160
Eli Benderskyc1d98692012-03-30 11:44:15 +0300161Finding interesting elements
162^^^^^^^^^^^^^^^^^^^^^^^^^^^^
163
164:class:`Element` has some useful methods that help iterate recursively over all
165the sub-tree below it (its children, their children, and so on). For example,
166:meth:`Element.iter`::
167
168 >>> for neighbor in root.iter('neighbor'):
169 ... print(neighbor.attrib)
170 ...
171 {'name': 'Austria', 'direction': 'E'}
172 {'name': 'Switzerland', 'direction': 'W'}
173 {'name': 'Malaysia', 'direction': 'N'}
174 {'name': 'Costa Rica', 'direction': 'W'}
175 {'name': 'Colombia', 'direction': 'E'}
176
Eli Bendersky0f4e9342012-08-14 07:19:33 +0300177:meth:`Element.findall` finds only elements with a tag which are direct
178children of the current element. :meth:`Element.find` finds the *first* child
Georg Brandlbdaee3a2013-10-06 09:23:03 +0200179with a particular tag, and :attr:`Element.text` accesses the element's text
Eli Bendersky0f4e9342012-08-14 07:19:33 +0300180content. :meth:`Element.get` accesses the element's attributes::
181
182 >>> for country in root.findall('country'):
183 ... rank = country.find('rank').text
184 ... name = country.get('name')
185 ... print(name, rank)
186 ...
Eli Bendersky3115f0d2012-08-15 14:26:30 +0300187 Liechtenstein 1
Eli Bendersky0f4e9342012-08-14 07:19:33 +0300188 Singapore 4
189 Panama 68
190
Eli Benderskyc1d98692012-03-30 11:44:15 +0300191More sophisticated specification of which elements to look for is possible by
192using :ref:`XPath <elementtree-xpath>`.
193
Eli Bendersky0f4e9342012-08-14 07:19:33 +0300194Modifying an XML File
195^^^^^^^^^^^^^^^^^^^^^
Eli Benderskyc1d98692012-03-30 11:44:15 +0300196
Eli Bendersky0f4e9342012-08-14 07:19:33 +0300197:class:`ElementTree` provides a simple way to build XML documents and write them to files.
Eli Benderskyc1d98692012-03-30 11:44:15 +0300198The :meth:`ElementTree.write` method serves this purpose.
199
200Once created, an :class:`Element` object may be manipulated by directly changing
201its fields (such as :attr:`Element.text`), adding and modifying attributes
202(:meth:`Element.set` method), as well as adding new children (for example
203with :meth:`Element.append`).
204
Eli Bendersky0f4e9342012-08-14 07:19:33 +0300205Let's say we want to add one to each country's rank, and add an ``updated``
206attribute to the rank element::
207
208 >>> for rank in root.iter('rank'):
209 ... new_rank = int(rank.text) + 1
210 ... rank.text = str(new_rank)
211 ... rank.set('updated', 'yes')
212 ...
Eli Benderskya1b0f6d2012-08-18 05:42:22 +0300213 >>> tree.write('output.xml')
Eli Bendersky0f4e9342012-08-14 07:19:33 +0300214
215Our XML now looks like this:
216
217.. code-block:: xml
218
219 <?xml version="1.0"?>
220 <data>
Eli Bendersky3115f0d2012-08-15 14:26:30 +0300221 <country name="Liechtenstein">
Eli Bendersky0f4e9342012-08-14 07:19:33 +0300222 <rank updated="yes">2</rank>
223 <year>2008</year>
224 <gdppc>141100</gdppc>
225 <neighbor name="Austria" direction="E"/>
226 <neighbor name="Switzerland" direction="W"/>
227 </country>
228 <country name="Singapore">
229 <rank updated="yes">5</rank>
230 <year>2011</year>
231 <gdppc>59900</gdppc>
232 <neighbor name="Malaysia" direction="N"/>
233 </country>
234 <country name="Panama">
235 <rank updated="yes">69</rank>
236 <year>2011</year>
237 <gdppc>13600</gdppc>
238 <neighbor name="Costa Rica" direction="W"/>
239 <neighbor name="Colombia" direction="E"/>
240 </country>
241 </data>
242
243We can remove elements using :meth:`Element.remove`. Let's say we want to
244remove all countries with a rank higher than 50::
245
246 >>> for country in root.findall('country'):
247 ... rank = int(country.find('rank').text)
248 ... if rank > 50:
249 ... root.remove(country)
250 ...
Eli Benderskya1b0f6d2012-08-18 05:42:22 +0300251 >>> tree.write('output.xml')
Eli Bendersky0f4e9342012-08-14 07:19:33 +0300252
253Our XML now looks like this:
254
255.. code-block:: xml
256
257 <?xml version="1.0"?>
258 <data>
Eli Bendersky3115f0d2012-08-15 14:26:30 +0300259 <country name="Liechtenstein">
Eli Bendersky0f4e9342012-08-14 07:19:33 +0300260 <rank updated="yes">2</rank>
261 <year>2008</year>
262 <gdppc>141100</gdppc>
263 <neighbor name="Austria" direction="E"/>
264 <neighbor name="Switzerland" direction="W"/>
265 </country>
266 <country name="Singapore">
267 <rank updated="yes">5</rank>
268 <year>2011</year>
269 <gdppc>59900</gdppc>
270 <neighbor name="Malaysia" direction="N"/>
271 </country>
272 </data>
273
274Building XML documents
275^^^^^^^^^^^^^^^^^^^^^^
276
Eli Benderskyc1d98692012-03-30 11:44:15 +0300277The :func:`SubElement` function also provides a convenient way to create new
278sub-elements for a given element::
279
280 >>> a = ET.Element('a')
281 >>> b = ET.SubElement(a, 'b')
282 >>> c = ET.SubElement(a, 'c')
283 >>> d = ET.SubElement(c, 'd')
284 >>> ET.dump(a)
285 <a><b /><c><d /></c></a>
286
Raymond Hettingerf6e31b72015-03-22 15:29:09 -0700287Parsing XML with Namespaces
288^^^^^^^^^^^^^^^^^^^^^^^^^^^
289
290If the XML input has `namespaces
291<https://en.wikipedia.org/wiki/XML_namespace>`__, tags and attributes
292with prefixes in the form ``prefix:sometag`` get expanded to
Raymond Hettingerc43a6662015-03-30 20:29:28 -0700293``{uri}sometag`` where the *prefix* is replaced by the full *URI*.
294Also, if there is a `default namespace
Raymond Hettingerf6e31b72015-03-22 15:29:09 -0700295<http://www.w3.org/TR/2006/REC-xml-names-20060816/#defaulting>`__,
296that full URI gets prepended to all of the non-prefixed tags.
297
298Here is an XML example that incorporates two namespaces, one with the
299prefix "fictional" and the other serving as the default namespace:
300
301.. code-block:: xml
302
303 <?xml version="1.0"?>
304 <actors xmlns:fictional="http://characters.example.com"
305 xmlns="http://people.example.com">
306 <actor>
307 <name>John Cleese</name>
308 <fictional:character>Lancelot</fictional:character>
309 <fictional:character>Archie Leach</fictional:character>
310 </actor>
311 <actor>
312 <name>Eric Idle</name>
313 <fictional:character>Sir Robin</fictional:character>
314 <fictional:character>Gunther</fictional:character>
315 <fictional:character>Commander Clement</fictional:character>
316 </actor>
317 </actors>
318
319One way to search and explore this XML example is to manually add the
Raymond Hettingerc43a6662015-03-30 20:29:28 -0700320URI to every tag or attribute in the xpath of a
321:meth:`~Element.find` or :meth:`~Element.findall`::
Raymond Hettingerf6e31b72015-03-22 15:29:09 -0700322
Raymond Hettingerc43a6662015-03-30 20:29:28 -0700323 root = fromstring(xml_text)
Raymond Hettingerf6e31b72015-03-22 15:29:09 -0700324 for actor in root.findall('{http://people.example.com}actor'):
325 name = actor.find('{http://people.example.com}name')
326 print(name.text)
327 for char in actor.findall('{http://characters.example.com}character'):
328 print(' |-->', char.text)
329
Raymond Hettingerc43a6662015-03-30 20:29:28 -0700330A better way to search the namespaced XML example is to create a
331dictionary with your own prefixes and use those in the search functions::
Raymond Hettingerf6e31b72015-03-22 15:29:09 -0700332
333 ns = {'real_person': 'http://people.example.com',
334 'role': 'http://characters.example.com'}
335
336 for actor in root.findall('real_person:actor', ns):
337 name = actor.find('real_person:name', ns)
338 print(name.text)
339 for char in actor.findall('role:character', ns):
340 print(' |-->', char.text)
341
342These two approaches both output::
343
344 John Cleese
345 |--> Lancelot
346 |--> Archie Leach
347 Eric Idle
348 |--> Sir Robin
349 |--> Gunther
350 |--> Commander Clement
351
352
Eli Benderskyc1d98692012-03-30 11:44:15 +0300353Additional resources
354^^^^^^^^^^^^^^^^^^^^
355
356See http://effbot.org/zone/element-index.htm for tutorials and links to other
357docs.
358
359
360.. _elementtree-xpath:
361
362XPath support
363-------------
364
365This module provides limited support for
366`XPath expressions <http://www.w3.org/TR/xpath>`_ for locating elements in a
367tree. The goal is to support a small subset of the abbreviated syntax; a full
368XPath engine is outside the scope of the module.
369
370Example
371^^^^^^^
372
373Here's an example that demonstrates some of the XPath capabilities of the
374module. We'll be using the ``countrydata`` XML document from the
375:ref:`Parsing XML <elementtree-parsing-xml>` section::
376
377 import xml.etree.ElementTree as ET
378
379 root = ET.fromstring(countrydata)
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200380
381 # Top-level elements
Eli Benderskyc1d98692012-03-30 11:44:15 +0300382 root.findall(".")
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200383
384 # All 'neighbor' grand-children of 'country' children of the top-level
385 # elements
Eli Benderskyc1d98692012-03-30 11:44:15 +0300386 root.findall("./country/neighbor")
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200387
388 # Nodes with name='Singapore' that have a 'year' child
Eli Benderskyc1d98692012-03-30 11:44:15 +0300389 root.findall(".//year/..[@name='Singapore']")
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200390
391 # 'year' nodes that are children of nodes with name='Singapore'
Eli Benderskyc1d98692012-03-30 11:44:15 +0300392 root.findall(".//*[@name='Singapore']/year")
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200393
394 # All 'neighbor' nodes that are the second child of their parent
Eli Benderskyc1d98692012-03-30 11:44:15 +0300395 root.findall(".//neighbor[2]")
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200396
397Supported XPath syntax
398^^^^^^^^^^^^^^^^^^^^^^
399
Georg Brandl44ea77b2013-03-28 13:28:44 +0100400.. tabularcolumns:: |l|L|
401
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200402+-----------------------+------------------------------------------------------+
403| Syntax | Meaning |
404+=======================+======================================================+
405| ``tag`` | Selects all child elements with the given tag. |
406| | For example, ``spam`` selects all child elements |
Raymond Hettinger1e1e6012014-03-29 11:50:08 -0700407| | named ``spam``, and ``spam/egg`` selects all |
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200408| | grandchildren named ``egg`` in all children named |
409| | ``spam``. |
410+-----------------------+------------------------------------------------------+
411| ``*`` | Selects all child elements. For example, ``*/egg`` |
412| | selects all grandchildren named ``egg``. |
413+-----------------------+------------------------------------------------------+
414| ``.`` | Selects the current node. This is mostly useful |
415| | at the beginning of the path, to indicate that it's |
416| | a relative path. |
417+-----------------------+------------------------------------------------------+
418| ``//`` | Selects all subelements, on all levels beneath the |
Eli Benderskyede001a2012-03-27 04:57:23 +0200419| | current element. For example, ``.//egg`` selects |
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200420| | all ``egg`` elements in the entire tree. |
421+-----------------------+------------------------------------------------------+
Eli Bendersky323a43a2012-10-09 06:46:33 -0700422| ``..`` | Selects the parent element. Returns ``None`` if the |
423| | path attempts to reach the ancestors of the start |
424| | element (the element ``find`` was called on). |
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200425+-----------------------+------------------------------------------------------+
426| ``[@attrib]`` | Selects all elements that have the given attribute. |
427+-----------------------+------------------------------------------------------+
428| ``[@attrib='value']`` | Selects all elements for which the given attribute |
429| | has the given value. The value cannot contain |
430| | quotes. |
431+-----------------------+------------------------------------------------------+
432| ``[tag]`` | Selects all elements that have a child named |
433| | ``tag``. Only immediate children are supported. |
434+-----------------------+------------------------------------------------------+
Raymond Hettingerc43a6662015-03-30 20:29:28 -0700435| ``[tag='text']`` | Selects all elements that have a child named |
436| | ``tag`` whose complete text content, including |
437| | descendants, equals the given ``text``. |
Raymond Hettingerf6e31b72015-03-22 15:29:09 -0700438+-----------------------+------------------------------------------------------+
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200439| ``[position]`` | Selects all elements that are located at the given |
440| | position. The position can be either an integer |
441| | (1 is the first position), the expression ``last()`` |
442| | (for the last position), or a position relative to |
443| | the last position (e.g. ``last()-1``). |
444+-----------------------+------------------------------------------------------+
445
446Predicates (expressions within square brackets) must be preceded by a tag
447name, an asterisk, or another predicate. ``position`` predicates must be
448preceded by a tag name.
449
450Reference
451---------
452
Georg Brandl116aa622007-08-15 14:28:22 +0000453.. _elementtree-functions:
454
455Functions
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200456^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +0000457
458
Georg Brandl7f01a132009-09-16 15:58:14 +0000459.. function:: Comment(text=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000460
Georg Brandlf6945182008-02-01 11:56:49 +0000461 Comment element factory. This factory function creates a special element
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000462 that will be serialized as an XML comment by the standard serializer. The
463 comment string can be either a bytestring or a Unicode string. *text* is a
464 string containing the comment string. Returns an element instance
Georg Brandlf6945182008-02-01 11:56:49 +0000465 representing a comment.
Georg Brandl116aa622007-08-15 14:28:22 +0000466
Eli Bendersky0bd22d42014-04-03 06:14:38 -0700467 Note that :class:`XMLParser` skips over comments in the input
468 instead of creating comment objects for them. An :class:`ElementTree` will
469 only contain comment nodes if they have been inserted into to
470 the tree using one of the :class:`Element` methods.
Georg Brandl116aa622007-08-15 14:28:22 +0000471
472.. function:: dump(elem)
473
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000474 Writes an element tree or element structure to sys.stdout. This function
475 should be used for debugging only.
Georg Brandl116aa622007-08-15 14:28:22 +0000476
477 The exact output format is implementation dependent. In this version, it's
478 written as an ordinary XML file.
479
480 *elem* is an element tree or an individual element.
481
482
Georg Brandl116aa622007-08-15 14:28:22 +0000483.. function:: fromstring(text)
484
Florent Xiclunadddd5e92010-03-14 01:28:07 +0000485 Parses an XML section from a string constant. Same as :func:`XML`. *text*
486 is a string containing XML data. Returns an :class:`Element` instance.
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000487
488
489.. function:: fromstringlist(sequence, parser=None)
490
491 Parses an XML document from a sequence of string fragments. *sequence* is a
492 list or other sequence containing XML data fragments. *parser* is an
493 optional parser instance. If not given, the standard :class:`XMLParser`
494 parser is used. Returns an :class:`Element` instance.
495
Ezio Melottif8754a62010-03-21 07:16:43 +0000496 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000497
498
499.. function:: iselement(element)
500
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000501 Checks if an object appears to be a valid element object. *element* is an
502 element instance. Returns a true value if this is an element object.
Georg Brandl116aa622007-08-15 14:28:22 +0000503
504
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000505.. function:: iterparse(source, events=None, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000506
507 Parses an XML section into an element tree incrementally, and reports what's
Eli Bendersky604c4ff2012-03-16 08:41:30 +0200508 going on to the user. *source* is a filename or :term:`file object`
Eli Benderskyfb625442013-05-19 09:09:24 -0700509 containing XML data. *events* is a sequence of events to report back. The
Eli Benderskyb5869342013-08-30 05:51:20 -0700510 supported events are the strings ``"start"``, ``"end"``, ``"start-ns"`` and
511 ``"end-ns"`` (the "ns" events are used to get detailed namespace
Eli Bendersky604c4ff2012-03-16 08:41:30 +0200512 information). If *events* is omitted, only ``"end"`` events are reported.
513 *parser* is an optional parser instance. If not given, the standard
Eli Benderskyb5869342013-08-30 05:51:20 -0700514 :class:`XMLParser` parser is used. *parser* must be a subclass of
515 :class:`XMLParser` and can only use the default :class:`TreeBuilder` as a
516 target. Returns an :term:`iterator` providing ``(event, elem)`` pairs.
Georg Brandl116aa622007-08-15 14:28:22 +0000517
Eli Benderskyab2a76c2013-04-20 05:53:50 -0700518 Note that while :func:`iterparse` builds the tree incrementally, it issues
519 blocking reads on *source* (or the file it names). As such, it's unsuitable
Eli Bendersky2c68e302013-08-31 07:37:23 -0700520 for applications where blocking reads can't be made. For fully non-blocking
521 parsing, see :class:`XMLPullParser`.
Eli Benderskyab2a76c2013-04-20 05:53:50 -0700522
Benjamin Peterson75edad02009-01-01 15:05:06 +0000523 .. note::
524
Eli Benderskyb5869342013-08-30 05:51:20 -0700525 :func:`iterparse` only guarantees that it has seen the ">" character of a
526 starting tag when it emits a "start" event, so the attributes are defined,
527 but the contents of the text and tail attributes are undefined at that
528 point. The same applies to the element children; they may or may not be
529 present.
Benjamin Peterson75edad02009-01-01 15:05:06 +0000530
531 If you need a fully populated element, look for "end" events instead.
532
Eli Benderskyb5869342013-08-30 05:51:20 -0700533 .. deprecated:: 3.4
534 The *parser* argument.
535
Georg Brandl7f01a132009-09-16 15:58:14 +0000536.. function:: parse(source, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000537
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000538 Parses an XML section into an element tree. *source* is a filename or file
539 object containing XML data. *parser* is an optional parser instance. If
540 not given, the standard :class:`XMLParser` parser is used. Returns an
541 :class:`ElementTree` instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000542
543
Georg Brandl7f01a132009-09-16 15:58:14 +0000544.. function:: ProcessingInstruction(target, text=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000545
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000546 PI element factory. This factory function creates a special element that
547 will be serialized as an XML processing instruction. *target* is a string
548 containing the PI target. *text* is a string containing the PI contents, if
549 given. Returns an element instance, representing a processing instruction.
550
Eli Bendersky0bd22d42014-04-03 06:14:38 -0700551 Note that :class:`XMLParser` skips over processing instructions
552 in the input instead of creating comment objects for them. An
553 :class:`ElementTree` will only contain processing instruction nodes if
554 they have been inserted into to the tree using one of the
555 :class:`Element` methods.
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000556
557.. function:: register_namespace(prefix, uri)
558
559 Registers a namespace prefix. The registry is global, and any existing
560 mapping for either the given prefix or the namespace URI will be removed.
561 *prefix* is a namespace prefix. *uri* is a namespace uri. Tags and
562 attributes in this namespace will be serialized with the given prefix, if at
563 all possible.
564
Ezio Melottif8754a62010-03-21 07:16:43 +0000565 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000566
567
Georg Brandl7f01a132009-09-16 15:58:14 +0000568.. function:: SubElement(parent, tag, attrib={}, **extra)
Georg Brandl116aa622007-08-15 14:28:22 +0000569
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000570 Subelement factory. This function creates an element instance, and appends
571 it to an existing element.
Georg Brandl116aa622007-08-15 14:28:22 +0000572
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000573 The element name, attribute names, and attribute values can be either
574 bytestrings or Unicode strings. *parent* is the parent element. *tag* is
575 the subelement name. *attrib* is an optional dictionary, containing element
576 attributes. *extra* contains additional attributes, given as keyword
577 arguments. Returns an element instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000578
579
Serhiy Storchaka9e189f02013-01-13 22:24:27 +0200580.. function:: tostring(element, encoding="us-ascii", method="xml", *, \
Eli Benderskya9a2ef52013-01-13 06:04:43 -0800581 short_empty_elements=True)
Georg Brandl116aa622007-08-15 14:28:22 +0000582
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000583 Generates a string representation of an XML element, including all
Florent Xiclunadddd5e92010-03-14 01:28:07 +0000584 subelements. *element* is an :class:`Element` instance. *encoding* [1]_ is
Florent Xiclunac17f1722010-08-08 19:48:29 +0000585 the output encoding (default is US-ASCII). Use ``encoding="unicode"`` to
Eli Bendersky831893a2012-10-09 07:18:16 -0700586 generate a Unicode string (otherwise, a bytestring is generated). *method*
587 is either ``"xml"``, ``"html"`` or ``"text"`` (default is ``"xml"``).
Eli Benderskya9a2ef52013-01-13 06:04:43 -0800588 *short_empty_elements* has the same meaning as in :meth:`ElementTree.write`.
Eli Bendersky831893a2012-10-09 07:18:16 -0700589 Returns an (optionally) encoded string containing the XML data.
Georg Brandl116aa622007-08-15 14:28:22 +0000590
Eli Benderskya9a2ef52013-01-13 06:04:43 -0800591 .. versionadded:: 3.4
592 The *short_empty_elements* parameter.
Georg Brandl116aa622007-08-15 14:28:22 +0000593
Eli Benderskya9a2ef52013-01-13 06:04:43 -0800594
Serhiy Storchaka9e189f02013-01-13 22:24:27 +0200595.. function:: tostringlist(element, encoding="us-ascii", method="xml", *, \
Eli Benderskya9a2ef52013-01-13 06:04:43 -0800596 short_empty_elements=True)
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000597
598 Generates a string representation of an XML element, including all
Florent Xiclunadddd5e92010-03-14 01:28:07 +0000599 subelements. *element* is an :class:`Element` instance. *encoding* [1]_ is
Florent Xiclunac17f1722010-08-08 19:48:29 +0000600 the output encoding (default is US-ASCII). Use ``encoding="unicode"`` to
Eli Bendersky831893a2012-10-09 07:18:16 -0700601 generate a Unicode string (otherwise, a bytestring is generated). *method*
602 is either ``"xml"``, ``"html"`` or ``"text"`` (default is ``"xml"``).
Eli Benderskya9a2ef52013-01-13 06:04:43 -0800603 *short_empty_elements* has the same meaning as in :meth:`ElementTree.write`.
Eli Bendersky831893a2012-10-09 07:18:16 -0700604 Returns a list of (optionally) encoded strings containing the XML data.
605 It does not guarantee any specific sequence, except that
Serhiy Storchaka5e028ae2014-02-06 21:10:41 +0200606 ``b"".join(tostringlist(element)) == tostring(element)``.
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000607
Ezio Melottif8754a62010-03-21 07:16:43 +0000608 .. versionadded:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000609
Eli Benderskya9a2ef52013-01-13 06:04:43 -0800610 .. versionadded:: 3.4
611 The *short_empty_elements* parameter.
612
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000613
614.. function:: XML(text, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000615
616 Parses an XML section from a string constant. This function can be used to
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000617 embed "XML literals" in Python code. *text* is a string containing XML
618 data. *parser* is an optional parser instance. If not given, the standard
619 :class:`XMLParser` parser is used. Returns an :class:`Element` instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000620
621
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000622.. function:: XMLID(text, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000623
624 Parses an XML section from a string constant, and also returns a dictionary
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000625 which maps from element id:s to elements. *text* is a string containing XML
626 data. *parser* is an optional parser instance. If not given, the standard
627 :class:`XMLParser` parser is used. Returns a tuple containing an
628 :class:`Element` instance and a dictionary.
Georg Brandl116aa622007-08-15 14:28:22 +0000629
630
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000631.. _elementtree-element-objects:
Georg Brandl116aa622007-08-15 14:28:22 +0000632
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000633Element Objects
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200634^^^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +0000635
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000636.. class:: Element(tag, attrib={}, **extra)
Georg Brandl116aa622007-08-15 14:28:22 +0000637
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000638 Element class. This class defines the Element interface, and provides a
639 reference implementation of this interface.
Georg Brandl116aa622007-08-15 14:28:22 +0000640
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000641 The element name, attribute names, and attribute values can be either
642 bytestrings or Unicode strings. *tag* is the element name. *attrib* is
643 an optional dictionary, containing element attributes. *extra* contains
644 additional attributes, given as keyword arguments.
Georg Brandl116aa622007-08-15 14:28:22 +0000645
646
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000647 .. attribute:: tag
Georg Brandl116aa622007-08-15 14:28:22 +0000648
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000649 A string identifying what kind of data this element represents (the
650 element type, in other words).
Georg Brandl116aa622007-08-15 14:28:22 +0000651
652
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000653 .. attribute:: text
Georg Brandl116aa622007-08-15 14:28:22 +0000654
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000655 The *text* attribute can be used to hold additional data associated with
656 the element. As the name implies this attribute is usually a string but
657 may be any application-specific object. If the element is created from
658 an XML file the attribute will contain any text found between the element
659 tags.
Georg Brandl116aa622007-08-15 14:28:22 +0000660
661
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000662 .. attribute:: tail
Georg Brandl116aa622007-08-15 14:28:22 +0000663
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000664 The *tail* attribute can be used to hold additional data associated with
665 the element. This attribute is usually a string but may be any
666 application-specific object. If the element is created from an XML file
667 the attribute will contain any text found after the element's end tag and
668 before the next tag.
Georg Brandl116aa622007-08-15 14:28:22 +0000669
Georg Brandl116aa622007-08-15 14:28:22 +0000670
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000671 .. attribute:: attrib
Georg Brandl116aa622007-08-15 14:28:22 +0000672
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000673 A dictionary containing the element's attributes. Note that while the
674 *attrib* value is always a real mutable Python dictionary, an ElementTree
675 implementation may choose to use another internal representation, and
676 create the dictionary only if someone asks for it. To take advantage of
677 such implementations, use the dictionary methods below whenever possible.
Georg Brandl116aa622007-08-15 14:28:22 +0000678
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000679 The following dictionary-like methods work on the element attributes.
Georg Brandl116aa622007-08-15 14:28:22 +0000680
681
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000682 .. method:: clear()
Georg Brandl116aa622007-08-15 14:28:22 +0000683
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000684 Resets an element. This function removes all subelements, clears all
Eli Bendersky323a43a2012-10-09 06:46:33 -0700685 attributes, and sets the text and tail attributes to ``None``.
Georg Brandl116aa622007-08-15 14:28:22 +0000686
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000687
688 .. method:: get(key, default=None)
689
690 Gets the element attribute named *key*.
691
692 Returns the attribute value, or *default* if the attribute was not found.
693
694
695 .. method:: items()
696
697 Returns the element attributes as a sequence of (name, value) pairs. The
698 attributes are returned in an arbitrary order.
699
700
701 .. method:: keys()
702
703 Returns the elements attribute names as a list. The names are returned
704 in an arbitrary order.
705
706
707 .. method:: set(key, value)
708
709 Set the attribute *key* on the element to *value*.
710
711 The following methods work on the element's children (subelements).
712
713
714 .. method:: append(subelement)
715
Eli Bendersky396e8fc2012-03-23 14:24:20 +0200716 Adds the element *subelement* to the end of this element's internal list
717 of subelements. Raises :exc:`TypeError` if *subelement* is not an
718 :class:`Element`.
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000719
720
721 .. method:: extend(subelements)
Georg Brandl116aa622007-08-15 14:28:22 +0000722
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000723 Appends *subelements* from a sequence object with zero or more elements.
Eli Bendersky396e8fc2012-03-23 14:24:20 +0200724 Raises :exc:`TypeError` if a subelement is not an :class:`Element`.
Georg Brandl116aa622007-08-15 14:28:22 +0000725
Ezio Melottif8754a62010-03-21 07:16:43 +0000726 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000727
Georg Brandl116aa622007-08-15 14:28:22 +0000728
Eli Bendersky737b1732012-05-29 06:02:56 +0300729 .. method:: find(match, namespaces=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000730
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000731 Finds the first subelement matching *match*. *match* may be a tag name
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200732 or a :ref:`path <elementtree-xpath>`. Returns an element instance
Eli Bendersky737b1732012-05-29 06:02:56 +0300733 or ``None``. *namespaces* is an optional mapping from namespace prefix
734 to full name.
Georg Brandl116aa622007-08-15 14:28:22 +0000735
Georg Brandl116aa622007-08-15 14:28:22 +0000736
Eli Bendersky737b1732012-05-29 06:02:56 +0300737 .. method:: findall(match, namespaces=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000738
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200739 Finds all matching subelements, by tag name or
740 :ref:`path <elementtree-xpath>`. Returns a list containing all matching
Eli Bendersky737b1732012-05-29 06:02:56 +0300741 elements in document order. *namespaces* is an optional mapping from
742 namespace prefix to full name.
Georg Brandl116aa622007-08-15 14:28:22 +0000743
Georg Brandl116aa622007-08-15 14:28:22 +0000744
Eli Bendersky737b1732012-05-29 06:02:56 +0300745 .. method:: findtext(match, default=None, namespaces=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000746
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000747 Finds text for the first subelement matching *match*. *match* may be
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200748 a tag name or a :ref:`path <elementtree-xpath>`. Returns the text content
749 of the first matching element, or *default* if no element was found.
750 Note that if the matching element has no text content an empty string
Eli Bendersky737b1732012-05-29 06:02:56 +0300751 is returned. *namespaces* is an optional mapping from namespace prefix
752 to full name.
Georg Brandl116aa622007-08-15 14:28:22 +0000753
Georg Brandl116aa622007-08-15 14:28:22 +0000754
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000755 .. method:: getchildren()
Georg Brandl116aa622007-08-15 14:28:22 +0000756
Georg Brandl67b21b72010-08-17 15:07:14 +0000757 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000758 Use ``list(elem)`` or iteration.
Georg Brandl116aa622007-08-15 14:28:22 +0000759
Georg Brandl116aa622007-08-15 14:28:22 +0000760
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000761 .. method:: getiterator(tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000762
Georg Brandl67b21b72010-08-17 15:07:14 +0000763 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000764 Use method :meth:`Element.iter` instead.
Georg Brandl116aa622007-08-15 14:28:22 +0000765
Georg Brandl116aa622007-08-15 14:28:22 +0000766
Eli Bendersky396e8fc2012-03-23 14:24:20 +0200767 .. method:: insert(index, subelement)
Georg Brandl116aa622007-08-15 14:28:22 +0000768
Eli Bendersky396e8fc2012-03-23 14:24:20 +0200769 Inserts *subelement* at the given position in this element. Raises
770 :exc:`TypeError` if *subelement* is not an :class:`Element`.
Georg Brandl116aa622007-08-15 14:28:22 +0000771
Georg Brandl116aa622007-08-15 14:28:22 +0000772
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000773 .. method:: iter(tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000774
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000775 Creates a tree :term:`iterator` with the current element as the root.
776 The iterator iterates over this element and all elements below it, in
777 document (depth first) order. If *tag* is not ``None`` or ``'*'``, only
778 elements whose tag equals *tag* are returned from the iterator. If the
779 tree structure is modified during iteration, the result is undefined.
Georg Brandl116aa622007-08-15 14:28:22 +0000780
Ezio Melotti138fc892011-10-10 00:02:03 +0300781 .. versionadded:: 3.2
782
Georg Brandl116aa622007-08-15 14:28:22 +0000783
Eli Bendersky737b1732012-05-29 06:02:56 +0300784 .. method:: iterfind(match, namespaces=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000785
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200786 Finds all matching subelements, by tag name or
787 :ref:`path <elementtree-xpath>`. Returns an iterable yielding all
Eli Bendersky737b1732012-05-29 06:02:56 +0300788 matching elements in document order. *namespaces* is an optional mapping
789 from namespace prefix to full name.
790
Georg Brandl116aa622007-08-15 14:28:22 +0000791
Ezio Melottif8754a62010-03-21 07:16:43 +0000792 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000793
Georg Brandl116aa622007-08-15 14:28:22 +0000794
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000795 .. method:: itertext()
Georg Brandl116aa622007-08-15 14:28:22 +0000796
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000797 Creates a text iterator. The iterator loops over this element and all
798 subelements, in document order, and returns all inner text.
Georg Brandl116aa622007-08-15 14:28:22 +0000799
Ezio Melottif8754a62010-03-21 07:16:43 +0000800 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000801
802
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000803 .. method:: makeelement(tag, attrib)
Georg Brandl116aa622007-08-15 14:28:22 +0000804
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000805 Creates a new element object of the same type as this element. Do not
806 call this method, use the :func:`SubElement` factory function instead.
Georg Brandl116aa622007-08-15 14:28:22 +0000807
808
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000809 .. method:: remove(subelement)
Georg Brandl116aa622007-08-15 14:28:22 +0000810
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000811 Removes *subelement* from the element. Unlike the find\* methods this
812 method compares elements based on the instance identity, not on tag value
813 or contents.
Georg Brandl116aa622007-08-15 14:28:22 +0000814
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000815 :class:`Element` objects also support the following sequence type methods
Serhiy Storchaka15e65902013-08-29 10:28:44 +0300816 for working with subelements: :meth:`~object.__delitem__`,
817 :meth:`~object.__getitem__`, :meth:`~object.__setitem__`,
818 :meth:`~object.__len__`.
Georg Brandl116aa622007-08-15 14:28:22 +0000819
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000820 Caution: Elements with no subelements will test as ``False``. This behavior
821 will change in future versions. Use specific ``len(elem)`` or ``elem is
822 None`` test instead. ::
Georg Brandl116aa622007-08-15 14:28:22 +0000823
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000824 element = root.find('foo')
Georg Brandl116aa622007-08-15 14:28:22 +0000825
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000826 if not element: # careful!
827 print("element not found, or element has no subelements")
Georg Brandl116aa622007-08-15 14:28:22 +0000828
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000829 if element is None:
830 print("element not found")
Georg Brandl116aa622007-08-15 14:28:22 +0000831
832
833.. _elementtree-elementtree-objects:
834
835ElementTree Objects
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200836^^^^^^^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +0000837
838
Georg Brandl7f01a132009-09-16 15:58:14 +0000839.. class:: ElementTree(element=None, file=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000840
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000841 ElementTree wrapper class. This class represents an entire element
842 hierarchy, and adds some extra support for serialization to and from
843 standard XML.
Georg Brandl116aa622007-08-15 14:28:22 +0000844
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000845 *element* is the root element. The tree is initialized with the contents
846 of the XML *file* if given.
Georg Brandl116aa622007-08-15 14:28:22 +0000847
848
Benjamin Petersone41251e2008-04-25 01:59:09 +0000849 .. method:: _setroot(element)
Georg Brandl116aa622007-08-15 14:28:22 +0000850
Benjamin Petersone41251e2008-04-25 01:59:09 +0000851 Replaces the root element for this tree. This discards the current
852 contents of the tree, and replaces it with the given element. Use with
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000853 care. *element* is an element instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000854
855
Eli Bendersky737b1732012-05-29 06:02:56 +0300856 .. method:: find(match, namespaces=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000857
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200858 Same as :meth:`Element.find`, starting at the root of the tree.
Georg Brandl116aa622007-08-15 14:28:22 +0000859
860
Eli Bendersky737b1732012-05-29 06:02:56 +0300861 .. method:: findall(match, namespaces=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000862
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200863 Same as :meth:`Element.findall`, starting at the root of the tree.
Georg Brandl116aa622007-08-15 14:28:22 +0000864
865
Eli Bendersky737b1732012-05-29 06:02:56 +0300866 .. method:: findtext(match, default=None, namespaces=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000867
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200868 Same as :meth:`Element.findtext`, starting at the root of the tree.
Georg Brandl116aa622007-08-15 14:28:22 +0000869
870
Georg Brandl7f01a132009-09-16 15:58:14 +0000871 .. method:: getiterator(tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000872
Georg Brandl67b21b72010-08-17 15:07:14 +0000873 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000874 Use method :meth:`ElementTree.iter` instead.
Georg Brandl116aa622007-08-15 14:28:22 +0000875
876
Benjamin Petersone41251e2008-04-25 01:59:09 +0000877 .. method:: getroot()
Florent Xiclunac17f1722010-08-08 19:48:29 +0000878
Benjamin Petersone41251e2008-04-25 01:59:09 +0000879 Returns the root element for this tree.
Georg Brandl116aa622007-08-15 14:28:22 +0000880
881
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000882 .. method:: iter(tag=None)
883
884 Creates and returns a tree iterator for the root element. The iterator
885 loops over all elements in this tree, in section order. *tag* is the tag
886 to look for (default is to return all elements)
887
888
Eli Bendersky737b1732012-05-29 06:02:56 +0300889 .. method:: iterfind(match, namespaces=None)
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000890
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200891 Same as :meth:`Element.iterfind`, starting at the root of the tree.
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000892
Ezio Melottif8754a62010-03-21 07:16:43 +0000893 .. versionadded:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000894
895
Georg Brandl7f01a132009-09-16 15:58:14 +0000896 .. method:: parse(source, parser=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000897
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000898 Loads an external XML section into this element tree. *source* is a file
Antoine Pitrou11cb9612010-09-15 11:11:28 +0000899 name or :term:`file object`. *parser* is an optional parser instance.
Eli Bendersky52467b12012-06-01 07:13:08 +0300900 If not given, the standard :class:`XMLParser` parser is used. Returns the
901 section root element.
Georg Brandl116aa622007-08-15 14:28:22 +0000902
903
Eli Benderskyf96cf912012-07-15 06:19:44 +0300904 .. method:: write(file, encoding="us-ascii", xml_declaration=None, \
Serhiy Storchaka9e189f02013-01-13 22:24:27 +0200905 default_namespace=None, method="xml", *, \
Eli Benderskye9af8272013-01-13 06:27:51 -0800906 short_empty_elements=True)
Georg Brandl116aa622007-08-15 14:28:22 +0000907
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000908 Writes the element tree to a file, as XML. *file* is a file name, or a
Eli Benderskyf96cf912012-07-15 06:19:44 +0300909 :term:`file object` opened for writing. *encoding* [1]_ is the output
910 encoding (default is US-ASCII).
911 *xml_declaration* controls if an XML declaration should be added to the
912 file. Use ``False`` for never, ``True`` for always, ``None``
913 for only if not US-ASCII or UTF-8 or Unicode (default is ``None``).
Serhiy Storchaka03530b92013-01-13 21:58:04 +0200914 *default_namespace* sets the default XML namespace (for "xmlns").
Eli Benderskyf96cf912012-07-15 06:19:44 +0300915 *method* is either ``"xml"``, ``"html"`` or ``"text"`` (default is
916 ``"xml"``).
Eli Benderskya9a2ef52013-01-13 06:04:43 -0800917 The keyword-only *short_empty_elements* parameter controls the formatting
918 of elements that contain no content. If *True* (the default), they are
919 emitted as a single self-closed tag, otherwise they are emitted as a pair
920 of start/end tags.
Eli Benderskyf96cf912012-07-15 06:19:44 +0300921
922 The output is either a string (:class:`str`) or binary (:class:`bytes`).
923 This is controlled by the *encoding* argument. If *encoding* is
924 ``"unicode"``, the output is a string; otherwise, it's binary. Note that
925 this may conflict with the type of *file* if it's an open
926 :term:`file object`; make sure you do not try to write a string to a
927 binary stream and vice versa.
928
R David Murray575fb312013-12-25 23:21:03 -0500929 .. versionadded:: 3.4
930 The *short_empty_elements* parameter.
Eli Benderskya9a2ef52013-01-13 06:04:43 -0800931
Georg Brandl116aa622007-08-15 14:28:22 +0000932
Christian Heimesd8654cf2007-12-02 15:22:16 +0000933This is the XML file that is going to be manipulated::
934
935 <html>
936 <head>
937 <title>Example page</title>
938 </head>
939 <body>
Georg Brandl48310cd2009-01-03 21:18:54 +0000940 <p>Moved to <a href="http://example.org/">example.org</a>
Christian Heimesd8654cf2007-12-02 15:22:16 +0000941 or <a href="http://example.com/">example.com</a>.</p>
942 </body>
943 </html>
944
945Example of changing the attribute "target" of every link in first paragraph::
946
947 >>> from xml.etree.ElementTree import ElementTree
948 >>> tree = ElementTree()
949 >>> tree.parse("index.xhtml")
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000950 <Element 'html' at 0xb77e6fac>
Christian Heimesd8654cf2007-12-02 15:22:16 +0000951 >>> p = tree.find("body/p") # Finds first occurrence of tag p in body
952 >>> p
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000953 <Element 'p' at 0xb77ec26c>
954 >>> links = list(p.iter("a")) # Returns list of all links
Christian Heimesd8654cf2007-12-02 15:22:16 +0000955 >>> links
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000956 [<Element 'a' at 0xb77ec2ac>, <Element 'a' at 0xb77ec1cc>]
Christian Heimesd8654cf2007-12-02 15:22:16 +0000957 >>> for i in links: # Iterates through all found links
958 ... i.attrib["target"] = "blank"
959 >>> tree.write("output.xhtml")
Georg Brandl116aa622007-08-15 14:28:22 +0000960
961.. _elementtree-qname-objects:
962
963QName Objects
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200964^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +0000965
966
Georg Brandl7f01a132009-09-16 15:58:14 +0000967.. class:: QName(text_or_uri, tag=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000968
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000969 QName wrapper. This can be used to wrap a QName attribute value, in order
970 to get proper namespace handling on output. *text_or_uri* is a string
971 containing the QName value, in the form {uri}local, or, if the tag argument
972 is given, the URI part of a QName. If *tag* is given, the first argument is
973 interpreted as an URI, and this argument is interpreted as a local name.
974 :class:`QName` instances are opaque.
Georg Brandl116aa622007-08-15 14:28:22 +0000975
976
Antoine Pitrou5b235d02013-04-18 19:37:06 +0200977
Georg Brandl116aa622007-08-15 14:28:22 +0000978.. _elementtree-treebuilder-objects:
979
980TreeBuilder Objects
Eli Bendersky3a4875e2012-03-26 20:43:32 +0200981^^^^^^^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +0000982
983
Georg Brandl7f01a132009-09-16 15:58:14 +0000984.. class:: TreeBuilder(element_factory=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000985
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000986 Generic element structure builder. This builder converts a sequence of
987 start, data, and end method calls to a well-formed element structure. You
988 can use this class to build an element structure using a custom XML parser,
Eli Bendersky48d358b2012-05-30 17:57:50 +0300989 or a parser for some other XML-like format. *element_factory*, when given,
990 must be a callable accepting two positional arguments: a tag and
991 a dict of attributes. It is expected to return a new element instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000992
Benjamin Petersone41251e2008-04-25 01:59:09 +0000993 .. method:: close()
Georg Brandl116aa622007-08-15 14:28:22 +0000994
Florent Xiclunaf15351d2010-03-13 23:24:31 +0000995 Flushes the builder buffers, and returns the toplevel document
996 element. Returns an :class:`Element` instance.
Georg Brandl116aa622007-08-15 14:28:22 +0000997
998
Benjamin Petersone41251e2008-04-25 01:59:09 +0000999 .. method:: data(data)
Georg Brandl116aa622007-08-15 14:28:22 +00001000
Florent Xiclunaf15351d2010-03-13 23:24:31 +00001001 Adds text to the current element. *data* is a string. This should be
1002 either a bytestring, or a Unicode string.
Georg Brandl116aa622007-08-15 14:28:22 +00001003
1004
Benjamin Petersone41251e2008-04-25 01:59:09 +00001005 .. method:: end(tag)
Georg Brandl116aa622007-08-15 14:28:22 +00001006
Florent Xiclunaf15351d2010-03-13 23:24:31 +00001007 Closes the current element. *tag* is the element name. Returns the
1008 closed element.
Georg Brandl116aa622007-08-15 14:28:22 +00001009
1010
Benjamin Petersone41251e2008-04-25 01:59:09 +00001011 .. method:: start(tag, attrs)
Georg Brandl116aa622007-08-15 14:28:22 +00001012
Florent Xiclunaf15351d2010-03-13 23:24:31 +00001013 Opens a new element. *tag* is the element name. *attrs* is a dictionary
1014 containing element attributes. Returns the opened element.
Georg Brandl116aa622007-08-15 14:28:22 +00001015
1016
Florent Xiclunaf15351d2010-03-13 23:24:31 +00001017 In addition, a custom :class:`TreeBuilder` object can provide the
1018 following method:
Georg Brandl116aa622007-08-15 14:28:22 +00001019
Florent Xiclunaf15351d2010-03-13 23:24:31 +00001020 .. method:: doctype(name, pubid, system)
1021
1022 Handles a doctype declaration. *name* is the doctype name. *pubid* is
1023 the public identifier. *system* is the system identifier. This method
1024 does not exist on the default :class:`TreeBuilder` class.
1025
Ezio Melottif8754a62010-03-21 07:16:43 +00001026 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +00001027
1028
Florent Xiclunaf15351d2010-03-13 23:24:31 +00001029.. _elementtree-xmlparser-objects:
Georg Brandl116aa622007-08-15 14:28:22 +00001030
Florent Xiclunaf15351d2010-03-13 23:24:31 +00001031XMLParser Objects
Eli Bendersky3a4875e2012-03-26 20:43:32 +02001032^^^^^^^^^^^^^^^^^
Florent Xiclunaf15351d2010-03-13 23:24:31 +00001033
1034
1035.. class:: XMLParser(html=0, target=None, encoding=None)
1036
Eli Benderskyb5869342013-08-30 05:51:20 -07001037 This class is the low-level building block of the module. It uses
1038 :mod:`xml.parsers.expat` for efficient, event-based parsing of XML. It can
1039 be fed XML data incrementall with the :meth:`feed` method, and parsing events
1040 are translated to a push API - by invoking callbacks on the *target* object.
1041 If *target* is omitted, the standard :class:`TreeBuilder` is used. The
1042 *html* argument was historically used for backwards compatibility and is now
1043 deprecated. If *encoding* [1]_ is given, the value overrides the encoding
Eli Bendersky52467b12012-06-01 07:13:08 +03001044 specified in the XML file.
Georg Brandl116aa622007-08-15 14:28:22 +00001045
Eli Benderskyb5869342013-08-30 05:51:20 -07001046 .. deprecated:: 3.4
Larry Hastings3732ed22014-03-15 21:13:56 -07001047 The *html* argument. The remaining arguments should be passed via
1048 keywword to prepare for the removal of the *html* argument.
Georg Brandl116aa622007-08-15 14:28:22 +00001049
Benjamin Petersone41251e2008-04-25 01:59:09 +00001050 .. method:: close()
Georg Brandl116aa622007-08-15 14:28:22 +00001051
Eli Benderskybfd78372013-08-24 15:11:44 -07001052 Finishes feeding data to the parser. Returns the result of calling the
Eli Benderskybf8ab772013-08-25 15:27:36 -07001053 ``close()`` method of the *target* passed during construction; by default,
1054 this is the toplevel document element.
Georg Brandl116aa622007-08-15 14:28:22 +00001055
1056
Benjamin Petersone41251e2008-04-25 01:59:09 +00001057 .. method:: doctype(name, pubid, system)
Georg Brandl116aa622007-08-15 14:28:22 +00001058
Georg Brandl67b21b72010-08-17 15:07:14 +00001059 .. deprecated:: 3.2
Florent Xiclunaf15351d2010-03-13 23:24:31 +00001060 Define the :meth:`TreeBuilder.doctype` method on a custom TreeBuilder
1061 target.
Georg Brandl116aa622007-08-15 14:28:22 +00001062
1063
Benjamin Petersone41251e2008-04-25 01:59:09 +00001064 .. method:: feed(data)
Georg Brandl116aa622007-08-15 14:28:22 +00001065
Florent Xiclunaf15351d2010-03-13 23:24:31 +00001066 Feeds data to the parser. *data* is encoded data.
Georg Brandl116aa622007-08-15 14:28:22 +00001067
Eli Benderskyb5869342013-08-30 05:51:20 -07001068 :meth:`XMLParser.feed` calls *target*\'s ``start(tag, attrs_dict)`` method
1069 for each opening tag, its ``end(tag)`` method for each closing tag, and data
1070 is processed by method ``data(data)``. :meth:`XMLParser.close` calls
1071 *target*\'s method ``close()``. :class:`XMLParser` can be used not only for
1072 building a tree structure. This is an example of counting the maximum depth
1073 of an XML file::
Christian Heimesd8654cf2007-12-02 15:22:16 +00001074
Florent Xiclunaf15351d2010-03-13 23:24:31 +00001075 >>> from xml.etree.ElementTree import XMLParser
Christian Heimesd8654cf2007-12-02 15:22:16 +00001076 >>> class MaxDepth: # The target object of the parser
1077 ... maxDepth = 0
1078 ... depth = 0
1079 ... def start(self, tag, attrib): # Called for each opening tag.
Georg Brandl48310cd2009-01-03 21:18:54 +00001080 ... self.depth += 1
Christian Heimesd8654cf2007-12-02 15:22:16 +00001081 ... if self.depth > self.maxDepth:
1082 ... self.maxDepth = self.depth
1083 ... def end(self, tag): # Called for each closing tag.
1084 ... self.depth -= 1
Georg Brandl48310cd2009-01-03 21:18:54 +00001085 ... def data(self, data):
Christian Heimesd8654cf2007-12-02 15:22:16 +00001086 ... pass # We do not need to do anything with data.
1087 ... def close(self): # Called when all data has been parsed.
1088 ... return self.maxDepth
Georg Brandl48310cd2009-01-03 21:18:54 +00001089 ...
Christian Heimesd8654cf2007-12-02 15:22:16 +00001090 >>> target = MaxDepth()
Florent Xiclunaf15351d2010-03-13 23:24:31 +00001091 >>> parser = XMLParser(target=target)
Christian Heimesd8654cf2007-12-02 15:22:16 +00001092 >>> exampleXml = """
1093 ... <a>
1094 ... <b>
1095 ... </b>
1096 ... <b>
1097 ... <c>
1098 ... <d>
1099 ... </d>
1100 ... </c>
1101 ... </b>
1102 ... </a>"""
1103 >>> parser.feed(exampleXml)
1104 >>> parser.close()
1105 4
Christian Heimesb186d002008-03-18 15:15:01 +00001106
Eli Benderskyb5869342013-08-30 05:51:20 -07001107
1108.. _elementtree-xmlpullparser-objects:
1109
1110XMLPullParser Objects
1111^^^^^^^^^^^^^^^^^^^^^
1112
1113.. class:: XMLPullParser(events=None)
1114
Eli Bendersky2c68e302013-08-31 07:37:23 -07001115 A pull parser suitable for non-blocking applications. Its input-side API is
1116 similar to that of :class:`XMLParser`, but instead of pushing calls to a
1117 callback target, :class:`XMLPullParser` collects an internal list of parsing
1118 events and lets the user read from it. *events* is a sequence of events to
1119 report back. The supported events are the strings ``"start"``, ``"end"``,
1120 ``"start-ns"`` and ``"end-ns"`` (the "ns" events are used to get detailed
1121 namespace information). If *events* is omitted, only ``"end"`` events are
1122 reported.
Eli Benderskyb5869342013-08-30 05:51:20 -07001123
1124 .. method:: feed(data)
1125
1126 Feed the given bytes data to the parser.
1127
1128 .. method:: close()
1129
Nick Coghlan4cc2afa2013-09-28 23:50:35 +10001130 Signal the parser that the data stream is terminated. Unlike
1131 :meth:`XMLParser.close`, this method always returns :const:`None`.
1132 Any events not yet retrieved when the parser is closed can still be
1133 read with :meth:`read_events`.
Eli Benderskyb5869342013-08-30 05:51:20 -07001134
1135 .. method:: read_events()
1136
R David Murray410d3202014-01-04 23:52:50 -05001137 Return an iterator over the events which have been encountered in the
1138 data fed to the
1139 parser. The iterator yields ``(event, elem)`` pairs, where *event* is a
Eli Benderskyb5869342013-08-30 05:51:20 -07001140 string representing the type of event (e.g. ``"end"``) and *elem* is the
Nick Coghlan4cc2afa2013-09-28 23:50:35 +10001141 encountered :class:`Element` object.
1142
1143 Events provided in a previous call to :meth:`read_events` will not be
R David Murray410d3202014-01-04 23:52:50 -05001144 yielded again. Events are consumed from the internal queue only when
1145 they are retrieved from the iterator, so multiple readers iterating in
1146 parallel over iterators obtained from :meth:`read_events` will have
1147 unpredictable results.
Eli Benderskyb5869342013-08-30 05:51:20 -07001148
1149 .. note::
1150
1151 :class:`XMLPullParser` only guarantees that it has seen the ">"
1152 character of a starting tag when it emits a "start" event, so the
1153 attributes are defined, but the contents of the text and tail attributes
1154 are undefined at that point. The same applies to the element children;
1155 they may or may not be present.
1156
1157 If you need a fully populated element, look for "end" events instead.
1158
1159 .. versionadded:: 3.4
1160
Eli Bendersky5b77d812012-03-16 08:20:05 +02001161Exceptions
Eli Bendersky3a4875e2012-03-26 20:43:32 +02001162^^^^^^^^^^
Eli Bendersky5b77d812012-03-16 08:20:05 +02001163
1164.. class:: ParseError
1165
1166 XML parse error, raised by the various parsing methods in this module when
1167 parsing fails. The string representation of an instance of this exception
1168 will contain a user-friendly error message. In addition, it will have
1169 the following attributes available:
1170
1171 .. attribute:: code
1172
1173 A numeric error code from the expat parser. See the documentation of
1174 :mod:`xml.parsers.expat` for the list of error codes and their meanings.
1175
1176 .. attribute:: position
1177
1178 A tuple of *line*, *column* numbers, specifying where the error occurred.
Christian Heimesb186d002008-03-18 15:15:01 +00001179
1180.. rubric:: Footnotes
1181
1182.. [#] The encoding string included in XML output should conform to the
Florent Xiclunaf15351d2010-03-13 23:24:31 +00001183 appropriate standards. For example, "UTF-8" is valid, but "UTF8" is
1184 not. See http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl
Georg Brandlb7354a62014-10-29 10:57:37 +01001185 and http://www.iana.org/assignments/character-sets/character-sets.xhtml.