blob: 17e21c0ccf37714c830edca352d4f4e9189f2e08 [file] [log] [blame]
Fred Drakeeaf57aa2000-11-29 06:10:22 +00001\section{\module{xml.dom} ---
2 The Document Object Model API}
Fred Drake669d36f2000-10-24 02:34:45 +00003
Fred Drakeeaf57aa2000-11-29 06:10:22 +00004\declaremodule{standard}{xml.dom}
5\modulesynopsis{Document Object Model API for Python.}
Fred Drake669d36f2000-10-24 02:34:45 +00006\sectionauthor{Paul Prescod}{paul@prescod.net}
7\sectionauthor{Martin v. L\"owis}{loewis@informatik.hu-berlin.de}
8
9\versionadded{2.0}
10
Fred Drakeeaf57aa2000-11-29 06:10:22 +000011The Document Object Model, or ``DOM,'' is a cross-language API from
12the World Wide Web Consortium (W3C) for accessing and modifying XML
13documents. A DOM implementation presents an XML document as a tree
14structure, or allows client code to build such a structure from
15scratch. It then gives access to the structure through a set of
16objects which provided well-known interfaces.
Fred Drake669d36f2000-10-24 02:34:45 +000017
Fred Drakeeaf57aa2000-11-29 06:10:22 +000018The DOM is extremely useful for random-access applications. SAX only
19allows you a view of one bit of the document at a time. If you are
20looking at one SAX element, you have no access to another. If you are
21looking at a text node, you have no access to a containing element.
22When you write a SAX application, you need to keep track of your
23program's position in the document somewhere in your own code. SAX
24does not do it for you. Also, if you need to look ahead in the XML
25document, you are just out of luck.
Fred Drake669d36f2000-10-24 02:34:45 +000026
27Some applications are simply impossible in an event driven model with
Fred Drakeeaf57aa2000-11-29 06:10:22 +000028no access to a tree. Of course you could build some sort of tree
Fred Drake669d36f2000-10-24 02:34:45 +000029yourself in SAX events, but the DOM allows you to avoid writing that
Fred Drakeeaf57aa2000-11-29 06:10:22 +000030code. The DOM is a standard tree representation for XML data.
Fred Drake669d36f2000-10-24 02:34:45 +000031
Fred Drakeeaf57aa2000-11-29 06:10:22 +000032%What if your needs are somewhere between SAX and the DOM? Perhaps
33%you cannot afford to load the entire tree in memory but you find the
34%SAX model somewhat cumbersome and low-level. There is also a module
35%called xml.dom.pulldom that allows you to build trees of only the
36%parts of a document that you need structured access to. It also has
37%features that allow you to find your way around the DOM.
Fred Drake669d36f2000-10-24 02:34:45 +000038% See http://www.prescod.net/python/pulldom
39
Fred Drakeeaf57aa2000-11-29 06:10:22 +000040The Document Object Model is being defined by the W3C in stages, or
41``levels'' in their terminology. The Python mapping of the API is
42substantially based on the DOM Level 2 recommendation. Some aspects
Fred Drake66f98b42001-01-26 20:51:32 +000043of the API will only become available in Python 2.1, or may only be
Fred Drakeeaf57aa2000-11-29 06:10:22 +000044available in particular DOM implementations.
Fred Drake669d36f2000-10-24 02:34:45 +000045
Fred Drakeeaf57aa2000-11-29 06:10:22 +000046DOM applications typically start by parsing some XML into a DOM. How
47this is accomplished is not covered at all by DOM Level 1, and Level 2
48provides only limited improvements. There is a
49\class{DOMImplementation} object class which provides access to
50\class{Document} creation methods, but these methods were only added
51in DOM Level 2 and were not implemented in time for Python 2.0. There
Fred Drake66f98b42001-01-26 20:51:32 +000052is also no well-defined way to access these methods without an
Fred Drakeeaf57aa2000-11-29 06:10:22 +000053existing \class{Document} object. For Python 2.0, consult the
54documentation for each particular DOM implementation to determine the
55bootstrap procedure needed to create and initialize \class{Document}
Fred Drake66f98b42001-01-26 20:51:32 +000056and \class{DocumentType} instances.
Fred Drake669d36f2000-10-24 02:34:45 +000057
58Once you have a DOM document object, you can access the parts of your
59XML document through its properties and methods. These properties are
Fred Drakeeaf57aa2000-11-29 06:10:22 +000060defined in the DOM specification; this portion of the reference manual
61describes the interpretation of the specification in Python.
Fred Drake669d36f2000-10-24 02:34:45 +000062
Fred Drakeeaf57aa2000-11-29 06:10:22 +000063The specification provided by the W3C defines the DOM API for Java,
64ECMAScript, and OMG IDL. The Python mapping defined here is based in
65large part on the IDL version of the specification, but strict
66compliance is not required (though implementations are free to support
67the strict mapping from IDL). See section \ref{dom-conformance},
68``Conformance,'' for a detailed discussion of mapping requirements.
Fred Drake669d36f2000-10-24 02:34:45 +000069
Fred Drake669d36f2000-10-24 02:34:45 +000070
71\begin{seealso}
Fred Drakeeaf57aa2000-11-29 06:10:22 +000072 \seetitle[http://www.w3.org/TR/DOM-Level-2-Core/]{Document Object
73 Model (DOM) Level 2 Specification}
74 {The W3C recommendation upon which the Python DOM API is
75 based.}
76 \seetitle[http://www.w3.org/TR/REC-DOM-Level-1/]{Document Object
77 Model (DOM) Level 1 Specification}
78 {The W3C recommendation for the
Fred Drake669d36f2000-10-24 02:34:45 +000079 DOM supported by \module{xml.dom.minidom}.}
80 \seetitle[http://pyxml.sourceforge.net]{PyXML}{Users that require a
81 full-featured implementation of DOM should use the PyXML
82 package.}
Fred Drakeeaf57aa2000-11-29 06:10:22 +000083 \seetitle[http://cgi.omg.org/cgi-bin/doc?orbos/99-08-02.pdf]{CORBA
84 Scripting with Python}
85 {This specifies the mapping from OMG IDL to Python.}
Fred Drake669d36f2000-10-24 02:34:45 +000086\end{seealso}
87
Martin v. Löwis7edbd4f2001-02-22 14:05:50 +000088\subsection{Module Contents}
89
90The \module{xml.dom} contains the following functions:
91
92\begin{funcdesc}{registerDOMImplementation}{name, factory}
Fred Drake07e6c502001-02-23 19:15:56 +000093Register the \var{factory} function with the name \var{name}. The
94factory function should return an object which implements the
95\class{DOMImplementation} interface. The factory function can return
96the same object every time, or a new one for each call, as appropriate
97for the specific implementation (e.g. if that implementation supports
Martin v. Löwis7edbd4f2001-02-22 14:05:50 +000098some customization).
99\end{funcdesc}
100
101\begin{funcdesc}{getDOMImplementation}{name = None, features = ()}
102Return a suitable DOM implementation. The \var{name} is either
103well-known, the module name of a DOM implementation, or
104\code{None}. If it is not \code{None}, imports the corresponding module and
105returns a \class{DOMImplementation} object if the import succeeds. If
Fred Drake07e6c502001-02-23 19:15:56 +0000106no name is given, and if the environment variable \envvar{PYTHON_DOM} is
Martin v. Löwis7edbd4f2001-02-22 14:05:50 +0000107set, this variable is used to find the implementation.
108
109If name is not given, consider the available implementations to find
110one with the required feature set. If no implementation can be found,
111raise an \exception{ImportError}. The features list must be a sequence of
112(feature, version) pairs which are passed to hasFeature.
113\end{funcdesc}
114
115% Should the Node documentation go here?
116
117In addition, \module{xml.dom} contains the \class{Node}, and the DOM
118exceptions.
Fred Drake669d36f2000-10-24 02:34:45 +0000119
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000120\subsection{Objects in the DOM \label{dom-objects}}
Fred Drake669d36f2000-10-24 02:34:45 +0000121
122The definitive documentation for the DOM is the DOM specification from
Fred Drake16942f22000-12-07 04:47:51 +0000123the W3C.
Fred Drake669d36f2000-10-24 02:34:45 +0000124
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000125Note that DOM attributes may also be manipulated as nodes instead of
126as simple strings. It is fairly rare that you must do this, however,
127so this usage is not yet documented.
128
129
130\begin{tableiii}{l|l|l}{class}{Interface}{Section}{Purpose}
Fred Drake16942f22000-12-07 04:47:51 +0000131 \lineiii{DOMImplementation}{\ref{dom-implementation-objects}}
132 {Interface to the underlying implementation.}
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000133 \lineiii{Node}{\ref{dom-node-objects}}
134 {Base interface for most objects in a document.}
Fred Drake16942f22000-12-07 04:47:51 +0000135 \lineiii{NodeList}{\ref{dom-nodelist-objects}}
136 {Interface for a sequence of nodes.}
137 \lineiii{DocumentType}{\ref{dom-documenttype-objects}}
138 {Information about the declarations needed to process a document.}
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000139 \lineiii{Document}{\ref{dom-document-objects}}
140 {Object which represents an entire document.}
141 \lineiii{Element}{\ref{dom-element-objects}}
142 {Element nodes in the document hierarchy.}
143 \lineiii{Attr}{\ref{dom-attr-objects}}
144 {Attribute value nodes on element nodes.}
145 \lineiii{Comment}{\ref{dom-comment-objects}}
146 {Representation of comments in the source document.}
147 \lineiii{Text}{\ref{dom-text-objects}}
148 {Nodes containing textual content from the document.}
149 \lineiii{ProcessingInstruction}{\ref{dom-pi-objects}}
150 {Processing instruction representation.}
151\end{tableiii}
152
Fred Drakebc9c1b12000-12-13 17:38:02 +0000153An additional section describes the exceptions defined for working
154with the DOM in Python.
155
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000156
Fred Drake16942f22000-12-07 04:47:51 +0000157\subsubsection{DOMImplementation Objects
158 \label{dom-implementation-objects}}
159
160The \class{DOMImplementation} interface provides a way for
161applications to determine the availability of particular features in
162the DOM they are using. DOM Level 2 added the ability to create new
163\class{Document} and \class{DocumentType} objects using the
164\class{DOMImplementation} as well.
165
166\begin{methoddesc}[DOMImplementation]{hasFeature}{feature, version}
167\end{methoddesc}
168
169
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000170\subsubsection{Node Objects \label{dom-node-objects}}
171
Fred Drake669d36f2000-10-24 02:34:45 +0000172All of the components of an XML document are subclasses of
173\class{Node}.
174
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000175\begin{memberdesc}[Node]{nodeType}
Fred Drake669d36f2000-10-24 02:34:45 +0000176An integer representing the node type. Symbolic constants for the
Fred Drake66f98b42001-01-26 20:51:32 +0000177types are on the \class{Node} object:
Fred Drake669d36f2000-10-24 02:34:45 +0000178\constant{ELEMENT_NODE}, \constant{ATTRIBUTE_NODE},
179\constant{TEXT_NODE}, \constant{CDATA_SECTION_NODE},
180\constant{ENTITY_NODE}, \constant{PROCESSING_INSTRUCTION_NODE},
181\constant{COMMENT_NODE}, \constant{DOCUMENT_NODE},
182\constant{DOCUMENT_TYPE_NODE}, \constant{NOTATION_NODE}.
Fred Drake9a29dd62000-12-08 06:54:51 +0000183This is a read-only attribute.
Fred Drake669d36f2000-10-24 02:34:45 +0000184\end{memberdesc}
185
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000186\begin{memberdesc}[Node]{parentNode}
Fred Drake16942f22000-12-07 04:47:51 +0000187The parent of the current node, or \code{None} for the document node.
188The value is always a \class{Node} object or \code{None}. For
189\class{Element} nodes, this will be the parent element, except for the
190root element, in which case it will be the \class{Document} object.
191For \class{Attr} nodes, this is always \code{None}.
Fred Drake9a29dd62000-12-08 06:54:51 +0000192This is a read-only attribute.
Fred Drake669d36f2000-10-24 02:34:45 +0000193\end{memberdesc}
194
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000195\begin{memberdesc}[Node]{attributes}
Fred Drake9368a122001-01-24 18:19:40 +0000196A \class{NamedNodeList} of attribute objects. Only elements have
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000197actual values for this; others provide \code{None} for this attribute.
Fred Drake9a29dd62000-12-08 06:54:51 +0000198This is a read-only attribute.
Fred Drake669d36f2000-10-24 02:34:45 +0000199\end{memberdesc}
200
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000201\begin{memberdesc}[Node]{previousSibling}
Fred Drake669d36f2000-10-24 02:34:45 +0000202The node that immediately precedes this one with the same parent. For
203instance the element with an end-tag that comes just before the
204\var{self} element's start-tag. Of course, XML documents are made
205up of more than just elements so the previous sibling could be text, a
Fred Drake16942f22000-12-07 04:47:51 +0000206comment, or something else. If this node is the first child of the
207parent, this attribute will be \code{None}.
Fred Drake9a29dd62000-12-08 06:54:51 +0000208This is a read-only attribute.
Fred Drake669d36f2000-10-24 02:34:45 +0000209\end{memberdesc}
210
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000211\begin{memberdesc}[Node]{nextSibling}
Fred Drake669d36f2000-10-24 02:34:45 +0000212The node that immediately follows this one with the same parent. See
Fred Drake16942f22000-12-07 04:47:51 +0000213also \member{previousSibling}. If this is the last child of the
214parent, this attribute will be \code{None}.
Fred Drake9a29dd62000-12-08 06:54:51 +0000215This is a read-only attribute.
Fred Drake669d36f2000-10-24 02:34:45 +0000216\end{memberdesc}
217
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000218\begin{memberdesc}[Node]{childNodes}
Fred Drake669d36f2000-10-24 02:34:45 +0000219A list of nodes contained within this node.
Fred Drake9a29dd62000-12-08 06:54:51 +0000220This is a read-only attribute.
Fred Drake669d36f2000-10-24 02:34:45 +0000221\end{memberdesc}
222
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000223\begin{memberdesc}[Node]{firstChild}
224The first child of the node, if there are any, or \code{None}.
Fred Drake9a29dd62000-12-08 06:54:51 +0000225This is a read-only attribute.
Fred Drake669d36f2000-10-24 02:34:45 +0000226\end{memberdesc}
227
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000228\begin{memberdesc}[Node]{lastChild}
229The last child of the node, if there are any, or \code{None}.
Fred Drake9a29dd62000-12-08 06:54:51 +0000230This is a read-only attribute.
Fred Drake669d36f2000-10-24 02:34:45 +0000231\end{memberdesc}
232
Fred Drake9a29dd62000-12-08 06:54:51 +0000233\begin{memberdesc}[Node]{localName}
234The part of the \member{tagName} following the colon if there is one,
235else the entire \member{tagName}. The value is a string.
236\end{memberdesc}
237
238\begin{memberdesc}[Node]{prefix}
239The part of the \member{tagName} preceding the colon if there is one,
240else the empty string. The value is a string, or \code{None}
241\end{memberdesc}
242
243\begin{memberdesc}[Node]{namespaceURI}
Fred Drake16942f22000-12-07 04:47:51 +0000244The namespace associated with the element name. This will be a
Fred Drake9a29dd62000-12-08 06:54:51 +0000245string or \code{None}. This is a read-only attribute.
Fred Drake16942f22000-12-07 04:47:51 +0000246\end{memberdesc}
247
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000248\begin{memberdesc}[Node]{nodeName}
Fred Drake9a29dd62000-12-08 06:54:51 +0000249This has a different meaning for each node type; see the DOM
250specification for details. You can always get the information you
251would get here from another property such as the \member{tagName}
252property for elements or the \member{name} property for attributes.
253For all node types, the value of this attribute will be either a
254string or \code{None}. This is a read-only attribute.
Fred Drake669d36f2000-10-24 02:34:45 +0000255\end{memberdesc}
256
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000257\begin{memberdesc}[Node]{nodeValue}
Fred Drake9a29dd62000-12-08 06:54:51 +0000258This has a different meaning for each node type; see the DOM
259specification for details. The situation is similar to that with
260\member{nodeName}. The value is a string or \code{None}.
Fred Drake669d36f2000-10-24 02:34:45 +0000261\end{memberdesc}
262
Fred Drake9a29dd62000-12-08 06:54:51 +0000263\begin{methoddesc}[Node]{hasAttributes}{}
264Returns true if the node has any attributes.
265\end{methoddesc}
266
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000267\begin{methoddesc}[Node]{hasChildNodes}{}
268Returns true if the node has any child nodes.
Fred Drake669d36f2000-10-24 02:34:45 +0000269\end{methoddesc}
270
Fred Drake40e43bf2001-02-03 01:20:01 +0000271\begin{methoddesc}[Node]{isSameNode}{other}
272Returns true if \var{other} refers to the same node as this node.
273This is especially useful for DOM implementations which use any sort
274of proxy architecture (because more than one object can refer to the
275same node).
Fred Drake15862f52001-02-14 20:39:15 +0000276
277\strong{Note:} This is based on a proposed DOM Level 3 API which is
278still in the ``working draft'' stage, but this particular interface
279appears uncontroversial. Changes from the W3C will not necessarily
280affect this method in the Python DOM interface (though any new W3C
281API for this would also be supported).
Fred Drake40e43bf2001-02-03 01:20:01 +0000282\end{methoddesc}
283
Fred Drake9a29dd62000-12-08 06:54:51 +0000284\begin{methoddesc}[Node]{appendChild}{newChild}
285Add a new child node to this node at the end of the list of children,
286returning \var{newChild}.
287\end{methoddesc}
288
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000289\begin{methoddesc}[Node]{insertBefore}{newChild, refChild}
Fred Drake669d36f2000-10-24 02:34:45 +0000290Insert a new child node before an existing child. It must be the case
291that \var{refChild} is a child of this node; if not,
Fred Drake9a29dd62000-12-08 06:54:51 +0000292\exception{ValueError} is raised. \var{newChild} is returned.
Fred Drake669d36f2000-10-24 02:34:45 +0000293\end{methoddesc}
294
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000295\begin{methoddesc}[Node]{removeChild}{oldChild}
Fred Drake669d36f2000-10-24 02:34:45 +0000296Remove a child node. \var{oldChild} must be a child of this node; if
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000297not, \exception{ValueError} is raised. \var{oldChild} is returned on
298success. If \var{oldChild} will not be used further, its
299\method{unlink()} method should be called.
Fred Drake669d36f2000-10-24 02:34:45 +0000300\end{methoddesc}
301
Fred Drake9a29dd62000-12-08 06:54:51 +0000302\begin{methoddesc}[Node]{replaceChild}{newChild, oldChild}
303Replace an existing node with a new node. It must be the case that
304\var{oldChild} is a child of this node; if not,
305\exception{ValueError} is raised.
Fred Drake669d36f2000-10-24 02:34:45 +0000306\end{methoddesc}
307
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000308\begin{methoddesc}[Node]{normalize}{}
309Join adjacent text nodes so that all stretches of text are stored as
310single \class{Text} instances. This simplifies processing text from a
311DOM tree for many applications.
312\versionadded{2.1}
Fred Drake669d36f2000-10-24 02:34:45 +0000313\end{methoddesc}
314
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000315\begin{methoddesc}[Node]{cloneNode}{deep}
316Clone this node. Setting \var{deep} means to clone all child nodes as
Fred Drake16942f22000-12-07 04:47:51 +0000317well. This returns the clone.
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000318\end{methoddesc}
Fred Drake669d36f2000-10-24 02:34:45 +0000319
320
Fred Drake16942f22000-12-07 04:47:51 +0000321\subsubsection{NodeList Objects \label{dom-nodelist-objects}}
322
323A \class{NodeList} represents a sequence of nodes. These objects are
324used in two ways in the DOM Core recommendation: the
325\class{Element} objects provides one as it's list of child nodes, and
326the \method{getElementsByTagName()} and
327\method{getElementsByTagNameNS()} methods of \class{Node} return
328objects with this interface to represent query results.
329
330The DOM Level 2 recommendation defines one method and one attribute
331for these objects:
332
333\begin{methoddesc}[NodeList]{item}{i}
334 Return the \var{i}'th item from the sequence, if there is one, or
335 \code{None}. The index \var{i} is not allowed to be less then zero
336 or greater than or equal to the length of the sequence.
337\end{methoddesc}
338
339\begin{memberdesc}[NodeList]{length}
340 The number of nodes in the sequence.
341\end{memberdesc}
342
343In addition, the Python DOM interface requires that some additional
344support is provided to allow \class{NodeList} objects to be used as
345Python sequences. All \class{NodeList} implementations must include
346support for \method{__len__()} and \method{__getitem__()}; this allows
347iteration over the \class{NodeList} in \keyword{for} statements and
348proper support for the \function{len()} built-in function.
349
350If a DOM implementation supports modification of the document, the
351\class{NodeList} implementation must also support the
352\method{__setitem__()} and \method{__delitem__()} methods.
353
354
355\subsubsection{DocumentType Objects \label{dom-documenttype-objects}}
356
357Information about the notations and entities declared by a document
358(including the external subset if the parser uses it and can provide
359the information) is available from a \class{DocumentType} object. The
360\class{DocumentType} for a document is available from the
361\class{Document} object's \member{doctype} attribute.
362
363\class{DocumentType} is a specialization of \class{Node}, and adds the
364following attributes:
365
366\begin{memberdesc}[DocumentType]{publicId}
367 The public identifier for the external subset of the document type
368 definition. This will be a string or \code{None}.
369\end{memberdesc}
370
371\begin{memberdesc}[DocumentType]{systemId}
372 The system identifier for the external subset of the document type
373 definition. This will be a URI as a string, or \code{None}.
374\end{memberdesc}
375
376\begin{memberdesc}[DocumentType]{internalSubset}
377 A string giving the complete internal subset from the document.
Fred Drakef459d852001-04-05 18:30:04 +0000378 This does not include the brackets which enclose the subset. If the
379 document has no internal subset, this should be \code{None}.
Fred Drake16942f22000-12-07 04:47:51 +0000380\end{memberdesc}
381
382\begin{memberdesc}[DocumentType]{name}
383 The name of the root element as given in the \code{DOCTYPE}
384 declaration, if present. If the was no \code{DOCTYPE} declaration,
385 this will be \code{None}.
386\end{memberdesc}
387
388\begin{memberdesc}[DocumentType]{entities}
389 This is a \class{NamedNodeMap} giving the definitions of external
390 entities. For entity names defined more than once, only the first
391 definition is provided (others are ignored as required by the XML
392 recommendation). This may be \code{None} if the information is not
393 provided by the parser, or if no entities are defined.
394\end{memberdesc}
395
396\begin{memberdesc}[DocumentType]{notations}
397 This is a \class{NamedNodeMap} giving the definitions of notations.
398 For notation names defined more than once, only the first definition
399 is provided (others are ignored as required by the XML
400 recommendation). This may be \code{None} if the information is not
401 provided by the parser, or if no notations are defined.
402\end{memberdesc}
403
404
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000405\subsubsection{Document Objects \label{dom-document-objects}}
Fred Drake669d36f2000-10-24 02:34:45 +0000406
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000407A \class{Document} represents an entire XML document, including its
408constituent elements, attributes, processing instructions, comments
409etc. Remeber that it inherits properties from \class{Node}.
410
411\begin{memberdesc}[Document]{documentElement}
Fred Drake669d36f2000-10-24 02:34:45 +0000412The one and only root element of the document.
413\end{memberdesc}
414
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000415\begin{methoddesc}[Document]{createElement}{tagName}
Fred Drake16942f22000-12-07 04:47:51 +0000416Create and return a new element node. The element is not inserted
417into the document when it is created. You need to explicitly insert
418it with one of the other methods such as \method{insertBefore()} or
Fred Drake669d36f2000-10-24 02:34:45 +0000419\method{appendChild()}.
420\end{methoddesc}
421
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000422\begin{methoddesc}[Document]{createElementNS}{namespaceURI, tagName}
Fred Drake16942f22000-12-07 04:47:51 +0000423Create and return a new element with a namespace. The
424\var{tagName} may have a prefix. The element is not inserted into the
425document when it is created. You need to explicitly insert it with
426one of the other methods such as \method{insertBefore()} or
427\method{appendChild()}.
Fred Drake669d36f2000-10-24 02:34:45 +0000428\end{methoddesc}
429
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000430\begin{methoddesc}[Document]{createTextNode}{data}
Fred Drake16942f22000-12-07 04:47:51 +0000431Create and return a text node containing the data passed as a
432parameter. As with the other creation methods, this one does not
433insert the node into the tree.
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000434\end{methoddesc}
Fred Drake669d36f2000-10-24 02:34:45 +0000435
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000436\begin{methoddesc}[Document]{createComment}{data}
Fred Drake16942f22000-12-07 04:47:51 +0000437Create and return a comment node containing the data passed as a
438parameter. As with the other creation methods, this one does not
439insert the node into the tree.
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000440\end{methoddesc}
441
442\begin{methoddesc}[Document]{createProcessingInstruction}{target, data}
Fred Drake16942f22000-12-07 04:47:51 +0000443Create and return a processing instruction node containing the
444\var{target} and \var{data} passed as parameters. As with the other
445creation methods, this one does not insert the node into the tree.
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000446\end{methoddesc}
447
448\begin{methoddesc}[Document]{createAttribute}{name}
Fred Drake16942f22000-12-07 04:47:51 +0000449Create and return an attribute node. This method does not associate
450the attribute node with any particular element. You must use
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000451\method{setAttributeNode()} on the appropriate \class{Element} object
452to use the newly created attribute instance.
453\end{methoddesc}
454
455\begin{methoddesc}[Document]{createAttributeNS}{namespaceURI, qualifiedName}
Fred Drake16942f22000-12-07 04:47:51 +0000456Create and return an attribute node with a namespace. The
457\var{tagName} may have a prefix. This method does not associate the
458attribute node with any particular element. You must use
459\method{setAttributeNode()} on the appropriate \class{Element} object
460to use the newly created attribute instance.
Fred Drake669d36f2000-10-24 02:34:45 +0000461\end{methoddesc}
462
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000463\begin{methoddesc}[Document]{getElementsByTagName}{tagName}
Fred Drake669d36f2000-10-24 02:34:45 +0000464Search for all descendants (direct children, children's children,
465etc.) with a particular element type name.
466\end{methoddesc}
467
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000468\begin{methoddesc}[Document]{getElementsByTagNameNS}{namespaceURI, localName}
Fred Drake669d36f2000-10-24 02:34:45 +0000469Search for all descendants (direct children, children's children,
470etc.) with a particular namespace URI and localname. The localname is
471the part of the namespace after the prefix.
472\end{methoddesc}
473
Fred Drake669d36f2000-10-24 02:34:45 +0000474
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000475\subsubsection{Element Objects \label{dom-element-objects}}
Fred Drake669d36f2000-10-24 02:34:45 +0000476
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000477\class{Element} is a subclass of \class{Node}, so inherits all the
478attributes of that class.
479
480\begin{memberdesc}[Element]{tagName}
Fred Drake669d36f2000-10-24 02:34:45 +0000481The element type name. In a namespace-using document it may have
Fred Drake16942f22000-12-07 04:47:51 +0000482colons in it. The value is a string.
Fred Drake669d36f2000-10-24 02:34:45 +0000483\end{memberdesc}
484
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000485\begin{methoddesc}[Element]{getElementsByTagName}{tagName}
Fred Drake669d36f2000-10-24 02:34:45 +0000486Same as equivalent method in the \class{Document} class.
487\end{methoddesc}
488
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000489\begin{methoddesc}[Element]{getElementsByTagNameNS}{tagName}
Fred Drake669d36f2000-10-24 02:34:45 +0000490Same as equivalent method in the \class{Document} class.
491\end{methoddesc}
492
Fred Drake9a29dd62000-12-08 06:54:51 +0000493\begin{methoddesc}[Element]{getAttribute}{attname}
494Return an attribute value as a string.
495\end{methoddesc}
496
497\begin{methoddesc}[Element]{getAttributeNode}{attrname}
498Return the \class{Attr} node for the attribute named by
499\var{attrname}.
500\end{methoddesc}
501
502\begin{methoddesc}[Element]{getAttributeNS}{namespaceURI, localName}
503Return an attribute value as a string, given a \var{namespaceURI} and
504\var{localName}.
505\end{methoddesc}
506
507\begin{methoddesc}[Element]{getAttributeNodeNS}{namespaceURI, localName}
508Return an attribute value as a node, given a \var{namespaceURI} and
509\var{localName}.
510\end{methoddesc}
511
512\begin{methoddesc}[Element]{removeAttribute}{attname}
513Remove an attribute by name. No exception is raised if there is no
514matching attribute.
515\end{methoddesc}
516
517\begin{methoddesc}[Element]{removeAttributeNode}{oldAttr}
518Remove and return \var{oldAttr} from the attribute list, if present.
519If \var{oldAttr} is not present, \exception{NotFoundErr} is raised.
520\end{methoddesc}
521
522\begin{methoddesc}[Element]{removeAttributeNS}{namespaceURI, localName}
523Remove an attribute by name. Note that it uses a localName, not a
524qname. No exception is raised if there is no matching attribute.
525\end{methoddesc}
526
527\begin{methoddesc}[Element]{setAttribute}{attname, value}
528Set an attribute value from a string.
529\end{methoddesc}
530
531\begin{methoddesc}[Element]{setAttributeNode}{newAttr}
532Add a new attibute node to the element, replacing an existing
533attribute if necessary if the \member{name} attribute matches. If a
534replacement occurs, the old attribute node will be returned. If
535\var{newAttr} is already in use, \exception{InuseAttributeErr} will be
536raised.
537\end{methoddesc}
538
539\begin{methoddesc}[Element]{setAttributeNodeNS}{newAttr}
540Add a new attibute node to the element, replacing an existing
541attribute if necessary if the \member{namespaceURI} and
542\member{localName} attributes match. If a replacement occurs, the old
543attribute node will be returned. If \var{newAttr} is already in use,
544\exception{InuseAttributeErr} will be raised.
545\end{methoddesc}
546
547\begin{methoddesc}[Element]{setAttributeNS}{namespaceURI, qname, value}
548Set an attribute value from a string, given a \var{namespaceURI} and a
549\var{qname}. Note that a qname is the whole attribute name. This is
550different than above.
551\end{methoddesc}
552
Fred Drake669d36f2000-10-24 02:34:45 +0000553
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000554\subsubsection{Attr Objects \label{dom-attr-objects}}
Fred Drake669d36f2000-10-24 02:34:45 +0000555
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000556\class{Attr} inherits from \class{Node}, so inherits all its
557attributes.
Fred Drake669d36f2000-10-24 02:34:45 +0000558
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000559\begin{memberdesc}[Attr]{name}
Fred Drake669d36f2000-10-24 02:34:45 +0000560The attribute name. In a namespace-using document it may have colons
561in it.
562\end{memberdesc}
563
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000564\begin{memberdesc}[Attr]{localName}
Fred Drake669d36f2000-10-24 02:34:45 +0000565The part of the name following the colon if there is one, else the
Fred Drake9a29dd62000-12-08 06:54:51 +0000566entire name. This is a read-only attribute.
Fred Drake669d36f2000-10-24 02:34:45 +0000567\end{memberdesc}
568
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000569\begin{memberdesc}[Attr]{prefix}
Fred Drake669d36f2000-10-24 02:34:45 +0000570The part of the name preceding the colon if there is one, else the
571empty string.
572\end{memberdesc}
573
Fred Drake669d36f2000-10-24 02:34:45 +0000574
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000575\subsubsection{NamedNodeMap Objects \label{dom-attributelist-objects}}
Fred Drake669d36f2000-10-24 02:34:45 +0000576
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000577\class{NamedNodeMap} does \emph{not} inherit from \class{Node}.
Fred Drake669d36f2000-10-24 02:34:45 +0000578
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000579\begin{memberdesc}[NamedNodeMap]{length}
Fred Drake669d36f2000-10-24 02:34:45 +0000580The length of the attribute list.
581\end{memberdesc}
582
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000583\begin{methoddesc}[NamedNodeMap]{item}{index}
Fred Drake669d36f2000-10-24 02:34:45 +0000584Return an attribute with a particular index. The order you get the
585attributes in is arbitrary but will be consistent for the life of a
586DOM. Each item is an attribute node. Get its value with the
587\member{value} attribbute.
588\end{methoddesc}
589
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000590There are also experimental methods that give this class more mapping
591behavior. You can use them or you can use the standardized
592\method{getAttribute*()}-family methods on the \class{Element} objects.
Fred Drake669d36f2000-10-24 02:34:45 +0000593
594
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000595\subsubsection{Comment Objects \label{dom-comment-objects}}
Fred Drake669d36f2000-10-24 02:34:45 +0000596
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000597\class{Comment} represents a comment in the XML document. It is a
Fred Drake9a29dd62000-12-08 06:54:51 +0000598subclass of \class{Node}, but cannot have child nodes.
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000599
600\begin{memberdesc}[Comment]{data}
Fred Drake9a29dd62000-12-08 06:54:51 +0000601The content of the comment as a string. The attribute contains all
602characters between the leading \code{<!-}\code{-} and trailing
603\code{-}\code{->}, but does not include them.
Fred Drake669d36f2000-10-24 02:34:45 +0000604\end{memberdesc}
Fred Drake669d36f2000-10-24 02:34:45 +0000605
606
Fred Drake9a29dd62000-12-08 06:54:51 +0000607\subsubsection{Text and CDATASection Objects \label{dom-text-objects}}
Fred Drake669d36f2000-10-24 02:34:45 +0000608
Fred Drake9a29dd62000-12-08 06:54:51 +0000609The \class{Text} interface represents text in the XML document. If
610the parser and DOM implementation support the DOM's XML extension,
611portions of the text enclosed in CDATA marked sections are stored in
612\class{CDATASection} objects. These two interfaces are identical, but
613provide different values for the \member{nodeType} attribute.
614
615These interfaces extend the \class{Node} interface. They cannot have
616child nodes.
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000617
618\begin{memberdesc}[Text]{data}
Fred Drake9a29dd62000-12-08 06:54:51 +0000619The content of the text node as a string.
Fred Drake669d36f2000-10-24 02:34:45 +0000620\end{memberdesc}
Fred Drake669d36f2000-10-24 02:34:45 +0000621
Fred Drake9a29dd62000-12-08 06:54:51 +0000622\strong{Note:} The use of a \class{CDATASection} node does not
623indicate that the node represents a complete CDATA marked section,
624only that the content of the node was part of a CDATA section. A
625single CDATA section may be represented by more than one node in the
626document tree. There is no way to determine whether two adjacent
627\class{CDATASection} nodes represent different CDATA marked sections.
628
Fred Drake669d36f2000-10-24 02:34:45 +0000629
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000630\subsubsection{ProcessingInstruction Objects \label{dom-pi-objects}}
Fred Drake669d36f2000-10-24 02:34:45 +0000631
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000632Represents a processing instruction in the XML document; this inherits
Fred Drake9a29dd62000-12-08 06:54:51 +0000633from the \class{Node} interface and cannot have child nodes.
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000634
635\begin{memberdesc}[ProcessingInstruction]{target}
Fred Drake669d36f2000-10-24 02:34:45 +0000636The content of the processing instruction up to the first whitespace
Fred Drake9a29dd62000-12-08 06:54:51 +0000637character. This is a read-only attribute.
Fred Drake669d36f2000-10-24 02:34:45 +0000638\end{memberdesc}
639
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000640\begin{memberdesc}[ProcessingInstruction]{data}
Fred Drake669d36f2000-10-24 02:34:45 +0000641The content of the processing instruction following the first
642whitespace character.
643\end{memberdesc}
Fred Drake669d36f2000-10-24 02:34:45 +0000644
645
Fred Drakebc9c1b12000-12-13 17:38:02 +0000646\subsubsection{Exceptions \label{dom-exceptions}}
647
648\versionadded{2.1}
649
650The DOM Level 2 recommendation defines a single exception,
651\exception{DOMException}, and a number of constants that allow
652applications to determine what sort of error occurred.
653\exception{DOMException} instances carry a \member{code} attribute
654that provides the appropriate value for the specific exception.
655
656The Python DOM interface provides the constants, but also expands the
657set of exceptions so that a specific exception exists for each of the
658exception codes defined by the DOM. The implementations must raise
659the appropriate specific exception, each of which carries the
660appropriate value for the \member{code} attribute.
661
662\begin{excdesc}{DOMException}
663 Base exception class used for all specific DOM exceptions. This
664 exception class cannot be directly instantiated.
665\end{excdesc}
666
667\begin{excdesc}{DomstringSizeErr}
668 Raised when a specified range of text does not fit into a string.
669 This is not known to be used in the Python DOM implementations, but
670 may be received from DOM implementations not written in Python.
671\end{excdesc}
672
673\begin{excdesc}{HierarchyRequestErr}
674 Raised when an attempt is made to insert a node where the node type
675 is not allowed.
676\end{excdesc}
677
678\begin{excdesc}{IndexSizeErr}
679 Raised when an index or size parameter to a method is negative or
680 exceeds the allowed values.
681\end{excdesc}
682
683\begin{excdesc}{InuseAttributeErr}
684 Raised when an attempt is made to insert an \class{Attr} node that
685 is already present elsewhere in the document.
686\end{excdesc}
687
688\begin{excdesc}{InvalidAccessErr}
689 Raised if a parameter or an operation is not supported on the
690 underlying object.
691\end{excdesc}
692
693\begin{excdesc}{InvalidCharacterErr}
694 This exception is raised when a string parameter contains a
695 character that is not permitted in the context it's being used in by
696 the XML 1.0 recommendation. For example, attempting to create an
697 \class{Element} node with a space in the element type name will
698 cause this error to be raised.
699\end{excdesc}
700
701\begin{excdesc}{InvalidModificationErr}
702 Raised when an attempt is made to modify the type of a node.
703\end{excdesc}
704
705\begin{excdesc}{InvalidStateErr}
706 Raised when an attempt is made to use an object that is not or is no
707 longer usable.
708\end{excdesc}
709
710\begin{excdesc}{NamespaceErr}
711 If an attempt is made to change any object in a way that is not
712 permitted with regard to the
713 \citetitle[http://www.w3.org/TR/REC-xml-names/]{Namespaces in XML}
714 recommendation, this exception is raised.
715\end{excdesc}
716
717\begin{excdesc}{NotFoundErr}
718 Exception when a node does not exist in the referenced context. For
719 example, \method{NamedNodeMap.removeNamedItem()} will raise this if
720 the node passed in does not exist in the map.
721\end{excdesc}
722
723\begin{excdesc}{NotSupportedErr}
724 Raised when the implementation does not support the requested type
725 of object or operation.
726\end{excdesc}
727
728\begin{excdesc}{NoDataAllowedErr}
729 This is raised if data is specified for a node which does not
730 support data.
731 % XXX a better explanation is needed!
732\end{excdesc}
733
734\begin{excdesc}{NoModificationAllowedErr}
735 Raised on attempts to modify an object where modifications are not
736 allowed (such as for read-only nodes).
737\end{excdesc}
738
739\begin{excdesc}{SyntaxErr}
740 Raised when an invalid or illegal string is specified.
741 % XXX how is this different from InvalidCharacterErr ???
742\end{excdesc}
743
744\begin{excdesc}{WrongDocumentErr}
745 Raised when a node is inserted in a different document than it
746 currently belongs to, and the implementation does not support
747 migrating the node from one document to the other.
748\end{excdesc}
749
750The exception codes defined in the DOM recommendation map to the
751exceptions described above according to this table:
752
753\begin{tableii}{l|l}{constant}{Constant}{Exception}
754 \lineii{DOMSTRING_SIZE_ERR}{\exception{DomstringSizeErr}}
755 \lineii{HIERARCHY_REQUEST_ERR}{\exception{HierarchyRequestErr}}
756 \lineii{INDEX_SIZE_ERR}{\exception{IndexSizeErr}}
757 \lineii{INUSE_ATTRIBUTE_ERR}{\exception{InuseAttributeErr}}
758 \lineii{INVALID_ACCESS_ERR}{\exception{InvalidAccessErr}}
759 \lineii{INVALID_CHARACTER_ERR}{\exception{InvalidCharacterErr}}
760 \lineii{INVALID_MODIFICATION_ERR}{\exception{InvalidModificationErr}}
761 \lineii{INVALID_STATE_ERR}{\exception{InvalidStateErr}}
762 \lineii{NAMESPACE_ERR}{\exception{NamespaceErr}}
763 \lineii{NOT_FOUND_ERR}{\exception{NotFoundErr}}
764 \lineii{NOT_SUPPORTED_ERR}{\exception{NotSupportedErr}}
765 \lineii{NO_DATA_ALLOWED_ERR}{\exception{NoDataAllowedErr}}
766 \lineii{NO_MODIFICATION_ALLOWED_ERR}{\exception{NoModificationAllowedErr}}
767 \lineii{SYNTAX_ERR}{\exception{SyntaxErr}}
768 \lineii{WRONG_DOCUMENT_ERR}{\exception{WrongDocumentErr}}
769\end{tableii}
770
771
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000772\subsection{Conformance \label{dom-conformance}}
Fred Drake669d36f2000-10-24 02:34:45 +0000773
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000774This section describes the conformance requirements and relationships
775between the Python DOM API, the W3C DOM recommendations, and the OMG
776IDL mapping for Python.
Fred Drake669d36f2000-10-24 02:34:45 +0000777
Fred Drake16942f22000-12-07 04:47:51 +0000778
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000779\subsubsection{Type Mapping \label{dom-type-mapping}}
Fred Drake669d36f2000-10-24 02:34:45 +0000780
Fred Drake16942f22000-12-07 04:47:51 +0000781The primitive IDL types used in the DOM specification are mapped to
782Python types according to the following table.
783
784\begin{tableii}{l|l}{code}{IDL Type}{Python Type}
785 \lineii{boolean}{\code{IntegerType} (with a value of \code{0} or \code{1})}
786 \lineii{int}{\code{IntegerType}}
787 \lineii{long int}{\code{IntegerType}}
788 \lineii{unsigned int}{\code{IntegerType}}
789\end{tableii}
790
791Additionally, the \class{DOMString} defined in the recommendation is
792mapped to a Python string or Unicode string. Applications should
793be able to handle Unicode whenever a string is returned from the DOM.
794
795The IDL \keyword{null} value is mapped to \code{None}, which may be
796accepted or provided by the implementation whenever \keyword{null} is
797allowed by the API.
798
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000799
800\subsubsection{Accessor Methods \label{dom-accessor-methods}}
801
802The mapping from OMG IDL to Python defines accessor functions for IDL
803\keyword{attribute} declarations in much the way the Java mapping
804does. Mapping the IDL declarations
Fred Drake669d36f2000-10-24 02:34:45 +0000805
806\begin{verbatim}
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000807readonly attribute string someValue;
808 attribute string anotherValue;
Fred Drake669d36f2000-10-24 02:34:45 +0000809\end{verbatim}
810
Andrew M. Kuchlinge7e03cd2001-06-23 16:26:44 +0000811yields three accessor functions: a ``get'' method for
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000812\member{someValue} (\method{_get_someValue()}), and ``get'' and
813``set'' methods for
814\member{anotherValue} (\method{_get_anotherValue()} and
815\method{_set_anotherValue()}). The mapping, in particular, does not
816require that the IDL attributes are accessible as normal Python
817attributes: \code{\var{object}.someValue} is \emph{not} required to
818work, and may raise an \exception{AttributeError}.
Fred Drake669d36f2000-10-24 02:34:45 +0000819
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000820The Python DOM API, however, \emph{does} require that normal attribute
821access work. This means that the typical surrogates generated by
822Python IDL compilers are not likely to work, and wrapper objects may
823be needed on the client if the DOM objects are accessed via CORBA.
824While this does require some additional consideration for CORBA DOM
825clients, the implementers with experience using DOM over CORBA from
826Python do not consider this a problem. Attributes that are declared
827\keyword{readonly} may not restrict write access in all DOM
828implementations.
Fred Drake669d36f2000-10-24 02:34:45 +0000829
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000830Additionally, the accessor functions are not required. If provided,
831they should take the form defined by the Python IDL mapping, but
832these methods are considered unnecessary since the attributes are
Fred Drake16942f22000-12-07 04:47:51 +0000833accessible directly from Python. ``Set'' accessors should never be
834provided for \keyword{readonly} attributes.