blob: 5e752a0f3ad3fcf240712ec60dbef02c8a67bfb5 [file] [log] [blame]
Fred Drakeeaf57aa2000-11-29 06:10:22 +00001\section{\module{xml.dom} ---
2 The Document Object Model API}
Fred Drake669d36f2000-10-24 02:34:45 +00003
Fred Drakeeaf57aa2000-11-29 06:10:22 +00004\declaremodule{standard}{xml.dom}
5\modulesynopsis{Document Object Model API for Python.}
Fred Drake669d36f2000-10-24 02:34:45 +00006\sectionauthor{Paul Prescod}{paul@prescod.net}
7\sectionauthor{Martin v. L\"owis}{loewis@informatik.hu-berlin.de}
8
9\versionadded{2.0}
10
Fred Drakeeaf57aa2000-11-29 06:10:22 +000011The Document Object Model, or ``DOM,'' is a cross-language API from
12the World Wide Web Consortium (W3C) for accessing and modifying XML
13documents. A DOM implementation presents an XML document as a tree
14structure, or allows client code to build such a structure from
15scratch. It then gives access to the structure through a set of
16objects which provided well-known interfaces.
Fred Drake669d36f2000-10-24 02:34:45 +000017
Fred Drakeeaf57aa2000-11-29 06:10:22 +000018The DOM is extremely useful for random-access applications. SAX only
19allows you a view of one bit of the document at a time. If you are
20looking at one SAX element, you have no access to another. If you are
21looking at a text node, you have no access to a containing element.
22When you write a SAX application, you need to keep track of your
23program's position in the document somewhere in your own code. SAX
24does not do it for you. Also, if you need to look ahead in the XML
25document, you are just out of luck.
Fred Drake669d36f2000-10-24 02:34:45 +000026
27Some applications are simply impossible in an event driven model with
Fred Drakeeaf57aa2000-11-29 06:10:22 +000028no access to a tree. Of course you could build some sort of tree
Fred Drake669d36f2000-10-24 02:34:45 +000029yourself in SAX events, but the DOM allows you to avoid writing that
Fred Drakeeaf57aa2000-11-29 06:10:22 +000030code. The DOM is a standard tree representation for XML data.
Fred Drake669d36f2000-10-24 02:34:45 +000031
Fred Drakeeaf57aa2000-11-29 06:10:22 +000032%What if your needs are somewhere between SAX and the DOM? Perhaps
33%you cannot afford to load the entire tree in memory but you find the
34%SAX model somewhat cumbersome and low-level. There is also a module
35%called xml.dom.pulldom that allows you to build trees of only the
36%parts of a document that you need structured access to. It also has
37%features that allow you to find your way around the DOM.
Fred Drake669d36f2000-10-24 02:34:45 +000038% See http://www.prescod.net/python/pulldom
39
Fred Drakeeaf57aa2000-11-29 06:10:22 +000040The Document Object Model is being defined by the W3C in stages, or
41``levels'' in their terminology. The Python mapping of the API is
42substantially based on the DOM Level 2 recommendation. Some aspects
Fred Drake66f98b42001-01-26 20:51:32 +000043of the API will only become available in Python 2.1, or may only be
Fred Drakeeaf57aa2000-11-29 06:10:22 +000044available in particular DOM implementations.
Fred Drake669d36f2000-10-24 02:34:45 +000045
Fred Drakeeaf57aa2000-11-29 06:10:22 +000046DOM applications typically start by parsing some XML into a DOM. How
47this is accomplished is not covered at all by DOM Level 1, and Level 2
48provides only limited improvements. There is a
49\class{DOMImplementation} object class which provides access to
50\class{Document} creation methods, but these methods were only added
51in DOM Level 2 and were not implemented in time for Python 2.0. There
Fred Drake66f98b42001-01-26 20:51:32 +000052is also no well-defined way to access these methods without an
Fred Drakeeaf57aa2000-11-29 06:10:22 +000053existing \class{Document} object. For Python 2.0, consult the
54documentation for each particular DOM implementation to determine the
55bootstrap procedure needed to create and initialize \class{Document}
Fred Drake66f98b42001-01-26 20:51:32 +000056and \class{DocumentType} instances.
Fred Drake669d36f2000-10-24 02:34:45 +000057
58Once you have a DOM document object, you can access the parts of your
59XML document through its properties and methods. These properties are
Fred Drakeeaf57aa2000-11-29 06:10:22 +000060defined in the DOM specification; this portion of the reference manual
61describes the interpretation of the specification in Python.
Fred Drake669d36f2000-10-24 02:34:45 +000062
Fred Drakeeaf57aa2000-11-29 06:10:22 +000063The specification provided by the W3C defines the DOM API for Java,
64ECMAScript, and OMG IDL. The Python mapping defined here is based in
65large part on the IDL version of the specification, but strict
66compliance is not required (though implementations are free to support
67the strict mapping from IDL). See section \ref{dom-conformance},
68``Conformance,'' for a detailed discussion of mapping requirements.
Fred Drake669d36f2000-10-24 02:34:45 +000069
Fred Drake669d36f2000-10-24 02:34:45 +000070
71\begin{seealso}
Fred Drakeeaf57aa2000-11-29 06:10:22 +000072 \seetitle[http://www.w3.org/TR/DOM-Level-2-Core/]{Document Object
73 Model (DOM) Level 2 Specification}
74 {The W3C recommendation upon which the Python DOM API is
75 based.}
76 \seetitle[http://www.w3.org/TR/REC-DOM-Level-1/]{Document Object
77 Model (DOM) Level 1 Specification}
78 {The W3C recommendation for the
Fred Drake669d36f2000-10-24 02:34:45 +000079 DOM supported by \module{xml.dom.minidom}.}
80 \seetitle[http://pyxml.sourceforge.net]{PyXML}{Users that require a
81 full-featured implementation of DOM should use the PyXML
82 package.}
Fred Drakeeaf57aa2000-11-29 06:10:22 +000083 \seetitle[http://cgi.omg.org/cgi-bin/doc?orbos/99-08-02.pdf]{CORBA
84 Scripting with Python}
85 {This specifies the mapping from OMG IDL to Python.}
Fred Drake669d36f2000-10-24 02:34:45 +000086\end{seealso}
87
88
Fred Drakeeaf57aa2000-11-29 06:10:22 +000089\subsection{Objects in the DOM \label{dom-objects}}
Fred Drake669d36f2000-10-24 02:34:45 +000090
91The definitive documentation for the DOM is the DOM specification from
Fred Drake16942f22000-12-07 04:47:51 +000092the W3C.
Fred Drake669d36f2000-10-24 02:34:45 +000093
Fred Drakeeaf57aa2000-11-29 06:10:22 +000094Note that DOM attributes may also be manipulated as nodes instead of
95as simple strings. It is fairly rare that you must do this, however,
96so this usage is not yet documented.
97
98
99\begin{tableiii}{l|l|l}{class}{Interface}{Section}{Purpose}
Fred Drake16942f22000-12-07 04:47:51 +0000100 \lineiii{DOMImplementation}{\ref{dom-implementation-objects}}
101 {Interface to the underlying implementation.}
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000102 \lineiii{Node}{\ref{dom-node-objects}}
103 {Base interface for most objects in a document.}
Fred Drake16942f22000-12-07 04:47:51 +0000104 \lineiii{NodeList}{\ref{dom-nodelist-objects}}
105 {Interface for a sequence of nodes.}
106 \lineiii{DocumentType}{\ref{dom-documenttype-objects}}
107 {Information about the declarations needed to process a document.}
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000108 \lineiii{Document}{\ref{dom-document-objects}}
109 {Object which represents an entire document.}
110 \lineiii{Element}{\ref{dom-element-objects}}
111 {Element nodes in the document hierarchy.}
112 \lineiii{Attr}{\ref{dom-attr-objects}}
113 {Attribute value nodes on element nodes.}
114 \lineiii{Comment}{\ref{dom-comment-objects}}
115 {Representation of comments in the source document.}
116 \lineiii{Text}{\ref{dom-text-objects}}
117 {Nodes containing textual content from the document.}
118 \lineiii{ProcessingInstruction}{\ref{dom-pi-objects}}
119 {Processing instruction representation.}
120\end{tableiii}
121
Fred Drakebc9c1b12000-12-13 17:38:02 +0000122An additional section describes the exceptions defined for working
123with the DOM in Python.
124
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000125
Fred Drake16942f22000-12-07 04:47:51 +0000126\subsubsection{DOMImplementation Objects
127 \label{dom-implementation-objects}}
128
129The \class{DOMImplementation} interface provides a way for
130applications to determine the availability of particular features in
131the DOM they are using. DOM Level 2 added the ability to create new
132\class{Document} and \class{DocumentType} objects using the
133\class{DOMImplementation} as well.
134
135\begin{methoddesc}[DOMImplementation]{hasFeature}{feature, version}
136\end{methoddesc}
137
138
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000139\subsubsection{Node Objects \label{dom-node-objects}}
140
Fred Drake669d36f2000-10-24 02:34:45 +0000141All of the components of an XML document are subclasses of
142\class{Node}.
143
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000144\begin{memberdesc}[Node]{nodeType}
Fred Drake669d36f2000-10-24 02:34:45 +0000145An integer representing the node type. Symbolic constants for the
Fred Drake66f98b42001-01-26 20:51:32 +0000146types are on the \class{Node} object:
Fred Drake669d36f2000-10-24 02:34:45 +0000147\constant{ELEMENT_NODE}, \constant{ATTRIBUTE_NODE},
148\constant{TEXT_NODE}, \constant{CDATA_SECTION_NODE},
149\constant{ENTITY_NODE}, \constant{PROCESSING_INSTRUCTION_NODE},
150\constant{COMMENT_NODE}, \constant{DOCUMENT_NODE},
151\constant{DOCUMENT_TYPE_NODE}, \constant{NOTATION_NODE}.
Fred Drake9a29dd62000-12-08 06:54:51 +0000152This is a read-only attribute.
Fred Drake669d36f2000-10-24 02:34:45 +0000153\end{memberdesc}
154
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000155\begin{memberdesc}[Node]{parentNode}
Fred Drake16942f22000-12-07 04:47:51 +0000156The parent of the current node, or \code{None} for the document node.
157The value is always a \class{Node} object or \code{None}. For
158\class{Element} nodes, this will be the parent element, except for the
159root element, in which case it will be the \class{Document} object.
160For \class{Attr} nodes, this is always \code{None}.
Fred Drake9a29dd62000-12-08 06:54:51 +0000161This is a read-only attribute.
Fred Drake669d36f2000-10-24 02:34:45 +0000162\end{memberdesc}
163
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000164\begin{memberdesc}[Node]{attributes}
Fred Drake9368a122001-01-24 18:19:40 +0000165A \class{NamedNodeList} of attribute objects. Only elements have
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000166actual values for this; others provide \code{None} for this attribute.
Fred Drake9a29dd62000-12-08 06:54:51 +0000167This is a read-only attribute.
Fred Drake669d36f2000-10-24 02:34:45 +0000168\end{memberdesc}
169
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000170\begin{memberdesc}[Node]{previousSibling}
Fred Drake669d36f2000-10-24 02:34:45 +0000171The node that immediately precedes this one with the same parent. For
172instance the element with an end-tag that comes just before the
173\var{self} element's start-tag. Of course, XML documents are made
174up of more than just elements so the previous sibling could be text, a
Fred Drake16942f22000-12-07 04:47:51 +0000175comment, or something else. If this node is the first child of the
176parent, this attribute will be \code{None}.
Fred Drake9a29dd62000-12-08 06:54:51 +0000177This is a read-only attribute.
Fred Drake669d36f2000-10-24 02:34:45 +0000178\end{memberdesc}
179
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000180\begin{memberdesc}[Node]{nextSibling}
Fred Drake669d36f2000-10-24 02:34:45 +0000181The node that immediately follows this one with the same parent. See
Fred Drake16942f22000-12-07 04:47:51 +0000182also \member{previousSibling}. If this is the last child of the
183parent, this attribute will be \code{None}.
Fred Drake9a29dd62000-12-08 06:54:51 +0000184This is a read-only attribute.
Fred Drake669d36f2000-10-24 02:34:45 +0000185\end{memberdesc}
186
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000187\begin{memberdesc}[Node]{childNodes}
Fred Drake669d36f2000-10-24 02:34:45 +0000188A list of nodes contained within this node.
Fred Drake9a29dd62000-12-08 06:54:51 +0000189This is a read-only attribute.
Fred Drake669d36f2000-10-24 02:34:45 +0000190\end{memberdesc}
191
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000192\begin{memberdesc}[Node]{firstChild}
193The first child of the node, if there are any, or \code{None}.
Fred Drake9a29dd62000-12-08 06:54:51 +0000194This is a read-only attribute.
Fred Drake669d36f2000-10-24 02:34:45 +0000195\end{memberdesc}
196
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000197\begin{memberdesc}[Node]{lastChild}
198The last child of the node, if there are any, or \code{None}.
Fred Drake9a29dd62000-12-08 06:54:51 +0000199This is a read-only attribute.
Fred Drake669d36f2000-10-24 02:34:45 +0000200\end{memberdesc}
201
Fred Drake9a29dd62000-12-08 06:54:51 +0000202\begin{memberdesc}[Node]{localName}
203The part of the \member{tagName} following the colon if there is one,
204else the entire \member{tagName}. The value is a string.
205\end{memberdesc}
206
207\begin{memberdesc}[Node]{prefix}
208The part of the \member{tagName} preceding the colon if there is one,
209else the empty string. The value is a string, or \code{None}
210\end{memberdesc}
211
212\begin{memberdesc}[Node]{namespaceURI}
Fred Drake16942f22000-12-07 04:47:51 +0000213The namespace associated with the element name. This will be a
Fred Drake9a29dd62000-12-08 06:54:51 +0000214string or \code{None}. This is a read-only attribute.
Fred Drake16942f22000-12-07 04:47:51 +0000215\end{memberdesc}
216
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000217\begin{memberdesc}[Node]{nodeName}
Fred Drake9a29dd62000-12-08 06:54:51 +0000218This has a different meaning for each node type; see the DOM
219specification for details. You can always get the information you
220would get here from another property such as the \member{tagName}
221property for elements or the \member{name} property for attributes.
222For all node types, the value of this attribute will be either a
223string or \code{None}. This is a read-only attribute.
Fred Drake669d36f2000-10-24 02:34:45 +0000224\end{memberdesc}
225
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000226\begin{memberdesc}[Node]{nodeValue}
Fred Drake9a29dd62000-12-08 06:54:51 +0000227This has a different meaning for each node type; see the DOM
228specification for details. The situation is similar to that with
229\member{nodeName}. The value is a string or \code{None}.
Fred Drake669d36f2000-10-24 02:34:45 +0000230\end{memberdesc}
231
Fred Drake9a29dd62000-12-08 06:54:51 +0000232\begin{methoddesc}[Node]{hasAttributes}{}
233Returns true if the node has any attributes.
234\end{methoddesc}
235
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000236\begin{methoddesc}[Node]{hasChildNodes}{}
237Returns true if the node has any child nodes.
Fred Drake669d36f2000-10-24 02:34:45 +0000238\end{methoddesc}
239
Fred Drake40e43bf2001-02-03 01:20:01 +0000240\begin{methoddesc}[Node]{isSameNode}{other}
241Returns true if \var{other} refers to the same node as this node.
242This is especially useful for DOM implementations which use any sort
243of proxy architecture (because more than one object can refer to the
244same node).
Fred Drake15862f52001-02-14 20:39:15 +0000245
246\strong{Note:} This is based on a proposed DOM Level 3 API which is
247still in the ``working draft'' stage, but this particular interface
248appears uncontroversial. Changes from the W3C will not necessarily
249affect this method in the Python DOM interface (though any new W3C
250API for this would also be supported).
Fred Drake40e43bf2001-02-03 01:20:01 +0000251\end{methoddesc}
252
Fred Drake9a29dd62000-12-08 06:54:51 +0000253\begin{methoddesc}[Node]{appendChild}{newChild}
254Add a new child node to this node at the end of the list of children,
255returning \var{newChild}.
256\end{methoddesc}
257
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000258\begin{methoddesc}[Node]{insertBefore}{newChild, refChild}
Fred Drake669d36f2000-10-24 02:34:45 +0000259Insert a new child node before an existing child. It must be the case
260that \var{refChild} is a child of this node; if not,
Fred Drake9a29dd62000-12-08 06:54:51 +0000261\exception{ValueError} is raised. \var{newChild} is returned.
Fred Drake669d36f2000-10-24 02:34:45 +0000262\end{methoddesc}
263
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000264\begin{methoddesc}[Node]{removeChild}{oldChild}
Fred Drake669d36f2000-10-24 02:34:45 +0000265Remove a child node. \var{oldChild} must be a child of this node; if
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000266not, \exception{ValueError} is raised. \var{oldChild} is returned on
267success. If \var{oldChild} will not be used further, its
268\method{unlink()} method should be called.
Fred Drake669d36f2000-10-24 02:34:45 +0000269\end{methoddesc}
270
Fred Drake9a29dd62000-12-08 06:54:51 +0000271\begin{methoddesc}[Node]{replaceChild}{newChild, oldChild}
272Replace an existing node with a new node. It must be the case that
273\var{oldChild} is a child of this node; if not,
274\exception{ValueError} is raised.
Fred Drake669d36f2000-10-24 02:34:45 +0000275\end{methoddesc}
276
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000277\begin{methoddesc}[Node]{normalize}{}
278Join adjacent text nodes so that all stretches of text are stored as
279single \class{Text} instances. This simplifies processing text from a
280DOM tree for many applications.
281\versionadded{2.1}
Fred Drake669d36f2000-10-24 02:34:45 +0000282\end{methoddesc}
283
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000284\begin{methoddesc}[Node]{cloneNode}{deep}
285Clone this node. Setting \var{deep} means to clone all child nodes as
Fred Drake16942f22000-12-07 04:47:51 +0000286well. This returns the clone.
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000287\end{methoddesc}
Fred Drake669d36f2000-10-24 02:34:45 +0000288
289
Fred Drake16942f22000-12-07 04:47:51 +0000290\subsubsection{NodeList Objects \label{dom-nodelist-objects}}
291
292A \class{NodeList} represents a sequence of nodes. These objects are
293used in two ways in the DOM Core recommendation: the
294\class{Element} objects provides one as it's list of child nodes, and
295the \method{getElementsByTagName()} and
296\method{getElementsByTagNameNS()} methods of \class{Node} return
297objects with this interface to represent query results.
298
299The DOM Level 2 recommendation defines one method and one attribute
300for these objects:
301
302\begin{methoddesc}[NodeList]{item}{i}
303 Return the \var{i}'th item from the sequence, if there is one, or
304 \code{None}. The index \var{i} is not allowed to be less then zero
305 or greater than or equal to the length of the sequence.
306\end{methoddesc}
307
308\begin{memberdesc}[NodeList]{length}
309 The number of nodes in the sequence.
310\end{memberdesc}
311
312In addition, the Python DOM interface requires that some additional
313support is provided to allow \class{NodeList} objects to be used as
314Python sequences. All \class{NodeList} implementations must include
315support for \method{__len__()} and \method{__getitem__()}; this allows
316iteration over the \class{NodeList} in \keyword{for} statements and
317proper support for the \function{len()} built-in function.
318
319If a DOM implementation supports modification of the document, the
320\class{NodeList} implementation must also support the
321\method{__setitem__()} and \method{__delitem__()} methods.
322
323
324\subsubsection{DocumentType Objects \label{dom-documenttype-objects}}
325
326Information about the notations and entities declared by a document
327(including the external subset if the parser uses it and can provide
328the information) is available from a \class{DocumentType} object. The
329\class{DocumentType} for a document is available from the
330\class{Document} object's \member{doctype} attribute.
331
332\class{DocumentType} is a specialization of \class{Node}, and adds the
333following attributes:
334
335\begin{memberdesc}[DocumentType]{publicId}
336 The public identifier for the external subset of the document type
337 definition. This will be a string or \code{None}.
338\end{memberdesc}
339
340\begin{memberdesc}[DocumentType]{systemId}
341 The system identifier for the external subset of the document type
342 definition. This will be a URI as a string, or \code{None}.
343\end{memberdesc}
344
345\begin{memberdesc}[DocumentType]{internalSubset}
346 A string giving the complete internal subset from the document.
347\end{memberdesc}
348
349\begin{memberdesc}[DocumentType]{name}
350 The name of the root element as given in the \code{DOCTYPE}
351 declaration, if present. If the was no \code{DOCTYPE} declaration,
352 this will be \code{None}.
353\end{memberdesc}
354
355\begin{memberdesc}[DocumentType]{entities}
356 This is a \class{NamedNodeMap} giving the definitions of external
357 entities. For entity names defined more than once, only the first
358 definition is provided (others are ignored as required by the XML
359 recommendation). This may be \code{None} if the information is not
360 provided by the parser, or if no entities are defined.
361\end{memberdesc}
362
363\begin{memberdesc}[DocumentType]{notations}
364 This is a \class{NamedNodeMap} giving the definitions of notations.
365 For notation names defined more than once, only the first definition
366 is provided (others are ignored as required by the XML
367 recommendation). This may be \code{None} if the information is not
368 provided by the parser, or if no notations are defined.
369\end{memberdesc}
370
371
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000372\subsubsection{Document Objects \label{dom-document-objects}}
Fred Drake669d36f2000-10-24 02:34:45 +0000373
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000374A \class{Document} represents an entire XML document, including its
375constituent elements, attributes, processing instructions, comments
376etc. Remeber that it inherits properties from \class{Node}.
377
378\begin{memberdesc}[Document]{documentElement}
Fred Drake669d36f2000-10-24 02:34:45 +0000379The one and only root element of the document.
380\end{memberdesc}
381
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000382\begin{methoddesc}[Document]{createElement}{tagName}
Fred Drake16942f22000-12-07 04:47:51 +0000383Create and return a new element node. The element is not inserted
384into the document when it is created. You need to explicitly insert
385it with one of the other methods such as \method{insertBefore()} or
Fred Drake669d36f2000-10-24 02:34:45 +0000386\method{appendChild()}.
387\end{methoddesc}
388
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000389\begin{methoddesc}[Document]{createElementNS}{namespaceURI, tagName}
Fred Drake16942f22000-12-07 04:47:51 +0000390Create and return a new element with a namespace. The
391\var{tagName} may have a prefix. The element is not inserted into the
392document when it is created. You need to explicitly insert it with
393one of the other methods such as \method{insertBefore()} or
394\method{appendChild()}.
Fred Drake669d36f2000-10-24 02:34:45 +0000395\end{methoddesc}
396
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000397\begin{methoddesc}[Document]{createTextNode}{data}
Fred Drake16942f22000-12-07 04:47:51 +0000398Create and return a text node containing the data passed as a
399parameter. As with the other creation methods, this one does not
400insert the node into the tree.
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000401\end{methoddesc}
Fred Drake669d36f2000-10-24 02:34:45 +0000402
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000403\begin{methoddesc}[Document]{createComment}{data}
Fred Drake16942f22000-12-07 04:47:51 +0000404Create and return a comment node containing the data passed as a
405parameter. As with the other creation methods, this one does not
406insert the node into the tree.
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000407\end{methoddesc}
408
409\begin{methoddesc}[Document]{createProcessingInstruction}{target, data}
Fred Drake16942f22000-12-07 04:47:51 +0000410Create and return a processing instruction node containing the
411\var{target} and \var{data} passed as parameters. As with the other
412creation methods, this one does not insert the node into the tree.
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000413\end{methoddesc}
414
415\begin{methoddesc}[Document]{createAttribute}{name}
Fred Drake16942f22000-12-07 04:47:51 +0000416Create and return an attribute node. This method does not associate
417the attribute node with any particular element. You must use
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000418\method{setAttributeNode()} on the appropriate \class{Element} object
419to use the newly created attribute instance.
420\end{methoddesc}
421
422\begin{methoddesc}[Document]{createAttributeNS}{namespaceURI, qualifiedName}
Fred Drake16942f22000-12-07 04:47:51 +0000423Create and return an attribute node with a namespace. The
424\var{tagName} may have a prefix. This method does not associate the
425attribute node with any particular element. You must use
426\method{setAttributeNode()} on the appropriate \class{Element} object
427to use the newly created attribute instance.
Fred Drake669d36f2000-10-24 02:34:45 +0000428\end{methoddesc}
429
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000430\begin{methoddesc}[Document]{getElementsByTagName}{tagName}
Fred Drake669d36f2000-10-24 02:34:45 +0000431Search for all descendants (direct children, children's children,
432etc.) with a particular element type name.
433\end{methoddesc}
434
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000435\begin{methoddesc}[Document]{getElementsByTagNameNS}{namespaceURI, localName}
Fred Drake669d36f2000-10-24 02:34:45 +0000436Search for all descendants (direct children, children's children,
437etc.) with a particular namespace URI and localname. The localname is
438the part of the namespace after the prefix.
439\end{methoddesc}
440
Fred Drake669d36f2000-10-24 02:34:45 +0000441
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000442\subsubsection{Element Objects \label{dom-element-objects}}
Fred Drake669d36f2000-10-24 02:34:45 +0000443
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000444\class{Element} is a subclass of \class{Node}, so inherits all the
445attributes of that class.
446
447\begin{memberdesc}[Element]{tagName}
Fred Drake669d36f2000-10-24 02:34:45 +0000448The element type name. In a namespace-using document it may have
Fred Drake16942f22000-12-07 04:47:51 +0000449colons in it. The value is a string.
Fred Drake669d36f2000-10-24 02:34:45 +0000450\end{memberdesc}
451
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000452\begin{methoddesc}[Element]{getElementsByTagName}{tagName}
Fred Drake669d36f2000-10-24 02:34:45 +0000453Same as equivalent method in the \class{Document} class.
454\end{methoddesc}
455
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000456\begin{methoddesc}[Element]{getElementsByTagNameNS}{tagName}
Fred Drake669d36f2000-10-24 02:34:45 +0000457Same as equivalent method in the \class{Document} class.
458\end{methoddesc}
459
Fred Drake9a29dd62000-12-08 06:54:51 +0000460\begin{methoddesc}[Element]{getAttribute}{attname}
461Return an attribute value as a string.
462\end{methoddesc}
463
464\begin{methoddesc}[Element]{getAttributeNode}{attrname}
465Return the \class{Attr} node for the attribute named by
466\var{attrname}.
467\end{methoddesc}
468
469\begin{methoddesc}[Element]{getAttributeNS}{namespaceURI, localName}
470Return an attribute value as a string, given a \var{namespaceURI} and
471\var{localName}.
472\end{methoddesc}
473
474\begin{methoddesc}[Element]{getAttributeNodeNS}{namespaceURI, localName}
475Return an attribute value as a node, given a \var{namespaceURI} and
476\var{localName}.
477\end{methoddesc}
478
479\begin{methoddesc}[Element]{removeAttribute}{attname}
480Remove an attribute by name. No exception is raised if there is no
481matching attribute.
482\end{methoddesc}
483
484\begin{methoddesc}[Element]{removeAttributeNode}{oldAttr}
485Remove and return \var{oldAttr} from the attribute list, if present.
486If \var{oldAttr} is not present, \exception{NotFoundErr} is raised.
487\end{methoddesc}
488
489\begin{methoddesc}[Element]{removeAttributeNS}{namespaceURI, localName}
490Remove an attribute by name. Note that it uses a localName, not a
491qname. No exception is raised if there is no matching attribute.
492\end{methoddesc}
493
494\begin{methoddesc}[Element]{setAttribute}{attname, value}
495Set an attribute value from a string.
496\end{methoddesc}
497
498\begin{methoddesc}[Element]{setAttributeNode}{newAttr}
499Add a new attibute node to the element, replacing an existing
500attribute if necessary if the \member{name} attribute matches. If a
501replacement occurs, the old attribute node will be returned. If
502\var{newAttr} is already in use, \exception{InuseAttributeErr} will be
503raised.
504\end{methoddesc}
505
506\begin{methoddesc}[Element]{setAttributeNodeNS}{newAttr}
507Add a new attibute node to the element, replacing an existing
508attribute if necessary if the \member{namespaceURI} and
509\member{localName} attributes match. If a replacement occurs, the old
510attribute node will be returned. If \var{newAttr} is already in use,
511\exception{InuseAttributeErr} will be raised.
512\end{methoddesc}
513
514\begin{methoddesc}[Element]{setAttributeNS}{namespaceURI, qname, value}
515Set an attribute value from a string, given a \var{namespaceURI} and a
516\var{qname}. Note that a qname is the whole attribute name. This is
517different than above.
518\end{methoddesc}
519
Fred Drake669d36f2000-10-24 02:34:45 +0000520
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000521\subsubsection{Attr Objects \label{dom-attr-objects}}
Fred Drake669d36f2000-10-24 02:34:45 +0000522
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000523\class{Attr} inherits from \class{Node}, so inherits all its
524attributes.
Fred Drake669d36f2000-10-24 02:34:45 +0000525
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000526\begin{memberdesc}[Attr]{name}
Fred Drake669d36f2000-10-24 02:34:45 +0000527The attribute name. In a namespace-using document it may have colons
528in it.
529\end{memberdesc}
530
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000531\begin{memberdesc}[Attr]{localName}
Fred Drake669d36f2000-10-24 02:34:45 +0000532The part of the name following the colon if there is one, else the
Fred Drake9a29dd62000-12-08 06:54:51 +0000533entire name. This is a read-only attribute.
Fred Drake669d36f2000-10-24 02:34:45 +0000534\end{memberdesc}
535
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000536\begin{memberdesc}[Attr]{prefix}
Fred Drake669d36f2000-10-24 02:34:45 +0000537The part of the name preceding the colon if there is one, else the
538empty string.
539\end{memberdesc}
540
Fred Drake669d36f2000-10-24 02:34:45 +0000541
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000542\subsubsection{NamedNodeMap Objects \label{dom-attributelist-objects}}
Fred Drake669d36f2000-10-24 02:34:45 +0000543
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000544\class{NamedNodeMap} does \emph{not} inherit from \class{Node}.
Fred Drake669d36f2000-10-24 02:34:45 +0000545
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000546\begin{memberdesc}[NamedNodeMap]{length}
Fred Drake669d36f2000-10-24 02:34:45 +0000547The length of the attribute list.
548\end{memberdesc}
549
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000550\begin{methoddesc}[NamedNodeMap]{item}{index}
Fred Drake669d36f2000-10-24 02:34:45 +0000551Return an attribute with a particular index. The order you get the
552attributes in is arbitrary but will be consistent for the life of a
553DOM. Each item is an attribute node. Get its value with the
554\member{value} attribbute.
555\end{methoddesc}
556
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000557There are also experimental methods that give this class more mapping
558behavior. You can use them or you can use the standardized
559\method{getAttribute*()}-family methods on the \class{Element} objects.
Fred Drake669d36f2000-10-24 02:34:45 +0000560
561
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000562\subsubsection{Comment Objects \label{dom-comment-objects}}
Fred Drake669d36f2000-10-24 02:34:45 +0000563
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000564\class{Comment} represents a comment in the XML document. It is a
Fred Drake9a29dd62000-12-08 06:54:51 +0000565subclass of \class{Node}, but cannot have child nodes.
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000566
567\begin{memberdesc}[Comment]{data}
Fred Drake9a29dd62000-12-08 06:54:51 +0000568The content of the comment as a string. The attribute contains all
569characters between the leading \code{<!-}\code{-} and trailing
570\code{-}\code{->}, but does not include them.
Fred Drake669d36f2000-10-24 02:34:45 +0000571\end{memberdesc}
Fred Drake669d36f2000-10-24 02:34:45 +0000572
573
Fred Drake9a29dd62000-12-08 06:54:51 +0000574\subsubsection{Text and CDATASection Objects \label{dom-text-objects}}
Fred Drake669d36f2000-10-24 02:34:45 +0000575
Fred Drake9a29dd62000-12-08 06:54:51 +0000576The \class{Text} interface represents text in the XML document. If
577the parser and DOM implementation support the DOM's XML extension,
578portions of the text enclosed in CDATA marked sections are stored in
579\class{CDATASection} objects. These two interfaces are identical, but
580provide different values for the \member{nodeType} attribute.
581
582These interfaces extend the \class{Node} interface. They cannot have
583child nodes.
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000584
585\begin{memberdesc}[Text]{data}
Fred Drake9a29dd62000-12-08 06:54:51 +0000586The content of the text node as a string.
Fred Drake669d36f2000-10-24 02:34:45 +0000587\end{memberdesc}
Fred Drake669d36f2000-10-24 02:34:45 +0000588
Fred Drake9a29dd62000-12-08 06:54:51 +0000589\strong{Note:} The use of a \class{CDATASection} node does not
590indicate that the node represents a complete CDATA marked section,
591only that the content of the node was part of a CDATA section. A
592single CDATA section may be represented by more than one node in the
593document tree. There is no way to determine whether two adjacent
594\class{CDATASection} nodes represent different CDATA marked sections.
595
Fred Drake669d36f2000-10-24 02:34:45 +0000596
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000597\subsubsection{ProcessingInstruction Objects \label{dom-pi-objects}}
Fred Drake669d36f2000-10-24 02:34:45 +0000598
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000599Represents a processing instruction in the XML document; this inherits
Fred Drake9a29dd62000-12-08 06:54:51 +0000600from the \class{Node} interface and cannot have child nodes.
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000601
602\begin{memberdesc}[ProcessingInstruction]{target}
Fred Drake669d36f2000-10-24 02:34:45 +0000603The content of the processing instruction up to the first whitespace
Fred Drake9a29dd62000-12-08 06:54:51 +0000604character. This is a read-only attribute.
Fred Drake669d36f2000-10-24 02:34:45 +0000605\end{memberdesc}
606
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000607\begin{memberdesc}[ProcessingInstruction]{data}
Fred Drake669d36f2000-10-24 02:34:45 +0000608The content of the processing instruction following the first
609whitespace character.
610\end{memberdesc}
Fred Drake669d36f2000-10-24 02:34:45 +0000611
612
Fred Drakebc9c1b12000-12-13 17:38:02 +0000613\subsubsection{Exceptions \label{dom-exceptions}}
614
615\versionadded{2.1}
616
617The DOM Level 2 recommendation defines a single exception,
618\exception{DOMException}, and a number of constants that allow
619applications to determine what sort of error occurred.
620\exception{DOMException} instances carry a \member{code} attribute
621that provides the appropriate value for the specific exception.
622
623The Python DOM interface provides the constants, but also expands the
624set of exceptions so that a specific exception exists for each of the
625exception codes defined by the DOM. The implementations must raise
626the appropriate specific exception, each of which carries the
627appropriate value for the \member{code} attribute.
628
629\begin{excdesc}{DOMException}
630 Base exception class used for all specific DOM exceptions. This
631 exception class cannot be directly instantiated.
632\end{excdesc}
633
634\begin{excdesc}{DomstringSizeErr}
635 Raised when a specified range of text does not fit into a string.
636 This is not known to be used in the Python DOM implementations, but
637 may be received from DOM implementations not written in Python.
638\end{excdesc}
639
640\begin{excdesc}{HierarchyRequestErr}
641 Raised when an attempt is made to insert a node where the node type
642 is not allowed.
643\end{excdesc}
644
645\begin{excdesc}{IndexSizeErr}
646 Raised when an index or size parameter to a method is negative or
647 exceeds the allowed values.
648\end{excdesc}
649
650\begin{excdesc}{InuseAttributeErr}
651 Raised when an attempt is made to insert an \class{Attr} node that
652 is already present elsewhere in the document.
653\end{excdesc}
654
655\begin{excdesc}{InvalidAccessErr}
656 Raised if a parameter or an operation is not supported on the
657 underlying object.
658\end{excdesc}
659
660\begin{excdesc}{InvalidCharacterErr}
661 This exception is raised when a string parameter contains a
662 character that is not permitted in the context it's being used in by
663 the XML 1.0 recommendation. For example, attempting to create an
664 \class{Element} node with a space in the element type name will
665 cause this error to be raised.
666\end{excdesc}
667
668\begin{excdesc}{InvalidModificationErr}
669 Raised when an attempt is made to modify the type of a node.
670\end{excdesc}
671
672\begin{excdesc}{InvalidStateErr}
673 Raised when an attempt is made to use an object that is not or is no
674 longer usable.
675\end{excdesc}
676
677\begin{excdesc}{NamespaceErr}
678 If an attempt is made to change any object in a way that is not
679 permitted with regard to the
680 \citetitle[http://www.w3.org/TR/REC-xml-names/]{Namespaces in XML}
681 recommendation, this exception is raised.
682\end{excdesc}
683
684\begin{excdesc}{NotFoundErr}
685 Exception when a node does not exist in the referenced context. For
686 example, \method{NamedNodeMap.removeNamedItem()} will raise this if
687 the node passed in does not exist in the map.
688\end{excdesc}
689
690\begin{excdesc}{NotSupportedErr}
691 Raised when the implementation does not support the requested type
692 of object or operation.
693\end{excdesc}
694
695\begin{excdesc}{NoDataAllowedErr}
696 This is raised if data is specified for a node which does not
697 support data.
698 % XXX a better explanation is needed!
699\end{excdesc}
700
701\begin{excdesc}{NoModificationAllowedErr}
702 Raised on attempts to modify an object where modifications are not
703 allowed (such as for read-only nodes).
704\end{excdesc}
705
706\begin{excdesc}{SyntaxErr}
707 Raised when an invalid or illegal string is specified.
708 % XXX how is this different from InvalidCharacterErr ???
709\end{excdesc}
710
711\begin{excdesc}{WrongDocumentErr}
712 Raised when a node is inserted in a different document than it
713 currently belongs to, and the implementation does not support
714 migrating the node from one document to the other.
715\end{excdesc}
716
717The exception codes defined in the DOM recommendation map to the
718exceptions described above according to this table:
719
720\begin{tableii}{l|l}{constant}{Constant}{Exception}
721 \lineii{DOMSTRING_SIZE_ERR}{\exception{DomstringSizeErr}}
722 \lineii{HIERARCHY_REQUEST_ERR}{\exception{HierarchyRequestErr}}
723 \lineii{INDEX_SIZE_ERR}{\exception{IndexSizeErr}}
724 \lineii{INUSE_ATTRIBUTE_ERR}{\exception{InuseAttributeErr}}
725 \lineii{INVALID_ACCESS_ERR}{\exception{InvalidAccessErr}}
726 \lineii{INVALID_CHARACTER_ERR}{\exception{InvalidCharacterErr}}
727 \lineii{INVALID_MODIFICATION_ERR}{\exception{InvalidModificationErr}}
728 \lineii{INVALID_STATE_ERR}{\exception{InvalidStateErr}}
729 \lineii{NAMESPACE_ERR}{\exception{NamespaceErr}}
730 \lineii{NOT_FOUND_ERR}{\exception{NotFoundErr}}
731 \lineii{NOT_SUPPORTED_ERR}{\exception{NotSupportedErr}}
732 \lineii{NO_DATA_ALLOWED_ERR}{\exception{NoDataAllowedErr}}
733 \lineii{NO_MODIFICATION_ALLOWED_ERR}{\exception{NoModificationAllowedErr}}
734 \lineii{SYNTAX_ERR}{\exception{SyntaxErr}}
735 \lineii{WRONG_DOCUMENT_ERR}{\exception{WrongDocumentErr}}
736\end{tableii}
737
738
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000739\subsection{Conformance \label{dom-conformance}}
Fred Drake669d36f2000-10-24 02:34:45 +0000740
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000741This section describes the conformance requirements and relationships
742between the Python DOM API, the W3C DOM recommendations, and the OMG
743IDL mapping for Python.
Fred Drake669d36f2000-10-24 02:34:45 +0000744
Fred Drake16942f22000-12-07 04:47:51 +0000745
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000746\subsubsection{Type Mapping \label{dom-type-mapping}}
Fred Drake669d36f2000-10-24 02:34:45 +0000747
Fred Drake16942f22000-12-07 04:47:51 +0000748The primitive IDL types used in the DOM specification are mapped to
749Python types according to the following table.
750
751\begin{tableii}{l|l}{code}{IDL Type}{Python Type}
752 \lineii{boolean}{\code{IntegerType} (with a value of \code{0} or \code{1})}
753 \lineii{int}{\code{IntegerType}}
754 \lineii{long int}{\code{IntegerType}}
755 \lineii{unsigned int}{\code{IntegerType}}
756\end{tableii}
757
758Additionally, the \class{DOMString} defined in the recommendation is
759mapped to a Python string or Unicode string. Applications should
760be able to handle Unicode whenever a string is returned from the DOM.
761
762The IDL \keyword{null} value is mapped to \code{None}, which may be
763accepted or provided by the implementation whenever \keyword{null} is
764allowed by the API.
765
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000766
767\subsubsection{Accessor Methods \label{dom-accessor-methods}}
768
769The mapping from OMG IDL to Python defines accessor functions for IDL
770\keyword{attribute} declarations in much the way the Java mapping
771does. Mapping the IDL declarations
Fred Drake669d36f2000-10-24 02:34:45 +0000772
773\begin{verbatim}
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000774readonly attribute string someValue;
775 attribute string anotherValue;
Fred Drake669d36f2000-10-24 02:34:45 +0000776\end{verbatim}
777
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000778yeilds three accessor functions: a ``get'' method for
779\member{someValue} (\method{_get_someValue()}), and ``get'' and
780``set'' methods for
781\member{anotherValue} (\method{_get_anotherValue()} and
782\method{_set_anotherValue()}). The mapping, in particular, does not
783require that the IDL attributes are accessible as normal Python
784attributes: \code{\var{object}.someValue} is \emph{not} required to
785work, and may raise an \exception{AttributeError}.
Fred Drake669d36f2000-10-24 02:34:45 +0000786
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000787The Python DOM API, however, \emph{does} require that normal attribute
788access work. This means that the typical surrogates generated by
789Python IDL compilers are not likely to work, and wrapper objects may
790be needed on the client if the DOM objects are accessed via CORBA.
791While this does require some additional consideration for CORBA DOM
792clients, the implementers with experience using DOM over CORBA from
793Python do not consider this a problem. Attributes that are declared
794\keyword{readonly} may not restrict write access in all DOM
795implementations.
Fred Drake669d36f2000-10-24 02:34:45 +0000796
Fred Drakeeaf57aa2000-11-29 06:10:22 +0000797Additionally, the accessor functions are not required. If provided,
798they should take the form defined by the Python IDL mapping, but
799these methods are considered unnecessary since the attributes are
Fred Drake16942f22000-12-07 04:47:51 +0000800accessible directly from Python. ``Set'' accessors should never be
801provided for \keyword{readonly} attributes.