Fred Drake | 7fbc85c | 2000-09-23 04:47:56 +0000 | [diff] [blame] | 1 | \section{\module{xml.parsers.expat} --- |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 2 | Fast XML parsing using Expat} |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 3 | |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 4 | % Markup notes: |
| 5 | % |
| 6 | % Many of the attributes of the XMLParser objects are callbacks. |
| 7 | % Since signature information must be presented, these are described |
| 8 | % using the methoddesc environment. Since they are attributes which |
| 9 | % are set by client code, in-text references to these attributes |
| 10 | % should be marked using the \member macro and should not include the |
| 11 | % parentheses used when marking functions and methods. |
| 12 | |
Fred Drake | 7fbc85c | 2000-09-23 04:47:56 +0000 | [diff] [blame] | 13 | \declaremodule{standard}{xml.parsers.expat} |
| 14 | \modulesynopsis{An interface to the Expat non-validating XML parser.} |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 15 | \moduleauthor{Paul Prescod}{paul@prescod.net} |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 16 | |
Fred Drake | 7fbc85c | 2000-09-23 04:47:56 +0000 | [diff] [blame] | 17 | \versionadded{2.0} |
| 18 | |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 19 | The \module{xml.parsers.expat} module is a Python interface to the |
| 20 | Expat\index{Expat} non-validating XML parser. |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 21 | The module provides a single extension type, \class{xmlparser}, that |
| 22 | represents the current state of an XML parser. After an |
| 23 | \class{xmlparser} object has been created, various attributes of the object |
| 24 | can be set to handler functions. When an XML document is then fed to |
| 25 | the parser, the handler functions are called for the character data |
| 26 | and markup in the XML document. |
Fred Drake | 7fbc85c | 2000-09-23 04:47:56 +0000 | [diff] [blame] | 27 | |
| 28 | This module uses the \module{pyexpat}\refbimodindex{pyexpat} module to |
| 29 | provide access to the Expat parser. Direct use of the |
| 30 | \module{pyexpat} module is deprecated. |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 31 | |
| 32 | This module provides one exception and one type object: |
| 33 | |
Fred Drake | 1d8ad2b | 2001-02-14 18:54:32 +0000 | [diff] [blame] | 34 | \begin{excdesc}{ExpatError} |
Fred Drake | e0af35e | 2001-09-20 20:43:28 +0000 | [diff] [blame] | 35 | The exception raised when Expat reports an error. See section |
| 36 | \ref{expaterror-objects}, ``ExpatError Exceptions,'' for more |
| 37 | information on interpreting Expat errors. |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 38 | \end{excdesc} |
| 39 | |
Fred Drake | 1d8ad2b | 2001-02-14 18:54:32 +0000 | [diff] [blame] | 40 | \begin{excdesc}{error} |
| 41 | Alias for \exception{ExpatError}. |
| 42 | \end{excdesc} |
| 43 | |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 44 | \begin{datadesc}{XMLParserType} |
| 45 | The type of the return values from the \function{ParserCreate()} |
| 46 | function. |
| 47 | \end{datadesc} |
| 48 | |
| 49 | |
Fred Drake | 7fbc85c | 2000-09-23 04:47:56 +0000 | [diff] [blame] | 50 | The \module{xml.parsers.expat} module contains two functions: |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 51 | |
| 52 | \begin{funcdesc}{ErrorString}{errno} |
| 53 | Returns an explanatory string for a given error number \var{errno}. |
| 54 | \end{funcdesc} |
| 55 | |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 56 | \begin{funcdesc}{ParserCreate}{\optional{encoding\optional{, |
| 57 | namespace_separator}}} |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 58 | Creates and returns a new \class{xmlparser} object. |
| 59 | \var{encoding}, if specified, must be a string naming the encoding |
| 60 | used by the XML data. Expat doesn't support as many encodings as |
| 61 | Python does, and its repertoire of encodings can't be extended; it |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 62 | supports UTF-8, UTF-16, ISO-8859-1 (Latin1), and ASCII. If |
| 63 | \var{encoding} is given it will override the implicit or explicit |
| 64 | encoding of the document. |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 65 | |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 66 | Expat can optionally do XML namespace processing for you, enabled by |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 67 | providing a value for \var{namespace_separator}. The value must be a |
| 68 | one-character string; a \exception{ValueError} will be raised if the |
| 69 | string has an illegal length (\code{None} is considered the same as |
| 70 | omission). When namespace processing is enabled, element type names |
| 71 | and attribute names that belong to a namespace will be expanded. The |
| 72 | element name passed to the element handlers |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 73 | \member{StartElementHandler} and \member{EndElementHandler} |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 74 | will be the concatenation of the namespace URI, the namespace |
| 75 | separator character, and the local part of the name. If the namespace |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 76 | separator is a zero byte (\code{chr(0)}) then the namespace URI and |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 77 | the local part will be concatenated without any separator. |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 78 | |
Fred Drake | 2fef3ab | 2000-11-28 06:38:22 +0000 | [diff] [blame] | 79 | For example, if \var{namespace_separator} is set to a space character |
| 80 | (\character{ }) and the following document is parsed: |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 81 | |
| 82 | \begin{verbatim} |
| 83 | <?xml version="1.0"?> |
| 84 | <root xmlns = "http://default-namespace.org/" |
| 85 | xmlns:py = "http://www.python.org/ns/"> |
| 86 | <py:elem1 /> |
| 87 | <elem2 xmlns="" /> |
| 88 | </root> |
| 89 | \end{verbatim} |
| 90 | |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 91 | \member{StartElementHandler} will receive the following strings |
Fred Drake | d79c33a | 2000-09-25 14:14:30 +0000 | [diff] [blame] | 92 | for each element: |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 93 | |
| 94 | \begin{verbatim} |
| 95 | http://default-namespace.org/ root |
| 96 | http://www.python.org/ns/ elem1 |
| 97 | elem2 |
| 98 | \end{verbatim} |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 99 | \end{funcdesc} |
| 100 | |
Fred Drake | f08cbb1 | 2000-12-23 22:19:05 +0000 | [diff] [blame] | 101 | |
Fred Drake | dce695aa | 2002-06-20 21:06:03 +0000 | [diff] [blame] | 102 | \begin{seealso} |
| 103 | \seetitle[http://www.libexpat.org/]{The Expat XML Parser} |
| 104 | {Home page of the Expat project.} |
| 105 | \end{seealso} |
| 106 | |
| 107 | |
Fred Drake | f08cbb1 | 2000-12-23 22:19:05 +0000 | [diff] [blame] | 108 | \subsection{XMLParser Objects \label{xmlparser-objects}} |
| 109 | |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 110 | \class{xmlparser} objects have the following methods: |
| 111 | |
Fred Drake | 2fef3ab | 2000-11-28 06:38:22 +0000 | [diff] [blame] | 112 | \begin{methoddesc}[xmlparser]{Parse}{data\optional{, isfinal}} |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 113 | Parses the contents of the string \var{data}, calling the appropriate |
| 114 | handler functions to process the parsed data. \var{isfinal} must be |
Fred Drake | f08cbb1 | 2000-12-23 22:19:05 +0000 | [diff] [blame] | 115 | true on the final call to this method. \var{data} can be the empty |
Fred Drake | c05cbb0 | 2000-07-05 02:03:34 +0000 | [diff] [blame] | 116 | string at any time. |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 117 | \end{methoddesc} |
| 118 | |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 119 | \begin{methoddesc}[xmlparser]{ParseFile}{file} |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 120 | Parse XML data reading from the object \var{file}. \var{file} only |
| 121 | needs to provide the \method{read(\var{nbytes})} method, returning the |
| 122 | empty string when there's no more data. |
| 123 | \end{methoddesc} |
| 124 | |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 125 | \begin{methoddesc}[xmlparser]{SetBase}{base} |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 126 | Sets the base to be used for resolving relative URIs in system |
| 127 | identifiers in declarations. Resolving relative identifiers is left |
| 128 | to the application: this value will be passed through as the |
| 129 | \var{base} argument to the \function{ExternalEntityRefHandler}, |
| 130 | \function{NotationDeclHandler}, and |
| 131 | \function{UnparsedEntityDeclHandler} functions. |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 132 | \end{methoddesc} |
| 133 | |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 134 | \begin{methoddesc}[xmlparser]{GetBase}{} |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 135 | Returns a string containing the base set by a previous call to |
| 136 | \method{SetBase()}, or \code{None} if |
| 137 | \method{SetBase()} hasn't been called. |
| 138 | \end{methoddesc} |
| 139 | |
Fred Drake | 1d8ad2b | 2001-02-14 18:54:32 +0000 | [diff] [blame] | 140 | \begin{methoddesc}[xmlparser]{GetInputContext}{} |
| 141 | Returns the input data that generated the current event as a string. |
| 142 | The data is in the encoding of the entity which contains the text. |
| 143 | When called while an event handler is not active, the return value is |
| 144 | \code{None}. |
| 145 | \versionadded{2.1} |
| 146 | \end{methoddesc} |
| 147 | |
Fred Drake | f08cbb1 | 2000-12-23 22:19:05 +0000 | [diff] [blame] | 148 | \begin{methoddesc}[xmlparser]{ExternalEntityParserCreate}{context\optional{, |
| 149 | encoding}} |
| 150 | Create a ``child'' parser which can be used to parse an external |
| 151 | parsed entity referred to by content parsed by the parent parser. The |
Fred Drake | b162d18 | 2001-01-04 05:48:08 +0000 | [diff] [blame] | 152 | \var{context} parameter should be the string passed to the |
Fred Drake | f08cbb1 | 2000-12-23 22:19:05 +0000 | [diff] [blame] | 153 | \method{ExternalEntityRefHandler()} handler function, described below. |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 154 | The child parser is created with the \member{ordered_attributes}, |
| 155 | \member{returns_unicode} and \member{specified_attributes} set to the |
| 156 | values of this parser. |
Fred Drake | f08cbb1 | 2000-12-23 22:19:05 +0000 | [diff] [blame] | 157 | \end{methoddesc} |
| 158 | |
Fred Drake | d62d507 | 2004-08-10 17:18:32 +0000 | [diff] [blame] | 159 | \begin{methoddesc}[xmlparser]{UseForeignDTD}{\optional{flag}} |
| 160 | Calling this with a true value for \var{flag} (the default) will cause |
| 161 | Expat to call the \member{ExternalEntityRefHandler} with |
| 162 | \constant{None} for all arguments to allow an alternate DTD to be |
| 163 | loaded. If the document does not contain a document type declaration, |
| 164 | the \member{ExternalEntityRefHandler} will still be called, but the |
| 165 | \member{StartDoctypeDeclHandler} and \member{EndDoctypeDeclHandler} |
| 166 | will not be called. |
| 167 | |
| 168 | Passing a false value for \var{flag} will cancel a previous call that |
| 169 | passed a true value, but otherwise has no effect. |
| 170 | |
| 171 | This method can only be called before the \method{Parse()} or |
| 172 | \method{ParseFile()} methods are called; calling it after either of |
| 173 | those have been called causes \exception{ExpatError} to be raised with |
| 174 | the \member{code} attribute set to |
| 175 | \constant{errors.XML_ERROR_CANT_CHANGE_FEATURE_ONCE_PARSING}. |
| 176 | |
| 177 | \versionadded{2.3} |
| 178 | \end{methoddesc} |
| 179 | |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 180 | |
Fred Drake | d79c33a | 2000-09-25 14:14:30 +0000 | [diff] [blame] | 181 | \class{xmlparser} objects have the following attributes: |
Andrew M. Kuchling | 0690c86 | 2000-08-17 23:15:21 +0000 | [diff] [blame] | 182 | |
Fred Drake | f0b095d | 2002-07-17 20:31:52 +0000 | [diff] [blame] | 183 | \begin{memberdesc}[xmlparser]{buffer_size} |
| 184 | The size of the buffer used when \member{buffer_text} is true. This |
| 185 | value cannot be changed at this time. |
| 186 | \versionadded{2.3} |
| 187 | \end{memberdesc} |
| 188 | |
| 189 | \begin{memberdesc}[xmlparser]{buffer_text} |
| 190 | Setting this to true causes the \class{xmlparser} object to buffer |
| 191 | textual content returned by Expat to avoid multiple calls to the |
| 192 | \method{CharacterDataHandler()} callback whenever possible. This can |
| 193 | improve performance substantially since Expat normally breaks |
| 194 | character data into chunks at every line ending. This attribute is |
| 195 | false by default, and may be changed at any time. |
| 196 | \versionadded{2.3} |
| 197 | \end{memberdesc} |
| 198 | |
| 199 | \begin{memberdesc}[xmlparser]{buffer_used} |
| 200 | If \member{buffer_text} is enabled, the number of bytes stored in the |
| 201 | buffer. These bytes represent UTF-8 encoded text. This attribute has |
| 202 | no meaningful interpretation when \member{buffer_text} is false. |
| 203 | \versionadded{2.3} |
| 204 | \end{memberdesc} |
| 205 | |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 206 | \begin{memberdesc}[xmlparser]{ordered_attributes} |
| 207 | Setting this attribute to a non-zero integer causes the attributes to |
| 208 | be reported as a list rather than a dictionary. The attributes are |
| 209 | presented in the order found in the document text. For each |
| 210 | attribute, two list entries are presented: the attribute name and the |
| 211 | attribute value. (Older versions of this module also used this |
| 212 | format.) By default, this attribute is false; it may be changed at |
| 213 | any time. |
| 214 | \versionadded{2.1} |
| 215 | \end{memberdesc} |
| 216 | |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 217 | \begin{memberdesc}[xmlparser]{returns_unicode} |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 218 | If this attribute is set to a non-zero integer, the handler functions |
| 219 | will be passed Unicode strings. If \member{returns_unicode} is 0, |
| 220 | 8-bit strings containing UTF-8 encoded data will be passed to the |
| 221 | handlers. |
Fred Drake | b62966c | 2000-12-07 00:00:21 +0000 | [diff] [blame] | 222 | \versionchanged[Can be changed at any time to affect the result |
Fred Drake | e0af35e | 2001-09-20 20:43:28 +0000 | [diff] [blame] | 223 | type]{1.6} |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 224 | \end{memberdesc} |
Andrew M. Kuchling | 0690c86 | 2000-08-17 23:15:21 +0000 | [diff] [blame] | 225 | |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 226 | \begin{memberdesc}[xmlparser]{specified_attributes} |
| 227 | If set to a non-zero integer, the parser will report only those |
| 228 | attributes which were specified in the document instance and not those |
| 229 | which were derived from attribute declarations. Applications which |
| 230 | set this need to be especially careful to use what additional |
| 231 | information is available from the declarations as needed to comply |
| 232 | with the standards for the behavior of XML processors. By default, |
| 233 | this attribute is false; it may be changed at any time. |
| 234 | \versionadded{2.1} |
| 235 | \end{memberdesc} |
| 236 | |
Andrew M. Kuchling | 0690c86 | 2000-08-17 23:15:21 +0000 | [diff] [blame] | 237 | The following attributes contain values relating to the most recent |
| 238 | error encountered by an \class{xmlparser} object, and will only have |
| 239 | correct values once a call to \method{Parse()} or \method{ParseFile()} |
Fred Drake | 523ec57 | 2001-02-15 05:37:51 +0000 | [diff] [blame] | 240 | has raised a \exception{xml.parsers.expat.ExpatError} exception. |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 241 | |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 242 | \begin{memberdesc}[xmlparser]{ErrorByteIndex} |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 243 | Byte index at which an error occurred. |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 244 | \end{memberdesc} |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 245 | |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 246 | \begin{memberdesc}[xmlparser]{ErrorCode} |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 247 | Numeric code specifying the problem. This value can be passed to the |
| 248 | \function{ErrorString()} function, or compared to one of the constants |
Fred Drake | 523ec57 | 2001-02-15 05:37:51 +0000 | [diff] [blame] | 249 | defined in the \code{errors} object. |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 250 | \end{memberdesc} |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 251 | |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 252 | \begin{memberdesc}[xmlparser]{ErrorColumnNumber} |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 253 | Column number at which an error occurred. |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 254 | \end{memberdesc} |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 255 | |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 256 | \begin{memberdesc}[xmlparser]{ErrorLineNumber} |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 257 | Line number at which an error occurred. |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 258 | \end{memberdesc} |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 259 | |
Dave Cole | 3203efb | 2004-08-26 00:37:31 +0000 | [diff] [blame] | 260 | The following attributes contain values relating to the current parse |
| 261 | location in an \class{xmlparser} object. During a callback reporting |
| 262 | a parse event they indicate the location of the first of the sequence |
| 263 | of characters that generated the event. When called outside of a |
| 264 | callback, the position indicated will be just past the last parse |
| 265 | event (regardless of whether there was an associated callback). |
| 266 | \versionadded{2.4} |
| 267 | |
| 268 | \begin{memberdesc}[xmlparser]{CurrentByteIndex} |
| 269 | Current byte index in the parser input. |
| 270 | \end{memberdesc} |
| 271 | |
| 272 | \begin{memberdesc}[xmlparser]{CurrentColumnNumber} |
| 273 | Current column number in the parser input. |
| 274 | \end{memberdesc} |
| 275 | |
| 276 | \begin{memberdesc}[xmlparser]{CurrentLineNumber} |
| 277 | Current line number in the parser input. |
| 278 | \end{memberdesc} |
| 279 | |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 280 | Here is the list of handlers that can be set. To set a handler on an |
Fred Drake | c05cbb0 | 2000-07-05 02:03:34 +0000 | [diff] [blame] | 281 | \class{xmlparser} object \var{o}, use |
| 282 | \code{\var{o}.\var{handlername} = \var{func}}. \var{handlername} must |
| 283 | be taken from the following list, and \var{func} must be a callable |
| 284 | object accepting the correct number of arguments. The arguments are |
| 285 | all strings, unless otherwise stated. |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 286 | |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 287 | \begin{methoddesc}[xmlparser]{XmlDeclHandler}{version, encoding, standalone} |
| 288 | Called when the XML declaration is parsed. The XML declaration is the |
| 289 | (optional) declaration of the applicable version of the XML |
| 290 | recommendation, the encoding of the document text, and an optional |
| 291 | ``standalone'' declaration. \var{version} and \var{encoding} will be |
| 292 | strings of the type dictated by the \member{returns_unicode} |
| 293 | attribute, and \var{standalone} will be \code{1} if the document is |
| 294 | declared standalone, \code{0} if it is declared not to be standalone, |
| 295 | or \code{-1} if the standalone clause was omitted. |
| 296 | This is only available with Expat version 1.95.0 or newer. |
| 297 | \versionadded{2.1} |
| 298 | \end{methoddesc} |
| 299 | |
| 300 | \begin{methoddesc}[xmlparser]{StartDoctypeDeclHandler}{doctypeName, |
| 301 | systemId, publicId, |
| 302 | has_internal_subset} |
| 303 | Called when Expat begins parsing the document type declaration |
| 304 | (\code{<!DOCTYPE \ldots}). The \var{doctypeName} is provided exactly |
| 305 | as presented. The \var{systemId} and \var{publicId} parameters give |
| 306 | the system and public identifiers if specified, or \code{None} if |
| 307 | omitted. \var{has_internal_subset} will be true if the document |
| 308 | contains and internal document declaration subset. |
| 309 | This requires Expat version 1.2 or newer. |
| 310 | \end{methoddesc} |
| 311 | |
| 312 | \begin{methoddesc}[xmlparser]{EndDoctypeDeclHandler}{} |
| 313 | Called when Expat is done parsing the document type delaration. |
| 314 | This requires Expat version 1.2 or newer. |
| 315 | \end{methoddesc} |
| 316 | |
| 317 | \begin{methoddesc}[xmlparser]{ElementDeclHandler}{name, model} |
| 318 | Called once for each element type declaration. \var{name} is the name |
| 319 | of the element type, and \var{model} is a representation of the |
| 320 | content model. |
| 321 | \end{methoddesc} |
| 322 | |
| 323 | \begin{methoddesc}[xmlparser]{AttlistDeclHandler}{elname, attname, |
| 324 | type, default, required} |
| 325 | Called for each declared attribute for an element type. If an |
| 326 | attribute list declaration declares three attributes, this handler is |
| 327 | called three times, once for each attribute. \var{elname} is the name |
| 328 | of the element to which the declaration applies and \var{attname} is |
| 329 | the name of the attribute declared. The attribute type is a string |
| 330 | passed as \var{type}; the possible values are \code{'CDATA'}, |
| 331 | \code{'ID'}, \code{'IDREF'}, ... |
| 332 | \var{default} gives the default value for the attribute used when the |
| 333 | attribute is not specified by the document instance, or \code{None} if |
| 334 | there is no default value (\code{\#IMPLIED} values). If the attribute |
| 335 | is required to be given in the document instance, \var{required} will |
| 336 | be true. |
| 337 | This requires Expat version 1.95.0 or newer. |
| 338 | \end{methoddesc} |
| 339 | |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 340 | \begin{methoddesc}[xmlparser]{StartElementHandler}{name, attributes} |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 341 | Called for the start of every element. \var{name} is a string |
| 342 | containing the element name, and \var{attributes} is a dictionary |
| 343 | mapping attribute names to their values. |
| 344 | \end{methoddesc} |
| 345 | |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 346 | \begin{methoddesc}[xmlparser]{EndElementHandler}{name} |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 347 | Called for the end of every element. |
| 348 | \end{methoddesc} |
| 349 | |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 350 | \begin{methoddesc}[xmlparser]{ProcessingInstructionHandler}{target, data} |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 351 | Called for every processing instruction. |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 352 | \end{methoddesc} |
| 353 | |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 354 | \begin{methoddesc}[xmlparser]{CharacterDataHandler}{data} |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 355 | Called for character data. This will be called for normal character |
| 356 | data, CDATA marked content, and ignorable whitespace. Applications |
| 357 | which must distinguish these cases can use the |
| 358 | \member{StartCdataSectionHandler}, \member{EndCdataSectionHandler}, |
| 359 | and \member{ElementDeclHandler} callbacks to collect the required |
| 360 | information. |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 361 | \end{methoddesc} |
| 362 | |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 363 | \begin{methoddesc}[xmlparser]{UnparsedEntityDeclHandler}{entityName, base, |
| 364 | systemId, publicId, |
| 365 | notationName} |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 366 | Called for unparsed (NDATA) entity declarations. This is only present |
| 367 | for version 1.2 of the Expat library; for more recent versions, use |
| 368 | \member{EntityDeclHandler} instead. (The underlying function in the |
| 369 | Expat library has been declared obsolete.) |
| 370 | \end{methoddesc} |
| 371 | |
| 372 | \begin{methoddesc}[xmlparser]{EntityDeclHandler}{entityName, |
| 373 | is_parameter_entity, value, |
| 374 | base, systemId, |
| 375 | publicId, |
| 376 | notationName} |
| 377 | Called for all entity declarations. For parameter and internal |
| 378 | entities, \var{value} will be a string giving the declared contents |
| 379 | of the entity; this will be \code{None} for external entities. The |
| 380 | \var{notationName} parameter will be \code{None} for parsed entities, |
| 381 | and the name of the notation for unparsed entities. |
| 382 | \var{is_parameter_entity} will be true if the entity is a paremeter |
| 383 | entity or false for general entities (most applications only need to |
| 384 | be concerned with general entities). |
| 385 | This is only available starting with version 1.95.0 of the Expat |
| 386 | library. |
| 387 | \versionadded{2.1} |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 388 | \end{methoddesc} |
| 389 | |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 390 | \begin{methoddesc}[xmlparser]{NotationDeclHandler}{notationName, base, |
| 391 | systemId, publicId} |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 392 | Called for notation declarations. \var{notationName}, \var{base}, and |
| 393 | \var{systemId}, and \var{publicId} are strings if given. If the |
| 394 | public identifier is omitted, \var{publicId} will be \code{None}. |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 395 | \end{methoddesc} |
| 396 | |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 397 | \begin{methoddesc}[xmlparser]{StartNamespaceDeclHandler}{prefix, uri} |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 398 | Called when an element contains a namespace declaration. Namespace |
| 399 | declarations are processed before the \member{StartElementHandler} is |
| 400 | called for the element on which declarations are placed. |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 401 | \end{methoddesc} |
| 402 | |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 403 | \begin{methoddesc}[xmlparser]{EndNamespaceDeclHandler}{prefix} |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 404 | Called when the closing tag is reached for an element |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 405 | that contained a namespace declaration. This is called once for each |
| 406 | namespace declaration on the element in the reverse of the order for |
| 407 | which the \member{StartNamespaceDeclHandler} was called to indicate |
| 408 | the start of each namespace declaration's scope. Calls to this |
| 409 | handler are made after the corresponding \member{EndElementHandler} |
| 410 | for the end of the element. |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 411 | \end{methoddesc} |
| 412 | |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 413 | \begin{methoddesc}[xmlparser]{CommentHandler}{data} |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 414 | Called for comments. \var{data} is the text of the comment, excluding |
Fred Drake | 523ec57 | 2001-02-15 05:37:51 +0000 | [diff] [blame] | 415 | the leading `\code{<!-}\code{-}' and trailing `\code{-}\code{->}'. |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 416 | \end{methoddesc} |
| 417 | |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 418 | \begin{methoddesc}[xmlparser]{StartCdataSectionHandler}{} |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 419 | Called at the start of a CDATA section. This and |
| 420 | \member{StartCdataSectionHandler} are needed to be able to identify |
| 421 | the syntactical start and end for CDATA sections. |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 422 | \end{methoddesc} |
| 423 | |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 424 | \begin{methoddesc}[xmlparser]{EndCdataSectionHandler}{} |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 425 | Called at the end of a CDATA section. |
| 426 | \end{methoddesc} |
| 427 | |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 428 | \begin{methoddesc}[xmlparser]{DefaultHandler}{data} |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 429 | Called for any characters in the XML document for |
| 430 | which no applicable handler has been specified. This means |
| 431 | characters that are part of a construct which could be reported, but |
| 432 | for which no handler has been supplied. |
| 433 | \end{methoddesc} |
| 434 | |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 435 | \begin{methoddesc}[xmlparser]{DefaultHandlerExpand}{data} |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 436 | This is the same as the \function{DefaultHandler}, |
| 437 | but doesn't inhibit expansion of internal entities. |
| 438 | The entity reference will not be passed to the default handler. |
| 439 | \end{methoddesc} |
| 440 | |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 441 | \begin{methoddesc}[xmlparser]{NotStandaloneHandler}{} Called if the |
| 442 | XML document hasn't been declared as being a standalone document. |
| 443 | This happens when there is an external subset or a reference to a |
| 444 | parameter entity, but the XML declaration does not set standalone to |
| 445 | \code{yes} in an XML declaration. If this handler returns \code{0}, |
| 446 | then the parser will throw an \constant{XML_ERROR_NOT_STANDALONE} |
| 447 | error. If this handler is not set, no exception is raised by the |
| 448 | parser for this condition. |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 449 | \end{methoddesc} |
| 450 | |
Fred Drake | efffe8e | 2000-10-29 05:10:30 +0000 | [diff] [blame] | 451 | \begin{methoddesc}[xmlparser]{ExternalEntityRefHandler}{context, base, |
| 452 | systemId, publicId} |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 453 | Called for references to external entities. \var{base} is the current |
| 454 | base, as set by a previous call to \method{SetBase()}. The public and |
| 455 | system identifiers, \var{systemId} and \var{publicId}, are strings if |
| 456 | given; if the public identifier is not given, \var{publicId} will be |
Fred Drake | 523ec57 | 2001-02-15 05:37:51 +0000 | [diff] [blame] | 457 | \code{None}. The \var{context} value is opaque and should only be |
| 458 | used as described below. |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 459 | |
| 460 | For external entities to be parsed, this handler must be implemented. |
| 461 | It is responsible for creating the sub-parser using |
Fred Drake | 523ec57 | 2001-02-15 05:37:51 +0000 | [diff] [blame] | 462 | \code{ExternalEntityParserCreate(\var{context})}, initializing it with |
| 463 | the appropriate callbacks, and parsing the entity. This handler |
| 464 | should return an integer; if it returns \code{0}, the parser will |
| 465 | throw an \constant{XML_ERROR_EXTERNAL_ENTITY_HANDLING} error, |
| 466 | otherwise parsing will continue. |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 467 | |
| 468 | If this handler is not provided, external entities are reported by the |
| 469 | \member{DefaultHandler} callback, if provided. |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 470 | \end{methoddesc} |
| 471 | |
| 472 | |
Fred Drake | 1d8ad2b | 2001-02-14 18:54:32 +0000 | [diff] [blame] | 473 | \subsection{ExpatError Exceptions \label{expaterror-objects}} |
| 474 | \sectionauthor{Fred L. Drake, Jr.}{fdrake@acm.org} |
| 475 | |
| 476 | \exception{ExpatError} exceptions have a number of interesting |
| 477 | attributes: |
| 478 | |
| 479 | \begin{memberdesc}[ExpatError]{code} |
| 480 | Expat's internal error number for the specific error. This will |
| 481 | match one of the constants defined in the \code{errors} object from |
| 482 | this module. |
| 483 | \versionadded{2.1} |
| 484 | \end{memberdesc} |
| 485 | |
| 486 | \begin{memberdesc}[ExpatError]{lineno} |
| 487 | Line number on which the error was detected. The first line is |
| 488 | numbered \code{1}. |
| 489 | \versionadded{2.1} |
| 490 | \end{memberdesc} |
| 491 | |
| 492 | \begin{memberdesc}[ExpatError]{offset} |
| 493 | Character offset into the line where the error occurred. The first |
| 494 | column is numbered \code{0}. |
| 495 | \versionadded{2.1} |
| 496 | \end{memberdesc} |
| 497 | |
| 498 | |
Fred Drake | 7fbc85c | 2000-09-23 04:47:56 +0000 | [diff] [blame] | 499 | \subsection{Example \label{expat-example}} |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 500 | |
Fred Drake | c05cbb0 | 2000-07-05 02:03:34 +0000 | [diff] [blame] | 501 | The following program defines three handlers that just print out their |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 502 | arguments. |
| 503 | |
| 504 | \begin{verbatim} |
Fred Drake | 7fbc85c | 2000-09-23 04:47:56 +0000 | [diff] [blame] | 505 | import xml.parsers.expat |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 506 | |
| 507 | # 3 handler functions |
| 508 | def start_element(name, attrs): |
| 509 | print 'Start element:', name, attrs |
| 510 | def end_element(name): |
| 511 | print 'End element:', name |
| 512 | def char_data(data): |
| 513 | print 'Character data:', repr(data) |
| 514 | |
Fred Drake | 7fbc85c | 2000-09-23 04:47:56 +0000 | [diff] [blame] | 515 | p = xml.parsers.expat.ParserCreate() |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 516 | |
| 517 | p.StartElementHandler = start_element |
Fred Drake | 7fbc85c | 2000-09-23 04:47:56 +0000 | [diff] [blame] | 518 | p.EndElementHandler = end_element |
| 519 | p.CharacterDataHandler = char_data |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 520 | |
| 521 | p.Parse("""<?xml version="1.0"?> |
| 522 | <parent id="top"><child1 name="paul">Text goes here</child1> |
| 523 | <child2 name="fred">More text</child2> |
Fred Drake | a41b2bb | 2002-12-03 22:57:37 +0000 | [diff] [blame] | 524 | </parent>""", 1) |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 525 | \end{verbatim} |
| 526 | |
| 527 | The output from this program is: |
| 528 | |
| 529 | \begin{verbatim} |
| 530 | Start element: parent {'id': 'top'} |
| 531 | Start element: child1 {'name': 'paul'} |
| 532 | Character data: 'Text goes here' |
| 533 | End element: child1 |
Ka-Ping Yee | fa004ad | 2001-01-24 17:19:08 +0000 | [diff] [blame] | 534 | Character data: '\n' |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 535 | Start element: child2 {'name': 'fred'} |
| 536 | Character data: 'More text' |
| 537 | End element: child2 |
Ka-Ping Yee | fa004ad | 2001-01-24 17:19:08 +0000 | [diff] [blame] | 538 | Character data: '\n' |
Andrew M. Kuchling | 6b14eeb | 2000-06-11 02:42:07 +0000 | [diff] [blame] | 539 | End element: parent |
| 540 | \end{verbatim} |
Fred Drake | c05cbb0 | 2000-07-05 02:03:34 +0000 | [diff] [blame] | 541 | |
| 542 | |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 543 | \subsection{Content Model Descriptions \label{expat-content-models}} |
| 544 | \sectionauthor{Fred L. Drake, Jr.}{fdrake@acm.org} |
| 545 | |
| 546 | Content modules are described using nested tuples. Each tuple |
| 547 | contains four values: the type, the quantifier, the name, and a tuple |
| 548 | of children. Children are simply additional content module |
| 549 | descriptions. |
| 550 | |
| 551 | The values of the first two fields are constants defined in the |
| 552 | \code{model} object of the \module{xml.parsers.expat} module. These |
| 553 | constants can be collected in two groups: the model type group and the |
| 554 | quantifier group. |
| 555 | |
| 556 | The constants in the model type group are: |
| 557 | |
| 558 | \begin{datadescni}{XML_CTYPE_ANY} |
| 559 | The element named by the model name was declared to have a content |
| 560 | model of \code{ANY}. |
| 561 | \end{datadescni} |
| 562 | |
| 563 | \begin{datadescni}{XML_CTYPE_CHOICE} |
| 564 | The named element allows a choice from a number of options; this is |
| 565 | used for content models such as \code{(A | B | C)}. |
| 566 | \end{datadescni} |
| 567 | |
| 568 | \begin{datadescni}{XML_CTYPE_EMPTY} |
| 569 | Elements which are declared to be \code{EMPTY} have this model type. |
| 570 | \end{datadescni} |
| 571 | |
| 572 | \begin{datadescni}{XML_CTYPE_MIXED} |
| 573 | \end{datadescni} |
| 574 | |
| 575 | \begin{datadescni}{XML_CTYPE_NAME} |
| 576 | \end{datadescni} |
| 577 | |
| 578 | \begin{datadescni}{XML_CTYPE_SEQ} |
| 579 | Models which represent a series of models which follow one after the |
| 580 | other are indicated with this model type. This is used for models |
| 581 | such as \code{(A, B, C)}. |
| 582 | \end{datadescni} |
| 583 | |
| 584 | |
| 585 | The constants in the quantifier group are: |
| 586 | |
| 587 | \begin{datadescni}{XML_CQUANT_NONE} |
Fred Drake | e0af35e | 2001-09-20 20:43:28 +0000 | [diff] [blame] | 588 | No modifier is given, so it can appear exactly once, as for \code{A}. |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 589 | \end{datadescni} |
| 590 | |
| 591 | \begin{datadescni}{XML_CQUANT_OPT} |
Fred Drake | e0af35e | 2001-09-20 20:43:28 +0000 | [diff] [blame] | 592 | The model is optional: it can appear once or not at all, as for |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 593 | \code{A?}. |
| 594 | \end{datadescni} |
| 595 | |
| 596 | \begin{datadescni}{XML_CQUANT_PLUS} |
Fred Drake | e0af35e | 2001-09-20 20:43:28 +0000 | [diff] [blame] | 597 | The model must occur one or more times (like \code{A+}). |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 598 | \end{datadescni} |
| 599 | |
| 600 | \begin{datadescni}{XML_CQUANT_REP} |
| 601 | The model must occur zero or more times, as for \code{A*}. |
| 602 | \end{datadescni} |
| 603 | |
| 604 | |
Fred Drake | 7fbc85c | 2000-09-23 04:47:56 +0000 | [diff] [blame] | 605 | \subsection{Expat error constants \label{expat-errors}} |
Fred Drake | c05cbb0 | 2000-07-05 02:03:34 +0000 | [diff] [blame] | 606 | |
Fred Drake | 1d8ad2b | 2001-02-14 18:54:32 +0000 | [diff] [blame] | 607 | The following constants are provided in the \code{errors} object of |
| 608 | the \refmodule{xml.parsers.expat} module. These constants are useful |
| 609 | in interpreting some of the attributes of the \exception{ExpatError} |
| 610 | exception objects raised when an error has occurred. |
Fred Drake | c05cbb0 | 2000-07-05 02:03:34 +0000 | [diff] [blame] | 611 | |
Fred Drake | 7fbc85c | 2000-09-23 04:47:56 +0000 | [diff] [blame] | 612 | The \code{errors} object has the following attributes: |
Fred Drake | c05cbb0 | 2000-07-05 02:03:34 +0000 | [diff] [blame] | 613 | |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 614 | \begin{datadescni}{XML_ERROR_ASYNC_ENTITY} |
| 615 | \end{datadescni} |
Fred Drake | acab3d6 | 2000-07-11 16:30:30 +0000 | [diff] [blame] | 616 | |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 617 | \begin{datadescni}{XML_ERROR_ATTRIBUTE_EXTERNAL_ENTITY_REF} |
| 618 | An entity reference in an attribute value referred to an external |
| 619 | entity instead of an internal entity. |
| 620 | \end{datadescni} |
Fred Drake | acab3d6 | 2000-07-11 16:30:30 +0000 | [diff] [blame] | 621 | |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 622 | \begin{datadescni}{XML_ERROR_BAD_CHAR_REF} |
Fred Drake | e0af35e | 2001-09-20 20:43:28 +0000 | [diff] [blame] | 623 | A character reference referred to a character which is illegal in XML |
Raymond Hettinger | bf3a752 | 2003-05-12 03:23:51 +0000 | [diff] [blame] | 624 | (for example, character \code{0}, or `\code{\&\#0;}'). |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 625 | \end{datadescni} |
Fred Drake | acab3d6 | 2000-07-11 16:30:30 +0000 | [diff] [blame] | 626 | |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 627 | \begin{datadescni}{XML_ERROR_BINARY_ENTITY_REF} |
Fred Drake | e0af35e | 2001-09-20 20:43:28 +0000 | [diff] [blame] | 628 | An entity reference referred to an entity which was declared with a |
| 629 | notation, so cannot be parsed. |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 630 | \end{datadescni} |
Fred Drake | acab3d6 | 2000-07-11 16:30:30 +0000 | [diff] [blame] | 631 | |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 632 | \begin{datadescni}{XML_ERROR_DUPLICATE_ATTRIBUTE} |
Fred Drake | acab3d6 | 2000-07-11 16:30:30 +0000 | [diff] [blame] | 633 | An attribute was used more than once in a start tag. |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 634 | \end{datadescni} |
Fred Drake | acab3d6 | 2000-07-11 16:30:30 +0000 | [diff] [blame] | 635 | |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 636 | \begin{datadescni}{XML_ERROR_INCORRECT_ENCODING} |
| 637 | \end{datadescni} |
Fred Drake | acab3d6 | 2000-07-11 16:30:30 +0000 | [diff] [blame] | 638 | |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 639 | \begin{datadescni}{XML_ERROR_INVALID_TOKEN} |
Fred Drake | e0af35e | 2001-09-20 20:43:28 +0000 | [diff] [blame] | 640 | Raised when an input byte could not properly be assigned to a |
| 641 | character; for example, a NUL byte (value \code{0}) in a UTF-8 input |
| 642 | stream. |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 643 | \end{datadescni} |
Fred Drake | acab3d6 | 2000-07-11 16:30:30 +0000 | [diff] [blame] | 644 | |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 645 | \begin{datadescni}{XML_ERROR_JUNK_AFTER_DOC_ELEMENT} |
Fred Drake | acab3d6 | 2000-07-11 16:30:30 +0000 | [diff] [blame] | 646 | Something other than whitespace occurred after the document element. |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 647 | \end{datadescni} |
Fred Drake | acab3d6 | 2000-07-11 16:30:30 +0000 | [diff] [blame] | 648 | |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 649 | \begin{datadescni}{XML_ERROR_MISPLACED_XML_PI} |
Fred Drake | e0af35e | 2001-09-20 20:43:28 +0000 | [diff] [blame] | 650 | An XML declaration was found somewhere other than the start of the |
| 651 | input data. |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 652 | \end{datadescni} |
Fred Drake | acab3d6 | 2000-07-11 16:30:30 +0000 | [diff] [blame] | 653 | |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 654 | \begin{datadescni}{XML_ERROR_NO_ELEMENTS} |
Fred Drake | e0af35e | 2001-09-20 20:43:28 +0000 | [diff] [blame] | 655 | The document contains no elements (XML requires all documents to |
| 656 | contain exactly one top-level element).. |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 657 | \end{datadescni} |
Fred Drake | acab3d6 | 2000-07-11 16:30:30 +0000 | [diff] [blame] | 658 | |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 659 | \begin{datadescni}{XML_ERROR_NO_MEMORY} |
Fred Drake | acab3d6 | 2000-07-11 16:30:30 +0000 | [diff] [blame] | 660 | Expat was not able to allocate memory internally. |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 661 | \end{datadescni} |
Fred Drake | acab3d6 | 2000-07-11 16:30:30 +0000 | [diff] [blame] | 662 | |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 663 | \begin{datadescni}{XML_ERROR_PARAM_ENTITY_REF} |
Fred Drake | e0af35e | 2001-09-20 20:43:28 +0000 | [diff] [blame] | 664 | A parameter entity reference was found where it was not allowed. |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 665 | \end{datadescni} |
Fred Drake | acab3d6 | 2000-07-11 16:30:30 +0000 | [diff] [blame] | 666 | |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 667 | \begin{datadescni}{XML_ERROR_PARTIAL_CHAR} |
Fred Drake | fb568ca | 2004-08-10 16:47:18 +0000 | [diff] [blame] | 668 | An incomplete character was found in the input. |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 669 | \end{datadescni} |
Fred Drake | acab3d6 | 2000-07-11 16:30:30 +0000 | [diff] [blame] | 670 | |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 671 | \begin{datadescni}{XML_ERROR_RECURSIVE_ENTITY_REF} |
Fred Drake | e0af35e | 2001-09-20 20:43:28 +0000 | [diff] [blame] | 672 | An entity reference contained another reference to the same entity; |
| 673 | possibly via a different name, and possibly indirectly. |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 674 | \end{datadescni} |
Fred Drake | acab3d6 | 2000-07-11 16:30:30 +0000 | [diff] [blame] | 675 | |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 676 | \begin{datadescni}{XML_ERROR_SYNTAX} |
Fred Drake | acab3d6 | 2000-07-11 16:30:30 +0000 | [diff] [blame] | 677 | Some unspecified syntax error was encountered. |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 678 | \end{datadescni} |
Fred Drake | acab3d6 | 2000-07-11 16:30:30 +0000 | [diff] [blame] | 679 | |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 680 | \begin{datadescni}{XML_ERROR_TAG_MISMATCH} |
Fred Drake | acab3d6 | 2000-07-11 16:30:30 +0000 | [diff] [blame] | 681 | An end tag did not match the innermost open start tag. |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 682 | \end{datadescni} |
Fred Drake | acab3d6 | 2000-07-11 16:30:30 +0000 | [diff] [blame] | 683 | |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 684 | \begin{datadescni}{XML_ERROR_UNCLOSED_TOKEN} |
Fred Drake | e0af35e | 2001-09-20 20:43:28 +0000 | [diff] [blame] | 685 | Some token (such as a start tag) was not closed before the end of the |
| 686 | stream or the next token was encountered. |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 687 | \end{datadescni} |
Fred Drake | acab3d6 | 2000-07-11 16:30:30 +0000 | [diff] [blame] | 688 | |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 689 | \begin{datadescni}{XML_ERROR_UNDEFINED_ENTITY} |
Fred Drake | acab3d6 | 2000-07-11 16:30:30 +0000 | [diff] [blame] | 690 | A reference was made to a entity which was not defined. |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 691 | \end{datadescni} |
Fred Drake | acab3d6 | 2000-07-11 16:30:30 +0000 | [diff] [blame] | 692 | |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 693 | \begin{datadescni}{XML_ERROR_UNKNOWN_ENCODING} |
Fred Drake | acab3d6 | 2000-07-11 16:30:30 +0000 | [diff] [blame] | 694 | The document encoding is not supported by Expat. |
Fred Drake | 5ed1dac | 2001-02-08 15:40:33 +0000 | [diff] [blame] | 695 | \end{datadescni} |
Fred Drake | fb568ca | 2004-08-10 16:47:18 +0000 | [diff] [blame] | 696 | |
| 697 | \begin{datadescni}{XML_ERROR_UNCLOSED_CDATA_SECTION} |
| 698 | A CDATA marked section was not closed. |
| 699 | \end{datadescni} |
| 700 | |
| 701 | \begin{datadescni}{XML_ERROR_EXTERNAL_ENTITY_HANDLING} |
| 702 | \end{datadescni} |
| 703 | |
| 704 | \begin{datadescni}{XML_ERROR_NOT_STANDALONE} |
| 705 | The parser determined that the document was not ``standalone'' though |
| 706 | it declared itself to be in the XML declaration, and the |
| 707 | \member{NotStandaloneHandler} was set and returned \code{0}. |
| 708 | \end{datadescni} |
| 709 | |
| 710 | \begin{datadescni}{XML_ERROR_UNEXPECTED_STATE} |
| 711 | \end{datadescni} |
| 712 | |
| 713 | \begin{datadescni}{XML_ERROR_ENTITY_DECLARED_IN_PE} |
| 714 | \end{datadescni} |
| 715 | |
| 716 | \begin{datadescni}{XML_ERROR_FEATURE_REQUIRES_XML_DTD} |
| 717 | An operation was requested that requires DTD support to be compiled |
| 718 | in, but Expat was configured without DTD support. This should never |
| 719 | be reported by a standard build of the \module{xml.parsers.expat} |
| 720 | module. |
| 721 | \end{datadescni} |
| 722 | |
| 723 | \begin{datadescni}{XML_ERROR_CANT_CHANGE_FEATURE_ONCE_PARSING} |
| 724 | A behavioral change was requested after parsing started that can only |
| 725 | be changed before parsing has started. This is (currently) only |
| 726 | raised by \method{UseForeignDTD()}. |
| 727 | \end{datadescni} |
| 728 | |
| 729 | \begin{datadescni}{XML_ERROR_UNBOUND_PREFIX} |
| 730 | An undeclared prefix was found when namespace processing was enabled. |
| 731 | \end{datadescni} |
| 732 | |
| 733 | \begin{datadescni}{XML_ERROR_UNDECLARING_PREFIX} |
| 734 | The document attempted to remove the namespace declaration associated |
| 735 | with a prefix. |
| 736 | \end{datadescni} |
| 737 | |
| 738 | \begin{datadescni}{XML_ERROR_INCOMPLETE_PE} |
| 739 | A parameter entity contained incomplete markup. |
| 740 | \end{datadescni} |
| 741 | |
| 742 | \begin{datadescni}{XML_ERROR_XML_DECL} |
| 743 | The document contained no document element at all. |
| 744 | \end{datadescni} |
| 745 | |
| 746 | \begin{datadescni}{XML_ERROR_TEXT_DECL} |
| 747 | There was an error parsing a text declaration in an external entity. |
| 748 | \end{datadescni} |
| 749 | |
| 750 | \begin{datadescni}{XML_ERROR_PUBLICID} |
| 751 | Characters were found in the public id that are not allowed. |
| 752 | \end{datadescni} |
| 753 | |
| 754 | \begin{datadescni}{XML_ERROR_SUSPENDED} |
| 755 | The requested operation was made on a suspended parser, but isn't |
| 756 | allowed. This includes attempts to provide additional input or to |
| 757 | stop the parser. |
| 758 | \end{datadescni} |
| 759 | |
| 760 | \begin{datadescni}{XML_ERROR_NOT_SUSPENDED} |
| 761 | An attempt to resume the parser was made when the parser had not been |
| 762 | suspended. |
| 763 | \end{datadescni} |
| 764 | |
| 765 | \begin{datadescni}{XML_ERROR_ABORTED} |
| 766 | This should not be reported to Python applications. |
| 767 | \end{datadescni} |
| 768 | |
| 769 | \begin{datadescni}{XML_ERROR_FINISHED} |
| 770 | The requested operation was made on a parser which was finished |
| 771 | parsing input, but isn't allowed. This includes attempts to provide |
| 772 | additional input or to stop the parser. |
| 773 | \end{datadescni} |
| 774 | |
| 775 | \begin{datadescni}{XML_ERROR_SUSPEND_PE} |
| 776 | \end{datadescni} |