blob: c9136a80202e595a54710d883647425bf0201faa [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001
2:mod:`xml.parsers.expat` --- Fast XML parsing using Expat
3=========================================================
4
5.. module:: xml.parsers.expat
6 :synopsis: An interface to the Expat non-validating XML parser.
7.. moduleauthor:: Paul Prescod <paul@prescod.net>
8
9
Georg Brandlb19be572007-12-29 10:57:00 +000010.. Markup notes:
11
12 Many of the attributes of the XMLParser objects are callbacks. Since
13 signature information must be presented, these are described using the method
14 directive. Since they are attributes which are set by client code, in-text
15 references to these attributes should be marked using the :member: role.
Georg Brandl8ec7f652007-08-15 14:28:01 +000016
Christian Heimes23790b42013-03-26 17:53:05 +010017
18.. warning::
19
20 The :mod:`pyexpat` module is not secure against maliciously
21 constructed data. If you need to parse untrusted or unauthenticated data see
22 :ref:`xml-vulnerabilities`.
23
24
Georg Brandl8ec7f652007-08-15 14:28:01 +000025.. versionadded:: 2.0
26
27.. index:: single: Expat
28
29The :mod:`xml.parsers.expat` module is a Python interface to the Expat
30non-validating XML parser. The module provides a single extension type,
31:class:`xmlparser`, that represents the current state of an XML parser. After
32an :class:`xmlparser` object has been created, various attributes of the object
33can be set to handler functions. When an XML document is then fed to the
34parser, the handler functions are called for the character data and markup in
35the XML document.
36
37.. index:: module: pyexpat
38
39This module uses the :mod:`pyexpat` module to provide access to the Expat
40parser. Direct use of the :mod:`pyexpat` module is deprecated.
41
42This module provides one exception and one type object:
43
44
45.. exception:: ExpatError
46
47 The exception raised when Expat reports an error. See section
48 :ref:`expaterror-objects` for more information on interpreting Expat errors.
49
50
51.. exception:: error
52
53 Alias for :exc:`ExpatError`.
54
55
56.. data:: XMLParserType
57
58 The type of the return values from the :func:`ParserCreate` function.
59
60The :mod:`xml.parsers.expat` module contains two functions:
61
62
63.. function:: ErrorString(errno)
64
65 Returns an explanatory string for a given error number *errno*.
66
67
68.. function:: ParserCreate([encoding[, namespace_separator]])
69
70 Creates and returns a new :class:`xmlparser` object. *encoding*, if specified,
71 must be a string naming the encoding used by the XML data. Expat doesn't
72 support as many encodings as Python does, and its repertoire of encodings can't
73 be extended; it supports UTF-8, UTF-16, ISO-8859-1 (Latin1), and ASCII. If
Mark Summerfield43da35d2008-03-17 08:28:15 +000074 *encoding* [1]_ is given it will override the implicit or explicit encoding of the
Georg Brandl8ec7f652007-08-15 14:28:01 +000075 document.
76
77 Expat can optionally do XML namespace processing for you, enabled by providing a
78 value for *namespace_separator*. The value must be a one-character string; a
79 :exc:`ValueError` will be raised if the string has an illegal length (``None``
80 is considered the same as omission). When namespace processing is enabled,
81 element type names and attribute names that belong to a namespace will be
82 expanded. The element name passed to the element handlers
83 :attr:`StartElementHandler` and :attr:`EndElementHandler` will be the
84 concatenation of the namespace URI, the namespace separator character, and the
85 local part of the name. If the namespace separator is a zero byte (``chr(0)``)
86 then the namespace URI and the local part will be concatenated without any
87 separator.
88
89 For example, if *namespace_separator* is set to a space character (``' '``) and
90 the following document is parsed::
91
92 <?xml version="1.0"?>
93 <root xmlns = "http://default-namespace.org/"
94 xmlns:py = "http://www.python.org/ns/">
95 <py:elem1 />
96 <elem2 xmlns="" />
97 </root>
98
99 :attr:`StartElementHandler` will receive the following strings for each
100 element::
101
102 http://default-namespace.org/ root
103 http://www.python.org/ns/ elem1
104 elem2
105
106
107.. seealso::
108
109 `The Expat XML Parser <http://www.libexpat.org/>`_
110 Home page of the Expat project.
111
112
113.. _xmlparser-objects:
114
115XMLParser Objects
116-----------------
117
118:class:`xmlparser` objects have the following methods:
119
120
121.. method:: xmlparser.Parse(data[, isfinal])
122
123 Parses the contents of the string *data*, calling the appropriate handler
124 functions to process the parsed data. *isfinal* must be true on the final call
125 to this method. *data* can be the empty string at any time.
126
127
128.. method:: xmlparser.ParseFile(file)
129
130 Parse XML data reading from the object *file*. *file* only needs to provide
131 the ``read(nbytes)`` method, returning the empty string when there's no more
132 data.
133
134
135.. method:: xmlparser.SetBase(base)
136
137 Sets the base to be used for resolving relative URIs in system identifiers in
138 declarations. Resolving relative identifiers is left to the application: this
139 value will be passed through as the *base* argument to the
140 :func:`ExternalEntityRefHandler`, :func:`NotationDeclHandler`, and
141 :func:`UnparsedEntityDeclHandler` functions.
142
143
144.. method:: xmlparser.GetBase()
145
146 Returns a string containing the base set by a previous call to :meth:`SetBase`,
147 or ``None`` if :meth:`SetBase` hasn't been called.
148
149
150.. method:: xmlparser.GetInputContext()
151
152 Returns the input data that generated the current event as a string. The data is
153 in the encoding of the entity which contains the text. When called while an
154 event handler is not active, the return value is ``None``.
155
156 .. versionadded:: 2.1
157
158
159.. method:: xmlparser.ExternalEntityParserCreate(context[, encoding])
160
161 Create a "child" parser which can be used to parse an external parsed entity
162 referred to by content parsed by the parent parser. The *context* parameter
163 should be the string passed to the :meth:`ExternalEntityRefHandler` handler
164 function, described below. The child parser is created with the
165 :attr:`ordered_attributes`, :attr:`returns_unicode` and
166 :attr:`specified_attributes` set to the values of this parser.
167
Antoine Pitrou50472252011-01-05 18:41:13 +0000168.. method:: xmlparser.SetParamEntityParsing(flag)
169
170 Control parsing of parameter entities (including the external DTD subset).
171 Possible *flag* values are :const:`XML_PARAM_ENTITY_PARSING_NEVER`,
172 :const:`XML_PARAM_ENTITY_PARSING_UNLESS_STANDALONE` and
173 :const:`XML_PARAM_ENTITY_PARSING_ALWAYS`. Return true if setting the flag
174 was successful.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000175
176.. method:: xmlparser.UseForeignDTD([flag])
177
178 Calling this with a true value for *flag* (the default) will cause Expat to call
179 the :attr:`ExternalEntityRefHandler` with :const:`None` for all arguments to
180 allow an alternate DTD to be loaded. If the document does not contain a
181 document type declaration, the :attr:`ExternalEntityRefHandler` will still be
182 called, but the :attr:`StartDoctypeDeclHandler` and
183 :attr:`EndDoctypeDeclHandler` will not be called.
184
185 Passing a false value for *flag* will cancel a previous call that passed a true
186 value, but otherwise has no effect.
187
188 This method can only be called before the :meth:`Parse` or :meth:`ParseFile`
189 methods are called; calling it after either of those have been called causes
190 :exc:`ExpatError` to be raised with the :attr:`code` attribute set to
191 :const:`errors.XML_ERROR_CANT_CHANGE_FEATURE_ONCE_PARSING`.
192
193 .. versionadded:: 2.3
194
195:class:`xmlparser` objects have the following attributes:
196
197
198.. attribute:: xmlparser.buffer_size
199
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000200 The size of the buffer used when :attr:`buffer_text` is true.
201 A new buffer size can be set by assigning a new integer value
202 to this attribute.
Andrew M. Kuchlinge0a49b62008-01-08 14:30:55 +0000203 When the size is changed, the buffer will be flushed.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000204
205 .. versionadded:: 2.3
206
Andrew M. Kuchlinge0a49b62008-01-08 14:30:55 +0000207 .. versionchanged:: 2.6
208 The buffer size can now be changed.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000209
210.. attribute:: xmlparser.buffer_text
211
212 Setting this to true causes the :class:`xmlparser` object to buffer textual
213 content returned by Expat to avoid multiple calls to the
214 :meth:`CharacterDataHandler` callback whenever possible. This can improve
215 performance substantially since Expat normally breaks character data into chunks
216 at every line ending. This attribute is false by default, and may be changed at
217 any time.
218
219 .. versionadded:: 2.3
220
221
222.. attribute:: xmlparser.buffer_used
223
224 If :attr:`buffer_text` is enabled, the number of bytes stored in the buffer.
225 These bytes represent UTF-8 encoded text. This attribute has no meaningful
226 interpretation when :attr:`buffer_text` is false.
227
228 .. versionadded:: 2.3
229
230
231.. attribute:: xmlparser.ordered_attributes
232
233 Setting this attribute to a non-zero integer causes the attributes to be
234 reported as a list rather than a dictionary. The attributes are presented in
235 the order found in the document text. For each attribute, two list entries are
236 presented: the attribute name and the attribute value. (Older versions of this
237 module also used this format.) By default, this attribute is false; it may be
238 changed at any time.
239
240 .. versionadded:: 2.1
241
242
243.. attribute:: xmlparser.returns_unicode
244
245 If this attribute is set to a non-zero integer, the handler functions will be
246 passed Unicode strings. If :attr:`returns_unicode` is :const:`False`, 8-bit
247 strings containing UTF-8 encoded data will be passed to the handlers. This is
248 :const:`True` by default when Python is built with Unicode support.
249
250 .. versionchanged:: 1.6
251 Can be changed at any time to affect the result type.
252
253
254.. attribute:: xmlparser.specified_attributes
255
256 If set to a non-zero integer, the parser will report only those attributes which
257 were specified in the document instance and not those which were derived from
258 attribute declarations. Applications which set this need to be especially
259 careful to use what additional information is available from the declarations as
260 needed to comply with the standards for the behavior of XML processors. By
261 default, this attribute is false; it may be changed at any time.
262
263 .. versionadded:: 2.1
264
265The following attributes contain values relating to the most recent error
266encountered by an :class:`xmlparser` object, and will only have correct values
267once a call to :meth:`Parse` or :meth:`ParseFile` has raised a
268:exc:`xml.parsers.expat.ExpatError` exception.
269
270
271.. attribute:: xmlparser.ErrorByteIndex
272
273 Byte index at which an error occurred.
274
275
276.. attribute:: xmlparser.ErrorCode
277
278 Numeric code specifying the problem. This value can be passed to the
279 :func:`ErrorString` function, or compared to one of the constants defined in the
280 ``errors`` object.
281
282
283.. attribute:: xmlparser.ErrorColumnNumber
284
285 Column number at which an error occurred.
286
287
288.. attribute:: xmlparser.ErrorLineNumber
289
290 Line number at which an error occurred.
291
292The following attributes contain values relating to the current parse location
293in an :class:`xmlparser` object. During a callback reporting a parse event they
294indicate the location of the first of the sequence of characters that generated
295the event. When called outside of a callback, the position indicated will be
296just past the last parse event (regardless of whether there was an associated
297callback).
298
299.. versionadded:: 2.4
300
301
302.. attribute:: xmlparser.CurrentByteIndex
303
304 Current byte index in the parser input.
305
306
307.. attribute:: xmlparser.CurrentColumnNumber
308
309 Current column number in the parser input.
310
311
312.. attribute:: xmlparser.CurrentLineNumber
313
314 Current line number in the parser input.
315
316Here is the list of handlers that can be set. To set a handler on an
317:class:`xmlparser` object *o*, use ``o.handlername = func``. *handlername* must
318be taken from the following list, and *func* must be a callable object accepting
319the correct number of arguments. The arguments are all strings, unless
320otherwise stated.
321
322
323.. method:: xmlparser.XmlDeclHandler(version, encoding, standalone)
324
325 Called when the XML declaration is parsed. The XML declaration is the
326 (optional) declaration of the applicable version of the XML recommendation, the
327 encoding of the document text, and an optional "standalone" declaration.
328 *version* and *encoding* will be strings of the type dictated by the
329 :attr:`returns_unicode` attribute, and *standalone* will be ``1`` if the
330 document is declared standalone, ``0`` if it is declared not to be standalone,
331 or ``-1`` if the standalone clause was omitted. This is only available with
332 Expat version 1.95.0 or newer.
333
334 .. versionadded:: 2.1
335
336
337.. method:: xmlparser.StartDoctypeDeclHandler(doctypeName, systemId, publicId, has_internal_subset)
338
339 Called when Expat begins parsing the document type declaration (``<!DOCTYPE
340 ...``). The *doctypeName* is provided exactly as presented. The *systemId* and
341 *publicId* parameters give the system and public identifiers if specified, or
342 ``None`` if omitted. *has_internal_subset* will be true if the document
343 contains and internal document declaration subset. This requires Expat version
344 1.2 or newer.
345
346
347.. method:: xmlparser.EndDoctypeDeclHandler()
348
349 Called when Expat is done parsing the document type declaration. This requires
350 Expat version 1.2 or newer.
351
352
353.. method:: xmlparser.ElementDeclHandler(name, model)
354
355 Called once for each element type declaration. *name* is the name of the
356 element type, and *model* is a representation of the content model.
357
358
359.. method:: xmlparser.AttlistDeclHandler(elname, attname, type, default, required)
360
361 Called for each declared attribute for an element type. If an attribute list
362 declaration declares three attributes, this handler is called three times, once
363 for each attribute. *elname* is the name of the element to which the
364 declaration applies and *attname* is the name of the attribute declared. The
365 attribute type is a string passed as *type*; the possible values are
366 ``'CDATA'``, ``'ID'``, ``'IDREF'``, ... *default* gives the default value for
367 the attribute used when the attribute is not specified by the document instance,
368 or ``None`` if there is no default value (``#IMPLIED`` values). If the
369 attribute is required to be given in the document instance, *required* will be
370 true. This requires Expat version 1.95.0 or newer.
371
372
373.. method:: xmlparser.StartElementHandler(name, attributes)
374
375 Called for the start of every element. *name* is a string containing the
376 element name, and *attributes* is a dictionary mapping attribute names to their
377 values.
378
379
380.. method:: xmlparser.EndElementHandler(name)
381
382 Called for the end of every element.
383
384
385.. method:: xmlparser.ProcessingInstructionHandler(target, data)
386
387 Called for every processing instruction.
388
389
390.. method:: xmlparser.CharacterDataHandler(data)
391
392 Called for character data. This will be called for normal character data, CDATA
393 marked content, and ignorable whitespace. Applications which must distinguish
394 these cases can use the :attr:`StartCdataSectionHandler`,
395 :attr:`EndCdataSectionHandler`, and :attr:`ElementDeclHandler` callbacks to
396 collect the required information.
397
398
399.. method:: xmlparser.UnparsedEntityDeclHandler(entityName, base, systemId, publicId, notationName)
400
401 Called for unparsed (NDATA) entity declarations. This is only present for
402 version 1.2 of the Expat library; for more recent versions, use
403 :attr:`EntityDeclHandler` instead. (The underlying function in the Expat
404 library has been declared obsolete.)
405
406
407.. method:: xmlparser.EntityDeclHandler(entityName, is_parameter_entity, value, base, systemId, publicId, notationName)
408
409 Called for all entity declarations. For parameter and internal entities,
410 *value* will be a string giving the declared contents of the entity; this will
411 be ``None`` for external entities. The *notationName* parameter will be
412 ``None`` for parsed entities, and the name of the notation for unparsed
413 entities. *is_parameter_entity* will be true if the entity is a parameter entity
414 or false for general entities (most applications only need to be concerned with
415 general entities). This is only available starting with version 1.95.0 of the
416 Expat library.
417
418 .. versionadded:: 2.1
419
420
421.. method:: xmlparser.NotationDeclHandler(notationName, base, systemId, publicId)
422
423 Called for notation declarations. *notationName*, *base*, and *systemId*, and
424 *publicId* are strings if given. If the public identifier is omitted,
425 *publicId* will be ``None``.
426
427
428.. method:: xmlparser.StartNamespaceDeclHandler(prefix, uri)
429
430 Called when an element contains a namespace declaration. Namespace declarations
431 are processed before the :attr:`StartElementHandler` is called for the element
432 on which declarations are placed.
433
434
435.. method:: xmlparser.EndNamespaceDeclHandler(prefix)
436
437 Called when the closing tag is reached for an element that contained a
438 namespace declaration. This is called once for each namespace declaration on
439 the element in the reverse of the order for which the
440 :attr:`StartNamespaceDeclHandler` was called to indicate the start of each
441 namespace declaration's scope. Calls to this handler are made after the
442 corresponding :attr:`EndElementHandler` for the end of the element.
443
444
445.. method:: xmlparser.CommentHandler(data)
446
447 Called for comments. *data* is the text of the comment, excluding the leading
Ezio Melottia8e49632012-09-20 09:48:07 +0300448 ``'<!-``\ ``-'`` and trailing ``'-``\ ``->'``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000449
450
451.. method:: xmlparser.StartCdataSectionHandler()
452
453 Called at the start of a CDATA section. This and :attr:`EndCdataSectionHandler`
454 are needed to be able to identify the syntactical start and end for CDATA
455 sections.
456
457
458.. method:: xmlparser.EndCdataSectionHandler()
459
460 Called at the end of a CDATA section.
461
462
463.. method:: xmlparser.DefaultHandler(data)
464
465 Called for any characters in the XML document for which no applicable handler
466 has been specified. This means characters that are part of a construct which
467 could be reported, but for which no handler has been supplied.
468
469
470.. method:: xmlparser.DefaultHandlerExpand(data)
471
472 This is the same as the :func:`DefaultHandler`, but doesn't inhibit expansion
473 of internal entities. The entity reference will not be passed to the default
474 handler.
475
476
477.. method:: xmlparser.NotStandaloneHandler()
478
479 Called if the XML document hasn't been declared as being a standalone document.
480 This happens when there is an external subset or a reference to a parameter
481 entity, but the XML declaration does not set standalone to ``yes`` in an XML
Georg Brandl21946af2010-10-06 09:28:45 +0000482 declaration. If this handler returns ``0``, then the parser will raise an
Georg Brandl8ec7f652007-08-15 14:28:01 +0000483 :const:`XML_ERROR_NOT_STANDALONE` error. If this handler is not set, no
484 exception is raised by the parser for this condition.
485
486
487.. method:: xmlparser.ExternalEntityRefHandler(context, base, systemId, publicId)
488
489 Called for references to external entities. *base* is the current base, as set
490 by a previous call to :meth:`SetBase`. The public and system identifiers,
491 *systemId* and *publicId*, are strings if given; if the public identifier is not
492 given, *publicId* will be ``None``. The *context* value is opaque and should
493 only be used as described below.
494
495 For external entities to be parsed, this handler must be implemented. It is
496 responsible for creating the sub-parser using
497 ``ExternalEntityParserCreate(context)``, initializing it with the appropriate
498 callbacks, and parsing the entity. This handler should return an integer; if it
Georg Brandl21946af2010-10-06 09:28:45 +0000499 returns ``0``, the parser will raise an
Georg Brandl8ec7f652007-08-15 14:28:01 +0000500 :const:`XML_ERROR_EXTERNAL_ENTITY_HANDLING` error, otherwise parsing will
501 continue.
502
503 If this handler is not provided, external entities are reported by the
504 :attr:`DefaultHandler` callback, if provided.
505
506
507.. _expaterror-objects:
508
509ExpatError Exceptions
510---------------------
511
512.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org>
513
514
515:exc:`ExpatError` exceptions have a number of interesting attributes:
516
517
518.. attribute:: ExpatError.code
519
520 Expat's internal error number for the specific error. This will match one of
521 the constants defined in the ``errors`` object from this module.
522
523 .. versionadded:: 2.1
524
525
526.. attribute:: ExpatError.lineno
527
528 Line number on which the error was detected. The first line is numbered ``1``.
529
530 .. versionadded:: 2.1
531
532
533.. attribute:: ExpatError.offset
534
535 Character offset into the line where the error occurred. The first column is
536 numbered ``0``.
537
538 .. versionadded:: 2.1
539
540
541.. _expat-example:
542
543Example
544-------
545
546The following program defines three handlers that just print out their
547arguments. ::
548
549 import xml.parsers.expat
550
551 # 3 handler functions
552 def start_element(name, attrs):
553 print 'Start element:', name, attrs
554 def end_element(name):
555 print 'End element:', name
556 def char_data(data):
557 print 'Character data:', repr(data)
558
559 p = xml.parsers.expat.ParserCreate()
560
561 p.StartElementHandler = start_element
562 p.EndElementHandler = end_element
563 p.CharacterDataHandler = char_data
564
565 p.Parse("""<?xml version="1.0"?>
566 <parent id="top"><child1 name="paul">Text goes here</child1>
567 <child2 name="fred">More text</child2>
568 </parent>""", 1)
569
570The output from this program is::
571
572 Start element: parent {'id': 'top'}
573 Start element: child1 {'name': 'paul'}
574 Character data: 'Text goes here'
575 End element: child1
576 Character data: '\n'
577 Start element: child2 {'name': 'fred'}
578 Character data: 'More text'
579 End element: child2
580 Character data: '\n'
581 End element: parent
582
583
584.. _expat-content-models:
585
586Content Model Descriptions
587--------------------------
588
589.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org>
590
591
592Content modules are described using nested tuples. Each tuple contains four
593values: the type, the quantifier, the name, and a tuple of children. Children
594are simply additional content module descriptions.
595
596The values of the first two fields are constants defined in the ``model`` object
597of the :mod:`xml.parsers.expat` module. These constants can be collected in two
598groups: the model type group and the quantifier group.
599
600The constants in the model type group are:
601
602
603.. data:: XML_CTYPE_ANY
604 :noindex:
605
606 The element named by the model name was declared to have a content model of
607 ``ANY``.
608
609
610.. data:: XML_CTYPE_CHOICE
611 :noindex:
612
613 The named element allows a choice from a number of options; this is used for
614 content models such as ``(A | B | C)``.
615
616
617.. data:: XML_CTYPE_EMPTY
618 :noindex:
619
620 Elements which are declared to be ``EMPTY`` have this model type.
621
622
623.. data:: XML_CTYPE_MIXED
624 :noindex:
625
626
627.. data:: XML_CTYPE_NAME
628 :noindex:
629
630
631.. data:: XML_CTYPE_SEQ
632 :noindex:
633
634 Models which represent a series of models which follow one after the other are
635 indicated with this model type. This is used for models such as ``(A, B, C)``.
636
637The constants in the quantifier group are:
638
639
640.. data:: XML_CQUANT_NONE
641 :noindex:
642
643 No modifier is given, so it can appear exactly once, as for ``A``.
644
645
646.. data:: XML_CQUANT_OPT
647 :noindex:
648
649 The model is optional: it can appear once or not at all, as for ``A?``.
650
651
652.. data:: XML_CQUANT_PLUS
653 :noindex:
654
655 The model must occur one or more times (like ``A+``).
656
657
658.. data:: XML_CQUANT_REP
659 :noindex:
660
661 The model must occur zero or more times, as for ``A*``.
662
663
664.. _expat-errors:
665
666Expat error constants
667---------------------
668
669The following constants are provided in the ``errors`` object of the
670:mod:`xml.parsers.expat` module. These constants are useful in interpreting
671some of the attributes of the :exc:`ExpatError` exception objects raised when an
672error has occurred.
673
674The ``errors`` object has the following attributes:
675
676
677.. data:: XML_ERROR_ASYNC_ENTITY
678 :noindex:
679
680
681.. data:: XML_ERROR_ATTRIBUTE_EXTERNAL_ENTITY_REF
682 :noindex:
683
684 An entity reference in an attribute value referred to an external entity instead
685 of an internal entity.
686
687
688.. data:: XML_ERROR_BAD_CHAR_REF
689 :noindex:
690
691 A character reference referred to a character which is illegal in XML (for
692 example, character ``0``, or '``&#0;``').
693
694
695.. data:: XML_ERROR_BINARY_ENTITY_REF
696 :noindex:
697
698 An entity reference referred to an entity which was declared with a notation, so
699 cannot be parsed.
700
701
702.. data:: XML_ERROR_DUPLICATE_ATTRIBUTE
703 :noindex:
704
705 An attribute was used more than once in a start tag.
706
707
708.. data:: XML_ERROR_INCORRECT_ENCODING
709 :noindex:
710
711
712.. data:: XML_ERROR_INVALID_TOKEN
713 :noindex:
714
715 Raised when an input byte could not properly be assigned to a character; for
716 example, a NUL byte (value ``0``) in a UTF-8 input stream.
717
718
719.. data:: XML_ERROR_JUNK_AFTER_DOC_ELEMENT
720 :noindex:
721
722 Something other than whitespace occurred after the document element.
723
724
725.. data:: XML_ERROR_MISPLACED_XML_PI
726 :noindex:
727
728 An XML declaration was found somewhere other than the start of the input data.
729
730
731.. data:: XML_ERROR_NO_ELEMENTS
732 :noindex:
733
734 The document contains no elements (XML requires all documents to contain exactly
735 one top-level element)..
736
737
738.. data:: XML_ERROR_NO_MEMORY
739 :noindex:
740
741 Expat was not able to allocate memory internally.
742
743
744.. data:: XML_ERROR_PARAM_ENTITY_REF
745 :noindex:
746
747 A parameter entity reference was found where it was not allowed.
748
749
750.. data:: XML_ERROR_PARTIAL_CHAR
751 :noindex:
752
753 An incomplete character was found in the input.
754
755
756.. data:: XML_ERROR_RECURSIVE_ENTITY_REF
757 :noindex:
758
759 An entity reference contained another reference to the same entity; possibly via
760 a different name, and possibly indirectly.
761
762
763.. data:: XML_ERROR_SYNTAX
764 :noindex:
765
766 Some unspecified syntax error was encountered.
767
768
769.. data:: XML_ERROR_TAG_MISMATCH
770 :noindex:
771
772 An end tag did not match the innermost open start tag.
773
774
775.. data:: XML_ERROR_UNCLOSED_TOKEN
776 :noindex:
777
778 Some token (such as a start tag) was not closed before the end of the stream or
779 the next token was encountered.
780
781
782.. data:: XML_ERROR_UNDEFINED_ENTITY
783 :noindex:
784
785 A reference was made to a entity which was not defined.
786
787
788.. data:: XML_ERROR_UNKNOWN_ENCODING
789 :noindex:
790
791 The document encoding is not supported by Expat.
792
793
794.. data:: XML_ERROR_UNCLOSED_CDATA_SECTION
795 :noindex:
796
797 A CDATA marked section was not closed.
798
799
800.. data:: XML_ERROR_EXTERNAL_ENTITY_HANDLING
801 :noindex:
802
803
804.. data:: XML_ERROR_NOT_STANDALONE
805 :noindex:
806
807 The parser determined that the document was not "standalone" though it declared
808 itself to be in the XML declaration, and the :attr:`NotStandaloneHandler` was
809 set and returned ``0``.
810
811
812.. data:: XML_ERROR_UNEXPECTED_STATE
813 :noindex:
814
815
816.. data:: XML_ERROR_ENTITY_DECLARED_IN_PE
817 :noindex:
818
819
820.. data:: XML_ERROR_FEATURE_REQUIRES_XML_DTD
821 :noindex:
822
823 An operation was requested that requires DTD support to be compiled in, but
824 Expat was configured without DTD support. This should never be reported by a
825 standard build of the :mod:`xml.parsers.expat` module.
826
827
828.. data:: XML_ERROR_CANT_CHANGE_FEATURE_ONCE_PARSING
829 :noindex:
830
831 A behavioral change was requested after parsing started that can only be changed
832 before parsing has started. This is (currently) only raised by
833 :meth:`UseForeignDTD`.
834
835
836.. data:: XML_ERROR_UNBOUND_PREFIX
837 :noindex:
838
839 An undeclared prefix was found when namespace processing was enabled.
840
841
842.. data:: XML_ERROR_UNDECLARING_PREFIX
843 :noindex:
844
845 The document attempted to remove the namespace declaration associated with a
846 prefix.
847
848
849.. data:: XML_ERROR_INCOMPLETE_PE
850 :noindex:
851
852 A parameter entity contained incomplete markup.
853
854
855.. data:: XML_ERROR_XML_DECL
856 :noindex:
857
858 The document contained no document element at all.
859
860
861.. data:: XML_ERROR_TEXT_DECL
862 :noindex:
863
864 There was an error parsing a text declaration in an external entity.
865
866
867.. data:: XML_ERROR_PUBLICID
868 :noindex:
869
870 Characters were found in the public id that are not allowed.
871
872
873.. data:: XML_ERROR_SUSPENDED
874 :noindex:
875
876 The requested operation was made on a suspended parser, but isn't allowed. This
877 includes attempts to provide additional input or to stop the parser.
878
879
880.. data:: XML_ERROR_NOT_SUSPENDED
881 :noindex:
882
883 An attempt to resume the parser was made when the parser had not been suspended.
884
885
886.. data:: XML_ERROR_ABORTED
887 :noindex:
888
889 This should not be reported to Python applications.
890
891
892.. data:: XML_ERROR_FINISHED
893 :noindex:
894
895 The requested operation was made on a parser which was finished parsing input,
896 but isn't allowed. This includes attempts to provide additional input or to
897 stop the parser.
898
899
900.. data:: XML_ERROR_SUSPEND_PE
901 :noindex:
902
Mark Summerfield43da35d2008-03-17 08:28:15 +0000903
904.. rubric:: Footnotes
905
906.. [#] The encoding string included in XML output should conform to the
907 appropriate standards. For example, "UTF-8" is valid, but "UTF8" is
908 not. See http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl
909 and http://www.iana.org/assignments/character-sets .
910