- parser.c: one must report spaces even if the Dtd element
content proves that this is not part of the element content.
- result/valid/*.xml: this changed the ouptu slightly
Daniel
diff --git a/ChangeLog b/ChangeLog
index e44c85f..891141e 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,9 @@
+Sat Mar 3 02:10:24 CET 2001 Daniel Veillard <Daniel.Veillard@imag.fr>
+
+ * parser.c: one must report spaces even if the Dtd element
+ content proves that this is not part of the element content.
+ * result/valid/*.xml: this changed the ouptu slightly
+
Thu Mar 1 17:53:39 CET 2001 Daniel Veillard <Daniel.Veillard@imag.fr>
* configure.in: bumped to 2.3.3
diff --git a/parser.c b/parser.c
index b6f6144..f1617a3 100644
--- a/parser.c
+++ b/parser.c
@@ -1393,6 +1393,9 @@
int i, ret;
xmlNodePtr lastChild;
+ if (ctxt->keepBlanks)
+ return(0);
+
/*
* Check for xml:space value.
*/
@@ -1417,8 +1420,6 @@
/*
* Otherwise, heuristic :-\
*/
- if (ctxt->keepBlanks)
- return(0);
if (RAW != '<') return(0);
if (ctxt->node == NULL) return(0);
if ((ctxt->node->children == NULL) &&
@@ -1641,7 +1642,7 @@
return(ret);
}
}
- xmlParseNameComplex(ctxt);
+ return(xmlParseNameComplex(ctxt));
}
xmlChar *
diff --git a/result/valid/REC-xml-19980210.xml b/result/valid/REC-xml-19980210.xml
index 2d4f035..45d941e 100644
--- a/result/valid/REC-xml-19980210.xml
+++ b/result/valid/REC-xml-19980210.xml
@@ -39,11 +39,8 @@
<version/>
<w3c-designation>REC-xml-&iso6.doc.date;</w3c-designation>
<w3c-doctype>W3C Recommendation</w3c-doctype>
-<pubdate>
-<day>&draft.day;</day>
-<month>&draft.month;</month>
-<year>&draft.year;</year>
-</pubdate>
+<pubdate><day>&draft.day;</day><month>&draft.month;</month><year>&draft.year;</year></pubdate>
+
<publoc>
<loc href="http://www.w3.org/TR/1998/REC-xml-&iso6.doc.date;">
http://www.w3.org/TR/1998/REC-xml-&iso6.doc.date;</loc>
@@ -76,21 +73,15 @@
http://www.w3.org/TR/WD-xml-971117</loc>-->
</prevlocs>
<authlist>
-<author>
-<name>Tim Bray</name>
+<author><name>Tim Bray</name>
<affiliation>Textuality and Netscape</affiliation>
-<email href="mailto:tbray@textuality.com">tbray@textuality.com</email>
-</author>
-<author>
-<name>Jean Paoli</name>
+<email href="mailto:tbray@textuality.com">tbray@textuality.com</email></author>
+<author><name>Jean Paoli</name>
<affiliation>Microsoft</affiliation>
-<email href="mailto:jeanpa@microsoft.com">jeanpa@microsoft.com</email>
-</author>
-<author>
-<name>C. M. Sperberg-McQueen</name>
+<email href="mailto:jeanpa@microsoft.com">jeanpa@microsoft.com</email></author>
+<author><name>C. M. Sperberg-McQueen</name>
<affiliation>University of Illinois at Chicago</affiliation>
-<email href="mailto:cmsmcq@uic.edu">cmsmcq@uic.edu</email>
-</author>
+<email href="mailto:cmsmcq@uic.edu">cmsmcq@uic.edu</email></author>
</authlist>
<abstract>
<p>The Extensible Markup Language (XML) is a subset of
@@ -128,6 +119,8 @@
<loc href="mailto:xml-editor@w3.org">xml-editor@w3.org</loc>.
</p>
</status>
+
+
<pubstmt>
<p>Chicago, Vancouver, Mountain View, et al.:
World-Wide Web Consortium, XML Working Group, 1996, 1997.</p>
@@ -358,7 +351,7 @@
</slist>
</revisiondesc>
</header>
-<body>
+<body>
<div1 id="sec-intro">
<head>Introduction</head>
<p>Extensible Markup Language, abbreviated XML, describes a class of
@@ -386,6 +379,7 @@
<term>application</term>.</termdef> This specification describes the
required behavior of an XML processor in terms of how it must read XML
data and the information it must provide to the application.</p>
+
<div2 id="sec-origin-goals">
<head>Origin and Goals</head>
<p>XML was developed by an XML Working Group (originally known as the
@@ -397,11 +391,21 @@
organized by the W3C. The membership of the XML Working Group is given
in an appendix. Dan Connolly served as the WG's contact with the W3C.
</p>
-<p>The design goals for XML are:<olist><item><p>XML shall be straightforwardly usable over the
-Internet.</p></item><item><p>XML shall support a wide variety of applications.</p></item><item><p>XML shall be compatible with SGML.</p></item><item><p>It shall be easy to write programs which process XML
-documents.</p></item><item><p>The number of optional features in XML is to be kept to the
-absolute minimum, ideally zero.</p></item><item><p>XML documents should be human-legible and reasonably
-clear.</p></item><item><p>The XML design should be prepared quickly.</p></item><item><p>The design of XML shall be formal and concise.</p></item><item><p>XML documents shall be easy to create.</p></item><item><p>Terseness in XML markup is of minimal importance.</p></item></olist>
+<p>The design goals for XML are:<olist>
+<item><p>XML shall be straightforwardly usable over the
+Internet.</p></item>
+<item><p>XML shall support a wide variety of applications.</p></item>
+<item><p>XML shall be compatible with SGML.</p></item>
+<item><p>It shall be easy to write programs which process XML
+documents.</p></item>
+<item><p>The number of optional features in XML is to be kept to the
+absolute minimum, ideally zero.</p></item>
+<item><p>XML documents should be human-legible and reasonably
+clear.</p></item>
+<item><p>The XML design should be prepared quickly.</p></item>
+<item><p>The design of XML shall be formal and concise.</p></item>
+<item><p>XML documents shall be easy to create.</p></item>
+<item><p>Terseness in XML markup is of minimal importance.</p></item></olist>
</p>
<p>This specification,
together with associated standards
@@ -415,23 +419,44 @@
<p>This version of the XML specification
<!-- is for &doc.audience;.-->
&doc.distribution;.</p>
+
</div2>
+
+
+
+
<div2 id="sec-terminology">
<head>Terminology</head>
+
<p>The terminology used to describe XML documents is defined in the body of
this specification.
The terms defined in the following list are used in building those
definitions and in describing the actions of an XML processor:
-<glist><gitem><label>may</label><def><p><termdef id="dt-may" term="May">Conforming documents and XML
+<glist>
+<gitem>
+<label>may</label>
+<def><p><termdef id="dt-may" term="May">Conforming documents and XML
processors are permitted to but need not behave as
-described.</termdef></p></def></gitem><gitem><label>must</label><def><p>Conforming documents and XML processors
+described.</termdef></p></def>
+</gitem>
+<gitem>
+<label>must</label>
+<def><p>Conforming documents and XML processors
are required to behave as described; otherwise they are in error.
<!-- do NOT change this! this is what defines a violation of
a 'must' clause as 'an error'. -MSM -->
-</p></def></gitem><gitem><label>error</label><def><p><termdef id="dt-error" term="Error">A violation of the rules of this
+</p></def>
+</gitem>
+<gitem>
+<label>error</label>
+<def><p><termdef id="dt-error" term="Error">A violation of the rules of this
specification; results are
undefined. Conforming software may detect and report an error and may
-recover from it.</termdef></p></def></gitem><gitem><label>fatal error</label><def><p><termdef id="dt-fatal" term="Fatal Error">An error
+recover from it.</termdef></p></def>
+</gitem>
+<gitem>
+<label>fatal error</label>
+<def><p><termdef id="dt-fatal" term="Fatal Error">An error
which a conforming <termref def="dt-xml-proc">XML processor</termref>
must detect and report to the application.
After encountering a fatal error, the
@@ -444,16 +469,33 @@
continue normal processing (i.e., it must not
continue to pass character data and information about the document's
logical structure to the application in the normal way).
-</termdef></p></def></gitem><gitem><label>at user option</label><def><p>Conforming software may or must (depending on the modal verb in the
+</termdef></p></def>
+</gitem>
+<gitem>
+<label>at user option</label>
+<def><p>Conforming software may or must (depending on the modal verb in the
sentence) behave as described; if it does, it must
provide users a means to enable or disable the behavior
-described.</p></def></gitem><gitem><label>validity constraint</label><def><p>A rule which applies to all
+described.</p></def>
+</gitem>
+<gitem>
+<label>validity constraint</label>
+<def><p>A rule which applies to all
<termref def="dt-valid">valid</termref> XML documents.
Violations of validity constraints are errors; they must, at user option,
be reported by
-<termref def="dt-validating">validating XML processors</termref>.</p></def></gitem><gitem><label>well-formedness constraint</label><def><p>A rule which applies to all <termref def="dt-wellformed">well-formed</termref> XML documents.
+<termref def="dt-validating">validating XML processors</termref>.</p></def>
+</gitem>
+<gitem>
+<label>well-formedness constraint</label>
+<def><p>A rule which applies to all <termref def="dt-wellformed">well-formed</termref> XML documents.
Violations of well-formedness constraints are
-<termref def="dt-fatal">fatal errors</termref>.</p></def></gitem><gitem><label>match</label><def><p><termdef id="dt-match" term="match">(Of strings or names:)
+<termref def="dt-fatal">fatal errors</termref>.</p></def>
+</gitem>
+
+<gitem>
+<label>match</label>
+<def><p><termdef id="dt-match" term="match">(Of strings or names:)
Two strings or names being compared must be identical.
Characters with multiple possible representations in ISO/IEC 10646 (e.g.
characters with
@@ -470,29 +512,42 @@
in the fashion described in the constraint
<specref ref="elementvalid"/>.
</termdef>
-</p></def></gitem><gitem><label>for compatibility</label><def><p><termdef id="dt-compat" term="For Compatibility">A feature of
+</p></def>
+</gitem>
+<gitem>
+<label>for compatibility</label>
+<def><p><termdef id="dt-compat" term="For Compatibility">A feature of
XML included solely to ensure that XML remains compatible with SGML.
-</termdef></p></def></gitem><gitem><label>for interoperability</label><def><p><termdef id="dt-interop" term="For interoperability">A
+</termdef></p></def>
+</gitem>
+<gitem>
+<label>for interoperability</label>
+<def><p><termdef id="dt-interop" term="For interoperability">A
non-binding recommendation included to increase the chances that XML
documents can be processed by the existing installed base of SGML
processors which predate the
-&WebSGML;.</termdef></p></def></gitem></glist>
+&WebSGML;.</termdef></p></def>
+</gitem>
+</glist>
</p>
</div2>
+
+
</div1>
<!-- &Docs; -->
+
<div1 id="sec-documents">
<head>Documents</head>
-<p>
-<termdef id="dt-xml-doc" term="XML Document">
+
+<p><termdef id="dt-xml-doc" term="XML Document">
A data object is an
<term>XML document</term> if it is
<termref def="dt-wellformed">well-formed</termref>, as
defined in this specification.
A well-formed XML document may in addition be
<termref def="dt-valid">valid</termref> if it meets certain further
-constraints.</termdef>
-</p>
+constraints.</termdef></p>
+
<p>Each XML document has both a logical and a physical structure.
Physically, the document is composed of units called <termref def="dt-entity">entities</termref>. An entity may <termref def="dt-entref">refer</termref> to other entities to cause their
inclusion in the document. A document begins in a "root" or <termref def="dt-docent">document entity</termref>.
@@ -505,42 +560,57 @@
The logical and physical structures must nest properly, as described
in <specref ref="wf-entities"/>.
</p>
+
<div2 id="sec-well-formed">
<head>Well-Formed XML Documents</head>
+
<p><termdef id="dt-wellformed" term="Well-Formed">
A textual object is
a well-formed XML document if:</termdef>
-<olist><item><p>Taken as a whole, it
-matches the production labeled <nt def="NT-document">document</nt>.</p></item><item><p>It
-meets all the well-formedness constraints given in this specification.</p></item><item><p>Each of the <termref def="dt-parsedent">parsed entities</termref>
+<olist>
+<item><p>Taken as a whole, it
+matches the production labeled <nt def="NT-document">document</nt>.</p></item>
+<item><p>It
+meets all the well-formedness constraints given in this specification.</p>
+</item>
+<item><p>Each of the <termref def="dt-parsedent">parsed entities</termref>
which is referenced directly or indirectly within the document is
-<titleref href="wf-entities">well-formed</titleref>.</p></item></olist></p>
+<titleref href="wf-entities">well-formed</titleref>.</p></item>
+</olist></p>
<p>
-<scrap lang="ebnf" id="document"><head>Document</head><prod id="NT-document"><lhs>document</lhs><rhs><nt def="NT-prolog">prolog</nt>
+<scrap lang="ebnf" id="document">
+<head>Document</head>
+<prod id="NT-document"><lhs>document</lhs>
+<rhs><nt def="NT-prolog">prolog</nt>
<nt def="NT-element">element</nt>
-<nt def="NT-Misc">Misc</nt>*</rhs></prod></scrap>
+<nt def="NT-Misc">Misc</nt>*</rhs></prod>
+</scrap>
</p>
<p>Matching the <nt def="NT-document">document</nt> production
implies that:
-<olist><item><p>It contains one or more
-<termref def="dt-element">elements</termref>.</p></item><!--* N.B. some readers (notably JC) find the following
+<olist>
+<item><p>It contains one or more
+<termref def="dt-element">elements</termref>.</p>
+</item>
+<!--* N.B. some readers (notably JC) find the following
paragraph awkward and redundant. I agree it's logically redundant:
it *says* it is summarizing the logical implications of
matching the grammar, and that means by definition it's
logically redundant. I don't think it's rhetorically
redundant or unnecessary, though, so I'm keeping it. It
could however use some recasting when the editors are feeling
-stronger. -MSM *--><item><p><termdef id="dt-root" term="Root Element">There is exactly
+stronger. -MSM *-->
+<item><p><termdef id="dt-root" term="Root Element">There is exactly
one element, called the <term>root</term>, or document element, no
part of which appears in the <termref def="dt-content">content</termref> of any other element.</termdef>
For all other elements, if the start-tag is in the content of another
element, the end-tag is in the content of the same element. More
simply stated, the elements, delimited by start- and end-tags, nest
properly within each other.
-</p></item></olist>
+</p></item>
+</olist>
</p>
-<p>
-<termdef id="dt-parentchild" term="Parent/Child">As a consequence
+<p><termdef id="dt-parentchild" term="Parent/Child">As a consequence
of this,
for each non-root element
<code>C</code> in the document, there is one other element <code>P</code>
@@ -550,11 +620,11 @@
<code>P</code>.
<code>P</code> is referred to as the
<term>parent</term> of <code>C</code>, and <code>C</code> as a
-<term>child</term> of <code>P</code>.</termdef>
-</p>
-</div2>
+<term>child</term> of <code>P</code>.</termdef></p></div2>
+
<div2 id="charsets">
<head>Characters</head>
+
<p><termdef id="dt-text" term="Text">A parsed entity contains
<term>text</term>, a sequence of
<termref def="dt-character">characters</termref>,
@@ -567,10 +637,18 @@
The use of "compatibility characters", as defined in section 6.8
of <bibref ref="Unicode"/>, is discouraged.
</termdef>
-<scrap lang="ebnf" id="char32"><head>Character Range</head><prodgroup pcw2="4" pcw4="17.5" pcw5="11"><prod id="NT-Char"><lhs>Char</lhs><rhs>#x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD]
-| [#x10000-#x10FFFF]</rhs><com>any Unicode character, excluding the
-surrogate blocks, FFFE, and FFFF.</com></prod></prodgroup></scrap>
+<scrap lang="ebnf" id="char32">
+<head>Character Range</head>
+<prodgroup pcw2="4" pcw4="17.5" pcw5="11">
+<prod id="NT-Char"><lhs>Char</lhs>
+<rhs>#x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD]
+| [#x10000-#x10FFFF]</rhs>
+<com>any Unicode character, excluding the
+surrogate blocks, FFFE, and FFFF.</com> </prod>
+</prodgroup>
+</scrap>
</p>
+
<p>The mechanism for encoding character code points into bit patterns may
vary from entity to entity. All XML processors must accept the UTF-8
and UTF-16 encodings of 10646; the mechanisms for signaling which of
@@ -584,13 +662,22 @@
UCS-4 code value.
</p>-->
</div2>
+
<div2 id="sec-common-syn">
<head>Common Syntactic Constructs</head>
+
<p>This section defines some symbols used widely in the grammar.</p>
<p><nt def="NT-S">S</nt> (white space) consists of one or more space (#x20)
characters, carriage returns, line feeds, or tabs.
-<scrap lang="ebnf" id="white"><head>White Space</head><prodgroup pcw2="4" pcw4="17.5" pcw5="11"><prod id="NT-S"><lhs>S</lhs><rhs>(#x20 | #x9 | #xD | #xA)+</rhs></prod></prodgroup></scrap></p>
+<scrap lang="ebnf" id="white">
+<head>White Space</head>
+<prodgroup pcw2="4" pcw4="17.5" pcw5="11">
+<prod id="NT-S"><lhs>S</lhs>
+<rhs>(#x20 | #x9 | #xD | #xA)+</rhs>
+</prod>
+</prodgroup>
+</scrap></p>
<p>Characters are classified for convenience as letters, digits, or other
characters. Letters consist of an alphabetic or syllabic
base character possibly
@@ -622,13 +709,26 @@
<p>An
<nt def="NT-Nmtoken">Nmtoken</nt> (name token) is any mixture of
name characters.
-<scrap lang="ebnf"><head>Names and Tokens</head><prod id="NT-NameChar"><lhs>NameChar</lhs><rhs><nt def="NT-Letter">Letter</nt>
+<scrap lang="ebnf">
+<head>Names and Tokens</head>
+<prod id="NT-NameChar"><lhs>NameChar</lhs>
+<rhs><nt def="NT-Letter">Letter</nt>
| <nt def="NT-Digit">Digit</nt>
| '.' | '-' | '_' | ':'
| <nt def="NT-CombiningChar">CombiningChar</nt>
-| <nt def="NT-Extender">Extender</nt></rhs></prod><prod id="NT-Name"><lhs>Name</lhs><rhs>(<nt def="NT-Letter">Letter</nt> | '_' | ':')
-(<nt def="NT-NameChar">NameChar</nt>)*</rhs></prod><prod id="NT-Names"><lhs>Names</lhs><rhs><nt def="NT-Name">Name</nt>
-(<nt def="NT-S">S</nt> <nt def="NT-Name">Name</nt>)*</rhs></prod><prod id="NT-Nmtoken"><lhs>Nmtoken</lhs><rhs>(<nt def="NT-NameChar">NameChar</nt>)+</rhs></prod><prod id="NT-Nmtokens"><lhs>Nmtokens</lhs><rhs><nt def="NT-Nmtoken">Nmtoken</nt> (<nt def="NT-S">S</nt> <nt def="NT-Nmtoken">Nmtoken</nt>)*</rhs></prod></scrap>
+| <nt def="NT-Extender">Extender</nt></rhs>
+</prod>
+<prod id="NT-Name"><lhs>Name</lhs>
+<rhs>(<nt def="NT-Letter">Letter</nt> | '_' | ':')
+(<nt def="NT-NameChar">NameChar</nt>)*</rhs></prod>
+<prod id="NT-Names"><lhs>Names</lhs>
+<rhs><nt def="NT-Name">Name</nt>
+(<nt def="NT-S">S</nt> <nt def="NT-Name">Name</nt>)*</rhs></prod>
+<prod id="NT-Nmtoken"><lhs>Nmtoken</lhs>
+<rhs>(<nt def="NT-NameChar">NameChar</nt>)+</rhs></prod>
+<prod id="NT-Nmtokens"><lhs>Nmtokens</lhs>
+<rhs><nt def="NT-Nmtoken">Nmtoken</nt> (<nt def="NT-S">S</nt> <nt def="NT-Nmtoken">Nmtoken</nt>)*</rhs></prod>
+</scrap>
</p>
<p>Literal data is any quoted string not containing
the quotation mark used as a delimiter for that string.
@@ -640,34 +740,56 @@
(<nt def="NT-SystemLiteral">SystemLiteral</nt>).
Note that a <nt def="NT-SystemLiteral">SystemLiteral</nt>
can be parsed without scanning for markup.
-<scrap lang="ebnf"><head>Literals</head><prod id="NT-EntityValue"><lhs>EntityValue</lhs><rhs>'"'
+<scrap lang="ebnf">
+<head>Literals</head>
+<prod id="NT-EntityValue"><lhs>EntityValue</lhs>
+<rhs>'"'
([^%&"]
| <nt def="NT-PEReference">PEReference</nt>
| <nt def="NT-Reference">Reference</nt>)*
'"'
-</rhs><rhs>|
+</rhs>
+<rhs>|
"'"
([^%&']
| <nt def="NT-PEReference">PEReference</nt>
| <nt def="NT-Reference">Reference</nt>)*
-"'"</rhs></prod><prod id="NT-AttValue"><lhs>AttValue</lhs><rhs>'"'
+"'"</rhs>
+</prod>
+<prod id="NT-AttValue"><lhs>AttValue</lhs>
+<rhs>'"'
([^<&"]
| <nt def="NT-Reference">Reference</nt>)*
'"'
-</rhs><rhs>|
+</rhs>
+<rhs>|
"'"
([^<&']
| <nt def="NT-Reference">Reference</nt>)*
-"'"</rhs></prod><prod id="NT-SystemLiteral"><lhs>SystemLiteral</lhs><rhs>('"' [^"]* '"') | ("'" [^']* "'")
-</rhs></prod><prod id="NT-PubidLiteral"><lhs>PubidLiteral</lhs><rhs>'"' <nt def="NT-PubidChar">PubidChar</nt>*
+"'"</rhs>
+</prod>
+<prod id="NT-SystemLiteral"><lhs>SystemLiteral</lhs>
+<rhs>('"' [^"]* '"') | ("'" [^']* "'")
+</rhs>
+</prod>
+<prod id="NT-PubidLiteral"><lhs>PubidLiteral</lhs>
+<rhs>'"' <nt def="NT-PubidChar">PubidChar</nt>*
'"'
-| "'" (<nt def="NT-PubidChar">PubidChar</nt> - "'")* "'"</rhs></prod><prod id="NT-PubidChar"><lhs>PubidChar</lhs><rhs>#x20 | #xD | #xA
+| "'" (<nt def="NT-PubidChar">PubidChar</nt> - "'")* "'"</rhs>
+</prod>
+<prod id="NT-PubidChar"><lhs>PubidChar</lhs>
+<rhs>#x20 | #xD | #xA
| [a-zA-Z0-9]
-| [-'()+,./:=?;!*#@$_%]</rhs></prod></scrap>
+| [-'()+,./:=?;!*#@$_%]</rhs>
+</prod>
+</scrap>
</p>
+
</div2>
+
<div2 id="syntax">
<head>Character Data and Markup</head>
+
<p><termref def="dt-text">Text</termref> consists of intermingled
<termref def="dt-chardata">character
data</termref> and markup.
@@ -683,11 +805,9 @@
<termref def="dt-pi">processing instructions</termref>.
</termdef>
</p>
-<p>
-<termdef id="dt-chardata" term="Character Data">All text that is not markup
+<p><termdef id="dt-chardata" term="Character Data">All text that is not markup
constitutes the <term>character data</term> of
-the document.</termdef>
-</p>
+the document.</termdef></p>
<p>The ampersand character (&) and the left angle bracket (<)
may appear in their literal form <emph>only</emph> when used as markup
delimiters, or within a <termref def="dt-comment">comment</termref>, a
@@ -727,13 +847,20 @@
apostrophe or single-quote character (') may be represented as
"<code>&apos;</code>", and the double-quote character (") as
"<code>&quot;</code>".
-<scrap lang="ebnf"><head>Character Data</head><prod id="NT-CharData"><lhs>CharData</lhs><rhs>[^<&]* - ([^<&]* ']]>' [^<&]*)</rhs></prod></scrap>
+<scrap lang="ebnf">
+<head>Character Data</head>
+<prod id="NT-CharData">
+<lhs>CharData</lhs>
+<rhs>[^<&]* - ([^<&]* ']]>' [^<&]*)</rhs>
+</prod>
+</scrap>
</p>
</div2>
+
<div2 id="sec-comments">
<head>Comments</head>
-<p>
-<termdef id="dt-comment" term="Comment"><term>Comments</term> may
+
+<p><termdef id="dt-comment" term="Comment"><term>Comments</term> may
appear anywhere in a document outside other
<termref def="dt-markup">markup</termref>; in addition,
they may appear within the document type declaration
@@ -745,28 +872,41 @@
<termref def="dt-compat">For compatibility</termref>, the string
"<code>--</code>" (double-hyphen) must not occur within
comments.
-<scrap lang="ebnf"><head>Comments</head><prod id="NT-Comment"><lhs>Comment</lhs><rhs>'<!--'
+<scrap lang="ebnf">
+<head>Comments</head>
+<prod id="NT-Comment"><lhs>Comment</lhs>
+<rhs>'<!--'
((<nt def="NT-Char">Char</nt> - '-')
| ('-' (<nt def="NT-Char">Char</nt> - '-')))*
-'-->'</rhs></prod></scrap>
-</termdef>
-</p>
+'-->'</rhs>
+</prod>
+</scrap>
+</termdef></p>
<p>An example of a comment:
<eg><!&como; declarations for <head> & <body> &comc;></eg>
</p>
</div2>
+
<div2 id="sec-pi">
<head>Processing Instructions</head>
+
<p><termdef id="dt-pi" term="Processing instruction"><term>Processing
instructions</term> (PIs) allow documents to contain instructions
for applications.
-<scrap lang="ebnf"><head>Processing Instructions</head><prod id="NT-PI"><lhs>PI</lhs><rhs>'<?' <nt def="NT-PITarget">PITarget</nt>
+<scrap lang="ebnf">
+<head>Processing Instructions</head>
+<prod id="NT-PI"><lhs>PI</lhs>
+<rhs>'<?' <nt def="NT-PITarget">PITarget</nt>
(<nt def="NT-S">S</nt>
(<nt def="NT-Char">Char</nt>* -
(<nt def="NT-Char">Char</nt>* &pic; <nt def="NT-Char">Char</nt>*)))?
-&pic;</rhs></prod><prod id="NT-PITarget"><lhs>PITarget</lhs><rhs><nt def="NT-Name">Name</nt> -
-(('X' | 'x') ('M' | 'm') ('L' | 'l'))</rhs></prod></scrap></termdef>
+&pic;</rhs></prod>
+<prod id="NT-PITarget"><lhs>PITarget</lhs>
+<rhs><nt def="NT-Name">Name</nt> -
+(('X' | 'x') ('M' | 'm') ('L' | 'l'))</rhs>
+</prod>
+</scrap></termdef>
PIs are not part of the document's <termref def="dt-chardata">character
data</termref>, but must be passed through to the application. The
PI begins with a target (<nt def="NT-PITarget">PITarget</nt>) used
@@ -780,8 +920,10 @@
formal declaration of PI targets.
</p>
</div2>
+
<div2 id="sec-cdata-sect">
<head>CDATA Sections</head>
+
<p><termdef id="dt-cdsection" term="CDATA Section"><term>CDATA sections</term>
may occur
anywhere character data may occur; they are
@@ -789,11 +931,24 @@
otherwise be recognized as markup. CDATA sections begin with the
string "<code><![CDATA[</code>" and end with the string
"<code>]]></code>":
-<scrap lang="ebnf"><head>CDATA Sections</head><prod id="NT-CDSect"><lhs>CDSect</lhs><rhs><nt def="NT-CDStart">CDStart</nt>
+<scrap lang="ebnf">
+<head>CDATA Sections</head>
+<prod id="NT-CDSect"><lhs>CDSect</lhs>
+<rhs><nt def="NT-CDStart">CDStart</nt>
<nt def="NT-CData">CData</nt>
-<nt def="NT-CDEnd">CDEnd</nt></rhs></prod><prod id="NT-CDStart"><lhs>CDStart</lhs><rhs>'<![CDATA['</rhs></prod><prod id="NT-CData"><lhs>CData</lhs><rhs>(<nt def="NT-Char">Char</nt>* -
+<nt def="NT-CDEnd">CDEnd</nt></rhs></prod>
+<prod id="NT-CDStart"><lhs>CDStart</lhs>
+<rhs>'<![CDATA['</rhs>
+</prod>
+<prod id="NT-CData"><lhs>CData</lhs>
+<rhs>(<nt def="NT-Char">Char</nt>* -
(<nt def="NT-Char">Char</nt>* ']]>' <nt def="NT-Char">Char</nt>*))
-</rhs></prod><prod id="NT-CDEnd"><lhs>CDEnd</lhs><rhs>']]>'</rhs></prod></scrap>
+</rhs>
+</prod>
+<prod id="NT-CDEnd"><lhs>CDEnd</lhs>
+<rhs>']]>'</rhs>
+</prod>
+</scrap>
Within a CDATA section, only the <nt def="NT-CDEnd">CDEnd</nt> string is
recognized as markup, so that left angle brackets and ampersands may occur in
@@ -801,6 +956,7 @@
"<code>&lt;</code>" and "<code>&amp;</code>". CDATA sections
cannot nest.</termdef>
</p>
+
<p>An example of a CDATA section, in which "<code><greeting></code>" and
"<code></greeting></code>"
are recognized as <termref def="dt-chardata">character data</termref>, not
@@ -808,8 +964,10 @@
<eg><![CDATA[<greeting>Hello, world!</greeting>]]></eg>
</p>
</div2>
+
<div2 id="sec-prolog-dtd">
<head>Prolog and Document Type Declaration</head>
+
<p><termdef id="dt-xmldecl" term="XML Declaration">XML documents
may, and should,
begin with an <term>XML declaration</term> which specifies
@@ -824,6 +982,7 @@
<eg><![CDATA[<greeting>Hello, world!</greeting>
]]></eg>
</p>
+
<p>The version number "<code>1.0</code>" should be used to indicate
conformance to this version of this specification; it is an error
for a document to use the value "<code>1.0</code>"
@@ -852,18 +1011,39 @@
complies with the constraints expressed in it.</termdef></p>
<p>The document type declaration must appear before
the first <termref def="dt-element">element</termref> in the document.
-<scrap lang="ebnf" id="xmldoc"><head>Prolog</head><prodgroup pcw2="6" pcw4="17.5" pcw5="9"><prod id="NT-prolog"><lhs>prolog</lhs><rhs><nt def="NT-XMLDecl">XMLDecl</nt>?
+<scrap lang="ebnf" id="xmldoc">
+<head>Prolog</head>
+<prodgroup pcw2="6" pcw4="17.5" pcw5="9">
+<prod id="NT-prolog"><lhs>prolog</lhs>
+<rhs><nt def="NT-XMLDecl">XMLDecl</nt>?
<nt def="NT-Misc">Misc</nt>*
(<nt def="NT-doctypedecl">doctypedecl</nt>
-<nt def="NT-Misc">Misc</nt>*)?</rhs></prod><prod id="NT-XMLDecl"><lhs>XMLDecl</lhs><rhs>&xmlpio;
+<nt def="NT-Misc">Misc</nt>*)?</rhs></prod>
+<prod id="NT-XMLDecl"><lhs>XMLDecl</lhs>
+<rhs>&xmlpio;
<nt def="NT-VersionInfo">VersionInfo</nt>
<nt def="NT-EncodingDecl">EncodingDecl</nt>?
<nt def="NT-SDDecl">SDDecl</nt>?
<nt def="NT-S">S</nt>?
-&pic;</rhs></prod><prod id="NT-VersionInfo"><lhs>VersionInfo</lhs><rhs><nt def="NT-S">S</nt> 'version' <nt def="NT-Eq">Eq</nt>
+&pic;</rhs>
+</prod>
+<prod id="NT-VersionInfo"><lhs>VersionInfo</lhs>
+<rhs><nt def="NT-S">S</nt> 'version' <nt def="NT-Eq">Eq</nt>
(' <nt def="NT-VersionNum">VersionNum</nt> '
-| " <nt def="NT-VersionNum">VersionNum</nt> ")</rhs></prod><prod id="NT-Eq"><lhs>Eq</lhs><rhs><nt def="NT-S">S</nt>? '=' <nt def="NT-S">S</nt>?</rhs></prod><prod id="NT-VersionNum"><lhs>VersionNum</lhs><rhs>([a-zA-Z0-9_.:] | '-')+</rhs></prod><prod id="NT-Misc"><lhs>Misc</lhs><rhs><nt def="NT-Comment">Comment</nt> | <nt def="NT-PI">PI</nt> |
-<nt def="NT-S">S</nt></rhs></prod></prodgroup></scrap></p>
+| " <nt def="NT-VersionNum">VersionNum</nt> ")</rhs>
+</prod>
+<prod id="NT-Eq"><lhs>Eq</lhs>
+<rhs><nt def="NT-S">S</nt>? '=' <nt def="NT-S">S</nt>?</rhs></prod>
+<prod id="NT-VersionNum">
+<lhs>VersionNum</lhs>
+<rhs>([a-zA-Z0-9_.:] | '-')+</rhs>
+</prod>
+<prod id="NT-Misc"><lhs>Misc</lhs>
+<rhs><nt def="NT-Comment">Comment</nt> | <nt def="NT-PI">PI</nt> |
+<nt def="NT-S">S</nt></rhs></prod>
+</prodgroup>
+</scrap></p>
+
<p><termdef id="dt-doctype" term="Document Type Declaration">The XML
<term>document type declaration</term>
contains or points to
@@ -896,8 +1076,7 @@
<scrap lang="ebnf" id="dtd">
<head>Document Type Definition</head>
<prodgroup pcw2="6" pcw4="17.5" pcw5="9">
-<prod id="NT-doctypedecl">
-<lhs>doctypedecl</lhs>
+<prod id="NT-doctypedecl"><lhs>doctypedecl</lhs>
<rhs>'<!DOCTYPE' <nt def="NT-S">S</nt>
<nt def="NT-Name">Name</nt> (<nt def="NT-S">S</nt>
<nt def="NT-ExternalID">ExternalID</nt>)?
@@ -909,8 +1088,7 @@
<nt def="NT-S">S</nt>?)? '>'</rhs>
<vc def="vc-roottype"/>
</prod>
-<prod id="NT-markupdecl">
-<lhs>markupdecl</lhs>
+<prod id="NT-markupdecl"><lhs>markupdecl</lhs>
<rhs><nt def="NT-elementdecl">elementdecl</nt>
| <nt def="NT-AttlistDecl">AttlistDecl</nt>
| <nt def="NT-EntityDecl">EntityDecl</nt>
@@ -921,8 +1099,10 @@
<vc def="vc-PEinMarkupDecl"/>
<wfc def="wfc-PEinInternalSubset"/>
</prod>
+
</prodgroup>
</scrap>
+
<p>The markup declarations may be made up in whole or in part of
the <termref def="dt-repltext">replacement text</termref> of
<termref def="dt-PE">parameter entities</termref>.
@@ -931,6 +1111,7 @@
<nt def="NT-AttlistDecl">AttlistDecl</nt>, and so on) describe
the declarations <emph>after</emph> all the parameter entities have been
<termref def="dt-include">included</termref>.</p>
+
<vcnote id="vc-roottype">
<head>Root Element Type</head>
<p>
@@ -938,6 +1119,7 @@
match the element type of the <termref def="dt-root">root element</termref>.
</p>
</vcnote>
+
<vcnote id="vc-PEinMarkupDecl">
<head>Proper Declaration/PE Nesting</head>
<p>Parameter-entity
@@ -974,13 +1156,22 @@
the <termref def="dt-cond-section">conditional section</termref>
construct; this is not allowed in the internal subset.
-<scrap id="ext-Subset"><head>External Subset</head><prodgroup pcw2="6" pcw4="17.5" pcw5="9"><prod id="NT-extSubset"><lhs>extSubset</lhs><rhs><nt def="NT-TextDecl">TextDecl</nt>?
-<nt def="NT-extSubsetDecl">extSubsetDecl</nt></rhs></prod><prod id="NT-extSubsetDecl"><lhs>extSubsetDecl</lhs><rhs>(
+<scrap id="ext-Subset">
+<head>External Subset</head>
+<prodgroup pcw2="6" pcw4="17.5" pcw5="9">
+<prod id="NT-extSubset"><lhs>extSubset</lhs>
+<rhs><nt def="NT-TextDecl">TextDecl</nt>?
+<nt def="NT-extSubsetDecl">extSubsetDecl</nt></rhs></prod>
+<prod id="NT-extSubsetDecl"><lhs>extSubsetDecl</lhs>
+<rhs>(
<nt def="NT-markupdecl">markupdecl</nt>
| <nt def="NT-conditionalSect">conditionalSect</nt>
| <nt def="NT-PEReference">PEReference</nt>
| <nt def="NT-S">S</nt>
-)*</rhs></prod></prodgroup></scrap></p>
+)*</rhs>
+</prod>
+</prodgroup>
+</scrap></p>
<p>The external subset and external parameter entities also differ
from the internal subset in that in them,
<termref def="dt-PERef">parameter-entity references</termref>
@@ -1008,6 +1199,7 @@
internal subset take precedence over those in the external subset.
</p>
</div2>
+
<div2 id="sec-rmd">
<head>Standalone Document Declaration</head>
<p>Markup declarations can affect the content of the document,
@@ -1018,11 +1210,18 @@
which may appear as a component of the XML declaration, signals
whether or not there are such declarations which appear external to
the <termref def="dt-docent">document entity</termref>.
-<scrap lang="ebnf" id="fulldtd"><head>Standalone Document Declaration</head><prodgroup pcw2="4" pcw4="19.5" pcw5="9"><prod id="NT-SDDecl"><lhs>SDDecl</lhs><rhs>
+<scrap lang="ebnf" id="fulldtd">
+<head>Standalone Document Declaration</head>
+<prodgroup pcw2="4" pcw4="19.5" pcw5="9">
+<prod id="NT-SDDecl"><lhs>SDDecl</lhs>
+<rhs>
<nt def="NT-S">S</nt>
'standalone' <nt def="NT-Eq">Eq</nt>
(("'" ('yes' | 'no') "'") | ('"' ('yes' | 'no') '"'))
-</rhs><vc def="vc-check-rmd"/></prod></prodgroup></scrap></p>
+</rhs>
+<vc def="vc-check-rmd"/></prod>
+</prodgroup>
+</scrap></p>
<p>
In a standalone document declaration, the value "<code>yes</code>" indicates
that there
@@ -1050,21 +1249,16 @@
<head>Standalone Document Declaration</head>
<p>The standalone document declaration must have
the value "<code>no</code>" if any external markup declarations
-contain declarations of:</p>
-<ulist>
-<item>
-<p>attributes with <termref def="dt-default">default</termref> values, if
+contain declarations of:</p><ulist>
+<item><p>attributes with <termref def="dt-default">default</termref> values, if
elements to which
these attributes apply appear in the document without
-specifications of values for these attributes, or</p>
-</item>
-<item>
-<p>entities (other than &magicents;),
+specifications of values for these attributes, or</p></item>
+<item><p>entities (other than &magicents;),
if <termref def="dt-entref">references</termref> to those
entities appear in the document, or</p>
</item>
-<item>
-<p>attributes with values subject to
+<item><p>attributes with values subject to
<titleref href="AVNormalize">normalization</titleref>, where the
attribute appears in the document with a value which will
change as a result of normalization, or</p>
@@ -1073,14 +1267,15 @@
<p>element types with <termref def="dt-elemcontent">element content</termref>,
if white space occurs
directly within any instance of those types.
-</p>
-</item>
+</p></item>
</ulist>
+
</vcnote>
<p>An example XML declaration with a standalone document declaration:<eg><?xml version="&XML.version;" standalone='yes'?></eg></p>
</div2>
<div2 id="sec-white-space">
<head>White Space Handling</head>
+
<p>In editing XML documents, it is often convenient to use "white space"
(spaces, tabs, and blank lines, denoted by the nonterminal
<nt def="NT-S">S</nt> in this specification) to
@@ -1119,6 +1314,7 @@
handling, unless it provides a value for
this attribute or the attribute is declared with a default value.
</p>
+
</div2>
<div2 id="sec-line-ends">
<head>End-of-Line Handling</head>
@@ -1152,19 +1348,38 @@
<termref def="dt-attdecl">declared</termref> if it is used.
The values of the attribute are language identifiers as defined
by <bibref ref="RFC1766"/>, "Tags for the Identification of Languages":
-<scrap lang="ebnf"><head>Language Identification</head><prod id="NT-LanguageID"><lhs>LanguageID</lhs><rhs><nt def="NT-Langcode">Langcode</nt>
-('-' <nt def="NT-Subcode">Subcode</nt>)*</rhs></prod><prod id="NT-Langcode"><lhs>Langcode</lhs><rhs><nt def="NT-ISO639Code">ISO639Code</nt> |
+<scrap lang="ebnf">
+<head>Language Identification</head>
+<prod id="NT-LanguageID"><lhs>LanguageID</lhs>
+<rhs><nt def="NT-Langcode">Langcode</nt>
+('-' <nt def="NT-Subcode">Subcode</nt>)*</rhs></prod>
+<prod id="NT-Langcode"><lhs>Langcode</lhs>
+<rhs><nt def="NT-ISO639Code">ISO639Code</nt> |
<nt def="NT-IanaCode">IanaCode</nt> |
-<nt def="NT-UserCode">UserCode</nt></rhs></prod><prod id="NT-ISO639Code"><lhs>ISO639Code</lhs><rhs>([a-z] | [A-Z]) ([a-z] | [A-Z])</rhs></prod><prod id="NT-IanaCode"><lhs>IanaCode</lhs><rhs>('i' | 'I') '-' ([a-z] | [A-Z])+</rhs></prod><prod id="NT-UserCode"><lhs>UserCode</lhs><rhs>('x' | 'X') '-' ([a-z] | [A-Z])+</rhs></prod><prod id="NT-Subcode"><lhs>Subcode</lhs><rhs>([a-z] | [A-Z])+</rhs></prod></scrap>
+<nt def="NT-UserCode">UserCode</nt></rhs>
+</prod>
+<prod id="NT-ISO639Code"><lhs>ISO639Code</lhs>
+<rhs>([a-z] | [A-Z]) ([a-z] | [A-Z])</rhs></prod>
+<prod id="NT-IanaCode"><lhs>IanaCode</lhs>
+<rhs>('i' | 'I') '-' ([a-z] | [A-Z])+</rhs></prod>
+<prod id="NT-UserCode"><lhs>UserCode</lhs>
+<rhs>('x' | 'X') '-' ([a-z] | [A-Z])+</rhs></prod>
+<prod id="NT-Subcode"><lhs>Subcode</lhs>
+<rhs>([a-z] | [A-Z])+</rhs></prod>
+</scrap>
The <nt def="NT-Langcode">Langcode</nt> may be any of the following:
-<ulist><item><p>a two-letter language code as defined by
+<ulist>
+<item><p>a two-letter language code as defined by
<bibref ref="ISO639"/>, "Codes
-for the representation of names of languages"</p></item><item><p>a language identifier registered with the Internet
+for the representation of names of languages"</p></item>
+<item><p>a language identifier registered with the Internet
Assigned Numbers Authority <bibref ref="IANA"/>; these begin with the
-prefix "<code>i-</code>" (or "<code>I-</code>")</p></item><item><p>a language identifier assigned by the user, or agreed on
+prefix "<code>i-</code>" (or "<code>I-</code>")</p></item>
+<item><p>a language identifier assigned by the user, or agreed on
between parties in private use; these must begin with the
prefix "<code>x-</code>" or "<code>X-</code>" in order to ensure that they do not conflict
-with names later standardized or registered with IANA</p></item></ulist></p>
+with names later standardized or registered with IANA</p></item>
+</ulist></p>
<p>There may be any number of <nt def="NT-Subcode">Subcode</nt> segments; if
the first
subcode segment exists and the Subcode consists of two
@@ -1224,11 +1439,14 @@
<!ATTLIST gloss xml:lang NMTOKEN 'en'>
<!ATTLIST note xml:lang NMTOKEN 'en'>]]></eg>
</p>
+
</div2>
</div1>
<!-- &Elements; -->
+
<div1 id="sec-logical-struct">
<head>Logical Structures</head>
+
<p><termdef id="dt-element" term="Element">Each <termref def="dt-xml-doc">XML document</termref> contains one or more
<term>elements</term>, the boundaries of which are
either delimited by <termref def="dt-stag">start-tags</termref>
@@ -1238,13 +1456,9 @@
attribute specifications.</termdef> Each attribute specification
has a <termref def="dt-attrname">name</termref> and a <termref def="dt-attrval">value</termref>.
</p>
-<scrap lang="ebnf">
-<head>Element</head>
-<prod id="NT-element">
-<lhs>element</lhs>
-<rhs>
-<nt def="NT-EmptyElemTag">EmptyElemTag</nt>
-</rhs>
+<scrap lang="ebnf"><head>Element</head>
+<prod id="NT-element"><lhs>element</lhs>
+<rhs><nt def="NT-EmptyElemTag">EmptyElemTag</nt></rhs>
<rhs>| <nt def="NT-STag">STag</nt> <nt def="NT-content">content</nt>
<nt def="NT-ETag">ETag</nt></rhs>
<wfc def="GIMatch"/>
@@ -1274,40 +1488,47 @@
<nt def="NT-Name">Name</nt> matches the element type, and
one of the following holds:</p>
<olist>
-<item>
-<p>The declaration matches <kw>EMPTY</kw> and the element has no
-<termref def="dt-content">content</termref>.</p>
-</item>
-<item>
-<p>The declaration matches <nt def="NT-children">children</nt> and
+<item><p>The declaration matches <kw>EMPTY</kw> and the element has no
+<termref def="dt-content">content</termref>.</p></item>
+<item><p>The declaration matches <nt def="NT-children">children</nt> and
the sequence of
<termref def="dt-parentchild">child elements</termref>
belongs to the language generated by the regular expression in
the content model, with optional white space (characters
matching the nonterminal <nt def="NT-S">S</nt>) between each pair
-of child elements.</p>
-</item>
-<item>
-<p>The declaration matches <nt def="NT-Mixed">Mixed</nt> and
+of child elements.</p></item>
+<item><p>The declaration matches <nt def="NT-Mixed">Mixed</nt> and
the content consists of <termref def="dt-chardata">character
data</termref> and <termref def="dt-parentchild">child elements</termref>
-whose types match names in the content model.</p>
-</item>
-<item>
-<p>The declaration matches <kw>ANY</kw>, and the types
+whose types match names in the content model.</p></item>
+<item><p>The declaration matches <kw>ANY</kw>, and the types
of any <termref def="dt-parentchild">child elements</termref> have
-been declared.</p>
-</item>
+been declared.</p></item>
</olist>
</vcnote>
+
<div2 id="sec-starttags">
<head>Start-Tags, End-Tags, and Empty-Element Tags</head>
+
<p><termdef id="dt-stag" term="Start-Tag">The beginning of every
non-empty XML element is marked by a <term>start-tag</term>.
-<scrap lang="ebnf"><head>Start-tag</head><prodgroup pcw2="6" pcw4="15" pcw5="11.5"><prod id="NT-STag"><lhs>STag</lhs><rhs>'<' <nt def="NT-Name">Name</nt>
+<scrap lang="ebnf">
+<head>Start-tag</head>
+<prodgroup pcw2="6" pcw4="15" pcw5="11.5">
+<prod id="NT-STag"><lhs>STag</lhs>
+<rhs>'<' <nt def="NT-Name">Name</nt>
(<nt def="NT-S">S</nt> <nt def="NT-Attribute">Attribute</nt>)*
-<nt def="NT-S">S</nt>? '>'</rhs><wfc def="uniqattspec"/></prod><prod id="NT-Attribute"><lhs>Attribute</lhs><rhs><nt def="NT-Name">Name</nt> <nt def="NT-Eq">Eq</nt>
-<nt def="NT-AttValue">AttValue</nt></rhs><vc def="ValueType"/><wfc def="NoExternalRefs"/><wfc def="CleanAttrVals"/></prod></prodgroup></scrap>
+<nt def="NT-S">S</nt>? '>'</rhs>
+<wfc def="uniqattspec"/>
+</prod>
+<prod id="NT-Attribute"><lhs>Attribute</lhs>
+<rhs><nt def="NT-Name">Name</nt> <nt def="NT-Eq">Eq</nt>
+<nt def="NT-AttValue">AttValue</nt></rhs>
+<vc def="ValueType"/>
+<wfc def="NoExternalRefs"/>
+<wfc def="CleanAttrVals"/></prod>
+</prodgroup>
+</scrap>
The <nt def="NT-Name">Name</nt> in
the start- and end-tags gives the
element's <term>type</term>.</termdef>
@@ -1351,39 +1572,55 @@
referred to directly or indirectly in an attribute
value (other than "<code>&lt;</code>") must not contain
a <code><</code>.
-</p>
-</wfcnote>
+</p></wfcnote>
<p>An example of a start-tag:
<eg><termdef id="dt-dog" term="dog"></eg></p>
-<p>
-<termdef id="dt-etag" term="End Tag">The end of every element
+<p><termdef id="dt-etag" term="End Tag">The end of every element
that begins with a start-tag must
be marked by an <term>end-tag</term>
containing a name that echoes the element's type as given in the
start-tag:
-<scrap lang="ebnf"><head>End-tag</head><prodgroup pcw2="6" pcw4="15" pcw5="11.5"><prod id="NT-ETag"><lhs>ETag</lhs><rhs>'</' <nt def="NT-Name">Name</nt>
-<nt def="NT-S">S</nt>? '>'</rhs></prod></prodgroup></scrap>
-</termdef>
-</p>
+<scrap lang="ebnf">
+<head>End-tag</head>
+<prodgroup pcw2="6" pcw4="15" pcw5="11.5">
+<prod id="NT-ETag"><lhs>ETag</lhs>
+<rhs>'</' <nt def="NT-Name">Name</nt>
+<nt def="NT-S">S</nt>? '>'</rhs></prod>
+</prodgroup>
+</scrap>
+</termdef></p>
<p>An example of an end-tag:<eg></termdef></eg></p>
-<p>
-<termdef id="dt-content" term="Content">The
+<p><termdef id="dt-content" term="Content">The
<termref def="dt-text">text</termref> between the start-tag and
end-tag is called the element's
<term>content</term>:
-<scrap lang="ebnf"><head>Content of Elements</head><prodgroup pcw2="6" pcw4="15" pcw5="11.5"><prod id="NT-content"><lhs>content</lhs><rhs>(<nt def="NT-element">element</nt> | <nt def="NT-CharData">CharData</nt>
+<scrap lang="ebnf">
+<head>Content of Elements</head>
+<prodgroup pcw2="6" pcw4="15" pcw5="11.5">
+<prod id="NT-content"><lhs>content</lhs>
+<rhs>(<nt def="NT-element">element</nt> | <nt def="NT-CharData">CharData</nt>
| <nt def="NT-Reference">Reference</nt> | <nt def="NT-CDSect">CDSect</nt>
-| <nt def="NT-PI">PI</nt> | <nt def="NT-Comment">Comment</nt>)*</rhs></prod></prodgroup></scrap>
-</termdef>
-</p>
+| <nt def="NT-PI">PI</nt> | <nt def="NT-Comment">Comment</nt>)*</rhs>
+</prod>
+</prodgroup>
+</scrap>
+</termdef></p>
<p><termdef id="dt-empty" term="Empty">If an element is <term>empty</term>,
it must be represented either by a start-tag immediately followed
by an end-tag or by an empty-element tag.</termdef>
<termdef id="dt-eetag" term="empty-element tag">An
<term>empty-element tag</term> takes a special form:
-<scrap lang="ebnf"><head>Tags for Empty Elements</head><prodgroup pcw2="6" pcw4="15" pcw5="11.5"><prod id="NT-EmptyElemTag"><lhs>EmptyElemTag</lhs><rhs>'<' <nt def="NT-Name">Name</nt> (<nt def="NT-S">S</nt>
+<scrap lang="ebnf">
+<head>Tags for Empty Elements</head>
+<prodgroup pcw2="6" pcw4="15" pcw5="11.5">
+<prod id="NT-EmptyElemTag"><lhs>EmptyElemTag</lhs>
+<rhs>'<' <nt def="NT-Name">Name</nt> (<nt def="NT-S">S</nt>
<nt def="NT-Attribute">Attribute</nt>)* <nt def="NT-S">S</nt>?
-'/>'</rhs><wfc def="uniqattspec"/></prod></prodgroup></scrap>
+'/>'</rhs>
+<wfc def="uniqattspec"/>
+</prod>
+</prodgroup>
+</scrap>
</termdef></p>
<p>Empty-element tags may be used for any element which has no
content, whether or not it is declared using the keyword
@@ -1397,8 +1634,10 @@
<br></br>
<br/></eg></p>
</div2>
+
<div2 id="elemdecls">
<head>Element Type Declarations</head>
+
<p>The <termref def="dt-element">element</termref> structure of an
<termref def="dt-xml-doc">XML document</termref> may, for
<termref def="dt-valid">validation</termref> purposes,
@@ -1407,6 +1646,7 @@
An element type declaration constrains the element's
<termref def="dt-content">content</termref>.
</p>
+
<p>Element type declarations often constrain which element types can
appear as <termref def="dt-parentchild">children</termref> of the element.
At user option, an XML processor may issue a warning
@@ -1414,31 +1654,45 @@
is provided, but this is not an error.</p>
<p><termdef id="dt-eldecl" term="Element Type declaration">An <term>element
type declaration</term> takes the form:
-<scrap lang="ebnf"><head>Element Type Declaration</head><prodgroup pcw2="5.5" pcw4="18" pcw5="9"><prod id="NT-elementdecl"><lhs>elementdecl</lhs><rhs>'<!ELEMENT' <nt def="NT-S">S</nt>
+<scrap lang="ebnf">
+<head>Element Type Declaration</head>
+<prodgroup pcw2="5.5" pcw4="18" pcw5="9">
+<prod id="NT-elementdecl"><lhs>elementdecl</lhs>
+<rhs>'<!ELEMENT' <nt def="NT-S">S</nt>
<nt def="NT-Name">Name</nt>
<nt def="NT-S">S</nt>
<nt def="NT-contentspec">contentspec</nt>
-<nt def="NT-S">S</nt>? '>'</rhs><vc def="EDUnique"/></prod><prod id="NT-contentspec"><lhs>contentspec</lhs><rhs>'EMPTY'
+<nt def="NT-S">S</nt>? '>'</rhs>
+<vc def="EDUnique"/></prod>
+<prod id="NT-contentspec"><lhs>contentspec</lhs>
+<rhs>'EMPTY'
| 'ANY'
| <nt def="NT-Mixed">Mixed</nt>
| <nt def="NT-children">children</nt>
-</rhs></prod></prodgroup></scrap>
+</rhs>
+</prod>
+</prodgroup>
+</scrap>
where the <nt def="NT-Name">Name</nt> gives the element type
being declared.</termdef>
</p>
+
<vcnote id="EDUnique">
<head>Unique Element Type Declaration</head>
<p>
No element type may be declared more than once.
</p>
</vcnote>
+
<p>Examples of element type declarations:
<eg><!ELEMENT br EMPTY>
<!ELEMENT p (#PCDATA|emph)* >
<!ELEMENT %name.para; %content.para; >
<!ELEMENT container ANY></eg></p>
+
<div3 id="sec-element-content">
<head>Element Content</head>
+
<p><termdef id="dt-elemcontent" term="Element content">An element <termref def="dt-stag">type</termref> has
<term>element content</term> when elements of that
type must contain only <termref def="dt-parentchild">child</termref>
@@ -1454,16 +1708,31 @@
content particles (<nt def="NT-cp">cp</nt>s), which consist of names,
choice lists of content particles, or
sequence lists of content particles:
-<scrap lang="ebnf"><head>Element-content Models</head><prodgroup pcw2="5.5" pcw4="16" pcw5="11"><prod id="NT-children"><lhs>children</lhs><rhs>(<nt def="NT-choice">choice</nt>
+<scrap lang="ebnf">
+<head>Element-content Models</head>
+<prodgroup pcw2="5.5" pcw4="16" pcw5="11">
+<prod id="NT-children"><lhs>children</lhs>
+<rhs>(<nt def="NT-choice">choice</nt>
| <nt def="NT-seq">seq</nt>)
-('?' | '*' | '+')?</rhs></prod><prod id="NT-cp"><lhs>cp</lhs><rhs>(<nt def="NT-Name">Name</nt>
+('?' | '*' | '+')?</rhs></prod>
+<prod id="NT-cp"><lhs>cp</lhs>
+<rhs>(<nt def="NT-Name">Name</nt>
| <nt def="NT-choice">choice</nt>
| <nt def="NT-seq">seq</nt>)
-('?' | '*' | '+')?</rhs></prod><prod id="NT-choice"><lhs>choice</lhs><rhs>'(' <nt def="NT-S">S</nt>? cp
+('?' | '*' | '+')?</rhs></prod>
+<prod id="NT-choice"><lhs>choice</lhs>
+<rhs>'(' <nt def="NT-S">S</nt>? cp
( <nt def="NT-S">S</nt>? '|' <nt def="NT-S">S</nt>? <nt def="NT-cp">cp</nt> )*
-<nt def="NT-S">S</nt>? ')'</rhs><vc def="vc-PEinGroup"/></prod><prod id="NT-seq"><lhs>seq</lhs><rhs>'(' <nt def="NT-S">S</nt>? cp
+<nt def="NT-S">S</nt>? ')'</rhs>
+<vc def="vc-PEinGroup"/></prod>
+<prod id="NT-seq"><lhs>seq</lhs>
+<rhs>'(' <nt def="NT-S">S</nt>? cp
( <nt def="NT-S">S</nt>? ',' <nt def="NT-S">S</nt>? <nt def="NT-cp">cp</nt> )*
-<nt def="NT-S">S</nt>? ')'</rhs><vc def="vc-PEinGroup"/></prod></prodgroup></scrap>
+<nt def="NT-S">S</nt>? ')'</rhs>
+<vc def="vc-PEinGroup"/></prod>
+
+</prodgroup>
+</scrap>
where each <nt def="NT-Name">Name</nt> is the type of an element which may
appear as a <termref def="dt-parentchild">child</termref>.
Any content
@@ -1518,8 +1787,10 @@
<!ELEMENT div1 (head, (p | list | note)*, div2*)>
<!ELEMENT dictionary-body (%div.mix; | %dict.mix;)*></eg></p>
</div3>
+
<div3 id="sec-mixed-content">
<head>Mixed Content</head>
+
<p><termdef id="dt-mixed" term="Mixed Content">An element
<termref def="dt-stag">type</termref> has
<term>mixed content</term> when elements of that type may contain
@@ -1527,15 +1798,25 @@
<termref def="dt-parentchild">child</termref> elements.</termdef>
In this case, the types of the child elements
may be constrained, but not their order or their number of occurrences:
-<scrap lang="ebnf"><head>Mixed-content Declaration</head><prodgroup pcw2="5.5" pcw4="16" pcw5="11"><prod id="NT-Mixed"><lhs>Mixed</lhs><rhs>'(' <nt def="NT-S">S</nt>?
+<scrap lang="ebnf">
+<head>Mixed-content Declaration</head>
+<prodgroup pcw2="5.5" pcw4="16" pcw5="11">
+<prod id="NT-Mixed"><lhs>Mixed</lhs>
+<rhs>'(' <nt def="NT-S">S</nt>?
'#PCDATA'
(<nt def="NT-S">S</nt>?
'|'
<nt def="NT-S">S</nt>?
<nt def="NT-Name">Name</nt>)*
<nt def="NT-S">S</nt>?
-')*' </rhs><rhs>| '(' <nt def="NT-S">S</nt>? '#PCDATA' <nt def="NT-S">S</nt>? ')'
-</rhs><vc def="vc-PEinGroup"/><vc def="vc-MixedChildrenUnique"/></prod></prodgroup></scrap>
+')*' </rhs>
+<rhs>| '(' <nt def="NT-S">S</nt>? '#PCDATA' <nt def="NT-S">S</nt>? ')'
+</rhs><vc def="vc-PEinGroup"/>
+<vc def="vc-MixedChildrenUnique"/>
+</prod>
+
+</prodgroup>
+</scrap>
where the <nt def="NT-Name">Name</nt>s give the types of elements
that may appear as children.
</p>
@@ -1543,16 +1824,17 @@
<head>No Duplicate Types</head>
<p>The same name must not appear more than once in a single mixed-content
declaration.
-</p>
-</vcnote>
+</p></vcnote>
<p>Examples of mixed content declarations:
<eg><!ELEMENT p (#PCDATA|a|ul|b|i|em)*>
<!ELEMENT p (#PCDATA | %font; | %phrase; | %special; | %form;)* >
<!ELEMENT b (#PCDATA)></eg></p>
</div3>
</div2>
+
<div2 id="attdecls">
<head>Attribute-List Declarations</head>
+
<p><termref def="dt-attr">Attributes</termref> are used to associate
name-value pairs with <termref def="dt-element">elements</termref>.
Attribute specifications may appear only within <termref def="dt-stag">start-tags</termref>
@@ -1561,29 +1843,39 @@
recognize them appear in <specref ref="sec-starttags"/>.
Attribute-list
declarations may be used:
-<ulist><item><p>To define the set of attributes pertaining to a given
-element type.</p></item><item><p>To establish type constraints for these
-attributes.</p></item><item><p>To provide <termref def="dt-default">default values</termref>
-for attributes.</p></item></ulist>
+<ulist>
+<item><p>To define the set of attributes pertaining to a given
+element type.</p></item>
+<item><p>To establish type constraints for these
+attributes.</p></item>
+<item><p>To provide <termref def="dt-default">default values</termref>
+for attributes.</p></item>
+</ulist>
</p>
-<p>
-<termdef id="dt-attdecl" term="Attribute-List Declaration">
+<p><termdef id="dt-attdecl" term="Attribute-List Declaration">
<term>Attribute-list declarations</term> specify the name, data type, and default
value (if any) of each attribute associated with a given element type:
-<scrap lang="ebnf"><head>Attribute-list Declaration</head><prod id="NT-AttlistDecl"><lhs>AttlistDecl</lhs><rhs>'<!ATTLIST' <nt def="NT-S">S</nt>
+<scrap lang="ebnf">
+<head>Attribute-list Declaration</head>
+<prod id="NT-AttlistDecl"><lhs>AttlistDecl</lhs>
+<rhs>'<!ATTLIST' <nt def="NT-S">S</nt>
<nt def="NT-Name">Name</nt>
<nt def="NT-AttDef">AttDef</nt>*
-<nt def="NT-S">S</nt>? '>'</rhs></prod><prod id="NT-AttDef"><lhs>AttDef</lhs><rhs><nt def="NT-S">S</nt> <nt def="NT-Name">Name</nt>
+<nt def="NT-S">S</nt>? '>'</rhs>
+</prod>
+<prod id="NT-AttDef"><lhs>AttDef</lhs>
+<rhs><nt def="NT-S">S</nt> <nt def="NT-Name">Name</nt>
<nt def="NT-S">S</nt> <nt def="NT-AttType">AttType</nt>
-<nt def="NT-S">S</nt> <nt def="NT-DefaultDecl">DefaultDecl</nt></rhs></prod></scrap>
+<nt def="NT-S">S</nt> <nt def="NT-DefaultDecl">DefaultDecl</nt></rhs>
+</prod>
+</scrap>
The <nt def="NT-Name">Name</nt> in the
<nt def="NT-AttlistDecl">AttlistDecl</nt> rule is the type of an element. At
user option, an XML processor may issue a warning if attributes are
declared for an element type not itself declared, but this is not an
error. The <nt def="NT-Name">Name</nt> in the
<nt def="NT-AttDef">AttDef</nt> rule is
-the name of the attribute.</termdef>
-</p>
+the name of the attribute.</termdef></p>
<p>
When more than one <nt def="NT-AttlistDecl">AttlistDecl</nt> is provided for a
given element type, the contents of all those provided are merged. When
@@ -1601,16 +1893,45 @@
is provided
for a given attribute, but this is not an error.
</p>
+
<div3 id="sec-attribute-types">
<head>Attribute Types</head>
+
<p>XML attribute types are of three kinds: a string type, a
set of tokenized types, and enumerated types. The string type may take
any literal string as a value; the tokenized types have varying lexical
and semantic constraints, as noted:
-<scrap lang="ebnf"><head>Attribute Types</head><prodgroup pcw4="14" pcw5="11.5"><prod id="NT-AttType"><lhs>AttType</lhs><rhs><nt def="NT-StringType">StringType</nt>
+<scrap lang="ebnf">
+<head>Attribute Types</head>
+<prodgroup pcw4="14" pcw5="11.5">
+<prod id="NT-AttType"><lhs>AttType</lhs>
+<rhs><nt def="NT-StringType">StringType</nt>
| <nt def="NT-TokenizedType">TokenizedType</nt>
| <nt def="NT-EnumeratedType">EnumeratedType</nt>
-</rhs></prod><prod id="NT-StringType"><lhs>StringType</lhs><rhs>'CDATA'</rhs></prod><prod id="NT-TokenizedType"><lhs>TokenizedType</lhs><rhs>'ID'</rhs><vc def="id"/><vc def="one-id-per-el"/><vc def="id-default"/><rhs>| 'IDREF'</rhs><vc def="idref"/><rhs>| 'IDREFS'</rhs><vc def="idref"/><rhs>| 'ENTITY'</rhs><vc def="entname"/><rhs>| 'ENTITIES'</rhs><vc def="entname"/><rhs>| 'NMTOKEN'</rhs><vc def="nmtok"/><rhs>| 'NMTOKENS'</rhs><vc def="nmtok"/></prod></prodgroup></scrap>
+</rhs>
+</prod>
+<prod id="NT-StringType"><lhs>StringType</lhs>
+<rhs>'CDATA'</rhs>
+</prod>
+<prod id="NT-TokenizedType"><lhs>TokenizedType</lhs>
+<rhs>'ID'</rhs>
+<vc def="id"/>
+<vc def="one-id-per-el"/>
+<vc def="id-default"/>
+<rhs>| 'IDREF'</rhs>
+<vc def="idref"/>
+<rhs>| 'IDREFS'</rhs>
+<vc def="idref"/>
+<rhs>| 'ENTITY'</rhs>
+<vc def="entname"/>
+<rhs>| 'ENTITIES'</rhs>
+<vc def="entname"/>
+<rhs>| 'NMTOKEN'</rhs>
+<vc def="nmtok"/>
+<rhs>| 'NMTOKENS'</rhs>
+<vc def="nmtok"/></prod>
+</prodgroup>
+</scrap>
</p>
<vcnote id="id">
<head>ID</head>
@@ -1672,9 +1993,14 @@
<p><termdef id="dt-enumerated" term="Enumerated Attribute Values"><term>Enumerated attributes</term> can take one
of a list of values provided in the declaration</termdef>. There are two
kinds of enumerated types:
-<scrap lang="ebnf"><head>Enumerated Attribute Types</head><prod id="NT-EnumeratedType"><lhs>EnumeratedType</lhs><rhs><nt def="NT-NotationType">NotationType</nt>
+<scrap lang="ebnf">
+<head>Enumerated Attribute Types</head>
+<prod id="NT-EnumeratedType"><lhs>EnumeratedType</lhs>
+<rhs><nt def="NT-NotationType">NotationType</nt>
| <nt def="NT-Enumeration">Enumeration</nt>
-</rhs></prod><prod id="NT-NotationType"><lhs>NotationType</lhs><rhs>'NOTATION'
+</rhs></prod>
+<prod id="NT-NotationType"><lhs>NotationType</lhs>
+<rhs>'NOTATION'
<nt def="NT-S">S</nt>
'('
<nt def="NT-S">S</nt>?
@@ -1682,19 +2008,25 @@
(<nt def="NT-S">S</nt>? '|' <nt def="NT-S">S</nt>?
<nt def="NT-Name">Name</nt>)*
<nt def="NT-S">S</nt>? ')'
-</rhs><vc def="notatn"/></prod><prod id="NT-Enumeration"><lhs>Enumeration</lhs><rhs>'(' <nt def="NT-S">S</nt>?
+</rhs>
+<vc def="notatn"/></prod>
+<prod id="NT-Enumeration"><lhs>Enumeration</lhs>
+<rhs>'(' <nt def="NT-S">S</nt>?
<nt def="NT-Nmtoken">Nmtoken</nt>
(<nt def="NT-S">S</nt>? '|'
<nt def="NT-S">S</nt>?
<nt def="NT-Nmtoken">Nmtoken</nt>)*
<nt def="NT-S">S</nt>?
-')'</rhs><vc def="enum"/></prod></scrap>
+')'</rhs>
+<vc def="enum"/></prod>
+</scrap>
A <kw>NOTATION</kw> attribute identifies a
<termref def="dt-notation">notation</termref>, declared in the
DTD with associated system and/or public identifiers, to
be used in interpreting the element to which the attribute
is attached.
</p>
+
<vcnote id="notatn">
<head>Notation Attributes</head>
<p>
@@ -1717,14 +2049,28 @@
enumerated attribute types of a single element type.
</p>
</div3>
+
<div3 id="sec-attr-defaults">
<head>Attribute Defaults</head>
+
<p>An <termref def="dt-attdecl">attribute declaration</termref> provides
information on whether
the attribute's presence is required, and if not, how an XML processor should
react if a declared attribute is absent in a document.
-<scrap lang="ebnf"><head>Attribute Defaults</head><prodgroup pcw4="14" pcw5="11.5"><prod id="NT-DefaultDecl"><lhs>DefaultDecl</lhs><rhs>'#REQUIRED'
-| '#IMPLIED' </rhs><rhs>| (('#FIXED' S)? <nt def="NT-AttValue">AttValue</nt>)</rhs><vc def="RequiredAttr"/><vc def="defattrvalid"/><wfc def="CleanAttrVals"/><vc def="FixedAttr"/></prod></prodgroup></scrap>
+<scrap lang="ebnf">
+<head>Attribute Defaults</head>
+<prodgroup pcw4="14" pcw5="11.5">
+<prod id="NT-DefaultDecl"><lhs>DefaultDecl</lhs>
+<rhs>'#REQUIRED'
+| '#IMPLIED' </rhs>
+<rhs>| (('#FIXED' S)? <nt def="NT-AttValue">AttValue</nt>)</rhs>
+<vc def="RequiredAttr"/>
+<vc def="defattrvalid"/>
+<wfc def="CleanAttrVals"/>
+<vc def="FixedAttr"/>
+</prod>
+</prodgroup>
+</scrap>
</p>
<p>In an attribute declaration, <kw>#REQUIRED</kw> means that the
@@ -1751,8 +2097,7 @@
<p>If the default declaration is the keyword <kw>#REQUIRED</kw>, then
the attribute must be specified for
all elements of the type in the attribute-list declaration.
-</p>
-</vcnote>
+</p></vcnote>
<vcnote id="defattrvalid">
<head>Attribute Default Legal</head>
<p>
@@ -1765,8 +2110,8 @@
<p>If an attribute has a default value declared with the
<kw>#FIXED</kw> keyword, instances of that attribute must
match the default value.
-</p>
-</vcnote>
+</p></vcnote>
+
<p>Examples of attribute-list declarations:
<eg><!ATTLIST termdef
id ID #REQUIRED
@@ -1781,14 +2126,19 @@
<p>Before the value of an attribute is passed to the application
or checked for validity, the
XML processor must normalize it as follows:
-<ulist><item><p>a character reference is processed by appending the referenced
-character to the attribute value</p></item><item><p>an entity reference is processed by recursively processing the
-replacement text of the entity</p></item><item><p>a whitespace character (#x20, #xD, #xA, #x9) is processed by
+<ulist>
+<item><p>a character reference is processed by appending the referenced
+character to the attribute value</p></item>
+<item><p>an entity reference is processed by recursively processing the
+replacement text of the entity</p></item>
+<item><p>a whitespace character (#x20, #xD, #xA, #x9) is processed by
appending #x20 to the normalized value, except that only a single #x20
is appended for a "#xD#xA" sequence that is part of an external
parsed entity or the literal entity value of an internal parsed
-entity</p></item><item><p>other characters are processed by appending them to the normalized
-value</p></item></ulist>
+entity</p></item>
+<item><p>other characters are processed by appending them to the normalized
+value</p>
+</item></ulist>
</p>
<p>If the declared value is not CDATA, then the XML processor must
further process the normalized attribute value by discarding any
@@ -1810,20 +2160,39 @@
which are
included in, or excluded from, the logical structure of the DTD based on
the keyword which governs them.</termdef>
-<scrap lang="ebnf"><head>Conditional Section</head><prodgroup pcw2="9" pcw4="14.5"><prod id="NT-conditionalSect"><lhs>conditionalSect</lhs><rhs><nt def="NT-includeSect">includeSect</nt>
+<scrap lang="ebnf">
+<head>Conditional Section</head>
+<prodgroup pcw2="9" pcw4="14.5">
+<prod id="NT-conditionalSect"><lhs>conditionalSect</lhs>
+<rhs><nt def="NT-includeSect">includeSect</nt>
| <nt def="NT-ignoreSect">ignoreSect</nt>
-</rhs></prod><prod id="NT-includeSect"><lhs>includeSect</lhs><rhs>'<![' S? 'INCLUDE' S? '['
+</rhs>
+</prod>
+<prod id="NT-includeSect"><lhs>includeSect</lhs>
+<rhs>'<![' S? 'INCLUDE' S? '['
<nt def="NT-extSubsetDecl">extSubsetDecl</nt>
']]>'
-</rhs></prod><prod id="NT-ignoreSect"><lhs>ignoreSect</lhs><rhs>'<![' S? 'IGNORE' S? '['
+</rhs>
+</prod>
+<prod id="NT-ignoreSect"><lhs>ignoreSect</lhs>
+<rhs>'<![' S? 'IGNORE' S? '['
<nt def="NT-ignoreSectContents">ignoreSectContents</nt>*
-']]>'</rhs></prod><prod id="NT-ignoreSectContents"><lhs>ignoreSectContents</lhs><rhs><nt def="NT-Ignore">Ignore</nt>
+']]>'</rhs>
+</prod>
+
+<prod id="NT-ignoreSectContents"><lhs>ignoreSectContents</lhs>
+<rhs><nt def="NT-Ignore">Ignore</nt>
('<![' <nt def="NT-ignoreSectContents">ignoreSectContents</nt> ']]>'
-<nt def="NT-Ignore">Ignore</nt>)*</rhs></prod><prod id="NT-Ignore"><lhs>Ignore</lhs><rhs><nt def="NT-Char">Char</nt>* -
+<nt def="NT-Ignore">Ignore</nt>)*</rhs></prod>
+<prod id="NT-Ignore"><lhs>Ignore</lhs>
+<rhs><nt def="NT-Char">Char</nt>* -
(<nt def="NT-Char">Char</nt>* ('<![' | ']]>')
<nt def="NT-Char">Char</nt>*)
-</rhs></prod></prodgroup></scrap>
+</rhs></prod>
+
+</prodgroup>
+</scrap>
</p>
<p>Like the internal and external DTD subsets, a conditional section
may contain one or more complete declarations,
@@ -1861,6 +2230,8 @@
</eg>
</p>
</div2>
+
+
<!--
<div2 id='sec-pass-to-app'>
<head>XML Processor Treatment of Logical Structure</head>
@@ -1881,11 +2252,14 @@
</ulist>
</p>
</div2>
--->
+-->
+
</div1>
<!-- &Entities; -->
+
<div1 id="sec-physical-struct">
<head>Physical Structures</head>
+
<p><termdef id="dt-entity" term="Entity">An XML document may consist
of one or many storage units. These are called
<term>entities</term>; they all have <term>content</term> and are all
@@ -1903,6 +2277,7 @@
<termref def="dt-repltext">replacement text</termref>;
this <termref def="dt-text">text</termref> is considered an
integral part of the document.</termdef></p>
+
<p><termdef id="dt-unparsed" term="Unparsed Entity">An
<term>unparsed entity</term>
is a resource whose contents may or may not be
@@ -1931,16 +2306,27 @@
Furthermore, they occupy different namespaces; a parameter entity and
a general entity with the same name are two distinct entities.
</p>
+
<div2 id="sec-references">
<head>Character and Entity References</head>
<p><termdef id="dt-charref" term="Character Reference">
A <term>character reference</term> refers to a specific character in the
ISO/IEC 10646 character set, for example one not directly accessible from
available input devices.
-<scrap lang="ebnf"><head>Character Reference</head><prod id="NT-CharRef"><lhs>CharRef</lhs><rhs>'&#' [0-9]+ ';' </rhs><rhs>| '&hcro;' [0-9a-fA-F]+ ';'</rhs><wfc def="wf-Legalchar"/></prod></scrap>
-<wfcnote id="wf-Legalchar"><head>Legal Character</head><p>Characters referred to using character references must
+<scrap lang="ebnf">
+<head>Character Reference</head>
+<prod id="NT-CharRef"><lhs>CharRef</lhs>
+<rhs>'&#' [0-9]+ ';' </rhs>
+<rhs>| '&hcro;' [0-9a-fA-F]+ ';'</rhs>
+<wfc def="wf-Legalchar"/>
+</prod>
+</scrap>
+<wfcnote id="wf-Legalchar">
+<head>Legal Character</head>
+<p>Characters referred to using character references must
match the production for
-<termref def="NT-Char">Char</termref>.</p></wfcnote>
+<termref def="NT-Char">Char</termref>.</p>
+</wfcnote>
If the character reference begins with "<code>&#x</code>", the digits and
letters up to the terminating <code>;</code> provide a hexadecimal
representation of the character's code point in ISO/IEC 10646.
@@ -1962,27 +2348,24 @@
</p>
<scrap lang="ebnf">
<head>Entity Reference</head>
-<prod id="NT-Reference">
-<lhs>Reference</lhs>
+<prod id="NT-Reference"><lhs>Reference</lhs>
<rhs><nt def="NT-EntityRef">EntityRef</nt>
-| <nt def="NT-CharRef">CharRef</nt></rhs>
-</prod>
-<prod id="NT-EntityRef">
-<lhs>EntityRef</lhs>
+| <nt def="NT-CharRef">CharRef</nt></rhs></prod>
+<prod id="NT-EntityRef"><lhs>EntityRef</lhs>
<rhs>'&' <nt def="NT-Name">Name</nt> ';'</rhs>
<wfc def="wf-entdeclared"/>
<vc def="vc-entdeclared"/>
<wfc def="textent"/>
<wfc def="norecursion"/>
</prod>
-<prod id="NT-PEReference">
-<lhs>PEReference</lhs>
+<prod id="NT-PEReference"><lhs>PEReference</lhs>
<rhs>'%' <nt def="NT-Name">Name</nt> ';'</rhs>
<vc def="vc-entdeclared"/>
<wfc def="norecursion"/>
<wfc def="indtd"/>
</prod>
</scrap>
+
<wfcnote id="wf-entdeclared">
<head>Entity Declared</head>
<p>In a document without any DTD, a document with only an internal
@@ -2052,20 +2435,44 @@
<!-- ... now reference it. -->
%ISOLat2;]]></eg></p>
</div2>
+
<div2 id="sec-entity-decl">
<head>Entity Declarations</head>
+
<p><termdef id="dt-entdecl" term="entity declaration">
Entities are declared thus:
-<scrap lang="ebnf"><head>Entity Declaration</head><prodgroup pcw2="5" pcw4="18.5"><prod id="NT-EntityDecl"><lhs>EntityDecl</lhs><rhs><nt def="NT-GEDecl">GEDecl</nt><!--</rhs><com>General entities</com>
-<rhs>--> | <nt def="NT-PEDecl">PEDecl</nt></rhs><!--<com>Parameter entities</com>--></prod><prod id="NT-GEDecl"><lhs>GEDecl</lhs><rhs>'<!ENTITY' <nt def="NT-S">S</nt> <nt def="NT-Name">Name</nt>
+<scrap lang="ebnf">
+<head>Entity Declaration</head>
+<prodgroup pcw2="5" pcw4="18.5">
+<prod id="NT-EntityDecl"><lhs>EntityDecl</lhs>
+<rhs><nt def="NT-GEDecl">GEDecl</nt><!--</rhs><com>General entities</com>
+<rhs>--> | <nt def="NT-PEDecl">PEDecl</nt></rhs>
+<!--<com>Parameter entities</com>-->
+</prod>
+<prod id="NT-GEDecl"><lhs>GEDecl</lhs>
+<rhs>'<!ENTITY' <nt def="NT-S">S</nt> <nt def="NT-Name">Name</nt>
<nt def="NT-S">S</nt> <nt def="NT-EntityDef">EntityDef</nt>
-<nt def="NT-S">S</nt>? '>'</rhs></prod><prod id="NT-PEDecl"><lhs>PEDecl</lhs><rhs>'<!ENTITY' <nt def="NT-S">S</nt> '%' <nt def="NT-S">S</nt>
+<nt def="NT-S">S</nt>? '>'</rhs>
+</prod>
+<prod id="NT-PEDecl"><lhs>PEDecl</lhs>
+<rhs>'<!ENTITY' <nt def="NT-S">S</nt> '%' <nt def="NT-S">S</nt>
<nt def="NT-Name">Name</nt> <nt def="NT-S">S</nt>
-<nt def="NT-PEDef">PEDef</nt> <nt def="NT-S">S</nt>? '>'</rhs><!--<com>Parameter entities</com>--></prod><prod id="NT-EntityDef"><lhs>EntityDef</lhs><rhs><nt def="NT-EntityValue">EntityValue</nt>
+<nt def="NT-PEDef">PEDef</nt> <nt def="NT-S">S</nt>? '>'</rhs>
+<!--<com>Parameter entities</com>-->
+</prod>
+<prod id="NT-EntityDef"><lhs>EntityDef</lhs>
+<rhs><nt def="NT-EntityValue">EntityValue</nt>
<!--</rhs>
<rhs>-->| (<nt def="NT-ExternalID">ExternalID</nt>
-<nt def="NT-NDataDecl">NDataDecl</nt>?)</rhs><!-- <nt def='NT-ExternalDef'>ExternalDef</nt></rhs> --></prod><!-- FINAL EDIT: what happened to WFs here? --><prod id="NT-PEDef"><lhs>PEDef</lhs><rhs><nt def="NT-EntityValue">EntityValue</nt>
-| <nt def="NT-ExternalID">ExternalID</nt></rhs></prod></prodgroup></scrap>
+<nt def="NT-NDataDecl">NDataDecl</nt>?)</rhs>
+<!-- <nt def='NT-ExternalDef'>ExternalDef</nt></rhs> -->
+</prod>
+<!-- FINAL EDIT: what happened to WFs here? -->
+<prod id="NT-PEDef"><lhs>PEDef</lhs>
+<rhs><nt def="NT-EntityValue">EntityValue</nt>
+| <nt def="NT-ExternalID">ExternalID</nt></rhs></prod>
+</prodgroup>
+</scrap>
The <nt def="NT-Name">Name</nt> identifies the entity in an
<termref def="dt-entref">entity reference</termref> or, in the case of an
unparsed entity, in the value of an <kw>ENTITY</kw> or <kw>ENTITIES</kw>
@@ -2074,8 +2481,10 @@
encountered is binding; at user option, an XML processor may issue a
warning if entities are declared multiple times.</termdef>
</p>
+
<div3 id="sec-internal-ent">
<head>Internal Entities</head>
+
<p><termdef id="dt-internent" term="Internal Entity Replacement Text">If
the entity definition is an
<nt def="NT-EntityValue">EntityValue</nt>,
@@ -2094,25 +2503,35 @@
<eg><!ENTITY Pub-Status "This is a pre-release of the
specification."></eg></p>
</div3>
+
<div3 id="sec-external-ent">
<head>External Entities</head>
-<p>
-<termdef id="dt-extent" term="External Entity">If the entity is not
+
+<p><termdef id="dt-extent" term="External Entity">If the entity is not
internal, it is an <term>external
entity</term>, declared as follows:
-<scrap lang="ebnf"><head>External Entity Declaration</head><!--
+<scrap lang="ebnf">
+<head>External Entity Declaration</head>
+<!--
<prod id='NT-ExternalDef'><lhs>ExternalDef</lhs>
-<rhs></prod> --><prod id="NT-ExternalID"><lhs>ExternalID</lhs><rhs>'SYSTEM' <nt def="NT-S">S</nt>
-<nt def="NT-SystemLiteral">SystemLiteral</nt></rhs><rhs>| 'PUBLIC' <nt def="NT-S">S</nt>
+<rhs></prod> -->
+<prod id="NT-ExternalID"><lhs>ExternalID</lhs>
+<rhs>'SYSTEM' <nt def="NT-S">S</nt>
+<nt def="NT-SystemLiteral">SystemLiteral</nt></rhs>
+<rhs>| 'PUBLIC' <nt def="NT-S">S</nt>
<nt def="NT-PubidLiteral">PubidLiteral</nt>
<nt def="NT-S">S</nt>
<nt def="NT-SystemLiteral">SystemLiteral</nt>
-</rhs></prod><prod id="NT-NDataDecl"><lhs>NDataDecl</lhs><rhs><nt def="NT-S">S</nt> 'NDATA' <nt def="NT-S">S</nt>
-<nt def="NT-Name">Name</nt></rhs><vc def="not-declared"/></prod></scrap>
+</rhs>
+</prod>
+<prod id="NT-NDataDecl"><lhs>NDataDecl</lhs>
+<rhs><nt def="NT-S">S</nt> 'NDATA' <nt def="NT-S">S</nt>
+<nt def="NT-Name">Name</nt></rhs>
+<vc def="not-declared"/></prod>
+</scrap>
If the <nt def="NT-NDataDecl">NDataDecl</nt> is present, this is a
general <termref def="dt-unparsed">unparsed
-entity</termref>; otherwise it is a parsed entity.</termdef>
-</p>
+entity</termref>; otherwise it is a parsed entity.</termdef></p>
<vcnote id="not-declared">
<head>Notation Declared</head>
<p>
@@ -2162,17 +2581,26 @@
SYSTEM "../grafix/OpenHatch.gif"
NDATA gif ></eg></p>
</div3>
+
</div2>
+
<div2 id="TextEntities">
<head>Parsed Entities</head>
<div3 id="sec-TextDecl">
<head>The Text Declaration</head>
<p>External parsed entities may each begin with a <term>text
declaration</term>.
-<scrap lang="ebnf"><head>Text Declaration</head><prodgroup pcw4="12.5" pcw5="13"><prod id="NT-TextDecl"><lhs>TextDecl</lhs><rhs>&xmlpio;
+<scrap lang="ebnf">
+<head>Text Declaration</head>
+<prodgroup pcw4="12.5" pcw5="13">
+<prod id="NT-TextDecl"><lhs>TextDecl</lhs>
+<rhs>&xmlpio;
<nt def="NT-VersionInfo">VersionInfo</nt>?
<nt def="NT-EncodingDecl">EncodingDecl</nt>
-<nt def="NT-S">S</nt>? &pic;</rhs></prod></prodgroup></scrap>
+<nt def="NT-S">S</nt>? &pic;</rhs>
+</prod>
+</prodgroup>
+</scrap>
</p>
<p>The text declaration must be provided literally, not
by reference to a parsed entity.
@@ -2189,9 +2617,17 @@
An external parameter
entity is well-formed if it matches the production labeled
<nt def="NT-extPE">extPE</nt>.
-<scrap lang="ebnf"><head>Well-Formed External Parsed Entity</head><prod id="NT-extParsedEnt"><lhs>extParsedEnt</lhs><rhs><nt def="NT-TextDecl">TextDecl</nt>?
-<nt def="NT-content">content</nt></rhs></prod><prod id="NT-extPE"><lhs>extPE</lhs><rhs><nt def="NT-TextDecl">TextDecl</nt>?
-<nt def="NT-extSubsetDecl">extSubsetDecl</nt></rhs></prod></scrap>
+<scrap lang="ebnf">
+<head>Well-Formed External Parsed Entity</head>
+<prod id="NT-extParsedEnt"><lhs>extParsedEnt</lhs>
+<rhs><nt def="NT-TextDecl">TextDecl</nt>?
+<nt def="NT-content">content</nt></rhs>
+</prod>
+<prod id="NT-extPE"><lhs>extPE</lhs>
+<rhs><nt def="NT-TextDecl">TextDecl</nt>?
+<nt def="NT-extSubsetDecl">extSubsetDecl</nt></rhs>
+</prod>
+</scrap>
An internal general parsed entity is well-formed if its replacement text
matches the production labeled
<nt def="NT-content">content</nt>.
@@ -2212,6 +2648,7 @@
</div3>
<div3 id="charencoding">
<head>Character Encoding in Entities</head>
+
<p>Each external parsed entity in an XML document may use a different
encoding for its characters. All XML processors must be able to read
entities in either UTF-8 or UTF-16.
@@ -2231,11 +2668,20 @@
Parsed entities which are stored in an encoding other than
UTF-8 or UTF-16 must begin with a <titleref href="TextDecl">text
declaration</titleref> containing an encoding declaration:
-<scrap lang="ebnf"><head>Encoding Declaration</head><prod id="NT-EncodingDecl"><lhs>EncodingDecl</lhs><rhs><nt def="NT-S">S</nt>
+<scrap lang="ebnf">
+<head>Encoding Declaration</head>
+<prod id="NT-EncodingDecl"><lhs>EncodingDecl</lhs>
+<rhs><nt def="NT-S">S</nt>
'encoding' <nt def="NT-Eq">Eq</nt>
('"' <nt def="NT-EncName">EncName</nt> '"' |
"'" <nt def="NT-EncName">EncName</nt> "'" )
-</rhs></prod><prod id="NT-EncName"><lhs>EncName</lhs><rhs>[A-Za-z] ([A-Za-z0-9._] | '-')*</rhs><com>Encoding name contains only Latin characters</com></prod></scrap>
+</rhs>
+</prod>
+<prod id="NT-EncName"><lhs>EncName</lhs>
+<rhs>[A-Za-z] ([A-Za-z0-9._] | '-')*</rhs>
+<com>Encoding name contains only Latin characters</com>
+</prod>
+</scrap>
In the <termref def="dt-docent">document entity</termref>, the encoding
declaration is part of the <termref def="dt-xmldecl">XML declaration</termref>.
The <nt def="NT-EncName">EncName</nt> is the name of the encoding used.
@@ -2278,6 +2724,7 @@
Note that since ASCII
is a subset of UTF-8, ordinary ASCII entities do not strictly need
an encoding declaration.</p>
+
<p>It is a <termref def="dt-fatal">fatal error</termref> when an XML processor
encounters an entity with an encoding that it is unable to process.</p>
<p>Examples of encoding declarations:
@@ -2292,29 +2739,44 @@
required behavior of an <termref def="dt-xml-proc">XML processor</termref> in
each case.
The labels in the leftmost column describe the recognition context:
-<glist><gitem><label>Reference in Content</label><def><p>as a reference
+<glist>
+<gitem><label>Reference in Content</label>
+<def><p>as a reference
anywhere after the <termref def="dt-stag">start-tag</termref> and
before the <termref def="dt-etag">end-tag</termref> of an element; corresponds
-to the nonterminal <nt def="NT-content">content</nt>.</p></def></gitem><gitem><label>Reference in Attribute Value</label><def><p>as a reference within either the value of an attribute in a
+to the nonterminal <nt def="NT-content">content</nt>.</p></def>
+</gitem>
+<gitem>
+<label>Reference in Attribute Value</label>
+<def><p>as a reference within either the value of an attribute in a
<termref def="dt-stag">start-tag</termref>, or a default
value in an <termref def="dt-attdecl">attribute declaration</termref>;
corresponds to the nonterminal
-<nt def="NT-AttValue">AttValue</nt>.</p></def></gitem><gitem><label>Occurs as Attribute Value</label><def><p>as a <nt def="NT-Name">Name</nt>, not a reference, appearing either as
+<nt def="NT-AttValue">AttValue</nt>.</p></def></gitem>
+<gitem>
+<label>Occurs as Attribute Value</label>
+<def><p>as a <nt def="NT-Name">Name</nt>, not a reference, appearing either as
the value of an
attribute which has been declared as type <kw>ENTITY</kw>, or as one of
the space-separated tokens in the value of an attribute which has been
-declared as type <kw>ENTITIES</kw>.</p></def></gitem><gitem><label>Reference in Entity Value</label><def><p>as a reference
+declared as type <kw>ENTITIES</kw>.</p>
+</def></gitem>
+<gitem><label>Reference in Entity Value</label>
+<def><p>as a reference
within a parameter or internal entity's
<termref def="dt-litentval">literal entity value</termref> in
the entity's declaration; corresponds to the nonterminal
-<nt def="NT-EntityValue">EntityValue</nt>.</p></def></gitem><gitem><label>Reference in DTD</label><def><p>as a reference within either the internal or external subsets of the
+<nt def="NT-EntityValue">EntityValue</nt>.</p></def></gitem>
+<gitem><label>Reference in DTD</label>
+<def><p>as a reference within either the internal or external subsets of the
<termref def="dt-doctype">DTD</termref>, but outside
of an <nt def="NT-EntityValue">EntityValue</nt> or
-<nt def="NT-AttValue">AttValue</nt>.</p></def></gitem></glist></p>
+<nt def="NT-AttValue">AttValue</nt>.</p></def>
+</gitem>
+</glist></p>
<htable border="1" cellpadding="7" align="center">
<htbody>
-<tr>
-<td bgcolor="&cellback;" rowspan="2" colspan="1"/>
+<tr><td bgcolor="&cellback;" rowspan="2" colspan="1"/>
<td bgcolor="&cellback;" align="center" valign="bottom" colspan="4">Entity Type</td>
<td bgcolor="&cellback;" rowspan="2" align="center">Character</td>
</tr>
@@ -2327,99 +2789,50 @@
<td bgcolor="&cellback;">Unparsed</td>
</tr>
<tr align="center" valign="middle">
+
<td bgcolor="&cellback;" align="right">Reference
in Content</td>
-<td bgcolor="&cellback;">
-<titleref href="not-recognized">Not recognized</titleref>
-</td>
-<td bgcolor="&cellback;">
-<titleref href="included">Included</titleref>
-</td>
-<td bgcolor="&cellback;">
-<titleref href="include-if-valid">Included if validating</titleref>
-</td>
-<td bgcolor="&cellback;">
-<titleref href="forbidden">Forbidden</titleref>
-</td>
-<td bgcolor="&cellback;">
-<titleref href="included">Included</titleref>
-</td>
+<td bgcolor="&cellback;"><titleref href="not-recognized">Not recognized</titleref></td>
+<td bgcolor="&cellback;"><titleref href="included">Included</titleref></td>
+<td bgcolor="&cellback;"><titleref href="include-if-valid">Included if validating</titleref></td>
+<td bgcolor="&cellback;"><titleref href="forbidden">Forbidden</titleref></td>
+<td bgcolor="&cellback;"><titleref href="included">Included</titleref></td>
</tr>
<tr align="center" valign="middle">
<td bgcolor="&cellback;" align="right">Reference
in Attribute Value</td>
-<td bgcolor="&cellback;">
-<titleref href="not-recognized">Not recognized</titleref>
-</td>
-<td bgcolor="&cellback;">
-<titleref href="inliteral">Included in literal</titleref>
-</td>
-<td bgcolor="&cellback;">
-<titleref href="forbidden">Forbidden</titleref>
-</td>
-<td bgcolor="&cellback;">
-<titleref href="forbidden">Forbidden</titleref>
-</td>
-<td bgcolor="&cellback;">
-<titleref href="included">Included</titleref>
-</td>
+<td bgcolor="&cellback;"><titleref href="not-recognized">Not recognized</titleref></td>
+<td bgcolor="&cellback;"><titleref href="inliteral">Included in literal</titleref></td>
+<td bgcolor="&cellback;"><titleref href="forbidden">Forbidden</titleref></td>
+<td bgcolor="&cellback;"><titleref href="forbidden">Forbidden</titleref></td>
+<td bgcolor="&cellback;"><titleref href="included">Included</titleref></td>
</tr>
<tr align="center" valign="middle">
<td bgcolor="&cellback;" align="right">Occurs as
Attribute Value</td>
-<td bgcolor="&cellback;">
-<titleref href="not-recognized">Not recognized</titleref>
-</td>
-<td bgcolor="&cellback;">
-<titleref href="not-recognized">Forbidden</titleref>
-</td>
-<td bgcolor="&cellback;">
-<titleref href="not-recognized">Forbidden</titleref>
-</td>
-<td bgcolor="&cellback;">
-<titleref href="notify">Notify</titleref>
-</td>
-<td bgcolor="&cellback;">
-<titleref href="not recognized">Not recognized</titleref>
-</td>
+<td bgcolor="&cellback;"><titleref href="not-recognized">Not recognized</titleref></td>
+<td bgcolor="&cellback;"><titleref href="not-recognized">Forbidden</titleref></td>
+<td bgcolor="&cellback;"><titleref href="not-recognized">Forbidden</titleref></td>
+<td bgcolor="&cellback;"><titleref href="notify">Notify</titleref></td>
+<td bgcolor="&cellback;"><titleref href="not recognized">Not recognized</titleref></td>
</tr>
<tr align="center" valign="middle">
<td bgcolor="&cellback;" align="right">Reference
in EntityValue</td>
-<td bgcolor="&cellback;">
-<titleref href="inliteral">Included in literal</titleref>
-</td>
-<td bgcolor="&cellback;">
-<titleref href="bypass">Bypassed</titleref>
-</td>
-<td bgcolor="&cellback;">
-<titleref href="bypass">Bypassed</titleref>
-</td>
-<td bgcolor="&cellback;">
-<titleref href="forbidden">Forbidden</titleref>
-</td>
-<td bgcolor="&cellback;">
-<titleref href="included">Included</titleref>
-</td>
+<td bgcolor="&cellback;"><titleref href="inliteral">Included in literal</titleref></td>
+<td bgcolor="&cellback;"><titleref href="bypass">Bypassed</titleref></td>
+<td bgcolor="&cellback;"><titleref href="bypass">Bypassed</titleref></td>
+<td bgcolor="&cellback;"><titleref href="forbidden">Forbidden</titleref></td>
+<td bgcolor="&cellback;"><titleref href="included">Included</titleref></td>
</tr>
<tr align="center" valign="middle">
<td bgcolor="&cellback;" align="right">Reference
in DTD</td>
-<td bgcolor="&cellback;">
-<titleref href="as-PE">Included as PE</titleref>
-</td>
-<td bgcolor="&cellback;">
-<titleref href="forbidden">Forbidden</titleref>
-</td>
-<td bgcolor="&cellback;">
-<titleref href="forbidden">Forbidden</titleref>
-</td>
-<td bgcolor="&cellback;">
-<titleref href="forbidden">Forbidden</titleref>
-</td>
-<td bgcolor="&cellback;">
-<titleref href="forbidden">Forbidden</titleref>
-</td>
+<td bgcolor="&cellback;"><titleref href="as-PE">Included as PE</titleref></td>
+<td bgcolor="&cellback;"><titleref href="forbidden">Forbidden</titleref></td>
+<td bgcolor="&cellback;"><titleref href="forbidden">Forbidden</titleref></td>
+<td bgcolor="&cellback;"><titleref href="forbidden">Forbidden</titleref></td>
+<td bgcolor="&cellback;"><titleref href="forbidden">Forbidden</titleref></td>
</tr>
</htbody>
</htable>
@@ -2434,8 +2847,7 @@
</div3>
<div3 id="included">
<head>Included</head>
-<p>
-<termdef id="dt-include" term="Include">An entity is
+<p><termdef id="dt-include" term="Include">An entity is
<term>included</term> when its
<termref def="dt-repltext">replacement text</termref> is retrieved
and processed, in place of the reference itself,
@@ -2452,8 +2864,7 @@
as an entity-reference delimiter.)
A character reference is <term>included</term> when the indicated
character is processed in place of the reference itself.
-</termdef>
-</p>
+</termdef></p>
</div3>
<div3 id="include-if-valid">
<head>Included If Validating</head>
@@ -2482,11 +2893,16 @@
<head>Forbidden</head>
<p>The following are forbidden, and constitute
<termref def="dt-fatal">fatal</termref> errors:
-<ulist><item><p>the appearance of a reference to an
+<ulist>
+<item><p>the appearance of a reference to an
<termref def="dt-unparsed">unparsed entity</termref>.
-</p></item><item><p>the appearance of any character or general-entity reference in the
+</p></item>
+<item><p>the appearance of any character or general-entity reference in the
DTD except within an <nt def="NT-EntityValue">EntityValue</nt> or
-<nt def="NT-AttValue">AttValue</nt>.</p></item><item><p>a reference to an external entity in an attribute value.</p></item></ulist>
+<nt def="NT-AttValue">AttValue</nt>.</p></item>
+<item><p>a reference to an external entity in an attribute value.</p>
+</item>
+</ulist>
</p>
</div3>
<div3 id="inliteral">
@@ -2505,8 +2921,7 @@
while this is not:
<eg><!ENTITY EndAttr "27'" >
<element attribute='a-&EndAttr;></eg>
-</p>
-</div3>
+</p></div3>
<div3 id="notify">
<head>Notify</head>
<p>When the name of an <termref def="dt-unparsed">unparsed
@@ -2538,6 +2953,7 @@
entities to contain an integral number of grammatical tokens in the DTD.
</p>
</div3>
+
</div2>
<div2 id="intern-replacement">
<head>Construction of Internal Entity Replacement Text</head>
@@ -2553,6 +2969,7 @@
replacement of character references and parameter-entity
references.
</termdef></p>
+
<p>The literal entity value
as given in an internal entity declaration
(<nt def="NT-EntityValue">EntityValue</nt>) may contain character,
@@ -2582,11 +2999,11 @@
discussion of a difficult example, see
<specref ref="sec-entexpand"/>.
</p>
+
</div2>
<div2 id="sec-predefined-ent">
<head>Predefined Entities</head>
-<p>
-<termdef id="dt-escape" term="escape">Entity and character
+<p><termdef id="dt-escape" term="escape">Entity and character
references can both be used to <term>escape</term> the left angle bracket,
ampersand, and other delimiters. A set of general entities
(&magicents;) is specified for this purpose.
@@ -2595,8 +3012,7 @@
character data, so the numeric character references
"<code>&#60;</code>" and "<code>&#38;</code>" may be used to
escape <code><</code> and <code>&</code> when they occur
-in character data.</termdef>
-</p>
+in character data.</termdef></p>
<p>All XML processors must recognize these entities whether they
are declared or not.
<termref def="dt-interop">For interoperability</termref>,
@@ -2618,34 +3034,38 @@
be well-formed.
</p>
</div2>
+
<div2 id="Notations">
<head>Notation Declarations</head>
-<p>
-<termdef id="dt-notation" term="Notation"><term>Notations</term> identify by
+
+<p><termdef id="dt-notation" term="Notation"><term>Notations</term> identify by
name the format of <termref def="dt-extent">unparsed
entities</termref>, the
format of elements which bear a notation attribute,
or the application to which
a <termref def="dt-pi">processing instruction</termref> is
-addressed.</termdef>
-</p>
-<p>
-<termdef id="dt-notdecl" term="Notation Declaration">
+addressed.</termdef></p>
+<p><termdef id="dt-notdecl" term="Notation Declaration">
<term>Notation declarations</term>
provide a name for the notation, for use in
entity and attribute-list declarations and in attribute specifications,
and an external identifier for the notation which may allow an XML
processor or its client application to locate a helper application
capable of processing data in the given notation.
-<scrap lang="ebnf"><head>Notation Declarations</head><prod id="NT-NotationDecl"><lhs>NotationDecl</lhs><rhs>'<!NOTATION' <nt def="NT-S">S</nt> <nt def="NT-Name">Name</nt>
+<scrap lang="ebnf">
+<head>Notation Declarations</head>
+<prod id="NT-NotationDecl"><lhs>NotationDecl</lhs>
+<rhs>'<!NOTATION' <nt def="NT-S">S</nt> <nt def="NT-Name">Name</nt>
<nt def="NT-S">S</nt>
(<nt def="NT-ExternalID">ExternalID</nt> |
<nt def="NT-PublicID">PublicID</nt>)
-<nt def="NT-S">S</nt>? '>'</rhs></prod><prod id="NT-PublicID"><lhs>PublicID</lhs><rhs>'PUBLIC' <nt def="NT-S">S</nt>
+<nt def="NT-S">S</nt>? '>'</rhs></prod>
+<prod id="NT-PublicID"><lhs>PublicID</lhs>
+<rhs>'PUBLIC' <nt def="NT-S">S</nt>
<nt def="NT-PubidLiteral">PubidLiteral</nt>
-</rhs></prod></scrap>
-</termdef>
-</p>
+</rhs></prod>
+</scrap>
+</termdef></p>
<p>XML processors must provide applications with the name and external
identifier(s) of any notation declared and referred to in an attribute
value, attribute definition, or entity declaration. They may
@@ -2657,8 +3077,11 @@
notations for which notation-specific applications are not available on
the system where the XML processor or application is running.)</p>
</div2>
+
+
<div2 id="sec-doc-entity">
<head>Document Entity</head>
+
<p><termdef id="dt-docent" term="Document Entity">The <term>document
entity</term> serves as the root of the entity
tree and a starting-point for an <termref def="dt-xml-proc">XML
@@ -2669,10 +3092,14 @@
well appear on a processor input stream
without any identification at all.</p>
</div2>
+
+
</div1>
<!-- &Conformance; -->
+
<div1 id="sec-conformance">
<head>Conformance</head>
+
<div2 id="proc-types">
<head>Validating and Non-Validating Processors</head>
<p>Conforming <termref def="dt-xml-proc">XML processors</termref> fall into two
@@ -2725,7 +3152,8 @@
Less is required of a non-validating processor; it need not read any
part of the document other than the document entity.
This has two effects that may be important to users of XML processors:
-<ulist><item><p>Certain well-formedness errors, specifically those that require
+<ulist>
+<item><p>Certain well-formedness errors, specifically those that require
reading external entities, may not be detected by a non-validating processor.
Examples include the constraints entitled
<titleref href="wf-entdeclared">Entity Declared</titleref>,
@@ -2733,7 +3161,8 @@
<titleref href="wf-norecursion">No Recursion</titleref>, as well
as some of the cases described as
<titleref href="forbidden">forbidden</titleref> in
-<specref ref="entproc"/>.</p></item><item><p>The information passed from the processor to the application may
+<specref ref="entproc"/>.</p></item>
+<item><p>The information passed from the processor to the application may
vary, depending on whether the processor reads
parameter and external entities.
For example, a non-validating processor may not
@@ -2742,7 +3171,8 @@
internal entities, or supply
<titleref href="sec-attr-defaults">default attribute values</titleref>,
where doing so depends on having read declarations in
-external or parameter entities.</p></item></ulist>
+external or parameter entities.</p></item>
+</ulist>
</p>
<p>For maximum reliability in interoperating between different XML
processors, applications which use non-validating processors should not
@@ -2752,8 +3182,10 @@
entities should use validating XML processors.</p>
</div2>
</div1>
+
<div1 id="sec-notation">
<head>Notation</head>
+
<p>The formal grammar of XML is given in this specification using a simple
Extended Backus-Naur Form (EBNF) notation. Each rule in the grammar defines
one symbol, in the form
@@ -2764,9 +3196,13 @@
Literal strings are quoted.
</p>
+
<p>Within the expression on the right-hand side of a rule, the following
expressions are used to match strings of one or more characters:
-<glist><gitem><label><code>#xN</code></label><def><p>where <code>N</code> is a hexadecimal integer, the
+<glist>
+<gitem>
+<label><code>#xN</code></label>
+<def><p>where <code>N</code> is a hexadecimal integer, the
expression matches the character in ISO/IEC 10646 whose canonical
(UCS-4)
code value, when interpreted as an unsigned binary number, has
@@ -2774,36 +3210,105 @@
<code>#xN</code> form is insignificant; the number of leading
zeros in the corresponding code value
is governed by the character
-encoding in use and is not significant for XML.</p></def></gitem><gitem><label><code>[a-zA-Z]</code>, <code>[#xN-#xN]</code></label><def><p>matches any <termref def="dt-character">character</termref>
-with a value in the range(s) indicated (inclusive).</p></def></gitem><gitem><label><code>[^a-z]</code>, <code>[^#xN-#xN]</code></label><def><p>matches any <termref def="dt-character">character</termref>
+encoding in use and is not significant for XML.</p></def>
+</gitem>
+<gitem>
+<label><code>[a-zA-Z]</code>, <code>[#xN-#xN]</code></label>
+<def><p>matches any <termref def="dt-character">character</termref>
+with a value in the range(s) indicated (inclusive).</p></def>
+</gitem>
+<gitem>
+<label><code>[^a-z]</code>, <code>[^#xN-#xN]</code></label>
+<def><p>matches any <termref def="dt-character">character</termref>
with a value <emph>outside</emph> the
-range indicated.</p></def></gitem><gitem><label><code>[^abc]</code>, <code>[^#xN#xN#xN]</code></label><def><p>matches any <termref def="dt-character">character</termref>
-with a value not among the characters given.</p></def></gitem><gitem><label><code>"string"</code></label><def><p>matches a literal string <termref def="dt-match">matching</termref>
-that given inside the double quotes.</p></def></gitem><gitem><label><code>'string'</code></label><def><p>matches a literal string <termref def="dt-match">matching</termref>
-that given inside the single quotes.</p></def></gitem></glist>
+range indicated.</p></def>
+</gitem>
+<gitem>
+<label><code>[^abc]</code>, <code>[^#xN#xN#xN]</code></label>
+<def><p>matches any <termref def="dt-character">character</termref>
+with a value not among the characters given.</p></def>
+</gitem>
+<gitem>
+<label><code>"string"</code></label>
+<def><p>matches a literal string <termref def="dt-match">matching</termref>
+that given inside the double quotes.</p></def>
+</gitem>
+<gitem>
+<label><code>'string'</code></label>
+<def><p>matches a literal string <termref def="dt-match">matching</termref>
+that given inside the single quotes.</p></def>
+</gitem>
+</glist>
These symbols may be combined to match more complex patterns as follows,
where <code>A</code> and <code>B</code> represent simple expressions:
-<glist><gitem><label>(<code>expression</code>)</label><def><p><code>expression</code> is treated as a unit
-and may be combined as described in this list.</p></def></gitem><gitem><label><code>A?</code></label><def><p>matches <code>A</code> or nothing; optional <code>A</code>.</p></def></gitem><gitem><label><code>A B</code></label><def><p>matches <code>A</code> followed by <code>B</code>.</p></def></gitem><gitem><label><code>A | B</code></label><def><p>matches <code>A</code> or <code>B</code> but not both.</p></def></gitem><gitem><label><code>A - B</code></label><def><p>matches any string that matches <code>A</code> but does not match
+<glist>
+<gitem>
+<label>(<code>expression</code>)</label>
+<def><p><code>expression</code> is treated as a unit
+and may be combined as described in this list.</p></def>
+</gitem>
+<gitem>
+<label><code>A?</code></label>
+<def><p>matches <code>A</code> or nothing; optional <code>A</code>.</p></def>
+</gitem>
+<gitem>
+<label><code>A B</code></label>
+<def><p>matches <code>A</code> followed by <code>B</code>.</p></def>
+</gitem>
+<gitem>
+<label><code>A | B</code></label>
+<def><p>matches <code>A</code> or <code>B</code> but not both.</p></def>
+</gitem>
+<gitem>
+<label><code>A - B</code></label>
+<def><p>matches any string that matches <code>A</code> but does not match
<code>B</code>.
-</p></def></gitem><gitem><label><code>A+</code></label><def><p>matches one or more occurrences of <code>A</code>.</p></def></gitem><gitem><label><code>A*</code></label><def><p>matches zero or more occurrences of <code>A</code>.</p></def></gitem></glist>
+</p></def>
+</gitem>
+<gitem>
+<label><code>A+</code></label>
+<def><p>matches one or more occurrences of <code>A</code>.</p></def>
+</gitem>
+<gitem>
+<label><code>A*</code></label>
+<def><p>matches zero or more occurrences of <code>A</code>.</p></def>
+</gitem>
+
+</glist>
Other notations used in the productions are:
-<glist><gitem><label><code>/* ... */</code></label><def><p>comment.</p></def></gitem><gitem><label><code>[ wfc: ... ]</code></label><def><p>well-formedness constraint; this identifies by name a
+<glist>
+<gitem>
+<label><code>/* ... */</code></label>
+<def><p>comment.</p></def>
+</gitem>
+<gitem>
+<label><code>[ wfc: ... ]</code></label>
+<def><p>well-formedness constraint; this identifies by name a
constraint on
<termref def="dt-wellformed">well-formed</termref> documents
-associated with a production.</p></def></gitem><gitem><label><code>[ vc: ... ]</code></label><def><p>validity constraint; this identifies by name a constraint on
+associated with a production.</p></def>
+</gitem>
+<gitem>
+<label><code>[ vc: ... ]</code></label>
+<def><p>validity constraint; this identifies by name a constraint on
<termref def="dt-valid">valid</termref> documents associated with
-a production.</p></def></gitem></glist>
-</p>
-</div1>
+a production.</p></def>
+</gitem>
+</glist>
+</p></div1>
+
</body>
<back>
<!-- &SGML; -->
+
+
<!-- &Biblio; -->
<div1 id="sec-bibliography">
+
<head>References</head>
<div2 id="sec-existing-stds">
<head>Normative References</head>
+
<blist>
<bibl id="IANA" key="IANA">
(Internet Assigned Numbers Authority) <emph>Official Names for
@@ -2811,18 +3316,21 @@
ed. Keld Simonsen et al.
See <loc href="ftp://ftp.isi.edu/in-notes/iana/assignments/character-sets">ftp://ftp.isi.edu/in-notes/iana/assignments/character-sets</loc>.
</bibl>
+
<bibl id="RFC1766" key="IETF RFC 1766">
IETF (Internet Engineering Task Force).
<emph>RFC 1766: Tags for the Identification of Languages</emph>,
ed. H. Alvestrand.
1995.
</bibl>
+
<bibl id="ISO639" key="ISO 639">
(International Organization for Standardization).
<emph>ISO 639:1988 (E).
Code for the representation of names of languages.</emph>
[Geneva]: International Organization for
Standardization, 1988.</bibl>
+
<bibl id="ISO3166" key="ISO 3166">
(International Organization for Standardization).
<emph>ISO 3166-1:1997 (E).
@@ -2830,6 +3338,7 @@
— Part 1: Country codes</emph>
[Geneva]: International Organization for
Standardization, 1997.</bibl>
+
<bibl id="ISO10646" key="ISO/IEC 10646">ISO
(International Organization for Standardization).
<emph>ISO/IEC 10646-1993 (E). Information technology — Universal
@@ -2838,24 +3347,31 @@
[Geneva]: International Organization for
Standardization, 1993 (plus amendments AM 1 through AM 7).
</bibl>
+
<bibl id="Unicode" key="Unicode">The Unicode Consortium.
<emph>The Unicode Standard, Version 2.0.</emph>
Reading, Mass.: Addison-Wesley Developers Press, 1996.</bibl>
+
</blist>
+
</div2>
-<div2>
-<head>Other References</head>
+
+<div2><head>Other References</head>
+
<blist>
+
<bibl id="Aho" key="Aho/Ullman">Aho, Alfred V.,
Ravi Sethi, and Jeffrey D. Ullman.
<emph>Compilers: Principles, Techniques, and Tools</emph>.
Reading: Addison-Wesley, 1986, rpt. corr. 1988.</bibl>
+
<bibl xml-link="simple" id="Berners-Lee" key="Berners-Lee et al.">
Berners-Lee, T., R. Fielding, and L. Masinter.
<emph>Uniform Resource Identifiers (URI): Generic Syntax and
Semantics</emph>.
1997.
(Work in progress; see updates to RFC1738.)</bibl>
+
<bibl id="ABK" key="Brüggemann-Klein">Brüggemann-Klein, Anne.
<emph>Regular Expressions into Finite Automata</emph>.
Extended abstract in I. Simon, Hrsg., LATIN 1992,
@@ -2863,12 +3379,14 @@
Full Version in Theoretical Computer Science 120: 197-213, 1993.
</bibl>
+
<bibl id="ABKDW" key="Brüggemann-Klein and Wood">Brüggemann-Klein, Anne,
and Derick Wood.
<emph>Deterministic Regular Languages</emph>.
Universität Freiburg, Institut für Informatik,
Bericht 38, Oktober 1991.
</bibl>
+
<bibl id="Clark" key="Clark">James Clark.
Comparison of SGML and XML. See
<loc href="http://www.w3.org/TR/NOTE-sgml-xml-971215">http://www.w3.org/TR/NOTE-sgml-xml-971215</loc>.
@@ -2879,18 +3397,21 @@
ed. T. Berners-Lee, L. Masinter, M. McCahill.
1994.
</bibl>
+
<bibl xml-link="simple" id="RFC1808" key="IETF RFC1808">
IETF (Internet Engineering Task Force).
<emph>RFC 1808: Relative Uniform Resource Locators</emph>,
ed. R. Fielding.
1995.
</bibl>
+
<bibl xml-link="simple" id="RFC2141" key="IETF RFC2141">
IETF (Internet Engineering Task Force).
<emph>RFC 2141: URN Syntax</emph>,
ed. R. Moats.
1997.
</bibl>
+
<bibl id="ISO8879" key="ISO 8879">ISO
(International Organization for Standardization).
<emph>ISO 8879:1986(E). Information processing — Text and Office
@@ -2898,6 +3419,8 @@
edition — 1986-10-15. [Geneva]: International Organization for
Standardization, 1986.
</bibl>
+
+
<bibl id="ISO10744" key="ISO/IEC 10744">ISO
(International Organization for Standardization).
<emph>ISO/IEC 10744-1992 (E). Information technology —
@@ -2909,6 +3432,9 @@
[Geneva]: International Organization for
Standardization, 1996.
</bibl>
+
+
+
</blist>
</div2>
</div1>
@@ -2921,8 +3447,14 @@
others, this class contains most diacritics); these classes combine
to form the class of letters. Digits and extenders are
also distinguished.
-<scrap lang="ebnf" id="CHARACTERS"><head>Characters</head><prodgroup pcw3="3" pcw4="15"><prod id="NT-Letter"><lhs>Letter</lhs><rhs><nt def="NT-BaseChar">BaseChar</nt>
-| <nt def="NT-Ideographic">Ideographic</nt></rhs></prod><prod id="NT-BaseChar"><lhs>BaseChar</lhs><rhs>[#x0041-#x005A]
+<scrap lang="ebnf" id="CHARACTERS">
+<head>Characters</head>
+<prodgroup pcw3="3" pcw4="15">
+<prod id="NT-Letter"><lhs>Letter</lhs>
+<rhs><nt def="NT-BaseChar">BaseChar</nt>
+| <nt def="NT-Ideographic">Ideographic</nt></rhs> </prod>
+<prod id="NT-BaseChar"><lhs>BaseChar</lhs>
+<rhs>[#x0041-#x005A]
| [#x0061-#x007A]
| [#x00C0-#x00D6]
| [#x00D8-#x00F6]
@@ -3124,10 +3656,14 @@
| [#x30A1-#x30FA]
| [#x3105-#x312C]
| [#xAC00-#xD7A3]
-</rhs></prod><prod id="NT-Ideographic"><lhs>Ideographic</lhs><rhs>[#x4E00-#x9FA5]
+</rhs></prod>
+<prod id="NT-Ideographic"><lhs>Ideographic</lhs>
+<rhs>[#x4E00-#x9FA5]
| #x3007
| [#x3021-#x3029]
-</rhs></prod><prod id="NT-CombiningChar"><lhs>CombiningChar</lhs><rhs>[#x0300-#x0345]
+</rhs></prod>
+<prod id="NT-CombiningChar"><lhs>CombiningChar</lhs>
+<rhs>[#x0300-#x0345]
| [#x0360-#x0361]
| [#x0483-#x0486]
| [#x0591-#x05A1]
@@ -3222,7 +3758,9 @@
| [#x302A-#x302F]
| #x3099
| #x309A
-</rhs></prod><prod id="NT-Digit"><lhs>Digit</lhs><rhs>[#x0030-#x0039]
+</rhs></prod>
+<prod id="NT-Digit"><lhs>Digit</lhs>
+<rhs>[#x0030-#x0039]
| [#x0660-#x0669]
| [#x06F0-#x06F9]
| [#x0966-#x096F]
@@ -3237,7 +3775,9 @@
| [#x0E50-#x0E59]
| [#x0ED0-#x0ED9]
| [#x0F20-#x0F29]
-</rhs></prod><prod id="NT-Extender"><lhs>Extender</lhs><rhs>#x00B7
+</rhs></prod>
+<prod id="NT-Extender"><lhs>Extender</lhs>
+<rhs>#x00B7
| #x02D0
| #x02D1
| #x0387
@@ -3248,26 +3788,61 @@
| [#x3031-#x3035]
| [#x309D-#x309E]
| [#x30FC-#x30FE]
-</rhs></prod></prodgroup></scrap>
+</rhs></prod>
+
+</prodgroup>
+</scrap>
</p>
<p>The character classes defined here can be derived from the
Unicode character database as follows:
-<ulist><item><p>Name start characters must have one of the categories Ll, Lu,
-Lo, Lt, Nl.</p></item><item><p>Name characters other than Name-start characters
-must have one of the categories Mc, Me, Mn, Lm, or Nd.</p></item><item><p>Characters in the compatibility area (i.e. with character code
+<ulist>
+<item>
+<p>Name start characters must have one of the categories Ll, Lu,
+Lo, Lt, Nl.</p>
+</item>
+<item>
+<p>Name characters other than Name-start characters
+must have one of the categories Mc, Me, Mn, Lm, or Nd.</p>
+</item>
+<item>
+<p>Characters in the compatibility area (i.e. with character code
greater than #xF900 and less than #xFFFE) are not allowed in XML
-names.</p></item><item><p>Characters which have a font or compatibility decomposition (i.e. those
+names.</p>
+</item>
+<item>
+<p>Characters which have a font or compatibility decomposition (i.e. those
with a "compatibility formatting tag" in field 5 of the database --
-marked by field 5 beginning with a "<") are not allowed.</p></item><item><p>The following characters are treated as name-start characters
+marked by field 5 beginning with a "<") are not allowed.</p>
+</item>
+<item>
+<p>The following characters are treated as name-start characters
rather than name characters, because the property file classifies
-them as Alphabetic: [#x02BB-#x02C1], #x0559, #x06E5, #x06E6.</p></item><item><p>Characters #x20DD-#x20E0 are excluded (in accordance with
-Unicode, section 5.14).</p></item><item><p>Character #x00B7 is classified as an extender, because the
-property list so identifies it.</p></item><item><p>Character #x0387 is added as a name character, because #x00B7
-is its canonical equivalent.</p></item><item><p>Characters ':' and '_' are allowed as name-start characters.</p></item><item><p>Characters '-' and '.' are allowed as name characters.</p></item></ulist>
+them as Alphabetic: [#x02BB-#x02C1], #x0559, #x06E5, #x06E6.</p>
+</item>
+<item>
+<p>Characters #x20DD-#x20E0 are excluded (in accordance with
+Unicode, section 5.14).</p>
+</item>
+<item>
+<p>Character #x00B7 is classified as an extender, because the
+property list so identifies it.</p>
+</item>
+<item>
+<p>Character #x0387 is added as a name character, because #x00B7
+is its canonical equivalent.</p>
+</item>
+<item>
+<p>Characters ':' and '_' are allowed as name-start characters.</p>
+</item>
+<item>
+<p>Characters '-' and '.' are allowed as name characters.</p>
+</item>
+</ulist>
</p>
</div1>
<inform-div1 id="sec-xml-and-sgml">
<head>XML and SGML</head>
+
<p>XML is designed to be a subset of SGML, in that every
<termref def="dt-valid">valid</termref> XML document should also be a
conformant SGML document.
@@ -3318,29 +3893,34 @@
8 <test>This sample shows a &tricky; method.</test>
]]></eg>
This produces the following:
-<ulist spacing="compact"><item><p>in line 4, the reference to character 37 is expanded immediately,
+<ulist spacing="compact">
+<item><p>in line 4, the reference to character 37 is expanded immediately,
and the parameter entity "<code>xx</code>" is stored in the symbol
table with the value "<code>%zz;</code>". Since the replacement text
is not rescanned, the reference to parameter entity "<code>zz</code>"
is not recognized. (And it would be an error if it were, since
-"<code>zz</code>" is not yet declared.)</p></item><item><p>in line 5, the character reference "<code>&#60;</code>" is
+"<code>zz</code>" is not yet declared.)</p></item>
+<item><p>in line 5, the character reference "<code>&#60;</code>" is
expanded immediately and the parameter entity "<code>zz</code>" is
stored with the replacement text
"<code><!ENTITY tricky "error-prone" ></code>",
-which is a well-formed entity declaration.</p></item><item><p>in line 6, the reference to "<code>xx</code>" is recognized,
+which is a well-formed entity declaration.</p></item>
+<item><p>in line 6, the reference to "<code>xx</code>" is recognized,
and the replacement text of "<code>xx</code>" (namely
"<code>%zz;</code>") is parsed. The reference to "<code>zz</code>"
is recognized in its turn, and its replacement text
("<code><!ENTITY tricky "error-prone" ></code>") is parsed.
The general entity "<code>tricky</code>" has now been
-declared, with the replacement text "<code>error-prone</code>".</p></item><item><p>
+declared, with the replacement text "<code>error-prone</code>".</p></item>
+<item><p>
in line 8, the reference to the general entity "<code>tricky</code>" is
recognized, and it is expanded, so the full content of the
"<code>test</code>" element is the self-describing (and ungrammatical) string
<emph>This sample shows a error-prone method.</emph>
-</p></item></ulist>
+</p></item>
+</ulist>
</p>
-</inform-div1>
+</inform-div1>
<inform-div1 id="determinism">
<head>Deterministic Content Models</head>
<p><termref def="dt-compat">For compatibility</termref>, it is
@@ -3409,9 +3989,35 @@
"<code>#x0000003C</code>" and '?' is "<code>#x0000003F</code>", and the Byte
Order Mark required of UTF-16 data streams is "<code>#xFEFF</code>".</p>
<p>
-<ulist><item><p><code>00 00 00 3C</code>: UCS-4, big-endian machine (1234 order)</p></item><item><p><code>3C 00 00 00</code>: UCS-4, little-endian machine (4321 order)</p></item><item><p><code>00 00 3C 00</code>: UCS-4, unusual octet order (2143)</p></item><item><p><code>00 3C 00 00</code>: UCS-4, unusual octet order (3412)</p></item><item><p><code>FE FF</code>: UTF-16, big-endian</p></item><item><p><code>FF FE</code>: UTF-16, little-endian</p></item><item><p><code>00 3C 00 3F</code>: UTF-16, big-endian, no Byte Order Mark
-(and thus, strictly speaking, in error)</p></item><item><p><code>3C 00 3F 00</code>: UTF-16, little-endian, no Byte Order Mark
-(and thus, strictly speaking, in error)</p></item><item><p><code>3C 3F 78 6D</code>: UTF-8, ISO 646, ASCII, some part of ISO 8859,
+<ulist>
+<item>
+<p><code>00 00 00 3C</code>: UCS-4, big-endian machine (1234 order)</p>
+</item>
+<item>
+<p><code>3C 00 00 00</code>: UCS-4, little-endian machine (4321 order)</p>
+</item>
+<item>
+<p><code>00 00 3C 00</code>: UCS-4, unusual octet order (2143)</p>
+</item>
+<item>
+<p><code>00 3C 00 00</code>: UCS-4, unusual octet order (3412)</p>
+</item>
+<item>
+<p><code>FE FF</code>: UTF-16, big-endian</p>
+</item>
+<item>
+<p><code>FF FE</code>: UTF-16, little-endian</p>
+</item>
+<item>
+<p><code>00 3C 00 3F</code>: UTF-16, big-endian, no Byte Order Mark
+(and thus, strictly speaking, in error)</p>
+</item>
+<item>
+<p><code>3C 00 3F 00</code>: UTF-16, little-endian, no Byte Order Mark
+(and thus, strictly speaking, in error)</p>
+</item>
+<item>
+<p><code>3C 3F 78 6D</code>: UTF-8, ISO 646, ASCII, some part of ISO 8859,
Shift-JIS, EUC, or any other 7-bit, 8-bit, or mixed-width encoding
which ensures that the characters of ASCII have their normal positions,
width,
@@ -3419,11 +4025,19 @@
detect which of these applies, but since all of these encodings
use the same bit patterns for the ASCII characters, the encoding
declaration itself may be read reliably
-</p></item><item><p><code>4C 6F A7 94</code>: EBCDIC (in some flavor; the full
+</p>
+</item>
+<item>
+<p><code>4C 6F A7 94</code>: EBCDIC (in some flavor; the full
encoding declaration must be read to tell which code page is in
-use)</p></item><item><p>other: UTF-8 without an encoding declaration, or else
+use)</p>
+</item>
+<item>
+<p>other: UTF-8 without an encoding declaration, or else
the data stream is corrupt, fragmentary, or enclosed in
-a wrapper of some kind</p></item></ulist>
+a wrapper of some kind</p>
+</item>
+</ulist>
</p>
<p>
This level of autodetection is enough to read the XML encoding
@@ -3469,97 +4083,64 @@
RFC document defining the text/xml and application/xml MIME types. In
the interests of interoperability, however, the following rules
are recommended.
-<ulist><item><p>If an XML entity is in a file, the Byte-Order Mark
+<ulist>
+<item><p>If an XML entity is in a file, the Byte-Order Mark
and encoding-declaration PI are used (if present) to determine the
character encoding. All other heuristics and sources of information
are solely for error recovery.
-</p></item><item><p>If an XML entity is delivered with a
+</p></item>
+<item><p>If an XML entity is delivered with a
MIME type of text/xml, then the <code>charset</code> parameter
on the MIME type determines the
character encoding method; all other heuristics and sources of
information are solely for error recovery.
-</p></item><item><p>If an XML entity is delivered
+</p></item>
+<item><p>If an XML entity is delivered
with a
MIME type of application/xml, then the Byte-Order Mark and
encoding-declaration PI are used (if present) to determine the
character encoding. All other heuristics and sources of
information are solely for error recovery.
-</p></item></ulist>
+</p></item>
+</ulist>
These rules apply only in the absence of protocol-level documentation;
in particular, when the MIME types text/xml and application/xml are
defined, the recommendations of the relevant RFC will supersede
these rules.
</p>
+
</inform-div1>
+
<inform-div1 id="sec-xml-wg">
<head>W3C XML Working Group</head>
+
<p>This specification was prepared and approved for publication by the
W3C XML Working Group (WG). WG approval of this specification does
not necessarily imply that all WG members voted for its approval.
The current and former members of the XML WG are:</p>
+
<orglist>
-<member>
-<name>Jon Bosak, Sun</name>
-<role>Chair</role>
-</member>
-<member>
-<name>James Clark</name>
-<role>Technical Lead</role>
-</member>
-<member>
-<name>Tim Bray, Textuality and Netscape</name>
-<role>XML Co-editor</role>
-</member>
-<member>
-<name>Jean Paoli, Microsoft</name>
-<role>XML Co-editor</role>
-</member>
-<member>
-<name>C. M. Sperberg-McQueen, U. of Ill.</name>
-<role>XML
-Co-editor</role>
-</member>
-<member>
-<name>Dan Connolly, W3C</name>
-<role>W3C Liaison</role>
-</member>
-<member>
-<name>Paula Angerstein, Texcel</name>
-</member>
-<member>
-<name>Steve DeRose, INSO</name>
-</member>
-<member>
-<name>Dave Hollander, HP</name>
-</member>
-<member>
-<name>Eliot Kimber, ISOGEN</name>
-</member>
-<member>
-<name>Eve Maler, ArborText</name>
-</member>
-<member>
-<name>Tom Magliery, NCSA</name>
-</member>
-<member>
-<name>Murray Maloney, Muzmo and Grif</name>
-</member>
-<member>
-<name>Makoto Murata, Fuji Xerox Information Systems</name>
-</member>
-<member>
-<name>Joel Nava, Adobe</name>
-</member>
-<member>
-<name>Conleth O'Connell, Vignette</name>
-</member>
-<member>
-<name>Peter Sharpe, SoftQuad</name>
-</member>
-<member>
-<name>John Tigue, DataChannel</name>
-</member>
+<member><name>Jon Bosak, Sun</name><role>Chair</role></member>
+<member><name>James Clark</name><role>Technical Lead</role></member>
+<member><name>Tim Bray, Textuality and Netscape</name><role>XML Co-editor</role></member>
+<member><name>Jean Paoli, Microsoft</name><role>XML Co-editor</role></member>
+<member><name>C. M. Sperberg-McQueen, U. of Ill.</name><role>XML
+Co-editor</role></member>
+<member><name>Dan Connolly, W3C</name><role>W3C Liaison</role></member>
+<member><name>Paula Angerstein, Texcel</name></member>
+<member><name>Steve DeRose, INSO</name></member>
+<member><name>Dave Hollander, HP</name></member>
+<member><name>Eliot Kimber, ISOGEN</name></member>
+<member><name>Eve Maler, ArborText</name></member>
+<member><name>Tom Magliery, NCSA</name></member>
+<member><name>Murray Maloney, Muzmo and Grif</name></member>
+<member><name>Makoto Murata, Fuji Xerox Information Systems</name></member>
+<member><name>Joel Nava, Adobe</name></member>
+<member><name>Conleth O'Connell, Vignette</name></member>
+<member><name>Peter Sharpe, SoftQuad</name></member>
+<member><name>John Tigue, DataChannel</name></member>
</orglist>
+
</inform-div1>
</back>
</spec>
diff --git a/result/valid/dia.xml b/result/valid/dia.xml
index c195984..f7f1853 100644
--- a/result/valid/dia.xml
+++ b/result/valid/dia.xml
@@ -39,100 +39,100 @@
<!ATTLIST font name CDATA #REQUIRED>
]>
<dia:diagram xmlns:dia="http://www.lysator.liu.se/~alla/dia/">
-<dia:diagramdata>
-<dia:attribute name="background">
-<dia:color val="#ffffff"/>
-</dia:attribute>
-</dia:diagramdata>
-<dia:layer name="Background" visible="true">
-<dia:object type="Standard - Line" version="0" id="O0">
-<dia:attribute name="obj_pos">
-<dia:point val="1.95,6.85"/>
-</dia:attribute>
-<dia:attribute name="obj_bb">
-<dia:rectangle val="1.9,6.8;11,8.55"/>
-</dia:attribute>
-<dia:attribute name="conn_endpoints">
-<dia:point val="1.95,6.85"/>
-<dia:point val="10.95,8.5"/>
-</dia:attribute>
-<dia:attribute name="line_color">
-<dia:color val="#000000"/>
-</dia:attribute>
-<dia:attribute name="line_width">
-<dia:real val="0.1"/>
-</dia:attribute>
-<dia:attribute name="line_style">
-<dia:enum val="0"/>
-</dia:attribute>
-<dia:attribute name="start_arrow">
-<dia:enum val="0"/>
-</dia:attribute>
-<dia:attribute name="end_arrow">
-<dia:enum val="0"/>
-</dia:attribute>
-<dia:connections>
-<dia:connection handle="1" to="O2" connection="3"/>
-</dia:connections>
-</dia:object>
-<dia:object type="Standard - Text" version="0" id="O1">
-<dia:attribute name="obj_pos">
-<dia:point val="4.8,4.75"/>
-</dia:attribute>
-<dia:attribute name="obj_bb">
-<dia:rectangle val="2.579,3.96359;7.021,4.96359"/>
-</dia:attribute>
-<dia:attribute name="text">
-<dia:composite type="text">
-<dia:attribute name="string">
-<dia:string val="sdfsdfg"/>
-</dia:attribute>
-<dia:attribute name="font">
-<dia:font name="Courier"/>
-</dia:attribute>
-<dia:attribute name="height">
-<dia:real val="1"/>
-</dia:attribute>
-<dia:attribute name="pos">
-<dia:point val="4.8,4.75"/>
-</dia:attribute>
-<dia:attribute name="color">
-<dia:color val="#000000"/>
-</dia:attribute>
-<dia:attribute name="alignment">
-<dia:enum val="1"/>
-</dia:attribute>
-</dia:composite>
-</dia:attribute>
-</dia:object>
-<dia:object type="Standard - Box" version="0" id="O2">
-<dia:attribute name="obj_pos">
-<dia:point val="10.95,7.5"/>
-</dia:attribute>
-<dia:attribute name="obj_bb">
-<dia:rectangle val="10.9,7.45;13.05,9.55"/>
-</dia:attribute>
-<dia:attribute name="elem_corner">
-<dia:point val="10.95,7.5"/>
-</dia:attribute>
-<dia:attribute name="elem_width">
-<dia:real val="2.05"/>
-</dia:attribute>
-<dia:attribute name="elem_height">
-<dia:real val="2"/>
-</dia:attribute>
-<dia:attribute name="border_width">
-<dia:real val="0.1"/>
-</dia:attribute>
-<dia:attribute name="border_color">
-<dia:color val="#000000"/>
-</dia:attribute>
-<dia:attribute name="inner_color">
-<dia:color val="#ffffff"/>
-</dia:attribute>
-<dia:attribute name="line_style">
-<dia:enum val="0"/>
-</dia:attribute>
-</dia:object>
-</dia:layer>
+ <dia:diagramdata>
+ <dia:attribute name="background">
+ <dia:color val="#ffffff"/>
+ </dia:attribute>
+ </dia:diagramdata>
+ <dia:layer name="Background" visible="true">
+ <dia:object type="Standard - Line" version="0" id="O0">
+ <dia:attribute name="obj_pos">
+ <dia:point val="1.95,6.85"/>
+ </dia:attribute>
+ <dia:attribute name="obj_bb">
+ <dia:rectangle val="1.9,6.8;11,8.55"/>
+ </dia:attribute>
+ <dia:attribute name="conn_endpoints">
+ <dia:point val="1.95,6.85"/>
+ <dia:point val="10.95,8.5"/>
+ </dia:attribute>
+ <dia:attribute name="line_color">
+ <dia:color val="#000000"/>
+ </dia:attribute>
+ <dia:attribute name="line_width">
+ <dia:real val="0.1"/>
+ </dia:attribute>
+ <dia:attribute name="line_style">
+ <dia:enum val="0"/>
+ </dia:attribute>
+ <dia:attribute name="start_arrow">
+ <dia:enum val="0"/>
+ </dia:attribute>
+ <dia:attribute name="end_arrow">
+ <dia:enum val="0"/>
+ </dia:attribute>
+ <dia:connections>
+ <dia:connection handle="1" to="O2" connection="3"/>
+ </dia:connections>
+ </dia:object>
+ <dia:object type="Standard - Text" version="0" id="O1">
+ <dia:attribute name="obj_pos">
+ <dia:point val="4.8,4.75"/>
+ </dia:attribute>
+ <dia:attribute name="obj_bb">
+ <dia:rectangle val="2.579,3.96359;7.021,4.96359"/>
+ </dia:attribute>
+ <dia:attribute name="text">
+ <dia:composite type="text">
+ <dia:attribute name="string">
+ <dia:string val="sdfsdfg"/>
+ </dia:attribute>
+ <dia:attribute name="font">
+ <dia:font name="Courier"/>
+ </dia:attribute>
+ <dia:attribute name="height">
+ <dia:real val="1"/>
+ </dia:attribute>
+ <dia:attribute name="pos">
+ <dia:point val="4.8,4.75"/>
+ </dia:attribute>
+ <dia:attribute name="color">
+ <dia:color val="#000000"/>
+ </dia:attribute>
+ <dia:attribute name="alignment">
+ <dia:enum val="1"/>
+ </dia:attribute>
+ </dia:composite>
+ </dia:attribute>
+ </dia:object>
+ <dia:object type="Standard - Box" version="0" id="O2">
+ <dia:attribute name="obj_pos">
+ <dia:point val="10.95,7.5"/>
+ </dia:attribute>
+ <dia:attribute name="obj_bb">
+ <dia:rectangle val="10.9,7.45;13.05,9.55"/>
+ </dia:attribute>
+ <dia:attribute name="elem_corner">
+ <dia:point val="10.95,7.5"/>
+ </dia:attribute>
+ <dia:attribute name="elem_width">
+ <dia:real val="2.05"/>
+ </dia:attribute>
+ <dia:attribute name="elem_height">
+ <dia:real val="2"/>
+ </dia:attribute>
+ <dia:attribute name="border_width">
+ <dia:real val="0.1"/>
+ </dia:attribute>
+ <dia:attribute name="border_color">
+ <dia:color val="#000000"/>
+ </dia:attribute>
+ <dia:attribute name="inner_color">
+ <dia:color val="#ffffff"/>
+ </dia:attribute>
+ <dia:attribute name="line_style">
+ <dia:enum val="0"/>
+ </dia:attribute>
+ </dia:object>
+ </dia:layer>
</dia:diagram>
diff --git a/result/valid/xhtml1.xhtml b/result/valid/xhtml1.xhtml
index 75644fa..6089e4e 100644
--- a/result/valid/xhtml1.xhtml
+++ b/result/valid/xhtml1.xhtml
@@ -65,16 +65,33 @@
<h3>W3C Proposed Recommendation 10 December 1999</h3>
-<dl><dt>This version:</dt><dd><a href="http://www.w3.org/TR/1999/PR-xhtml1-19991210">
+<dl>
+<dt>This version:</dt>
+
+<dd><a href="http://www.w3.org/TR/1999/PR-xhtml1-19991210">
http://www.w3.org/TR/1999/PR-xhtml1-19991210</a> <br/>
(<a href="xhtml1.ps">Postscript version</a>,
<a href="xhtml1.pdf">PDF version</a>,
<a href="xhtml1.zip">ZIP archive</a>, or
<a href="xhtml1.tgz">Gzip'd TAR archive</a>)
-</dd><dt>Latest version:</dt><dd><a href="http://www.w3.org/TR/xhtml1">
-http://www.w3.org/TR/xhtml1</a></dd><dt>Previous versions:</dt><dd><a href="http://www.w3.org/TR/1999/WD-xhtml1-19991124">
-http://www.w3.org/TR/1999/WD-xhtml1-19991124</a></dd><dd><a href="http://www.w3.org/TR/1999/PR-xhtml1-19990824">
-http://www.w3.org/TR/1999/PR-xhtml1-19990824</a></dd><dt>Authors:</dt><dd>See <a href="#acks">acknowledgements</a>.</dd></dl>
+</dd>
+
+<dt>Latest version:</dt>
+
+<dd><a href="http://www.w3.org/TR/xhtml1">
+http://www.w3.org/TR/xhtml1</a></dd>
+
+<dt>Previous versions:</dt>
+
+<dd><a href="http://www.w3.org/TR/1999/WD-xhtml1-19991124">
+http://www.w3.org/TR/1999/WD-xhtml1-19991124</a></dd>
+<dd><a href="http://www.w3.org/TR/1999/PR-xhtml1-19990824">
+http://www.w3.org/TR/1999/PR-xhtml1-19990824</a></dd>
+
+<dt>Authors:</dt>
+
+<dd>See <a href="#acks">acknowledgements</a>.</dd>
+</dl>
<p class="copyright"><a href="http://www.w3.org/Consortium/Legal/ipr-notice#Copyright">
Copyright</a> © 1999 <a href="http://www.w3.org/">W3C</a><sup>®</sup>
@@ -85,7 +102,9 @@
licensing</a> rules apply.</p>
<hr/>
</div>
+
<h2 class="notoc">Abstract</h2>
+
<p>This specification defines <abbr title="Extensible Hypertext Markup Language">XHTML</abbr> 1.0, a reformulation of HTML
4.0 as an XML 1.0 application, and three <abbr title="Document Type Definition">DTDs</abbr> corresponding to
the ones defined by HTML 4.0. The semantics of the elements and
@@ -93,12 +112,13 @@
4.0. These semantics provide the foundation for future
extensibility of XHTML. Compatibility with existing HTML user
agents is possible by following a small set of guidelines.</p>
+
<h2>Status of this document</h2>
-<p>
-<em>This section describes the status of this document at the time
+
+<p><em>This section describes the status of this document at the time
of its publication. Other documents may supersede this document. The
-latest status of this document series is maintained at the W3C.</em>
-</p>
+latest status of this document series is maintained at the W3C.</em></p>
+
<p>This specification is a Proposed Recommendation of the HTML Working Group. It is
a revision of the Proposed Recommendation dated <a href="http://www.w3.org/TR/1999/PR-xhtml1-19990824/">24 August
1999</a> incorporating changes as a result of comments from the Proposed
@@ -106,6 +126,7 @@
comments and further deliberations of the W3C HTML Working Group. A
<a href="xhtml1-diff-19991210.html">diff-marked version</a> from the previous
proposed recommendation is available for comparison purposes.</p>
+
<p>On 10 December 1999, this document enters a
<a href="http://www.w3.org/Consortium/Process/#RecsPR">
Proposed Recommendation</a> review period. From that date until 8 January
@@ -115,63 +136,115 @@
ballots to w3c-html-review@w3.org. Please send any comments of a
confidential nature in separate email to w3t-html@w3.org, which is
visible to the Team only.</p>
+
<p>No sooner than 14 days after the end of the review period, the
Director will announce the document's disposition: it may become a W3C
Recommendation (possibly with minor changes), it may revert to Working
Draft status, or it may be dropped as a W3C work item.</p>
+
<p>Publication as a Proposed Recommendation does not imply endorsement
by the W3C membership. This is still a draft document and may be
updated, replaced or obsoleted by other documents at any time. It is
inappropriate to cite W3C Proposed Recommendation as other than "work
in progress."</p>
+
<p>This document has been produced as part of the <a href="http://www.w3.org/MarkUp/">W3C HTML Activity</a>. The goals of
the <a href="http://www.w3.org/MarkUp/Group/">HTML Working
Group</a> <i>(<a href="http://cgi.w3.org/MemberAccess/">members
only</a>)</i> are discussed in the <a href="http://www.w3.org/MarkUp/Group/HTMLcharter">HTML Working Group
charter</a> <i>(<a href="http://cgi.w3.org/MemberAccess/">members
only</a>)</i>.</p>
+
<p>A list of current W3C Recommendations and other technical documents
can be found at <a href="http://www.w3.org/TR">http://www.w3.org/TR</a>.</p>
+
<p>Public discussion on <abbr title="HyperText Markup Language">HTML</abbr> features takes place on the mailing list <a href="mailto:www-html@w3.org"> www-html@w3.org</a> (<a href="http://lists.w3.org/Archives/Public/www-html/">archive</a>). The W3C
staff contact for work on HTML is <a href="mailto:dsr@w3.org">Dave
Raggett</a>.</p>
+
<p>Please report errors in this document to <a href="mailto:www-html-editor@w3.org">www-html-editor@w3.org</a>.</p>
+
<p>The list of known errors in this specification is available at <a href="http://www.w3.org/1999/12/PR-xhtml1-19991210-errata">http://www.w3.org/1999/12/PR-xhtml1-19991210-errata</a>.</p>
-<h2 class="notoc">
-<a id="toc" name="toc">Contents</a>
-</h2>
+
+<h2 class="notoc"><a id="toc" name="toc">Contents</a></h2>
+
<div class="contents">
-<ul class="toc"><li class="tocline">1. <a href="#xhtml">What is XHTML?</a>
+<ul class="toc">
+<li class="tocline">1. <a href="#xhtml">What is XHTML?</a>
-<ul class="toc"><li class="tocline">1.1 <a href="#html4">What is HTML 4.0?</a></li><li class="tocline">1.2 <a href="#xml">What is XML?</a></li><li class="tocline">1.3 <a href="#why">Why the need for XHTML?</a></li></ul>
-</li><li class="tocline">2. <a href="#defs">Definitions</a>
+<ul class="toc">
+<li class="tocline">1.1 <a href="#html4">What is HTML 4.0?</a></li>
-<ul class="toc"><li class="tocline">2.1 <a href="#terms">Terminology</a></li><li class="tocline">2.2 <a href="#general">General Terms</a></li></ul>
-</li><li class="tocline">3. <a href="#normative">Normative Definition of XHTML 1.0</a>
+<li class="tocline">1.2 <a href="#xml">What is XML?</a></li>
+
+<li class="tocline">1.3 <a href="#why">Why the need for XHTML?</a></li>
+</ul>
+</li>
+
+<li class="tocline">2. <a href="#defs">Definitions</a>
+
+<ul class="toc">
+<li class="tocline">2.1 <a href="#terms">Terminology</a></li>
+
+<li class="tocline">2.2 <a href="#general">General Terms</a></li>
+</ul>
+</li>
+
+<li class="tocline">3. <a href="#normative">Normative Definition of XHTML 1.0</a>
-<ul class="toc"><li class="tocline">3.1 <a href="#docconf">Document Conformance</a></li><li class="tocline">3.2 <a href="#uaconf">User Agent Conformance</a></li></ul>
-</li><li class="tocline">4. <a href="#diffs">Differences with HTML 4.0</a>
+<ul class="toc">
+<li class="tocline">3.1 <a href="#docconf">Document Conformance</a></li>
-</li><li class="tocline">5. <a href="#issues">Compatibility Issues</a>
+<li class="tocline">3.2 <a href="#uaconf">User Agent Conformance</a></li>
+</ul>
+</li>
-<ul class="toc"><li class="tocline">5.1 <a href="#media">Internet Media Types</a></li></ul>
-</li><li class="tocline">6. <a href="#future">Future Directions</a>
+<li class="tocline">4. <a href="#diffs">Differences with HTML 4.0</a>
-<ul class="toc"><li class="tocline">6.1 <a href="#mods">Modularizing HTML</a></li><li class="tocline">6.2 <a href="#extensions">Subsets and Extensibility</a></li><li class="tocline">6.3 <a href="#profiles">Document Profiles</a></li></ul>
-</li><li class="tocline"><a href="#dtds">Appendix A. DTDs</a></li><li class="tocline"><a href="#prohibitions">Appendix B. Element
-Prohibitions</a></li><li class="tocline"><a href="#guidelines">Appendix C. HTML Compatibility Guidelines</a></li><li class="tocline"><a href="#acks">Appendix D. Acknowledgements</a></li><li class="tocline"><a href="#refs">Appendix E. References</a></li></ul>
+</li>
+
+<li class="tocline">5. <a href="#issues">Compatibility Issues</a>
+
+<ul class="toc">
+<li class="tocline">5.1 <a href="#media">Internet Media Types</a></li>
+</ul>
+</li>
+
+<li class="tocline">6. <a href="#future">Future Directions</a>
+
+<ul class="toc">
+<li class="tocline">6.1 <a href="#mods">Modularizing HTML</a></li>
+
+<li class="tocline">6.2 <a href="#extensions">Subsets and Extensibility</a></li>
+
+<li class="tocline">6.3 <a href="#profiles">Document Profiles</a></li>
+</ul>
+</li>
+
+<li class="tocline"><a href="#dtds">Appendix A. DTDs</a></li>
+
+<li class="tocline"><a href="#prohibitions">Appendix B. Element
+Prohibitions</a></li>
+
+<li class="tocline"><a href="#guidelines">Appendix C. HTML Compatibility Guidelines</a></li>
+
+<li class="tocline"><a href="#acks">Appendix D. Acknowledgements</a></li>
+
+<li class="tocline"><a href="#refs">Appendix E. References</a></li>
+</ul>
</div>
+
<!--OddPage-->
-<h1>
-<a name="xhtml" id="xhtml">1. What is XHTML?</a>
-</h1>
+<h1><a name="xhtml" id="xhtml">1. What is XHTML?</a></h1>
+
<p>XHTML is a family of current and future document types and modules that
reproduce, subset, and extend HTML 4.0 <a href="#ref-html4">[HTML]</a>. XHTML family document types are <abbr title="Extensible Markup Language">XML</abbr> based,
and ultimately are designed to work in conjunction with XML-based user agents.
The details of this family and its evolution are
discussed in more detail in the section on <a href="#future">Future
Directions</a>. </p>
+
<p>XHTML 1.0 (this specification) is the first document type in the XHTML
family. It is a reformulation of the three HTML 4.0 document types as
applications of XML 1.0 <a href="#ref-xml"> [XML]</a>. It is intended
@@ -179,6 +252,7 @@
simple <a href="#guidelines">guidelines</a> are followed,
operates in HTML 4.0 conforming user agents. Developers who migrate
their content to XHTML 1.0 will realize the following benefits:</p>
+
<ul>
<li>XHTML documents are XML conforming. As such, they are readily viewed,
edited, and validated with standard XML tools.</li>
@@ -191,27 +265,31 @@
<li>As the XHTML family evolves, documents conforming to XHTML 1.0 will be more
likely to interoperate within and among various XHTML environments.</li>
</ul>
+
<p>The XHTML family is the next step in the evolution of the Internet. By
migrating to XHTML today, content developers can enter the XML world with all
of its attendant benefits, while still remaining confident in their
content's backward and future compatibility.</p>
-<h2>
-<a name="html4" id="html4">1.1 What is HTML 4.0?</a>
-</h2>
+
+<h2><a name="html4" id="html4">1.1 What is HTML 4.0?</a></h2>
+
<p>HTML 4.0 <a href="#ref-html4">[HTML]</a> is an <abbr title="Standard Generalized Markup Language">SGML</abbr> (Standard
Generalized Markup Language) application conforming to
International Standard <abbr title="Organization for International Standardization">ISO</abbr> 8879, and is widely regarded as the
standard publishing language of the World Wide Web.</p>
+
<p>SGML is a language for describing markup languages,
particularly those used in electronic document exchange, document
management, and document publishing. HTML is an example of a
language defined in SGML.</p>
+
<p>SGML has been around since the middle 1980's and has remained
quite stable. Much of this stability stems from the fact that the
language is both feature-rich and flexible. This flexibility,
however, comes at a price, and that price is a level of
complexity that has inhibited its adoption in a diversity of
environments, including the World Wide Web.</p>
+
<p>HTML, as originally conceived, was to be a language for the
exchange of scientific and other technical documents, suitable
for use by non-document specialists. HTML addressed the problem
@@ -220,6 +298,7 @@
In addition to simplifying the document structure, HTML added
support for hypertext. Multimedia capabilities were added
later.</p>
+
<p>In a remarkably short space of time, HTML became wildly
popular and rapidly outgrew its original purpose. Since HTML's
inception, there has been rapid invention of new elements for use
@@ -227,27 +306,31 @@
highly specialized, markets. This plethora of new elements has
led to compatibility problems for documents across different
platforms.</p>
+
<p>As the heterogeneity of both software and platforms rapidly
proliferate, it is clear that the suitability of 'classic' HTML
4.0 for use on these platforms is somewhat limited.</p>
-<h2>
-<a name="xml" id="xml">1.2 What is XML?</a>
-</h2>
+
+<h2><a name="xml" id="xml">1.2 What is XML?</a></h2>
+
<p>XML<sup>™</sup> is the shorthand for Extensible Markup
Language, and is an acronym of Extensible Markup Language <a href="#ref-xml">[XML]</a>.</p>
+
<p>XML was conceived as a means of regaining the power and
flexibility of SGML without most of its complexity. Although a
restricted form of SGML, XML nonetheless preserves most of SGML's
power and richness, and yet still retains all of SGML's commonly
used features.</p>
+
<p>While retaining these beneficial features, XML removes many of
the more complex features of SGML that make the authoring and
design of suitable software both difficult and costly.</p>
-<h2>
-<a name="why" id="why">1.3 Why the need for XHTML?</a>
-</h2>
+
+<h2><a name="why" id="why">1.3 Why the need for XHTML?</a></h2>
+
<p>The benefits of migrating to XHTML 1.0 are described above. Some of the
benefits of migrating to XHTML in general are:</p>
+
<ul>
<li>Document developers and user agent designers are constantly
discovering new ways to express their ideas through new markup. In XML, it is
@@ -258,6 +341,7 @@
These modules will permit the combination of existing and
new feature sets when developing content and when designing new user
agents.</li>
+
<li>Alternate ways of accessing the Internet are constantly being
introduced. Some estimates indicate that by the year 2002, 75% of
Internet document viewing will be carried out on these alternate
@@ -267,49 +351,62 @@
best effort content transformation. Ultimately, it will be possible to
develop XHTML-conforming content that is usable by any XHTML-conforming
user agent.</li>
+
</ul>
<!--OddPage-->
-<h1>
-<a name="defs" id="defs">2. Definitions</a>
-</h1>
-<h2>
-<a name="terms" id="terms">2.1 Terminology</a>
-</h2>
+<h1><a name="defs" id="defs">2. Definitions</a></h1>
+
+<h2><a name="terms" id="terms">2.1 Terminology</a></h2>
+
<p>The following terms are used in this specification. These
terms extend the definitions in <a href="#ref-rfc2119">
[RFC2119]</a> in ways based upon similar definitions in ISO/<abbr title="International Electro-technical Commission">IEC</abbr>
9945-1:1990 <a href="#ref-posix">[POSIX.1]</a>:</p>
+
<dl>
<dt>Implementation-defined</dt>
+
<dd>A value or behavior is implementation-defined when it is left
to the implementation to define [and document] the corresponding
requirements for correct document construction.</dd>
+
<dt>May</dt>
+
<dd>With respect to implementations, the word "may" is to be
interpreted as an optional feature that is not required in this
specification but can be provided. With respect to <a href="#docconf">Document Conformance</a>, the word "may" means that
the optional feature must not be used. The term "optional" has
the same definition as "may".</dd>
+
<dt>Must</dt>
+
<dd>In this specification, the word "must" is to be interpreted
as a mandatory requirement on the implementation or on Strictly
Conforming XHTML Documents, depending upon the context. The term
"shall" has the same definition as "must".</dd>
+
<dt>Reserved</dt>
+
<dd>A value or behavior is unspecified, but it is not allowed to
be used by Conforming Documents nor to be supported by a
Conforming User Agents.</dd>
+
<dt>Should</dt>
+
<dd>With respect to implementations, the word "should" is to be
interpreted as an implementation recommendation, but not a
requirement. With respect to documents, the word "should" is to
be interpreted as recommended programming practice for documents
and a requirement for Strictly Conforming XHTML Documents.</dd>
+
<dt>Supported</dt>
+
<dd>Certain facilities in this specification are optional. If a
facility is supported, it behaves as specified by this
specification.</dd>
+
<dt>Unspecified</dt>
+
<dd>When a value or behavior is unspecified, the specification
defines no portability requirements for a facility on an
implementation even when faced with a document that uses the
@@ -317,68 +414,85 @@
instance, rather than tolerating any behavior when using that
facility, is not a Strictly Conforming XHTML Document.</dd>
</dl>
-<h2>
-<a name="general" id="general">2.2 General Terms</a>
-</h2>
+
+<h2><a name="general" id="general">2.2 General Terms</a></h2>
+
<dl>
<dt>Attribute</dt>
+
<dd>An attribute is a parameter to an element declared in the
DTD. An attribute's type and value range, including a possible
default value, are defined in the DTD.</dd>
+
<dt>DTD</dt>
+
<dd>A DTD, or document type definition, is a collection of XML
declarations that, as a collection, defines the legal structure,
<span class="term">elements</span>, and <span class="term">
attributes</span> that are available for use in a document that
complies to the DTD.</dd>
+
<dt>Document</dt>
+
<dd>A document is a stream of data that, after being combined
with any other streams it references, is structured such that it
holds information contained within <span class="term">
elements</span> that are organized as defined in the associated
<span class="term">DTD</span>. See <a href="#docconf">Document
Conformance</a> for more information.</dd>
+
<dt>Element</dt>
+
<dd>An element is a document structuring unit declared in the
<span class="term">DTD</span>. The element's content model is
defined in the <span class="term">DTD</span>, and additional
semantics may be defined in the prose description of the
element.</dd>
-<dt>
-<a name="facilities" id="facilities">Facilities</a>
-</dt>
+
+<dt><a name="facilities" id="facilities">Facilities</a></dt>
+
<dd>Functionality includes <span class="term">elements</span>,
<span class="term">attributes</span>, and the semantics
associated with those <span class="term">elements</span> and
<span class="term">attributes</span>. An implementation
supporting that functionality is said to provide the necessary
facilities.</dd>
+
<dt>Implementation</dt>
+
<dd>An implementation is a system that provides collection of
<span class="term">facilities</span> and services that supports
this specification. See <a href="#uaconf">User Agent
Conformance</a> for more information.</dd>
+
<dt>Parsing</dt>
+
<dd>Parsing is the act whereby a <span class="term">
document</span> is scanned, and the information contained within
the <span class="term">document</span> is filtered into the
context of the <span class="term">elements</span> in which the
information is structured.</dd>
+
<dt>Rendering</dt>
+
<dd>Rendering is the act whereby the information in a <span class="term">document</span> is presented. This presentation is
done in the form most appropriate to the environment (e.g.
aurally, visually, in print).</dd>
+
<dt>User Agent</dt>
+
<dd>A user agent is an <span class="term">implementation</span>
that retrieves and processes XHTML documents. See <a href="#uaconf">User Agent Conformance</a> for more information.</dd>
+
<dt>Validation</dt>
+
<dd>Validation is a process whereby <span class="term">
documents</span> are verified against the associated <span class="term">DTD</span>, ensuring that the structure, use of <span class="term">elements</span>, and use of <span class="term">
attributes</span> are consistent with the definitions in the
<span class="term">DTD</span>.</dd>
-<dt>
-<a name="wellformed" id="wellformed">Well-formed</a>
-</dt>
+
+<dt><a name="wellformed" id="wellformed">Well-formed</a></dt>
+
<dd>A <span class="term">document</span> is well-formed when it
is structured according to the rules defined in <a href="http://www.w3.org/TR/REC-xml#sec-well-formed">Section 2.1</a> of
the XML 1.0 Recommendation <a href="#ref-xml">[XML]</a>.
@@ -386,42 +500,45 @@
their start and end tags, are nested properly within one
another.</dd>
</dl>
+
<!--OddPage-->
-<h1>
-<a name="normative" id="normative">3. Normative Definition of
-XHTML 1.0</a>
-</h1>
-<h2>
-<a name="docconf" id="docconf">3.1 Document
-Conformance</a>
-</h2>
+<h1><a name="normative" id="normative">3. Normative Definition of
+XHTML 1.0</a></h1>
+
+<h2><a name="docconf" id="docconf">3.1 Document
+Conformance</a></h2>
+
<p>This version of XHTML provides a definition of strictly
conforming XHTML documents, which are restricted to tags and
attributes from the XHTML namespace. See <a href="#well-formed">Section 3.1.2</a> for information on using XHTML
with other namespaces, for instance, to include metadata
expressed in <abbr title="Resource Description Format">RDF</abbr> within XHTML documents.</p>
-<h3>
-<a name="strict" id="strict">3.1.1 Strictly Conforming
-Documents</a>
-</h3>
+
+<h3><a name="strict" id="strict">3.1.1 Strictly Conforming
+Documents</a></h3>
+
<p>A Strictly Conforming XHTML Document is a document that
requires only the facilities described as mandatory in this
specification. Such a document must meet all of the following
criteria:</p>
+
<ol>
<li>
<p>It must validate against one of the three DTDs found in <a href="#dtds">Appendix A</a>.</p>
</li>
+
<li>
<p>The root element of the document must be <code>
<html></code>.</p>
</li>
+
<li>
<p>The root element of the document must designate the XHTML
namespace using the <code>xmlns</code> attribute <a href="#ref-xmlns">[XMLNAMES]</a>. The namespace for XHTML is
defined to be
<code>http://www.w3.org/1999/xhtml</code>.</p>
</li>
+
<li>
<p>There must be a DOCTYPE declaration in the document prior to
the root element. The public identifier included in
@@ -445,7 +562,9 @@
</pre>
</li>
</ol>
+
<p>Here is an example of a minimal XHTML document.</p>
+
<div class="good">
<pre>
<?xml version="1.0" encoding="UTF-8"?>
@@ -461,22 +580,25 @@
</body>
</html></pre>
</div>
+
<p>Note that in this example, the XML declaration is included. An XML
declaration like the one above is
not required in all XML documents. XHTML document authors are strongly encouraged to use XML declarations in all their documents. Such a declaration is required
when the character encoding of the document is other than the default UTF-8 or
UTF-16.</p>
-<h3>
-<a name="well-formed" id="well-formed">3.1.2 Using XHTML with
-other namespaces</a>
-</h3>
+
+<h3><a name="well-formed" id="well-formed">3.1.2 Using XHTML with
+other namespaces</a></h3>
+
<p>The XHTML namespace may be used with other XML namespaces
as per <a href="#ref-xmlns">[XMLNAMES]</a>, although such
documents are not strictly conforming XHTML 1.0 documents as
defined above. Future work by W3C will address ways to specify
conformance for documents involving multiple namespaces.</p>
+
<p>The following example shows the way in which XHTML 1.0 could
be used in conjunction with the MathML Recommendation:</p>
+
<div class="good">
<pre>
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
@@ -497,8 +619,10 @@
</html>
</pre>
</div>
+
<p>The following example shows the way in which XHTML 1.0 markup
could be incorporated into another XML namespace:</p>
+
<div class="good">
<pre>
<?xml version="1.0" encoding="UTF-8"?>
@@ -516,33 +640,40 @@
</book>
</pre>
</div>
-<h2>
-<a name="uaconf" id="uaconf">3.2 User Agent
-Conformance</a>
-</h2>
+
+<h2><a name="uaconf" id="uaconf">3.2 User Agent
+Conformance</a></h2>
+
<p>A conforming user agent must meet all of the following
criteria:</p>
+
<ol>
<li>In order to be consistent with the XML 1.0 Recommendation <a href="#ref-xml">[XML]</a>, the user agent must parse and evaluate
an XHTML document for well-formedness. If the user agent claims
to be a validating user agent, it must also validate documents
against their referenced DTDs according to <a href="#ref-xml">
[XML]</a>.</li>
+
<li>When the user agent claims to support <a href="#facilities">
facilities</a> defined within this specification or required by
this specification through normative reference, it must do so in
ways consistent with the facilities' definition.</li>
+
<li>When a user agent processes an XHTML document as generic XML,
it shall only recognize attributes of type
<code>ID</code> (e.g. the <code>id</code> attribute on most XHTML elements)
as fragment identifiers.</li>
+
<li>If a user agent encounters an element it does not recognize,
it must render the element's content.</li>
+
<li>If a user agent encounters an attribute it does not
recognize, it must ignore the entire attribute specification
(i.e., the attribute and its value).</li>
+
<li>If a user agent encounters an attribute value it doesn't
recognize, it must use the default attribute value.</li>
+
<li>If it encounters an entity reference (other than one
of the predefined entities) for which the User Agent has
processed no declaration (which could happen if the declaration
@@ -550,12 +681,19 @@
reference should be rendered as the characters (starting
with the ampersand and ending with the semi-colon) that
make up the entity reference.</li>
+
<li>When rendering content, User Agents that encounter
characters or character entity references that are recognized but not renderable should display the document in such a way that it is obvious to the user that normal rendering has not taken place.</li>
+
<li>
The following characters are defined in [XML] as whitespace characters:
-<ul><li>Space (&#x0020;)</li><li>Tab (&#x0009;)</li><li>Carriage return (&#x000D;)</li><li>Line feed (&#x000A;)</li></ul>
+<ul>
+<li>Space (&#x0020;)</li>
+<li>Tab (&#x0009;)</li>
+<li>Carriage return (&#x000D;)</li>
+<li>Line feed (&#x000A;)</li>
+</ul>
<p>
The XML processor normalizes different system's line end codes into one
@@ -563,7 +701,10 @@
user agent in addition, must treat the following characters as whitespace:
</p>
-<ul><li>Form feed (&#x000C;)</li><li>Zero-width space (&#x200B;)</li></ul>
+<ul>
+<li>Form feed (&#x000C;)</li>
+<li>Zero-width space (&#x200B;)</li>
+</ul>
<p>
In elements where the 'xml:space' attribute is set to 'preserve', the user
@@ -573,20 +714,26 @@
is handled according to the following rules:
</p>
-<ul><li>
+<ul>
+<li>
All whitespace surrounding block elements should be removed.
-</li><li>
+</li>
+<li>
Comments are removed entirely and do not affect whitespace handling. One
whitespace character on either side of a comment is treated as two white
space characters.
-</li><li>
+</li>
+<li>
Leading and trailing whitespace inside a block element must be removed.
-</li><li>Line feed characters within a block element must be converted into a
+</li>
+<li>Line feed characters within a block element must be converted into a
space (except when the 'xml:space' attribute is set to 'preserve').
-</li><li>
+</li>
+<li>
A sequence of white space characters must be reduced to a single space
character (except when the 'xml:space' attribute is set to 'preserve').
-</li><li>
+</li>
+<li>
With regard to rendition,
the User Agent should render the content in a
manner appropriate to the language in which the content is written.
@@ -602,134 +749,146 @@
e.g. 'kitAbuhum' = 'kitAbu-hum' = 'book them' == their book); and languages
in the Chinese script tradition typically neither encode such delimiters nor
use typographic whitespace in this way.
-</li></ul>
+</li>
+</ul>
<p>Whitespace in attribute values is processed according to <a href="#ref-xml">[XML]</a>.</p>
</li>
</ol>
+
<!--OddPage-->
-<h1>
-<a name="diffs" id="diffs">4. Differences with HTML
-4.0</a>
-</h1>
+<h1><a name="diffs" id="diffs">4. Differences with HTML
+4.0</a></h1>
+
<p>Due to the fact that XHTML is an XML application, certain
practices that were perfectly legal in SGML-based HTML 4.0 <a href="#ref-html4">[HTML]</a> must be changed.</p>
-<h2>
-<a name="h-4.1" id="h-4.1">4.1 Documents must be
-well-formed</a>
-</h2>
+
+<h2><a name="h-4.1" id="h-4.1">4.1 Documents must be
+well-formed</a></h2>
+
<p><a href="#wellformed">Well-formedness</a> is a new concept
introduced by <a href="#ref-xml">[XML]</a>. Essentially this
means that all elements must either have closing tags or be
written in a special form (as described below), and that all the
elements must nest.</p>
+
<p>Although overlapping is illegal in SGML, it was widely
tolerated in existing browsers.</p>
+
<div class="good">
<p><strong><em>CORRECT: nested elements.</em></strong></p>
<p><p>here is an emphasized
<em>paragraph</em>.</p></p>
</div>
+
<div class="bad">
<p><strong><em>INCORRECT: overlapping elements</em></strong></p>
<p><p>here is an emphasized
<em>paragraph.</p></em></p>
</div>
-<h2>
-<a name="h-4.2" id="h-4.2">4.2 Element and attribute
-names must be in lower case</a>
-</h2>
+
+<h2><a name="h-4.2" id="h-4.2">4.2 Element and attribute
+names must be in lower case</a></h2>
+
<p>XHTML documents must use lower case for all HTML element and
attribute names. This difference is necessary because XML is
case-sensitive e.g. <li> and <LI> are different
tags.</p>
-<h2>
-<a name="h-4.3" id="h-4.3">4.3 For non-empty elements,
-end tags are required</a>
-</h2>
+
+<h2><a name="h-4.3" id="h-4.3">4.3 For non-empty elements,
+end tags are required</a></h2>
+
<p>In SGML-based HTML 4.0 certain elements were permitted to omit
the end tag; with the elements that followed implying closure.
This omission is not permitted in XML-based XHTML. All elements
other than those declared in the DTD as <code>EMPTY</code> must
have an end tag.</p>
+
<div class="good">
<p><strong><em>CORRECT: terminated elements</em></strong></p>
<p><p>here is a paragraph.</p><p>here is
another paragraph.</p></p>
</div>
+
<div class="bad">
<p><strong><em>INCORRECT: unterminated elements</em></strong></p>
<p><p>here is a paragraph.<p>here is another
paragraph.</p>
</div>
-<h2>
-<a name="h-4.4" id="h-4.4">4.4 Attribute values must
-always be quoted</a>
-</h2>
+
+<h2><a name="h-4.4" id="h-4.4">4.4 Attribute values must
+always be quoted</a></h2>
+
<p>All attribute values must be quoted, even those which appear
to be numeric.</p>
+
<div class="good">
<p><strong><em>CORRECT: quoted attribute values</em></strong></p>
<p><table rows="3"></p>
</div>
+
<div class="bad">
<p><strong><em>INCORRECT: unquoted attribute values</em></strong></p>
<p><table rows=3></p>
</div>
-<h2>
-<a name="h-4.5" id="h-4.5">4.5 Attribute
-Minimization</a>
-</h2>
+
+<h2><a name="h-4.5" id="h-4.5">4.5 Attribute
+Minimization</a></h2>
+
<p>XML does not support attribute minimization. Attribute-value
pairs must be written in full. Attribute names such as <code>
compact</code> and <code>checked</code> cannot occur in elements
without their value being specified.</p>
+
<div class="good">
<p><strong><em>CORRECT: unminimized attributes</em></strong></p>
<p><dl compact="compact"></p>
</div>
+
<div class="bad">
<p><strong><em>INCORRECT: minimized attributes</em></strong></p>
<p><dl compact></p>
</div>
-<h2>
-<a name="h-4.6" id="h-4.6">4.6 Empty Elements</a>
-</h2>
+
+<h2><a name="h-4.6" id="h-4.6">4.6 Empty Elements</a></h2>
+
<p>Empty elements must either have an end tag or the start tag must end with <code>/></code>. For instance,
<code><br/></code> or <code><hr></hr></code>. See <a href="#guidelines">HTML Compatibility Guidelines</a> for information on ways to
ensure this is backward compatible with HTML 4.0 user agents.</p>
+
<div class="good">
<p><strong><em>CORRECT: terminated empty tags</em></strong></p>
<p><br/><hr/></p>
</div>
+
<div class="bad">
<p><strong><em>INCORRECT: unterminated empty tags</em></strong></p>
<p><br><hr></p>
</div>
-<h2>
-<a name="h-4.7" id="h-4.7">4.7 Whitespace handling in
-attribute values</a>
-</h2>
+
+<h2><a name="h-4.7" id="h-4.7">4.7 Whitespace handling in
+attribute values</a></h2>
+
<p>In attribute values, user agents will strip leading and
trailing whitespace from attribute values and map sequences
of one or more whitespace characters (including line breaks) to
a single inter-word space (an ASCII space character for western
scripts). See <a href="http://www.w3.org/TR/REC-xml#AVNormalize">
Section 3.3.3</a> of <a href="#ref-xml">[XML]</a>.</p>
-<h2>
-<a name="h-4.8" id="h-4.8">4.8 Script and Style
-elements</a>
-</h2>
+
+<h2><a name="h-4.8" id="h-4.8">4.8 Script and Style
+elements</a></h2>
+
<p>In XHTML, the script and style elements are declared as having
<code>#PCDATA</code> content. As a result, <code><</code> and
<code>&</code> will be treated as the start of markup, and
@@ -739,6 +898,7 @@
the content of the script or style element within a <code>
CDATA</code> marked section avoids the expansion of these
entities.</p>
+
<div class="good">
<pre>
<script>
@@ -748,18 +908,21 @@
</script>
</pre>
</div>
+
<p><code>CDATA</code> sections are recognized by the XML
processor and appear as nodes in the Document Object Model, see
<a href="http://www.w3.org/TR/REC-DOM-Level-1/level-one-core.html#ID-E067D597">
Section 1.3</a> of the DOM Level 1 Recommendation <a href="#ref-dom">[DOM]</a>.</p>
+
<p>An alternative is to use external script and style
documents.</p>
-<h2>
-<a name="h-4.9" id="h-4.9">4.9 SGML exclusions</a>
-</h2>
+
+<h2><a name="h-4.9" id="h-4.9">4.9 SGML exclusions</a></h2>
+
<p>SGML gives the writer of a DTD the ability to exclude specific
elements from being contained within an element. Such
prohibitions (called "exclusions") are not possible in XML.</p>
+
<p>For example, the HTML 4.0 Strict DTD forbids the nesting of an
'<code>a</code>' element within another '<code>a</code>' element
to any descendant depth. It is not possible to spell out such
@@ -768,10 +931,10 @@
summary of such elements and the elements that should not be
nested in them is found in the normative <a href="#prohibitions">
Appendix B</a>.</p>
-<h2>
-<a name="h-4.10" id="h-4.10">4.10 The elements with 'id' and 'name'
-attributes</a>
-</h2>
+
+<h2><a name="h-4.10" id="h-4.10">4.10 The elements with 'id' and 'name'
+attributes</a></h2>
+
<p>HTML 4.0 defined the <code>name</code> attribute for the elements
<code>a</code>,
<code>applet</code>, <code>frame</code>,
@@ -794,190 +957,197 @@
<p>Note that in XHTML 1.0, the <code>name</code> attribute of these
elements is formally deprecated, and will be removed in a
subsequent version of XHTML.</p>
+
<!--OddPage-->
-<h1>
-<a name="issues" id="issues">5. Compatibility Issues</a>
-</h1>
+<h1><a name="issues" id="issues">5. Compatibility Issues</a></h1>
+
<p>Although there is no requirement for XHTML 1.0 documents to be
compatible with existing user agents, in practice this is easy to
accomplish. Guidelines for creating compatible documents can be
found in <a href="#guidelines">Appendix C</a>.</p>
-<h2>
-<a name="media" id="media">5.1 Internet Media Type</a>
-</h2>
+
+<h2><a name="media" id="media">5.1 Internet Media Type</a></h2>
<p>As of the publication of this recommendation, the general
recommended MIME labeling for XML-based applications
has yet to be resolved.</p>
+
<p>However, XHTML Documents which follow the guidelines set forth
in <a href="#guidelines">Appendix C</a>, "HTML Compatibility Guidelines" may be
labeled with the Internet Media Type "text/html", as they
are compatible with most HTML browsers. This document
makes no recommendation about MIME labeling of other
XHTML documents.</p>
+
<!--OddPage-->
-<h1>
-<a name="future" id="future">6. Future Directions</a>
-</h1>
+<h1><a name="future" id="future">6. Future Directions</a></h1>
+
<p>XHTML 1.0 provides the basis for a family of document types
that will extend and subset XHTML, in order to support a wide
range of new devices and applications, by defining modules and
specifying a mechanism for combining these modules. This
mechanism will enable the extension and sub-setting of XHTML 1.0
in a uniform way through the definition of new modules.</p>
-<h2>
-<a name="mods" id="mods">6.1 Modularizing HTML</a>
-</h2>
+
+<h2><a name="mods" id="mods">6.1 Modularizing HTML</a></h2>
+
<p>As the use of XHTML moves from the traditional desktop user
agents to other platforms, it is clear that not all of the XHTML
elements will be required on all platforms. For example a hand
held device or a cell-phone may only support a subset of XHTML
elements.</p>
+
<p>The process of modularization breaks XHTML up into a series of
smaller element sets. These elements can then be recombined to
meet the needs of different communities.</p>
+
<p>These modules will be defined in a later W3C document.</p>
-<h2>
-<a name="extensions" id="extensions">6.2 Subsets and
-Extensibility</a>
-</h2>
+
+<h2><a name="extensions" id="extensions">6.2 Subsets and
+Extensibility</a></h2>
+
<p>Modularization brings with it several advantages:</p>
+
<ul>
<li>
<p>It provides a formal mechanism for sub-setting XHTML.</p>
</li>
+
<li>
<p>It provides a formal mechanism for extending XHTML.</p>
</li>
+
<li>
<p>It simplifies the transformation between document types.</p>
</li>
+
<li>
<p>It promotes the reuse of modules in new document types.</p>
</li>
</ul>
-<h2>
-<a name="profiles" id="profiles">6.3 Document
-Profiles</a>
-</h2>
+
+<h2><a name="profiles" id="profiles">6.3 Document
+Profiles</a></h2>
+
<p>A document profile specifies the syntax and semantics of a set
of documents. Conformance to a document profile provides a basis
for interoperability guarantees. The document profile specifies
the facilities required to process documents of that type, e.g.
which image formats can be used, levels of scripting, style sheet
support, and so on.</p>
+
<p>For product designers this enables various groups to define
their own standard profile.</p>
+
<p>For authors this will obviate the need to write several
different versions of documents for different clients.</p>
+
<p>For special groups such as chemists, medical doctors, or
mathematicians this allows a special profile to be built using
standard HTML elements plus a group of elements geared to the
specialist's needs.</p>
+
<!--OddPage-->
<h1><a name="appendices" id="appendices"/>
<a name="dtds" id="dtds">Appendix A. DTDs</a></h1>
-<p>
-<b>This appendix is normative.</b>
-</p>
+
+<p><b>This appendix is normative.</b></p>
+
<p>These DTDs and entity sets form a normative part of this
specification. The complete set of DTD files together with an XML
declaration and SGML Open Catalog is included in the <a href="xhtml1.zip">zip file</a> for this specification.</p>
-<h2>
-<a name="h-A1" id="h-A1">A.1 Document Type
-Definitions</a>
-</h2>
+
+<h2><a name="h-A1" id="h-A1">A.1 Document Type
+Definitions</a></h2>
+
<p>These DTDs approximate the HTML 4.0 DTDs. It is likely that
when the DTDs are modularized, a method of DTD construction will
be employed that corresponds more closely to HTML 4.0.</p>
+
<ul>
<li>
<p><a href="DTD/xhtml1-strict.dtd" type="text/plain">
XHTML-1.0-Strict</a></p>
</li>
+
<li>
<p><a href="DTD/xhtml1-transitional.dtd" type="text/plain">
XHTML-1.0-Transitional</a></p>
</li>
+
<li>
<p><a href="DTD/xhtml1-frameset.dtd" type="text/plain">
XHTML-1.0-Frameset</a></p>
</li>
</ul>
-<h2>
-<a name="h-A2" id="h-A2">A.2 Entity Sets</a>
-</h2>
+
+<h2><a name="h-A2" id="h-A2">A.2 Entity Sets</a></h2>
+
<p>The XHTML entity sets are the same as for HTML 4.0, but have
been modified to be valid XML 1.0 entity declarations. Note the
entity for the Euro currency sign (<code>&euro;</code> or
<code>&#8364;</code> or <code>&#x20AC;</code>) is defined
as part of the special characters.</p>
+
<ul>
<li>
<p><a href="DTD/xhtml-lat1.ent">Latin-1 characters</a></p>
</li>
+
<li>
<p><a href="DTD/xhtml-special.ent">Special characters</a></p>
</li>
+
<li>
<p><a href="DTD/xhtml-symbol.ent">Symbols</a></p>
</li>
</ul>
+
<!--OddPage-->
-<h1>
-<a name="prohibitions" id="prohibitions">Appendix B. Element
-Prohibitions</a>
-</h1>
-<p>
-<b>This appendix is normative.</b>
-</p>
+<h1><a name="prohibitions" id="prohibitions">Appendix B. Element
+Prohibitions</a></h1>
+
+<p><b>This appendix is normative.</b></p>
+
<p>The following elements have prohibitions on which elements
they can contain (see <a href="#h-4.9">Section 4.9</a>). This
prohibition applies to all depths of nesting, i.e. it contains
all the descendant elements.</p>
-<dl>
-<dt>
-<code class="tag">a</code>
-</dt>
+
+<dl><dt><code class="tag">a</code></dt>
<dd>
cannot contain other <code>a</code> elements.</dd>
-<dt>
-<code class="tag">pre</code>
-</dt>
+<dt><code class="tag">pre</code></dt>
<dd>cannot contain the <code>img</code>, <code>object</code>,
<code>big</code>, <code>small</code>, <code>sub</code>, or <code>
sup</code> elements.</dd>
-<dt>
-<code class="tag">button</code>
-</dt>
+
+<dt><code class="tag">button</code></dt>
<dd>cannot contain the <code>input</code>, <code>select</code>,
<code>textarea</code>, <code>label</code>, <code>button</code>,
<code>form</code>, <code>fieldset</code>, <code>iframe</code> or
<code>isindex</code> elements.</dd>
-<dt>
-<code class="tag">label</code>
-</dt>
+<dt><code class="tag">label</code></dt>
<dd>cannot contain other <code class="tag">label</code> elements.</dd>
-<dt>
-<code class="tag">form</code>
-</dt>
+<dt><code class="tag">form</code></dt>
<dd>cannot contain other <code>form</code> elements.</dd>
</dl>
+
<!--OddPage-->
-<h1>
-<a name="guidelines" id="guidelines">Appendix C.
-HTML Compatibility Guidelines</a>
-</h1>
-<p>
-<b>This appendix is informative.</b>
-</p>
+<h1><a name="guidelines" id="guidelines">Appendix C.
+HTML Compatibility Guidelines</a></h1>
+
+<p><b>This appendix is informative.</b></p>
+
<p>This appendix summarizes design guidelines for authors who
wish their XHTML documents to render on existing HTML user
agents.</p>
+
<h2>C.1 Processing Instructions</h2>
<p>Be aware that processing instructions are rendered on some
user agents. However, also note that when the XML declaration is not included
in a document, the document can only use the default character encodings UTF-8
or UTF-16.</p>
+
<h2>C.2 Empty Elements</h2>
<p>Include a space before the trailing <code>/</code> and <code>
></code> of empty elements, e.g. <code class="greenmono">
@@ -986,12 +1156,14 @@
src="karen.jpg" alt="Karen" /></code>. Also, use the
minimized tag syntax for empty elements, e.g. <code class="greenmono"><br /></code>, as the alternative syntax <code class="greenmono"><br></br></code> allowed by XML
gives uncertain results in many existing user agents.</p>
+
<h2>C.3 Element Minimization and Empty Element Content</h2>
<p>Given an empty instance of an element whose content model is
not <code>EMPTY</code> (for example, an empty title or paragraph)
do not use the minimized form (e.g. use <code class="greenmono">
<p> </p></code> and not <code class="greenmono">
<p /></code>).</p>
+
<h2>C.4 Embedded Style Sheets and Scripts</h2>
<p>Use external style sheets if your style sheet uses <code>
<</code> or <code>&</code> or <code>]]></code> or <code>--</code>. Use
@@ -1001,18 +1173,22 @@
practice of "hiding" scripts and style sheets within comments to make the
documents backward compatible is likely to not work as expected in XML-based
implementations.</p>
+
<h2>C.5 Line Breaks within Attribute Values</h2>
<p>Avoid line breaks and multiple whitespace characters within
attribute values. These are handled inconsistently by user
agents.</p>
+
<h2>C.6 Isindex</h2>
<p>Don't include more than one <code>isindex</code> element in
the document <code>head</code>. The <code>isindex</code> element
is deprecated in favor of the <code>input</code> element.</p>
+
<h2>C.7 The <code>lang</code> and <code>xml:lang</code> Attributes</h2>
<p>Use both the <code>lang</code> and <code>xml:lang</code>
attributes when specifying the language of an element. The value
of the <code>xml:lang</code> attribute takes precedence.</p>
+
<h2>C.8 Fragment Identifiers</h2>
<p>In XML, <abbr title="Uniform Resource Identifiers">URIs</abbr> [<a href="#ref-rfc2396">RFC2396</a>] that end with fragment identifiers of the form
<code>"#foo"</code> do not refer to elements with an attribute
@@ -1022,6 +1198,7 @@
support the use of <code>ID</code>-type attributes in this way,
so identical values may be supplied for both of these attributes to ensure
maximum forward and backward compatibility (e.g., <code class="greenmono"><a id="foo" name="foo">...</a></code>).</p>
+
<p>Further, since the set of
legal values for attributes of type <code>ID</code> is much smaller than
for those of type <code>CDATA</code>, the type of the <code>name</code>
@@ -1039,6 +1216,7 @@
<code>name</code> attribute of the <code>a</code>, <code>applet</code>, <code>frame</code>, <code>iframe</code>, <code>img</code>, and <code>map</code>
elements, and it will be
removed from XHTML in subsequent versions.</p>
+
<h2>C.9 Character Encoding</h2>
<p>To specify a character encoding in the document, use both the
encoding attribute specification on the xml declaration (e.g.
@@ -1048,6 +1226,7 @@
content='text/html; charset="EUC-JP"' /></code>). The
value of the encoding attribute of the xml processing instruction
takes precedence.</p>
+
<h2>C.10 Boolean Attributes</h2>
<p>Some HTML user agents are unable to interpret boolean
attributes when these appear in their full (non-minimized) form,
@@ -1058,6 +1237,7 @@
checked</code>, <code>disabled</code>, <code>readonly</code>,
<code>multiple</code>, <code>selected</code>, <code>
noresize</code>, <code>defer</code>.</p>
+
<h2>C.11 Document Object Model and XHTML</h2>
<p>
The Document Object Model level 1 Recommendation [<a href="#ref-dom">DOM</a>]
@@ -1089,6 +1269,7 @@
Applications need to adapt to this
accordingly.</li>
</ol>
+
<h2>C.12 Using Ampersands in Attribute Values</h2>
<p>
When an attribute value contains an ampersand, it must be expressed as a character
@@ -1101,44 +1282,51 @@
rather than as
<code>http://my.site.dom/cgi-bin/myscript.pl?class=guest&name=user</code>.
</p>
+
<h2>C.13 Cascading Style Sheets (CSS) and XHTML</h2>
+
<p>The Cascading Style Sheets level 2 Recommendation [<a href="#ref-css2">CSS2</a>] defines style
properties which are applied to the parse tree of the HTML or XML
document. Differences in parsing will produce different visual or
aural results, depending on the selectors used. The following hints
will reduce this effect for documents which are served without
modification as both media types:</p>
+
<ol>
<li>
CSS style sheets for XHTML should use lower case element and
attribute names.</li>
+
+
<li>In tables, the tbody element will be inferred by the parser of an
HTML user agent, but not by the parser of an XML user agent. Therefore
you should always explicitely add a tbody element if it is referred to
in a CSS selector.</li>
+
<li>Within the XHTML name space, user agents are expected to
recognize the "id" attribute as an attribute of type ID.
Therefore, style sheets should be able to continue using the
shorthand "#" selector syntax even if the user agent does not read
the DTD.</li>
+
<li>Within the XHTML name space, user agents are expected to
recognize the "class" attribute. Therefore, style sheets should be
able to continue using the shorthand "." selector syntax.</li>
+
<li>
CSS defines different conformance rules for HTML and XML documents;
be aware that the HTML rules apply to XHTML documents delivered as
HTML and the XML rules apply to XHTML documents delivered as XML.</li>
</ol>
<!--OddPage-->
-<h1>
-<a name="acks" id="acks">Appendix D.
-Acknowledgements</a>
-</h1>
-<p>
-<b>This appendix is informative.</b>
-</p>
+<h1><a name="acks" id="acks">Appendix D.
+Acknowledgements</a></h1>
+
+<p><b>This appendix is informative.</b></p>
+
<p>This specification was written with the participation of the
members of the W3C HTML working group:</p>
+
<dl>
<dd>Steven Pemberton, CWI (HTML Working Group Chair)<br/>
Murray Altheim, Sun Microsystems<br/>
@@ -1165,108 +1353,87 @@
Ted Wugofski, Gateway 2000<br/>
Dan Zigmond, WebTV Networks</dd>
</dl>
+
<!--OddPage-->
-<h1>
-<a name="refs" id="refs">Appendix E. References</a>
-</h1>
-<p>
-<b>This appendix is informative.</b>
-</p>
+<h1><a name="refs" id="refs">Appendix E. References</a></h1>
+
+<p><b>This appendix is informative.</b></p>
+
<dl>
-<dt>
-<a name="ref-css2" id="ref-css2">
-<b>[CSS2]</b>
-</a>
-</dt>
+
+<dt><a name="ref-css2" id="ref-css2"><b>[CSS2]</b></a></dt>
+
<dd><a href="http://www.w3.org/TR/REC-CSS2">"Cascading Style Sheets, level 2 (CSS2) Specification"</a>, B.
Bos, H. W. Lie, C. Lilley, I. Jacobs, 12 May 1998.<br/>
Available at: <a href="http://www.w3.org/TR/REC-CSS2">
http://www.w3.org/TR/REC-CSS2</a></dd>
-<dt>
-<a name="ref-dom" id="ref-dom">
-<b>[DOM]</b>
-</a>
-</dt>
+
+<dt><a name="ref-dom" id="ref-dom"><b>[DOM]</b></a></dt>
+
<dd><a href="http://www.w3.org/TR/REC-DOM-Level-1">"Document Object Model (DOM) Level 1 Specification"</a>, Lauren
Wood <i>et al.</i>, 1 October 1998.<br/>
Available at: <a href="http://www.w3.org/TR/REC-DOM-Level-1">
http://www.w3.org/TR/REC-DOM-Level-1</a></dd>
-<dt>
-<a name="ref-html4" id="ref-html4">
-<b>[HTML]</b>
-</a>
-</dt>
+
+<dt><a name="ref-html4" id="ref-html4"><b>[HTML]</b></a></dt>
+
<dd><a href="http://www.w3.org/TR/1999/PR-html40-19990824">"HTML 4.01 Specification"</a>, D. Raggett, A. Le Hors, I.
Jacobs, 24 August 1999.<br/>
Available at: <a href="http://www.w3.org/TR/1999/PR-html40-19990824">
http://www.w3.org/TR/1999/PR-html40-19990824</a></dd>
-<dt>
-<a name="ref-posix" id="ref-posix">
-<b>[POSIX.1]</b>
-</a>
-</dt>
+
+<dt><a name="ref-posix" id="ref-posix"><b>[POSIX.1]</b></a></dt>
+
<dd>"ISO/IEC 9945-1:1990 Information Technology - Portable
Operating System Interface (POSIX) - Part 1: System Application
Program Interface (API) [C Language]", Institute of Electrical
and Electronics Engineers, Inc, 1990.</dd>
-<dt>
-<a name="ref-rfc2046" id="ref-rfc2046">
-<b>
-[RFC2046]</b>
-</a>
-</dt>
+
+<dt><a name="ref-rfc2046" id="ref-rfc2046"><b>
+[RFC2046]</b></a></dt>
+
<dd><a href="http://www.ietf.org/rfc/rfc2046.txt">"RFC2046: Multipurpose Internet Mail Extensions (MIME) Part
Two: Media Types"</a>, N. Freed and N. Borenstein, November
1996.<br/>
Available at <a href="http://www.ietf.org/rfc/rfc2046.txt">
http://www.ietf.org/rfc/rfc2046.txt</a>. Note that this RFC
obsoletes RFC1521, RFC1522, and RFC1590.</dd>
-<dt>
-<a name="ref-rfc2119" id="ref-rfc2119">
-<b>
-[RFC2119]</b>
-</a>
-</dt>
+
+<dt><a name="ref-rfc2119" id="ref-rfc2119"><b>
+[RFC2119]</b></a></dt>
+
<dd><a href="http://www.ietf.org/rfc/rfc2119.txt">"RFC2119: Key words for use in RFCs to Indicate Requirement
Levels"</a>, S. Bradner, March 1997.<br/>
Available at: <a href="http://www.ietf.org/rfc/rfc2119.txt">
http://www.ietf.org/rfc/rfc2119.txt</a></dd>
-<dt>
-<a name="ref-rfc2376" id="ref-rfc2376">
-<b>
-[RFC2376]</b>
-</a>
-</dt>
+
+<dt><a name="ref-rfc2376" id="ref-rfc2376"><b>
+[RFC2376]</b></a></dt>
+
<dd><a href="http://www.ietf.org/rfc/rfc2376.txt">"RFC2376: XML Media Types"</a>, E. Whitehead, M. Murata, July
1998.<br/>
Available at: <a href="http://www.ietf.org/rfc/rfc2376.txt">
http://www.ietf.org/rfc/rfc2376.txt</a></dd>
-<dt>
-<a name="ref-rfc2396" id="ref-rfc2396">
-<b>
-[RFC2396]</b>
-</a>
-</dt>
+
+<dt><a name="ref-rfc2396" id="ref-rfc2396"><b>
+[RFC2396]</b></a></dt>
+
<dd><a href="http://www.ietf.org/rfc/rfc2396.txt">"RFC2396: Uniform Resource Identifiers (URI): Generic
Syntax"</a>, T. Berners-Lee, R. Fielding, L. Masinter, August
1998.<br/>
This document updates RFC1738 and RFC1808.<br/>
Available at: <a href="http://www.ietf.org/rfc/rfc2396.txt">
http://www.ietf.org/rfc/rfc2396.txt</a></dd>
-<dt>
-<a name="ref-xml" id="ref-xml">
-<b>[XML]</b>
-</a>
-</dt>
+
+<dt><a name="ref-xml" id="ref-xml"><b>[XML]</b></a></dt>
+
<dd><a href="http://www.w3.org/TR/REC-xml">"Extensible Markup Language (XML) 1.0 Specification"</a>, T.
Bray, J. Paoli, C. M. Sperberg-McQueen, 10 February 1998.<br/>
Available at: <a href="http://www.w3.org/TR/REC-xml">
http://www.w3.org/TR/REC-xml</a></dd>
-<dt>
-<a name="ref-xmlns" id="ref-xmlns">
-<b>[XMLNAMES]</b>
-</a>
-</dt>
+
+<dt><a name="ref-xmlns" id="ref-xmlns"><b>[XMLNAMES]</b></a></dt>
+
<dd><a href="http://www.w3.org/TR/REC-xml-names">"Namespaces in XML"</a>, T. Bray, D. Hollander, A. Layman, 14
January 1999.<br/>
XML namespaces provide a simple method for qualifying names used
@@ -1274,11 +1441,10 @@
by URI.<br/>
Available at: <a href="http://www.w3.org/TR/REC-xml-names">
http://www.w3.org/TR/REC-xml-names</a></dd>
+
</dl>
-<p>
-<a href="http://www.w3.org/WAI/WCAG1AAA-Conformance" title="Explanation of Level Triple-A Conformance">
-<img height="32" width="88" src="wcag1AAA.gif" alt="Level Triple-A conformance icon, W3C-WAI Web Content Accessibility Guidelines 1.0"/></a>
-</p>
+<p><a href="http://www.w3.org/WAI/WCAG1AAA-Conformance" title="Explanation of Level Triple-A Conformance">
+<img height="32" width="88" src="wcag1AAA.gif" alt="Level Triple-A conformance icon, W3C-WAI Web Content Accessibility Guidelines 1.0"/></a></p>
<div class="navbar">
<hr/>
<a href="#toc">table of contents</a>
diff --git a/result/valid/xlink.xml b/result/valid/xlink.xml
index baab153..70096cd 100644
--- a/result/valid/xlink.xml
+++ b/result/valid/xlink.xml
@@ -9,217 +9,347 @@
<?xml-stylesheet href="file:///C|/Program%20Files/SoftQuad/XMetaL%201/display/xmlspec.css"
type="text/css"?>
<spec>
+
<!-- Last edited: 27 May 1999 by bent -->
-<header>
-<?Pub Dtl?>
-<title>XML Linking Language (XLink)</title>
-<version>Version 1.0</version>
-<w3c-designation><!-- &doc-type;-&iso6.doc.date; --> WD-xlink-19990527</w3c-designation>
-<w3c-doctype>World Wide Web Consortium Working Draft</w3c-doctype>
-<pubdate>
-<day>29</day>
-<month>May</month>
-<year>1999</year>
-</pubdate>
-<notice>
-<p>This draft is for public discussion.</p>
-</notice>
-<publoc>
-<loc href="http://www.w3.org/XML/Group/1999/05/WD-xlink-current">http://www.w3.org/XML/Group/1999/05/WD-xlink-current</loc>
-</publoc>
-<prevlocs>
-<!--Check: was it actually August?-->
-<loc href="http://www.w3.org/XML/Group/1999/05/WD-xlink-19990527">http://www.w3.org/XML/Group/1999/05/WD-xlink-19990527</loc>
-<loc href="http://www.w3.org/XML/Group/1999/05/WD-xlink-19990505">http://www.w3.org/XML/Group/1999/05/WD-xlink-19990505</loc>
-<loc href="http://www.w3.org/TR/1998/WD-xlink-19980303">http://www.w3.org/TR/1998/WD-xlink-19980303</loc>
-<loc href="http://www.w3.org/TR/WD-xml-link-970630">http://www.w3.org/TR/WD-xml-link-970630</loc>
-</prevlocs>
-<authlist>
-<!--Updated author hrefs dorchard-->
-<!-- Update Steve's email - bent -->
-<author>
-<name>Steve DeRose</name>
-<affiliation>Inso Corp. and Brown University</affiliation>
-<email href="mailto:Steven_DeRose@Brown.edu">Steven_DeRose@Brown.edu</email>
-</author>
-<author>
-<name>David Orchard</name>
-<affiliation>IBM Corp.</affiliation>
-<email href="mailto:dorchard@ca.ibm.com">dorchard@ca.ibm.com</email>
-</author>
-<author>
-<name>Ben Trafford</name>
-<affiliation>Invited Expert</affiliation>
-<email href="mailto:bent@exemplary.net">bent@exemplary.net</email>
-</author>
-<!-- I suggest we move Eve and Tim down to the Acknowledgements section. We
+<header><?Pub Dtl?>
+ <title>XML Linking Language (XLink)</title>
+ <version>Version 1.0</version>
+ <w3c-designation><!-- &doc-type;-&iso6.doc.date; --> WD-xlink-19990527</w3c-designation>
+ <w3c-doctype>World Wide Web Consortium Working Draft</w3c-doctype>
+ <pubdate><day>29</day><month>May</month><year>1999</year></pubdate>
+ <notice>
+ <p>This draft is for public discussion.</p>
+ </notice>
+ <publoc><loc href="http://www.w3.org/XML/Group/1999/05/WD-xlink-current">http://www.w3.org/XML/Group/1999/05/WD-xlink-current</loc></publoc>
+ <prevlocs>
+ <!--Check: was it actually August?-->
+ <loc href="http://www.w3.org/XML/Group/1999/05/WD-xlink-19990527">http://www.w3.org/XML/Group/1999/05/WD-xlink-19990527</loc>
+ <loc href="http://www.w3.org/XML/Group/1999/05/WD-xlink-19990505">http://www.w3.org/XML/Group/1999/05/WD-xlink-19990505</loc>
+ <loc href="http://www.w3.org/TR/1998/WD-xlink-19980303">http://www.w3.org/TR/1998/WD-xlink-19980303</loc>
+ <loc href="http://www.w3.org/TR/WD-xml-link-970630">http://www.w3.org/TR/WD-xml-link-970630</loc></prevlocs>
+
+ <authlist>
+ <!--Updated author hrefs dorchard-->
+ <!-- Update Steve's email - bent -->
+ <author>
+ <name>Steve DeRose</name>
+ <affiliation>Inso Corp. and Brown University</affiliation>
+ <email href="mailto:Steven_DeRose@Brown.edu">Steven_DeRose@Brown.edu</email>
+ </author>
+ <author>
+ <name>David Orchard</name>
+ <affiliation>IBM Corp.</affiliation>
+ <email href="mailto:dorchard@ca.ibm.com">dorchard@ca.ibm.com</email>
+ </author>
+ <author>
+ <name>Ben Trafford</name>
+ <affiliation>Invited Expert</affiliation>
+ <email href="mailto:bent@exemplary.net">bent@exemplary.net</email>
+ </author>
+ <!-- I suggest we move Eve and Tim down to the Acknowledgements section. We
also ought to add Gabe Beged-Dov there, as well. bent
how shall we cite Tim? sjd What about with an Acknowledgments section?
-elm <AUTHOR> <NAME>Tim Bray</NAME> <AFFILIATION>Textuality</AFFILIATION>
<EMAIL>tbray@textuality.com</EMAIL> </AUTHOR>-->
-</authlist>
-<status>
-<p>This is a W3C Working Draft for review by W3C members and other interested parties. It is a draft document and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress". A list of current W3C working drafts can be found at <loc href="http://www.w3.org/TR">http://www.w3.org/TR</loc>.</p>
-<p><emph>Note:</emph> Since working drafts are subject to frequent change, you are advised to reference the above URI, rather than the URIs for working drafts themselves. Some of the work remaining is described in <specref ref="unfinished"/>. </p>
-<p>This work is part of the W3C XML Activity (for current status, see <loc href="http://www.w3.org/MarkUp/SGML/Activity">http://www.w3.org/XML/Activity </loc>). For information about the XPointer language which is expected to be used with XLink, see <loc href="http://www.w3.org/MarkUp/SGML/Activity">http://www.w3.org/TR/WD-xptr</loc>.
+ </authlist>
+
+ <status>
+ <p>This is a W3C Working Draft for review by W3C members and other interested parties. It is a draft document and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress". A list of current W3C working drafts can be found at <loc href="http://www.w3.org/TR">http://www.w3.org/TR</loc>.</p>
+ <p><emph>Note:</emph> Since working drafts are subject to frequent change, you are advised to reference the above URI, rather than the URIs for working drafts themselves. Some of the work remaining is described in <specref ref="unfinished"/>. </p>
+ <p>This work is part of the W3C XML Activity (for current status, see <loc href="http://www.w3.org/MarkUp/SGML/Activity">http://www.w3.org/XML/Activity </loc>). For information about the XPointer language which is expected to be used with XLink, see <loc href="http://www.w3.org/MarkUp/SGML/Activity">http://www.w3.org/TR/WD-xptr</loc>.
</p>
-<p>See <loc href="http://www.w3.org/TR/NOTE-xlink-principles">http://www.w3.org/TR/NOTE-xlink-principles </loc> for additional background on the design principles informing XLink.</p>
-<p>Also see <loc href="http://www.w3.org/TR/NOTE-xlink-req/">http://www.w3.org/TR/NOTE-xlink-req/</loc> for the XLink requirements that this document attempts to satisfy.</p>
-</status>
-<abstract>
-<!-- edited the abstract for further clarity - bent -->
-<p>This specification defines constructs that may be inserted into XML DTDs, schemas and document instances to describe links between objects. It uses XML syntax to create structures that can describe the simple unidirectional hyperlinks of today's HTML as well as more sophisticated links.</p>
-</abstract>
-<pubstmt>
-<p>Burlington, Seekonk, et al.: World-Wide Web Consortium, XML Working Group, 1998.</p>
-</pubstmt>
-<sourcedesc>
-<p>Created in electronic form.</p>
-</sourcedesc>
-<langusage>
-<language id="en">English</language>
-<language id="ebnf">Extended Backus-Naur Form (formal grammar)</language>
-</langusage>
-<revisiondesc>
-<slist>
-<sitem>1997-01-15 : Skeleton draft by TB</sitem>
-<sitem>1997-01-24 : Fleshed out by sjd</sitem>
-<sitem>1997-04-08 : Substantive draft</sitem>
-<sitem>1997-06-30 : Public draft</sitem>
-<sitem>1997-08-01 : Public draft</sitem>
-<sitem>1997-08-05 : Prose/organization work by sjd</sitem>
-<sitem>1997-10-14: Conformance and design principles; a bit of cleanup by elm</sitem>
-<sitem>1997-11-07: Update for editorial issues per issues doc, by sjd.</sitem>
-<sitem>1997-12-01: Update for editorial issues per issues doc in preparation for F2F meeting, by sjd.</sitem>
-<sitem>1998-01-13: Editorial cleanup, addition of new design principles, by elm.</sitem>
-<sitem>1998-02-27: Splitting out of XLink and XPointer, by elm.</sitem>
-<sitem>1998-03-03: Moved most of the XPointer locator stuff here. elm</sitem>
-<sitem>1999-04-24: Editorial rewrites to represent new ideas on XLink, especially the inclusion of arcs. bent</sitem>
-<sitem>1999-05-05: Prose/organization work by dorchard. Moved much of the semantics section around, from: locators, link semantics, remote resource semantics, local resource semantics; to: resource semantics, locators, behavior semantics, link semantics, arc semantics</sitem>
-<sitem>1999-05-12: Prose/organization work. Re-organized some of the sections, removed XML constructs from the document, added descriptive prose, edited document text for clarity. Rewrote the link recognition section. bent</sitem>
-<sitem>1999-05-17: Further prose work. Added non-normative examples. Clarified arcs. bent</sitem>
-<sitem>1999-05-23: Edited for grammar and clarity. bent</sitem>
-<sitem>1999-05-27: Final once-over before sending to group. Fixed sjd's email address. bent</sitem>
-</slist>
-</revisiondesc>
+ <p>See <loc href="http://www.w3.org/TR/NOTE-xlink-principles">http://www.w3.org/TR/NOTE-xlink-principles </loc> for additional background on the design principles informing XLink.</p>
+ <p>Also see <loc href="http://www.w3.org/TR/NOTE-xlink-req/">http://www.w3.org/TR/NOTE-xlink-req/</loc> for the XLink requirements that this document attempts to satisfy.</p>
+ </status>
+
+ <abstract>
+ <!-- edited the abstract for further clarity - bent -->
+ <p>This specification defines constructs that may be inserted into XML DTDs, schemas and document instances to describe links between objects. It uses XML syntax to create structures that can describe the simple unidirectional hyperlinks of today's HTML as well as more sophisticated links.</p>
+ </abstract>
+
+ <pubstmt>
+ <p>Burlington, Seekonk, et al.: World-Wide Web Consortium, XML Working Group, 1998.</p>
+ </pubstmt>
+
+ <sourcedesc>
+ <p>Created in electronic form.</p>
+ </sourcedesc>
+
+ <langusage>
+ <language id="en">English</language>
+ <language id="ebnf">Extended Backus-Naur Form (formal grammar)</language>
+ </langusage>
+
+ <revisiondesc>
+ <slist>
+ <sitem>1997-01-15 : Skeleton draft by TB</sitem>
+ <sitem>1997-01-24 : Fleshed out by sjd</sitem>
+ <sitem>1997-04-08 : Substantive draft</sitem>
+ <sitem>1997-06-30 : Public draft</sitem>
+ <sitem>1997-08-01 : Public draft</sitem>
+ <sitem>1997-08-05 : Prose/organization work by sjd</sitem>
+ <sitem>1997-10-14: Conformance and design principles; a bit of cleanup by elm</sitem>
+ <sitem>1997-11-07: Update for editorial issues per issues doc, by sjd.</sitem>
+ <sitem>1997-12-01: Update for editorial issues per issues doc in preparation for F2F meeting, by sjd.</sitem>
+ <sitem>1998-01-13: Editorial cleanup, addition of new design principles, by elm.</sitem>
+ <sitem>1998-02-27: Splitting out of XLink and XPointer, by elm.</sitem>
+ <sitem>1998-03-03: Moved most of the XPointer locator stuff here. elm</sitem>
+ <sitem>1999-04-24: Editorial rewrites to represent new ideas on XLink, especially the inclusion of arcs. bent</sitem>
+ <sitem>1999-05-05: Prose/organization work by dorchard. Moved much of the semantics section around, from: locators, link semantics, remote resource semantics, local resource semantics; to: resource semantics, locators, behavior semantics, link semantics, arc semantics</sitem>
+ <sitem>1999-05-12: Prose/organization work. Re-organized some of the sections, removed XML constructs from the document, added descriptive prose, edited document text for clarity. Rewrote the link recognition section. bent</sitem>
+ <sitem>1999-05-17: Further prose work. Added non-normative examples. Clarified arcs. bent</sitem>
+ <sitem>1999-05-23: Edited for grammar and clarity. bent</sitem>
+ <sitem>1999-05-27: Final once-over before sending to group. Fixed sjd's email address. bent</sitem>
+ </slist>
+ </revisiondesc>
</header>
+
<body>
-<div1>
-<?Pub Dtl?>
-<head>Introduction</head>
-<p>This specification defines constructs that may be inserted into XML DTDs, schemas, and document instances to describe links between objects. A <termref def="dt-link">link</termref>, as the term is used here, is an explicit relationship between two or more data objects or portions of data objects. This specification is concerned with the syntax used to assert link existence and describe link characteristics. Implicit (unasserted) relationships, for example that of one word to the next or that of a word in a text to its entry in an on-line dictionary are obviously important, but outside its scope.</p>
-<p>Links are asserted by <xtermref href="WD-xml-lang.html#dt-element">elements </xtermref> contained in <xtermref href="WD-xml-lang.html#dt-xml-doc">XML document instances</xtermref>. The simplest case is very like an HTML <code>A</code> link, and has these characteristics:
- <ulist><item><p>The link is expressed at one of its ends (similar to the <code>A</code> element in some document)</p></item><item><p>Users can only initiate travel from that end to the other</p></item><item><p>The link's effect on windows, frames, go-back lists, stylesheets in use, and so on is mainly determined by browsers, not by the link itself. For example, traveral of <code>A</code> links normally replaces the current view, perhaps with a user option to open a new window.</p></item><item><p>The link goes to only one destination (although a server may have great freedom in finding or dynamically creating that destination).</p></item></ulist>
+ <div1><?Pub Dtl?>
+ <head>Introduction</head>
+ <p>This specification defines constructs that may be inserted into XML DTDs, schemas, and document instances to describe links between objects. A <termref def="dt-link">link</termref>, as the term is used here, is an explicit relationship between two or more data objects or portions of data objects. This specification is concerned with the syntax used to assert link existence and describe link characteristics. Implicit (unasserted) relationships, for example that of one word to the next or that of a word in a text to its entry in an on-line dictionary are obviously important, but outside its scope.</p>
+ <p>Links are asserted by <xtermref href="WD-xml-lang.html#dt-element">elements </xtermref> contained in <xtermref href="WD-xml-lang.html#dt-xml-doc">XML document instances</xtermref>. The simplest case is very like an HTML <code>A</code> link, and has these characteristics:
+ <ulist>
+ <item><p>The link is expressed at one of its ends (similar to the <code>A</code> element in some document)</p></item>
+ <item><p>Users can only initiate travel from that end to the other</p></item>
+ <item><p>The link's effect on windows, frames, go-back lists, stylesheets in use, and so on is mainly determined by browsers, not by the link itself. For example, traveral of <code>A</code> links normally replaces the current view, perhaps with a user option to open a new window.</p></item>
+ <item><p>The link goes to only one destination (although a server may have great freedom in finding or dynamically creating that destination).</p></item>
+ </ulist>
</p>
-<p>While this set of characteristics is already very powerful and obviously has proven itself highly useful and effective, each of these assumptions also limits the range of hypertext functionality. The linking model defined here provides ways to create links that go beyond each of these specific characteristics, thus providing features previously available mostly in dedicated hypermedia systems.
+ <p>While this set of characteristics is already very powerful and obviously has proven itself highly useful and effective, each of these assumptions also limits the range of hypertext functionality. The linking model defined here provides ways to create links that go beyond each of these specific characteristics, thus providing features previously available mostly in dedicated hypermedia systems.
</p>
+
<div2>
-<head>Origin and Goals</head>
-<p>Following is a summary of the design principles governing XLink:
- <olist><item><p>XLink must be straightforwardly usable over the Internet. </p></item><item><p>XLink must be usable by a wide variety of link usage domains and classes of linking application software.</p></item><item><p>XLink must support HTML 4.0 linking constructs.</p></item><item><p>The XLink expression language must be XML.</p></item><item><p>The XLink design must be formal, concise, and illustrative.</p></item><item><p>XLinks must be human-readable and human-writable.</p></item><item><p>XLinks may reside within or outside the documents in which the
- participating resources reside. </p></item><item><p>XLink must represent the abstract structure and significance of links.</p></item><item><p>XLink must be feasible to implement.</p></item><item><p>XLink must be informed by knowledge of established hypermedia systems and standards.</p></item></olist>
+ <head>Origin and Goals</head>
+ <p>Following is a summary of the design principles governing XLink:
+ <olist>
+ <item><p>XLink must be straightforwardly usable over the Internet. </p></item>
+ <item><p>XLink must be usable by a wide variety of link usage domains and classes of linking application software.</p></item>
+ <item><p>XLink must support HTML 4.0 linking constructs.</p></item>
+ <item><p>The XLink expression language must be XML.</p></item>
+ <item><p>The XLink design must be formal, concise, and illustrative.</p></item>
+ <item><p>XLinks must be human-readable and human-writable.</p></item>
+ <item><p>XLinks may reside within or outside the documents in which the
+ participating resources reside. </p></item>
+ <item><p>XLink must represent the abstract structure and significance of links.</p></item>
+ <item><p>XLink must be feasible to implement.</p></item>
+ <item><p>XLink must be informed by knowledge of established hypermedia systems and standards.</p></item>
+ </olist>
</p>
</div2>
<!--Changed the list of requirements to reflect current XLink requirements
document. bent-->
+
<div2>
-<head>Relationship to Existing Standards</head>
-<p>Three standards have been especially influential:
- <ulist><item><p><emph>HTML:</emph> Defines several SGML element types that represent links.</p></item><item><p><emph>HyTime:</emph> Defines inline and out-of-line link structures and some semantic features, including traversal control and presentation of objects. <!--Changed from "placement of objects into a display or other space" -elm-->
- </p></item><item><p><emph>Text Encoding Initiative Guidelines (TEI P3):</emph> Provides structures for creating links, aggregate objects, and link collections out of them.</p></item></ulist>
+ <head>Relationship to Existing Standards</head>
+ <p>Three standards have been especially influential:
+ <ulist>
+ <item><p><emph>HTML:</emph> Defines several SGML element types that represent links.</p></item>
+ <item><p><emph>HyTime:</emph> Defines inline and out-of-line link structures and some semantic features, including traversal control and presentation of objects. <!--Changed from "placement of objects into a display or other space" -elm-->
+ </p></item>
+ <item><p><emph>Text Encoding Initiative Guidelines (TEI P3):</emph> Provides structures for creating links, aggregate objects, and link collections out of them.</p></item>
+ </ulist>
</p>
-<p>Many other linking systems have also informed this design, especially Dexter, FRESS, MicroCosm, and InterMedia.</p>
+ <p>Many other linking systems have also informed this design, especially Dexter, FRESS, MicroCosm, and InterMedia.</p>
</div2>
+
<div2>
-<head>Terminology</head>
-<p>The following basic terms apply in this document. <!--<IMG
+ <head>Terminology</head>
+ <p>The following basic terms apply in this document. <!--<IMG
SRC="local://./linkdiag.gif">(figure to be inserted)-->
- <glist><gitem><label><termdef id="dt-arc" term="Arc">arc</termdef></label><def><p>A symbolic representation of traversal behavior in links, especially the direction, context and timing of traversal.</p></def></gitem><gitem><label><termdef id="dt-eltree" term="Element Tree">element tree</termdef></label><def><p>A representation of the relevant structure specified by the tags and attributes in an XML document, based on "groves" as defined in the ISO DSSSL standard. </p></def></gitem><gitem><label><termdef id="dt-inline" term="In-Line Link">inline link</termdef></label><def><p>Abstractly, a <termref def="dt-link">link</termref> which serves as one of its own <termref def="dt-resource">resources</termref>. Concretely, a link where the content of the <termref def="dt-linkel">linking element</termref> serves as a <termref def="dt-particip-resource">participating resource</termref>.
+ <glist>
+ <gitem>
+ <label><termdef id="dt-arc" term="Arc">arc</termdef></label>
+ <def><p>A symbolic representation of traversal behavior in links, especially the direction, context and timing of traversal.</p></def>
+ </gitem>
+ <gitem>
+ <label><termdef id="dt-eltree" term="Element Tree">element tree</termdef></label>
+ <def><p>A representation of the relevant structure specified by the tags and attributes in an XML document, based on "groves" as defined in the ISO DSSSL standard. </p></def>
+ </gitem>
+ <gitem>
+ <label><termdef id="dt-inline" term="In-Line Link">inline link</termdef></label>
+ <def><p>Abstractly, a <termref def="dt-link">link</termref> which serves as one of its own <termref def="dt-resource">resources</termref>. Concretely, a link where the content of the <termref def="dt-linkel">linking element</termref> serves as a <termref def="dt-particip-resource">participating resource</termref>.
HTML <code>A</code>, HyTime <code>clink</code>, and TEI <code>XREF</code>
- are all inline links.</p></def></gitem><gitem><label><termdef id="dt-link" term="Link">link</termdef></label><def><p>An explicit relationship between two or more data objects or portions of data objects.</p></def></gitem><gitem><label><termdef id="dt-linkel" term="Linking Element">linking element </termdef></label><def><p>An <xtermref href="WD-xml-lang.html#dt-element">element</xtermref> that asserts the existence and describes the characteristics of a <termref def="dt-link"> link</termref>.</p></def></gitem><gitem><label><termdef id="dt-local-resource" term="Local Resource">local resource</termdef></label><def><p>The content of an <termref def="dt-inline">inline</termref>linking element. Note that the content of the linking element could be explicitly pointed to by means of a regular <termref def="dt-locator">locator</termref> in the same linking element, in which case the resource is considered <termref def="dt-remote-resource"> remote</termref>, not local.</p></def></gitem><gitem><label><termdef id="dt-locator" term="Locator">locator</termdef> </label><def><p>Data, provided as part of a link, which identifies a
- <termref def="dt-resource">resource</termref>.</p></def></gitem><gitem><label><termdef id="dt-multidir" term="Multi-Directional Link">multidirectional link</termdef></label><def><p>A <termref def="dt-link">link</termref> whose <termref def="dt-traversal"> traversal</termref> can be initiated from more than one of its <termref def="dt-particip-resource"> participating resources</termref>. Note that being able to "go back" after following a one-directional link does not make the link multidirectional.</p></def></gitem><gitem><label><termdef id="dt-outofline" term="Out-of-line Link">out-of-line link</termdef></label><def><p>A <termref def="dt-link">link</termref> whose content does not serve as one of the link's <termref def="dt-particip-resource">participating resources </termref>. Such links presuppose a notion like <termref def="dt-xlg">extended link groups</termref>, which instruct application software where to look for links. Out-of-line links are generally required for supporting multidirectional <termref def="dt-traversal">traversal</termref> and for allowing read-only resources to have outgoing links.</p></def></gitem><gitem><label><termdef id="dt-parsedq" term="Parsed">parsed</termdef></label><def><p>In the context of link behavior, a parsed link is any link whose content is transcluded into the document where the link originated. The use of the term "parsed" directly refers to the concept in XML of a
- parsed entity.</p></def></gitem><gitem><label><termdef id="dt-particip-resource" term="Participating Resource"> participating resource</termdef></label><def><p>A <termref def="dt-resource">resource</termref> that belongs to a link. All resources are potential contributors to a link; participating resources are the actual contributors to a particular link.</p></def></gitem><gitem><label><termdef id="dt-remote-resource" term="Remote Resource">remote resource</termdef></label><def><p>Any participating resource of a link that is pointed to with a locator. </p></def></gitem><gitem><label><termdef id="dt-resource" term="Resource">resource</termdef></label><def><p>In the abstract sense, an addressable unit of information or service that is participating in a <termref def="dt-link">link</termref>. Examples include files, images, documents, programs, and query results. Concretely, anything reachable by the use of a <termref def="dt-locator">locator</termref> in some <termref def="dt-linkel">linking element</termref>. Note that this term and its definition are taken from the basic specifications governing the World Wide Web. <!--Joel notes: need link here. bent asks: A link?-->
- </p></def></gitem><gitem><label><termdef id="dt-subresource" term="sub-Resource">sub-resource</termdef></label><def><p>A portion of a resource, pointed to as the precise destination of a link. As one example, a link might specify that an entire document be retrieved and displayed, but that some specific part(s) of it is the specific linked data, to be treated in an application-appropriate manner such as indication by highlighting, scrolling, etc.</p></def></gitem><gitem><label><termdef id="dt-traversal" term="Traversal">traversal</termdef></label><def><p>The action of using a <termref def="dt-link">link</termref>; that is, of accessing a <termref def="dt-resource">resource</termref>. Traversal may be initiated by a user action (for example, clicking on the displayed content of a <termref def="dt-linkel">linking element</termref>) or occur under program control.</p></def></gitem></glist>
+ are all inline links.</p></def>
+ </gitem>
+ <gitem>
+ <label><termdef id="dt-link" term="Link">link</termdef></label>
+ <def><p>An explicit relationship between two or more data objects or portions of data objects.</p></def>
+ </gitem>
+ <gitem>
+ <label><termdef id="dt-linkel" term="Linking Element">linking element </termdef></label>
+ <def><p>An <xtermref href="WD-xml-lang.html#dt-element">element</xtermref> that asserts the existence and describes the characteristics of a <termref def="dt-link"> link</termref>.</p></def>
+ </gitem>
+ <gitem>
+ <label><termdef id="dt-local-resource" term="Local Resource">local resource</termdef></label>
+ <def><p>The content of an <termref def="dt-inline">inline</termref>linking element. Note that the content of the linking element could be explicitly pointed to by means of a regular <termref def="dt-locator">locator</termref> in the same linking element, in which case the resource is considered <termref def="dt-remote-resource"> remote</termref>, not local.</p></def>
+ </gitem>
+ <gitem>
+ <label><termdef id="dt-locator" term="Locator">locator</termdef> </label>
+ <def><p>Data, provided as part of a link, which identifies a
+ <termref def="dt-resource">resource</termref>.</p></def>
+ </gitem>
+ <gitem>
+ <label><termdef id="dt-multidir" term="Multi-Directional Link">multidirectional link</termdef></label>
+ <def><p>A <termref def="dt-link">link</termref> whose <termref def="dt-traversal"> traversal</termref> can be initiated from more than one of its <termref def="dt-particip-resource"> participating resources</termref>. Note that being able to "go back" after following a one-directional link does not make the link multidirectional.</p></def>
+ </gitem>
+ <gitem>
+ <label><termdef id="dt-outofline" term="Out-of-line Link">out-of-line link</termdef></label>
+ <def><p>A <termref def="dt-link">link</termref> whose content does not serve as one of the link's <termref def="dt-particip-resource">participating resources </termref>. Such links presuppose a notion like <termref def="dt-xlg">extended link groups</termref>, which instruct application software where to look for links. Out-of-line links are generally required for supporting multidirectional <termref def="dt-traversal">traversal</termref> and for allowing read-only resources to have outgoing links.</p></def>
+ </gitem>
+ <gitem>
+ <label><termdef id="dt-parsedq" term="Parsed">parsed</termdef></label> <def><p>In the context of link behavior, a parsed link is any link whose content is transcluded into the document where the link originated. The use of the term "parsed" directly refers to the concept in XML of a
+ parsed entity.</p></def>
+ </gitem>
+ <gitem>
+ <label><termdef id="dt-particip-resource" term="Participating Resource"> participating resource</termdef></label>
+ <def><p>A <termref def="dt-resource">resource</termref> that belongs to a link. All resources are potential contributors to a link; participating resources are the actual contributors to a particular link.</p></def>
+ </gitem>
+ <gitem>
+ <label><termdef id="dt-remote-resource" term="Remote Resource">remote resource</termdef></label>
+ <def><p>Any participating resource of a link that is pointed to with a locator. </p></def>
+ </gitem>
+ <gitem>
+ <label><termdef id="dt-resource" term="Resource">resource</termdef></label>
+ <def><p>In the abstract sense, an addressable unit of information or service that is participating in a <termref def="dt-link">link</termref>. Examples include files, images, documents, programs, and query results. Concretely, anything reachable by the use of a <termref def="dt-locator">locator</termref> in some <termref def="dt-linkel">linking element</termref>. Note that this term and its definition are taken from the basic specifications governing the World Wide Web. <!--Joel notes: need link here. bent asks: A link?-->
+ </p></def>
+ </gitem>
+ <gitem>
+ <label><termdef id="dt-subresource" term="sub-Resource">sub-resource</termdef></label>
+ <def><p>A portion of a resource, pointed to as the precise destination of a link. As one example, a link might specify that an entire document be retrieved and displayed, but that some specific part(s) of it is the specific linked data, to be treated in an application-appropriate manner such as indication by highlighting, scrolling, etc.</p></def>
+ </gitem>
+ <gitem>
+ <label><termdef id="dt-traversal" term="Traversal">traversal</termdef></label>
+ <def><p>The action of using a <termref def="dt-link">link</termref>; that is, of accessing a <termref def="dt-resource">resource</termref>. Traversal may be initiated by a user action (for example, clicking on the displayed content of a <termref def="dt-linkel">linking element</termref>) or occur under program control.</p></def>
+ </gitem>
+ </glist>
</p>
</div2>
+
<div2>
-<head>Notation</head>
-<p>The formal grammar for <termref def="dt-locator">locators</termref> is given using a simple Extended Backus-Naur Form (EBNF) location, as described in <xspecref href="http://www.w3.org/TR/REC-xml#sec-notation">the XML specification</xspecref>.</p>
-<!-- fixed link to XML spec - bent -->
+ <head>Notation</head>
+ <p>The formal grammar for <termref def="dt-locator">locators</termref> is given using a simple Extended Backus-Naur Form (EBNF) location, as described in <xspecref href="http://www.w3.org/TR/REC-xml#sec-notation">the XML specification</xspecref>.</p>
+ <!-- fixed link to XML spec - bent -->
</div2>
</div1>
-<div1 id="addressing">
-<?Pub Dtl?>
-<head>Locator Syntax</head>
-<p>The locator for a <termref def="dt-resource">resource</termref> is typically provided by means of a Uniform Resource Identifier, or URI. XPointers can be used in conjunction with the URI structure, as fragment identifiers, to specify a more precise sub-resource. </p>
-<!-- Removed the discussion of queries from the previous paragraph, due to contention within the WG. bent -->
-<p>A locator generally contains a URI, as described in IETF RFCs <bibref ref="rfc1738"/> and <bibref ref="rfc1808"/>. As these RFCs state, the URI may include a trailing <emph>query</emph> (marked by a leading "<code>?</code>"), and be followed by a "<code>#</code>" and a <emph>fragment identifier</emph>, with the query interpreted by the host providing the indicated resource, and the interpretation of the fragment identifier dependent on the data type of the indicated resource.</p>
-<!--Is there some restriction on URNs having queries and/or fragment identifiers? Since these RFCs don't mention URIs explicitly, should the wording here lead from URLs to URIs more explicitly? -elm-->
-<p>In order to locate XML documents and portions of documents, a locator value may contain either a <xtermref href="http://www.w3.org/Addressing/rfc1738.txt"> URI</xtermref> or a fragment identifier, or both. Any fragment identifier for pointing into XML must be an <xtermref href="http://www.w3.org/TR/WD-xptr#dt-xpointer"> XPointer</xtermref>.</p>
-<p>Special syntax may be used to request the use of particular processing models in accessing the locator's resource. This is designed to reflect the realities of network operation, where it may or may not be desirable to exercise fine control over the distribution of work between local and remote processors.
- <scrap id="locator" lang="ebnf"><head>Locator</head><prod id="nt-locator"><lhs>Locator</lhs><rhs><nt def="nt-uri">URI</nt></rhs><rhs>| <nt def="nt-connector">Connector</nt> (<xnt href="http://www.w3.org/TR/WD-xptr">XPointer</xnt> | <xnt href="WD-xml-lang.html#NT-Name">Name</xnt>)</rhs><rhs>| <nt def="nt-uri">URI</nt> <nt def="nt-connector">Connector</nt> (<xnt href="http://www.w3.org/TR/WD-xptr">XPointer</xnt> | <xnt href="WD-xml-lang.html#NT-Name">Name</xnt>)</rhs></prod><prod id="nt-connector"><lhs>Connector</lhs><rhs>'#' | '|'</rhs></prod><prod id="nt-uri"><lhs>URI</lhs><rhs><xnt href="WD-xml-lang.html#NT-URLchar">URIchar*</xnt></rhs></prod></scrap>
+
+<div1 id="addressing"><?Pub Dtl?>
+ <head>Locator Syntax</head>
+ <p>The locator for a <termref def="dt-resource">resource</termref> is typically provided by means of a Uniform Resource Identifier, or URI. XPointers can be used in conjunction with the URI structure, as fragment identifiers, to specify a more precise sub-resource. </p>
+ <!-- Removed the discussion of queries from the previous paragraph, due to contention within the WG. bent -->
+ <p>A locator generally contains a URI, as described in IETF RFCs <bibref ref="rfc1738"/> and <bibref ref="rfc1808"/>. As these RFCs state, the URI may include a trailing <emph>query</emph> (marked by a leading "<code>?</code>"), and be followed by a "<code>#</code>" and a <emph>fragment identifier</emph>, with the query interpreted by the host providing the indicated resource, and the interpretation of the fragment identifier dependent on the data type of the indicated resource.</p>
+ <!--Is there some restriction on URNs having queries and/or fragment identifiers? Since these RFCs don't mention URIs explicitly, should the wording here lead from URLs to URIs more explicitly? -elm-->
+ <p>In order to locate XML documents and portions of documents, a locator value may contain either a <xtermref href="http://www.w3.org/Addressing/rfc1738.txt"> URI</xtermref> or a fragment identifier, or both. Any fragment identifier for pointing into XML must be an <xtermref href="http://www.w3.org/TR/WD-xptr#dt-xpointer"> XPointer</xtermref>.</p>
+ <p>Special syntax may be used to request the use of particular processing models in accessing the locator's resource. This is designed to reflect the realities of network operation, where it may or may not be desirable to exercise fine control over the distribution of work between local and remote processors.
+ <scrap id="locator" lang="ebnf">
+ <head>Locator</head>
+ <prod id="nt-locator">
+ <lhs>Locator</lhs>
+ <rhs><nt def="nt-uri">URI</nt></rhs>
+ <rhs>| <nt def="nt-connector">Connector</nt> (<xnt href="http://www.w3.org/TR/WD-xptr">XPointer</xnt> | <xnt href="WD-xml-lang.html#NT-Name">Name</xnt>)</rhs>
+ <rhs>| <nt def="nt-uri">URI</nt> <nt def="nt-connector">Connector</nt> (<xnt href="http://www.w3.org/TR/WD-xptr">XPointer</xnt> | <xnt href="WD-xml-lang.html#NT-Name">Name</xnt>)</rhs>
+ </prod>
+ <prod id="nt-connector">
+ <lhs>Connector</lhs><rhs>'#' | '|'</rhs>
+ </prod>
+ <prod id="nt-uri">
+ <lhs>URI</lhs><rhs><xnt href="WD-xml-lang.html#NT-URLchar">URIchar*</xnt></rhs>
+ </prod>
+ </scrap>
</p>
-<p><termdef id="dt-designated" term="Designated Resource">In this discussion, the term <term>designated resource</term> refers to the resource which an entire locator serves to locate.</termdef> The following rules apply:
- <ulist><item><p><termdef id="dt-containing-resource" term="Containing Resource"> The URI, if provided, locates a resource called the <term>containing resource</term>.</termdef></p></item><item><p>If the URI is not provided, the containing resource is considered to be the document in which the linking element is contained.
- </p></item><item><p><termdef id="dt-sub-resource" term="Sub-Resource">If an XPointer is provided, the designated resource is a <term>sub-resource</term>
+ <p><termdef id="dt-designated" term="Designated Resource">In this discussion, the term <term>designated resource</term> refers to the resource which an entire locator serves to locate.</termdef> The following rules apply:
+ <ulist>
+ <item>
+ <p><termdef id="dt-containing-resource" term="Containing Resource"> The URI, if provided, locates a resource called the <term>containing resource</term>.</termdef></p>
+ </item>
+ <item>
+ <p>If the URI is not provided, the containing resource is considered to be the document in which the linking element is contained.
+ </p></item>
+ <item>
+ <p><termdef id="dt-sub-resource" term="Sub-Resource">If an XPointer is provided, the designated resource is a <term>sub-resource</term>
of the containing resource; otherwise the designated resource is the
- containing resource.</termdef></p></item><!--Is this now incorrect, given the nature of the switch from here() to origin()? -elm
- Oy, yes, i think so. it will require some fun wording, though, so i haven't fixed it yet here -sjd--><item><p>If the <nt def="nt-connector">Connector</nt> is followed directly by a <xnt href="http://www.w3.org/TR/REC-xml#NT-Name">Name</xnt>, the <xnt href="http://www.w3.org/TR/REC-xml#NT-Name">Name</xnt> is shorthand for the XPointer"<code>id(Name)</code>"; that is, the sub-resource is the element in the containing resource that has an XML <xtermref href="http://www.w3.org/TR/REC-xml#sec-attrtypes">ID attribute</xtermref> whose value <xtermref href="http://www.w3.org/TR/REC-xml#dt-match">matches</xtermref> the <xnt href="http://www.w3.org/TR/REC-xml#NT-Name">Name</xnt>. This shorthand is to encourage use of the robust <code>id</code> addressing mode.</p></item><!-- fixed links to the XML recommendation - bent --><item><p>If the connector is "<code>#</code>", this signals an intent that the containing resource is to be fetched as a whole from the host that provides it, and that the XPointer processing to extract the sub-resource
- is to be performed on the client, that is to say on the same system where the linking element is recognized and processed.</p></item><item><p>If the connector is "<code>|</code>", no intent is signaled as to what processing model is to be used to go about accessing the designated resource.</p></item></ulist>
+ containing resource.</termdef></p>
+ </item>
+ <!--Is this now incorrect, given the nature of the switch from here() to origin()? -elm
+ Oy, yes, i think so. it will require some fun wording, though, so i haven't fixed it yet here -sjd-->
+ <item>
+ <p>If the <nt def="nt-connector">Connector</nt> is followed directly by a <xnt href="http://www.w3.org/TR/REC-xml#NT-Name">Name</xnt>, the <xnt href="http://www.w3.org/TR/REC-xml#NT-Name">Name</xnt> is shorthand for the XPointer"<code>id(Name)</code>"; that is, the sub-resource is the element in the containing resource that has an XML <xtermref href="http://www.w3.org/TR/REC-xml#sec-attrtypes">ID attribute</xtermref> whose value <xtermref href="http://www.w3.org/TR/REC-xml#dt-match">matches</xtermref> the <xnt href="http://www.w3.org/TR/REC-xml#NT-Name">Name</xnt>. This shorthand is to encourage use of the robust <code>id</code> addressing mode.</p>
+ </item>
+ <!-- fixed links to the XML recommendation - bent -->
+ <item>
+ <p>If the connector is "<code>#</code>", this signals an intent that the containing resource is to be fetched as a whole from the host that provides it, and that the XPointer processing to extract the sub-resource
+ is to be performed on the client, that is to say on the same system where the linking element is recognized and processed.</p>
+ </item>
+ <item>
+ <p>If the connector is "<code>|</code>", no intent is signaled as to what processing model is to be used to go about accessing the designated resource.</p>
+ </item>
+ </ulist>
</p>
-<p>Note that the definition of a URI includes an optional query component. </p>
-<p>In the case where the URI contains a query (to be interpreted by the server), information providers and authors of server software are urged to use queries as follows:
- <scrap id="querysyntax" lang="ebnf"><head>Query</head><prod id="nt-query"><lhs>Query</lhs><rhs>'XML-XPTR=' (<xnt href="http://www.w3.org/TR/WD-xptr"> XPointer</xnt> | <xnt href="http://www.w3.org/TR/REC-xml#NT-Name">Name</xnt>)</rhs></prod></scrap>
+ <p>Note that the definition of a URI includes an optional query component. </p>
+ <p>In the case where the URI contains a query (to be interpreted by the server), information providers and authors of server software are urged to use queries as follows:
+ <scrap id="querysyntax" lang="ebnf">
+ <head>Query</head>
+ <prod id="nt-query">
+ <lhs>Query</lhs><rhs>'XML-XPTR=' (<xnt href="http://www.w3.org/TR/WD-xptr"> XPointer</xnt> | <xnt href="http://www.w3.org/TR/REC-xml#NT-Name">Name</xnt>)</rhs>
+ </prod>
+ </scrap>
</p>
-<!-- fixed link to XML recommendation - bent -->
+ <!-- fixed link to XML recommendation - bent -->
</div1>
-<div1>
-<?Pub Dtl?>
-<head>Link Recognition</head>
-<p>The existence of a <termref def="dt-link">link</termref> is asserted by a <termref def="dt-linkel">linking element</termref>. Linking elements must be recognized reliably by application software in order to provide appropriate display and behavior. There are several ways link recognition could be accomplished: for example, reserving element type names, reserving attributes names, leaving the matter of recognition entirely up to stylesheets and application software, or using the XLink <xtermref href="http://www.w3.org/TR/REC-xml-names/">namespace</xtermref> to specify element names and attribute names that would be recognized by namespace and XLink-aware processors. Using element and attribute names within the XLink namespace provides a balance between giving users control of their own markup language design and keeping the identification of linking elements simple and unambiguous.</p>
-<p>The two approaches to identifying linking elements are relatively simple to implement. For example, here's how the HTML <code>A</code> element would be declared using attributes within the XLink namespace, and then how an element within the XLink namespace might do the same:
+
+<div1><?Pub Dtl?>
+ <head>Link Recognition</head>
+ <p>The existence of a <termref def="dt-link">link</termref> is asserted by a <termref def="dt-linkel">linking element</termref>. Linking elements must be recognized reliably by application software in order to provide appropriate display and behavior. There are several ways link recognition could be accomplished: for example, reserving element type names, reserving attributes names, leaving the matter of recognition entirely up to stylesheets and application software, or using the XLink <xtermref href="http://www.w3.org/TR/REC-xml-names/">namespace</xtermref> to specify element names and attribute names that would be recognized by namespace and XLink-aware processors. Using element and attribute names within the XLink namespace provides a balance between giving users control of their own markup language design and keeping the identification of linking elements simple and unambiguous.</p>
+ <p>The two approaches to identifying linking elements are relatively simple to implement. For example, here's how the HTML <code>A</code> element would be declared using attributes within the XLink namespace, and then how an element within the XLink namespace might do the same:
<eg><A xlink:type="simple" xlink:href="http://www.w3.org/TR/wd-xlink/"
xlink:title="The Xlink Working Draft">The XLink Working Draft.</A></eg>
<eg><xlink:simple href="http://www.w3.org/TR/wd-xlink/"
title="The XLink Working Draft">The XLink Working Draft</xlink:simple></eg>
Any arbitrary element can be made into an XLink by using the <code>xlink:type</code> attribute. And, of course, the explicit XLink elements may be used, as well. This document will go on to describe the linking attributes that are associated with linking elements. It may be assumed by the reader that these attributes would require the <code>xlink</code> namespace prefix if they existed within an arbitrary element, or that they may be used directly if they exist within an explicit Xlink element.</p>
-<!-- heavily modified this section to accomodate namespace-aware link recognition - bent -->
+ <!-- heavily modified this section to accomodate namespace-aware link recognition - bent -->
</div1>
+
<!-- Rewrote this entire section. - bent -->
<div1>
-<head>Linking Attributes</head>
-<p>XLink has several attributes associated with the variety of links it may represent. These attributes define four main concepts: locators, arcs, behaviors, and semantics. <emph>Locators</emph> define where the actual resource is located. <emph>Arcs</emph> define the traversal of links. Where does the link come from? Where does it go to? All this information can be stored in the arc attributes. <emph>Behaviors</emph> define how the link is activated, and what the application should do with the resource being linked to. <emph>Semantics</emph> define useful information that the application may use, and enables the link for such specalized targets as constricted devices and accessibility software.</p>
-<div2 id="link-locators">
-<head>Locator Attributes</head>
-<p>The only locator attribute at this time is <code>href</code>. This attribute must contain either a string in the form of a URI that defines the remote resource being linked to, a string containing a fragment identifier that links to a local resource, or a string containing a URI with a fragment identifier concacenated onto it.</p>
-</div2>
-<div2 id="link-arcs">
-<head>Arc Attributes</head>
-<p>Arcs contain two attributes, <code>from</code> and <code>to</code>. The <code>from</code> attribute may contain a string containing the content of a <code>role</code> attribute from the resource being linked from. The purpose of the <code>from</code> attribute is to define where this link is being actuated from.</p>
-<p>The <code>to</code> attribute may contain a string containing the content of a <code>role</code> attribute from the resource being linked to. The purpose of the <code>to</code> attribute is to define where this link traverses to.</p>
-<p>The application may use this information in a number of ways, especially in a complex hypertext system, but it is mainly useful in providing context for application behavior.</p>
-<!-- I'm at a loss as to how to describe arcs more clearly than this. I don't want to devolve into discussions of directed graphs and n-ary links. -bent -->
-</div2>
-<div2 id="link-behaviors">
-<head>Behavior Attributes</head>
-<p>There are two attributes associated with behavior: <code>show</code> and <code>actuate</code>. The <code>show</code> attribute defines how the remote resource is to be revealed to the user. It has three options: <code>new</code>, <code>parsed</code>, and <code>replace</code>. The <code>new</code> option indicates that the remote resource should be shown in a new window (or other device context) without replacing the previous content. The <code>parsed</code> option, relating directly to the XML concept of a parsed entity, indicates that the content should be integrated into the document from which the link was actuated. The <code>replace</code> option is the one most commonly seen on the World Wide Web, where the document being linked from is entirely replaced by the object being linked to.</p>
-<p>The <code>actuate</code> attribute defines how the link is initiated. It has two options: <code>user</code> and <code>auto</code>. The <code>user</code> option indicates that the link must be initiated by some sort of human-initiated selection, such as clicking on an HTML anchor. The <code>auto</code> option indicates that the link is automatically initiated when the application deems that the user has reached the link. It then follows the behavior set out in the <code>show</code> option.</p>
-<!-- Something should be put here in terms of an example. Idea: "A" link versus automatically updating encyclopedia. -bent -->
-</div2>
-<div2 id="link-semantics">
-<head>Semantic Attributes</head>
-<p>There are two attributes associated with semantics, <code>role</code> and <code>title</code>. The <code>role</code> attribute is a generic string used to describe the function of the link's content. For example, a poem might have a link with a <code>role="stanza"</code>. The <code>role</code> is also used as an identifier for the <code>from</code> and <code>to</code> attributes of arcs.</p>
-<p>The <code>title</code> attribute is designed to provide human-readable text describing the link. It is very useful for those who have text-based applications, whether that be due to a constricted device that cannot display the link's content, or if it's being read by an application to a visually-impaired user, or if it's being used to create a table of links. The <code>title</code> attribute contains a simple, descriptive string.</p>
-</div2>
+ <head>Linking Attributes</head>
+ <p>XLink has several attributes associated with the variety of links it may represent. These attributes define four main concepts: locators, arcs, behaviors, and semantics. <emph>Locators</emph> define where the actual resource is located. <emph>Arcs</emph> define the traversal of links. Where does the link come from? Where does it go to? All this information can be stored in the arc attributes. <emph>Behaviors</emph> define how the link is activated, and what the application should do with the resource being linked to. <emph>Semantics</emph> define useful information that the application may use, and enables the link for such specalized targets as constricted devices and accessibility software.</p>
+
+ <div2 id="link-locators">
+ <head>Locator Attributes</head>
+ <p>The only locator attribute at this time is <code>href</code>. This attribute must contain either a string in the form of a URI that defines the remote resource being linked to, a string containing a fragment identifier that links to a local resource, or a string containing a URI with a fragment identifier concacenated onto it.</p>
+ </div2>
+
+ <div2 id="link-arcs">
+ <head>Arc Attributes</head>
+ <p>Arcs contain two attributes, <code>from</code> and <code>to</code>. The <code>from</code> attribute may contain a string containing the content of a <code>role</code> attribute from the resource being linked from. The purpose of the <code>from</code> attribute is to define where this link is being actuated from.</p>
+ <p>The <code>to</code> attribute may contain a string containing the content of a <code>role</code> attribute from the resource being linked to. The purpose of the <code>to</code> attribute is to define where this link traverses to.</p>
+ <p>The application may use this information in a number of ways, especially in a complex hypertext system, but it is mainly useful in providing context for application behavior.</p>
+ <!-- I'm at a loss as to how to describe arcs more clearly than this. I don't want to devolve into discussions of directed graphs and n-ary links. -bent -->
+ </div2>
+
+ <div2 id="link-behaviors">
+ <head>Behavior Attributes</head>
+ <p>There are two attributes associated with behavior: <code>show</code> and <code>actuate</code>. The <code>show</code> attribute defines how the remote resource is to be revealed to the user. It has three options: <code>new</code>, <code>parsed</code>, and <code>replace</code>. The <code>new</code> option indicates that the remote resource should be shown in a new window (or other device context) without replacing the previous content. The <code>parsed</code> option, relating directly to the XML concept of a parsed entity, indicates that the content should be integrated into the document from which the link was actuated. The <code>replace</code> option is the one most commonly seen on the World Wide Web, where the document being linked from is entirely replaced by the object being linked to.</p>
+ <p>The <code>actuate</code> attribute defines how the link is initiated. It has two options: <code>user</code> and <code>auto</code>. The <code>user</code> option indicates that the link must be initiated by some sort of human-initiated selection, such as clicking on an HTML anchor. The <code>auto</code> option indicates that the link is automatically initiated when the application deems that the user has reached the link. It then follows the behavior set out in the <code>show</code> option.</p>
+ <!-- Something should be put here in terms of an example. Idea: "A" link versus automatically updating encyclopedia. -bent -->
+ </div2>
+
+ <div2 id="link-semantics">
+ <head>Semantic Attributes</head>
+ <p>There are two attributes associated with semantics, <code>role</code> and <code>title</code>. The <code>role</code> attribute is a generic string used to describe the function of the link's content. For example, a poem might have a link with a <code>role="stanza"</code>. The <code>role</code> is also used as an identifier for the <code>from</code> and <code>to</code> attributes of arcs.</p>
+ <p>The <code>title</code> attribute is designed to provide human-readable text describing the link. It is very useful for those who have text-based applications, whether that be due to a constricted device that cannot display the link's content, or if it's being read by an application to a visually-impaired user, or if it's being used to create a table of links. The <code>title</code> attribute contains a simple, descriptive string.</p>
+ </div2>
</div1>
+
<div1 id="linking-elements">
-<head>Linking Elements</head>
-<p>There are several kinds of linking elements in XLink: <code>simple</code> links, <code>locators</code>, <code>arcs</code>, and <code>extended</code> links. These elements may be instantiated via element declarations from the XLink namespace, or they may be instantiated via attribute declarations from the XLink namespace. Both kinds of instantiation are described in the definition of each linking element.</p>
-<p>The <code>simple</code> link is used to declare a link that approximates the functionality of the HTML <code>A</code> element. It has, however, a few added features to increase its value, including the potential declaration of semantics and behavior. The <code>locator</code> elements are used to define the resource being linked to. Some links may contain multiple locators, representing a choice of potential links to be traversed. The <code>arcs</code> are used to define the traversal semantics of the link. Finally, an <code>extended</code> linking element differs from a simple link in that it can connect any number of resources, not just one local resource (optionally) and one remote resource, and in that extended links are more often out-of-line than simple links.</p>
+ <head>Linking Elements</head>
+ <p>There are several kinds of linking elements in XLink: <code>simple</code> links, <code>locators</code>, <code>arcs</code>, and <code>extended</code> links. These elements may be instantiated via element declarations from the XLink namespace, or they may be instantiated via attribute declarations from the XLink namespace. Both kinds of instantiation are described in the definition of each linking element.</p>
+ <p>The <code>simple</code> link is used to declare a link that approximates the functionality of the HTML <code>A</code> element. It has, however, a few added features to increase its value, including the potential declaration of semantics and behavior. The <code>locator</code> elements are used to define the resource being linked to. Some links may contain multiple locators, representing a choice of potential links to be traversed. The <code>arcs</code> are used to define the traversal semantics of the link. Finally, an <code>extended</code> linking element differs from a simple link in that it can connect any number of resources, not just one local resource (optionally) and one remote resource, and in that extended links are more often out-of-line than simple links.</p>
+
<div2 id="simple-links">
-<head>Simple Links</head>
-<p id="dt-simplelink"><termdef id="dt-simpleline" term="Simple Link"><term>Simple links</term> can be used for purposes that approximate the functionality of a basic HTML <code>A</code> link, but they can also support a limited amount of additional functionality. Simple links have only one locator and thus, for convenience, combine the functions of a linking element and a locator into a single element.</termdef> As a result of this combination, the simple linking element offers both a locator attribute and all the behavior and semantic attributes.</p>
-<p>The following are two examples of linking elements, each showing all the possible attributes that can be associated with a simple link. Here is the explicit XLink simple linking element.
+ <head>Simple Links</head>
+ <p id="dt-simplelink"><termdef id="dt-simpleline" term="Simple Link"><term>Simple links</term> can be used for purposes that approximate the functionality of a basic HTML <code>A</code> link, but they can also support a limited amount of additional functionality. Simple links have only one locator and thus, for convenience, combine the functions of a linking element and a locator into a single element.</termdef> As a result of this combination, the simple linking element offers both a locator attribute and all the behavior and semantic attributes.</p>
+ <p>The following are two examples of linking elements, each showing all the possible attributes that can be associated with a simple link. Here is the explicit XLink simple linking element.
<eg><!ELEMENT xlink:simple ANY>
<!ATTLIST xlink:slink
href CDATA #REQUIRED
@@ -248,13 +378,13 @@
Alternately, a simple link could be as terse as this:
<eg><foo xlink:href="#stanza1">The First Stanza.</foo></eg>
</p>
-<p>
+ <p>
There are no constraints on the contents of a simple linking element. In
the sample declaration above, it is given a content model of <code>ANY</code>
to illustrate that any content model or declared content is acceptable. In
a valid document, every element that is significant to XLink must still conform
to the constraints expressed in its governing DTD.</p>
-<p>Note that it is meaningful to have an out-of-line simple link, although
+ <p>Note that it is meaningful to have an out-of-line simple link, although
such links are uncommon. They are called "one-ended" and are typically used
to associate discrete semantic properties with locations. The properties might
be expressed by attributes on the link, the link's element type name, or in
@@ -262,19 +392,29 @@
Most out-of-line links are extended links, as these have a far wider range
of uses.</p>
</div2>
+
<div2 id="extended-link">
<head>Extended Links</head>
-<p>
-<termdef id="dt-extendedlink" term="Extended Link">An <term>extended link</term> differs from a simple link in that it can connect any number of resources, not just one local resource (optionally) and one remote resource, and in that extended links are more often out-of-line than simple links.</termdef>
-</p>
-<p>These additional capabilities of extended links are required for:
- <ulist><item><p>Enabling outgoing links in documents that cannot be modified to add an inline link</p></item><item><p>Creating links to and from resources in formats with no native support for embedded links (such as most multimedia formats)</p></item><item><p>Applying and filtering sets of relevant links on demand</p></item><item><p>Enabling other advanced hypermedia capabilities</p></item></ulist>
+ <p><termdef id="dt-extendedlink" term="Extended Link">An <term>extended link</term> differs from a simple link in that it can connect any number of resources, not just one local resource (optionally) and one remote resource, and in that extended links are more often out-of-line than simple links.</termdef></p>
+ <p>These additional capabilities of extended links are required for:
+ <ulist>
+ <item>
+ <p>Enabling outgoing links in documents that cannot be modified to add an inline link</p>
+ </item>
+ <item>
+ <p>Creating links to and from resources in formats with no native support for embedded links (such as most multimedia formats)</p>
+ </item>
+ <item>
+ <p>Applying and filtering sets of relevant links on demand</p>
+ </item>
+ <item><p>Enabling other advanced hypermedia capabilities</p></item>
+ </ulist>
</p>
-<p>Application software might be expected to provide traversal among all of a link's participating resources (subject to semantic constraints outside the scope of this specification) and to signal the fact that a given resource or sub-resource participates in one or more links when it is displayed (even though there is no markup at exactly that point to signal it).</p>
-<p>A linking element for an extended link contains a series of <xtermref href="http://www.w3.org/TR/REC-xml/#dt-parentchild">child elements</xtermref> that serve as locators and arcs. Because an extended link can have more than one remote resource, it separates out linking itself from the mechanisms used to locate each resource (whereas a simple link combines the two).</p>
-<p>The <code>xlink:type</code> attribute value for an extended link must be <code> extended</code>, if the link is being instantiated on an arbitrary element. Note that extended links introduce variants of the <code>show</code> and <code>actuate</code> behavior attributes. These attributes, the <code>showdefault</code> and <code>actuatedefault</code> define the same behavior as their counterparts. However, in this case, they are considered to define the default behavior for all the linking elements that they contain.</p>
-<p>However, when a linking element within an extended link has a <code>show</code> or <code>actuate</code> attribute of its own, that attribute overrides the defaults set on the extended linking element.</p>
-<p>The extended linking element itself retains those attributes relevant to the link as a whole, and to its local resource if any. Following are two sample declaration for an extended link. The first is an example of the explicit XLink extended link:
+ <p>Application software might be expected to provide traversal among all of a link's participating resources (subject to semantic constraints outside the scope of this specification) and to signal the fact that a given resource or sub-resource participates in one or more links when it is displayed (even though there is no markup at exactly that point to signal it).</p>
+ <p>A linking element for an extended link contains a series of <xtermref href="http://www.w3.org/TR/REC-xml/#dt-parentchild">child elements</xtermref> that serve as locators and arcs. Because an extended link can have more than one remote resource, it separates out linking itself from the mechanisms used to locate each resource (whereas a simple link combines the two).</p>
+ <p>The <code>xlink:type</code> attribute value for an extended link must be <code> extended</code>, if the link is being instantiated on an arbitrary element. Note that extended links introduce variants of the <code>show</code> and <code>actuate</code> behavior attributes. These attributes, the <code>showdefault</code> and <code>actuatedefault</code> define the same behavior as their counterparts. However, in this case, they are considered to define the default behavior for all the linking elements that they contain.</p>
+ <p>However, when a linking element within an extended link has a <code>show</code> or <code>actuate</code> attribute of its own, that attribute overrides the defaults set on the extended linking element.</p>
+ <p>The extended linking element itself retains those attributes relevant to the link as a whole, and to its local resource if any. Following are two sample declaration for an extended link. The first is an example of the explicit XLink extended link:
<eg><!ELEMENT xlink:extended ((xlink:arc | xlink:locator)*)>
<!ATTLIST xlink:extended
@@ -301,20 +441,26 @@
<eg><foo xlink:type="extended" xlink:role="address book" xlink:title="Ben's Address Book" xlink:showdefault="replace" xlink:actuatedefault="user"> ... </foo></eg>
</p>
+
</div2>
+
<div2 id="xlink-arcs">
-<head>Arc Elements</head>
-<p><termdef id="dt-arc" term="Arc">An <term>arc</term> is contained within an extended link for the purpose of defining traversal behavior.</termdef> More than one arc may be associated with a link. Otherwise, arc elements function exactly as the arc attributes might lead on to expect.</p>
-<!-- More here? -bent -->
+ <head>Arc Elements</head>
+ <p><termdef id="dt-arc" term="Arc">An <term>arc</term> is contained within an extended link for the purpose of defining traversal behavior.</termdef> More than one arc may be associated with a link. Otherwise, arc elements function exactly as the arc attributes might lead on to expect.</p>
+ <!-- More here? -bent -->
</div2>
+
</div1>
<div1>
<head>Conformance</head>
-<p>An element conforms to XLink if: <olist><item><p>The element has an <code>xml:link</code> attribute whose value is
-one of the attribute values prescribed by this specification, and</p></item><item><p>the element and all of its attributes and content adhere to the
+<p>An element conforms to XLink if: <olist>
+<item><p>The element has an <code>xml:link</code> attribute whose value is
+one of the attribute values prescribed by this specification, and</p></item>
+<item><p>the element and all of its attributes and content adhere to the
syntactic
requirements imposed by the chosen <code>xml:link</code> attribute value,
-as prescribed in this specification.</p></item></olist></p>
+as prescribed in this specification.</p></item>
+</olist></p>
<p>Note that conformance is assessed at the level of individual elements,
rather than whole XML documents, because XLink and non-XLink linking mechanisms
may be used side by side in any one document.</p>
@@ -326,8 +472,7 @@
conformance language will have to address the different
levels of support. -elm--> </p>
</div1>
-</body>
-<back>
+</body><back>
<div1 id="unfinished">
<head>Unfinished Work</head>
<div2>
@@ -375,8 +520,6 @@
Reprinted in <titleref>Text Encoding Initiative: Background and
Context</titleref>,
ed. Nancy Ide and Jean ronis <!-- fix this name -->, ISBN 0-7923-3704-2. </bibl>
-</blist>
-</div1>
-</back>
-</spec>
+</blist></div1>
+</back></spec>
<?Pub *0000052575?>
diff --git a/result/valid/xlink.xml.err b/result/valid/xlink.xml.err
index 48c3b0c..c1358cd 100644
--- a/result/valid/xlink.xml.err
+++ b/result/valid/xlink.xml.err
@@ -1,6 +1,6 @@
-./test/valid/xlink.xml:450: validity error: ID dt-arc already defined
+./test/valid/xlink.xml:816: validity error: ID dt-arc already defined
<p><termdef id="dt-arc" term="Arc">An <term>arc</term> is contained within an
^
-./test/valid/xlink.xml:530: validity error: IDREF attribute def reference an unknown ID "dt-xlg"
+./test/valid/xlink.xml:956: validity error: IDREF attribute def reference an unknown ID "dt-xlg"
^