<?xml version="1.0" encoding="iso-8859-1"?> | |
<!DOCTYPE article [ | |
<!-- ELEMENT declarations work around MSXML bug. --> | |
<!ELEMENT section ANY> | |
<!ATTLIST section id ID #IMPLIED> | |
<!ELEMENT appendix ANY> | |
<!ATTLIST appendix id ID #IMPLIED> | |
<!ELEMENT bibliomixed ANY> | |
<!ATTLIST bibliomixed id ID #IMPLIED> | |
]> | |
<article status="Committee Specification" xmlns:p="http://relaxng.org/ns/proofsystem"> | |
<articleinfo> | |
<releaseinfo>$Id: spec.xml,v 1.159 2001/12/02 12:12:12 jjc Exp $</releaseinfo> | |
<title>RELAX NG Specification</title> | |
<authorgroup> | |
<editor> | |
<firstname>James</firstname><surname>Clark</surname> | |
<affiliation> | |
<address><email>jjc@jclark.com</email></address> | |
</affiliation> | |
</editor> | |
<editor> | |
<surname>MURATA</surname><firstname>Makoto</firstname> | |
<affiliation> | |
<address><email>EB2M-MRT@asahi-net.or.jp</email></address> | |
</affiliation> | |
</editor> | |
</authorgroup> | |
<pubdate>3 December 2001</pubdate> | |
<releaseinfo role="meta"> | |
$Id: spec.xml,v 1.159 2001/12/02 12:12:12 jjc Exp $ | |
</releaseinfo> | |
<copyright><year>2001</year><holder>OASIS</holder></copyright> | |
<legalnotice> | |
<para>Copyright © The Organization for the Advancement of | |
Structured Information Standards [OASIS] 2001. All Rights | |
Reserved.</para> | |
<para>This document and translations of it may be copied and furnished | |
to others, and derivative works that comment on or otherwise explain | |
it or assist in its implementation may be prepared, copied, published | |
and distributed, in whole or in part, without restriction of any kind, | |
provided that the above copyright notice and this paragraph are | |
included on all such copies and derivative works. However, this | |
document itself may not be modified in any way, such as by removing | |
the copyright notice or references to OASIS, except as needed for the | |
purpose of developing OASIS specifications, in which case the | |
procedures for copyrights defined in the OASIS Intellectual Property | |
Rights document must be followed, or as required to translate it into | |
languages other than English.</para> | |
<para>The limited permissions granted above are perpetual and will not | |
be revoked by OASIS or its successors or assigns.</para> | |
<para>This document and the information contained herein is provided | |
on an <quote>AS IS</quote> basis and OASIS DISCLAIMS ALL WARRANTIES, | |
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE | |
USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY | |
IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR | |
PURPOSE.</para> | |
</legalnotice> | |
<legalnotice role="status"><title>Status of this Document</title> | |
<para>This Committee Specification was approved for publication by the | |
OASIS RELAX NG technical committee. It is a stable document which | |
represents the consensus of the committee. Comments on this document | |
may be sent to <ulink | |
url="mailto:relax-ng-comment@lists.oasis-open.org" | |
>relax-ng-comment@lists.oasis-open.org</ulink>.</para> | |
<para>A list of known errors in this document is available at <ulink | |
url="http://www.oasis-open.org/committees/relax-ng/spec-20011203-errata.html" | |
>http://www.oasis-open.org/committees/relax-ng/spec-20011203-errata.html</ulink | |
>.</para> | |
</legalnotice> | |
<abstract> | |
<para>This is the definitive specification of RELAX NG, a simple | |
schema language for XML, based on <xref linkend="relax"/> and <xref | |
linkend="trex"/>. A RELAX NG schema specifies a pattern for the | |
structure and content of an XML document. A RELAX NG schema is itself | |
an XML document.</para> | |
</abstract> | |
<revhistory> | |
<revision> | |
<revnumber>Committee Specification</revnumber> | |
<date>3 December 2001</date> | |
</revision> | |
<revision> | |
<revnumber>Committee Specification</revnumber> | |
<date>11 August 2001</date> | |
</revision> | |
</revhistory> | |
</articleinfo> | |
<section> | |
<title>Introduction</title> | |
<para>This document specifies</para> | |
<itemizedlist> | |
<listitem><para>when an XML document is a correct RELAX NG | |
schema</para></listitem> | |
<listitem><para>when an XML document is valid with respect to a | |
correct RELAX NG schema</para></listitem> | |
</itemizedlist> | |
<para>An XML document that is being validated with respect to a RELAX NG | |
schema is referred to as an instance.</para> | |
<para>The structure of this document is as follows. <xref | |
linkend="data-model"/> describes the data model, which is the | |
abstraction of an XML document used throughout the rest of the | |
document. <xref linkend="full-syntax"/> describes the syntax of a | |
RELAX NG schema; any correct RELAX NG schema must conform to this | |
syntax. <xref linkend="simplification"/> describes a sequence of | |
transformations that are applied to simplify a RELAX NG schema; | |
applying the transformations also involves checking certain | |
restrictions that must be satisfied by a correct RELAX NG | |
schema. <xref linkend="simple-syntax"/> describes the syntax that | |
results from applying the transformations; this simple syntax is a | |
subset of the full syntax. <xref linkend="semantics"/> describes the | |
semantics of a correct RELAX NG schema that uses the simple syntax; | |
the semantics specify when an element is valid with respect to a RELAX | |
NG schema. <xref linkend="restriction"/> describes restrictions in | |
terms of the simple syntax; a correct RELAX NG schema must be such | |
that, after transformation into the simple form, it satisfies these | |
restrictions. Finally, <xref linkend="conformance"/> describes | |
conformance requirements for RELAX NG validators.</para> | |
<para>A tutorial is available separately (see <xref | |
linkend="tutorial"/>).</para> | |
</section> | |
<section id="data-model"> | |
<title>Data model</title> | |
<para>RELAX NG deals with XML documents representing both schemas and | |
instances through an abstract data model. XML documents representing | |
schemas and instances must be well-formed in conformance with <xref | |
linkend="xml-rec"/> and must conform to the constraints of <xref | |
linkend="xml-names"/>.</para> | |
<para>An XML document is represented by an element. An element consists | |
of</para> | |
<itemizedlist> | |
<listitem><para>a name</para></listitem> | |
<listitem><para>a context</para></listitem> | |
<listitem><para>a set of attributes</para></listitem> | |
<listitem><para>an ordered sequence of zero or more children; each | |
child is either an element or a non-empty string; the sequence never contains | |
two consecutive strings</para></listitem> | |
</itemizedlist> | |
<para>A name consists of</para> | |
<itemizedlist> | |
<listitem><para>a string representing the namespace URI; the empty | |
string has special significance, representing the absence of any | |
namespace</para></listitem> | |
<listitem><para>a string representing the local name; this string matches the NCName | |
production of <xref linkend="xml-names"/></para></listitem> | |
</itemizedlist> | |
<para>A context consists of</para> | |
<itemizedlist> | |
<listitem><para>a base URI</para></listitem> | |
<listitem><para>a namespace map; this maps prefixes to namespace URIs, | |
and also may specify a default namespace URI (as declared | |
by the <literal>xmlns</literal> attribute)</para></listitem> | |
</itemizedlist> | |
<para>An attribute consists of</para> | |
<itemizedlist> | |
<listitem><para>a name</para></listitem> | |
<listitem><para>a string representing the value</para></listitem> | |
</itemizedlist> | |
<para>A string consists of a sequence of zero or more characters, | |
where a character is as defined in <xref linkend="xml-rec"/>.</para> | |
<para>The element for an XML document is constructed from an instance | |
of the <xref linkend="infoset"/> as follows. We use the notation | |
[<replaceable>x</replaceable>] to refer to the value of the | |
<replaceable>x</replaceable> property of an information item. An | |
element is constructed from a document information item by | |
constructing an element from the [document element]. An element is | |
constructed from an element information item by constructing the name | |
from the [namespace name] and [local name], the context from the [base | |
URI] and [in-scope namespaces], the attributes from the [attributes], | |
and the children from the [children]. The attributes of an element | |
are constructed from the unordered set of attribute information items | |
by constructing an attribute for each attribute information item. The | |
children of an element are constructed from the list of child | |
information items first by removing information items other than | |
element information items and character information items, and then by | |
constructing an element for each element information item in the list | |
and a string for each maximal sequence of character information items. | |
An attribute is constructed from an attribute information item by | |
constructing the name from the [namespace name] and [local name], and | |
the value from the [normalized value]. When constructing the name of | |
an element or attribute from the [namespace name] and [local name], if | |
the [namespace name] property is not present, then the name is | |
constructed from an empty string and the [local name]. A string is | |
constructed from a sequence of character information items by | |
constructing a character from the [character code] of each character | |
information item.</para> | |
<para>It is possible for there to be multiple distinct infosets for a | |
single XML document. This is because XML parsers are not required to | |
process all DTD declarations or expand all external parsed general | |
entities. Amongst these multiple infosets, there is exactly one | |
infoset for which [all declarations processed] is true and which does | |
not contain any unexpanded entity reference information items. This | |
is the infoset that is the basis for defining the RELAX NG data | |
model.</para> | |
<section id="data-model-example"> | |
<title>Example</title> | |
<para>Suppose the document | |
<literal>http://www.example.com/doc.xml</literal> is as | |
follows:</para> | |
<programlisting><![CDATA[<?xml version="1.0"?> | |
<foo><pre1:bar1 xmlns:pre1="http://www.example.com/n1"/><pre2:bar2 | |
xmlns:pre2="http://www.example.com/n2"/></foo> | |
]]></programlisting> | |
<para>The element representing this document has</para> | |
<itemizedlist> | |
<listitem><para>a name which has</para> | |
<itemizedlist> | |
<listitem><para>the empty string as the namespace URI, representing | |
the absence of any namespace</para></listitem> | |
<listitem><para><literal>foo</literal> as the local | |
name</para></listitem> | |
</itemizedlist> | |
</listitem> | |
<listitem><para>a context which has</para> | |
<itemizedlist> | |
<listitem><para><literal>http://www.example.com/doc.xml</literal> as the base | |
URI</para></listitem> | |
<listitem><para>a namespace map which</para> | |
<itemizedlist> | |
<listitem><para>maps the prefix <literal>xml</literal> to the | |
namespace URI | |
<literal>http://www.w3.org/XML/1998/namespace</literal> | |
(the <literal>xml</literal> prefix is implicitly declared | |
by every XML document)</para></listitem> | |
<listitem><para>specifies the empty string as the default namespace | |
URI</para></listitem> | |
</itemizedlist> | |
</listitem> | |
</itemizedlist> | |
</listitem> | |
<listitem><para>an empty set of attributes</para></listitem> | |
<listitem><para>a sequence of children consisting | |
of an element which has</para> | |
<itemizedlist> | |
<listitem><para>a name which has</para> | |
<itemizedlist> | |
<listitem><para><literal>http://www.example.com/n1</literal> as the | |
namespace URI</para></listitem> | |
<listitem><para><literal>bar1</literal> as the local | |
name</para></listitem> | |
</itemizedlist> | |
</listitem> | |
<listitem><para>a context which has</para> | |
<itemizedlist> | |
<listitem><para><literal>http://www.example.com/doc.xml</literal> as the base | |
URI</para></listitem> | |
<listitem><para>a namespace map which</para> | |
<itemizedlist> | |
<listitem><para>maps the prefix <literal>pre1</literal> to the | |
namespace URI | |
<literal>http://www.example.com/n1</literal></para></listitem> | |
<listitem><para>maps the prefix <literal>xml</literal> to the | |
namespace URI | |
<literal>http://www.w3.org/XML/1998/namespace</literal></para></listitem> | |
<listitem><para>specifies the empty string as the default namespace | |
URI</para></listitem> | |
</itemizedlist> | |
</listitem> | |
</itemizedlist> | |
</listitem> | |
<listitem><para>an empty set of attributes</para></listitem> | |
<listitem><para>an empty sequence of children</para></listitem> | |
</itemizedlist> | |
<para>followed by an element which has</para> | |
<itemizedlist> | |
<listitem><para>a name which has</para> | |
<itemizedlist> | |
<listitem><para><literal>http://www.example.com/n2</literal> as the | |
namespace URI</para></listitem> | |
<listitem><para><literal>bar2</literal> as the local | |
name</para></listitem> | |
</itemizedlist> | |
</listitem> | |
<listitem><para>a context which has</para> | |
<itemizedlist> | |
<listitem><para><literal>http://www.example.com/doc.xml</literal> as the base | |
URI</para></listitem> | |
<listitem><para>a namespace map which</para> | |
<itemizedlist> | |
<listitem><para>maps the prefix <literal>pre2</literal> to the | |
namespace URI | |
<literal>http://www.example.com/n2</literal></para></listitem> | |
<listitem><para>maps the prefix <literal>xml</literal> to the | |
namespace URI | |
<literal>http://www.w3.org/XML/1998/namespace</literal></para></listitem> | |
<listitem><para>specifies the empty string as the default namespace | |
URI</para></listitem> | |
</itemizedlist> | |
</listitem> | |
</itemizedlist> | |
</listitem> | |
<listitem><para>an empty set of attributes</para></listitem> | |
<listitem><para>an empty sequence of children</para></listitem> | |
</itemizedlist> | |
</listitem> | |
</itemizedlist> | |
</section> | |
</section> | |
<section id="full-syntax"> | |
<title>Full syntax</title> | |
<para>The following grammar summarizes the syntax of RELAX NG. | |
Although we use a notation based on the XML representation of an RELAX | |
NG schema as a sequence of characters, the grammar must be understood | |
as operating at the data model level. For example, although the | |
syntax uses <literal><![CDATA[<text/>]]></literal>, an instance or | |
schema can use <literal><![CDATA[<text></text>]]></literal> instead, | |
because they both represent the same element at the data model level. | |
All elements shown in the grammar are qualified with the namespace | |
URI:</para> | |
<programlisting>http://relaxng.org/ns/structure/1.0</programlisting> | |
<para>The symbols QName and NCName are defined in <xref | |
linkend="xml-names"/>. The anyURI symbol has the same meaning as the | |
anyURI datatype of <xref linkend="xmlschema-2"/>: it indicates a | |
string that, after escaping of disallowed values as described in | |
Section 5.4 of <xref linkend="xlink"/>, is a URI reference as defined | |
in <xref linkend="rfc2396"/> (as modified by <xref | |
linkend="rfc2732"/>). The symbol string matches any string.</para> | |
<para>In addition to the attributes shown explicitly, any element can | |
have an <literal>ns</literal> attribute and any element can have a | |
<literal>datatypeLibrary</literal> attribute. The | |
<literal>ns</literal> attribute can have any value. The value of the | |
<literal>datatypeLibrary</literal> attribute must match the anyURI | |
symbol as described in the previous paragraph; in addition, it must | |
not use the relative form of URI reference and must not have a | |
fragment identifier; as an exception to this, the value may be the | |
empty string.</para> | |
<para>Any element can also have foreign attributes in addition to the | |
attributes shown in the grammar. A foreign attribute is an attribute | |
with a name whose namespace URI is neither the empty string nor the | |
RELAX NG namespace URI. Any element that cannot have string children | |
(that is, any element other than <literal>value</literal>, <literal>param</literal> | |
and <literal>name</literal>) may have foreign child elements in addition | |
to the child elements shown in the grammar. A foreign element is an | |
element with a name whose namespace URI is not the RELAX NG namespace | |
URI. There are no constraints on the relative position of foreign | |
child elements with respect to other child elements.</para> | |
<para>Any element can also have as children strings that consist | |
entirely of whitespace characters, where a whitespace character is one | |
of #x20, #x9, #xD or #xA. There are no constraints on the relative | |
position of whitespace string children with respect to child | |
elements.</para> | |
<para>Leading and trailing whitespace is allowed for value of each | |
<literal>name</literal>, <literal>type</literal> and | |
<literal>combine</literal> attribute and for the content of each | |
<literal>name</literal> element.</para> | |
<grammarref src="full.rng"/> | |
<section id="full-syntax-example"> | |
<title>Example</title> | |
<para>Here is an example of a schema in the full syntax for the | |
document in <xref linkend="data-model-example"/>.</para> | |
<programlisting><![CDATA[<?xml version="1.0"?> | |
<element name="foo" | |
xmlns="http://relaxng.org/ns/structure/1.0" | |
xmlns:a="http://relaxng.org/ns/annotation/1.0" | |
xmlns:ex1="http://www.example.com/n1" | |
xmlns:ex2="http://www.example.com/n2"> | |
<a:documentation>A foo element.</a:document> | |
<element name="ex1:bar1"> | |
<empty/> | |
</element> | |
<element name="ex2:bar2"> | |
<empty/> | |
</element> | |
</element>]]></programlisting> | |
</section> | |
</section> | |
<section id="simplification"> | |
<title>Simplification</title> | |
<para>The full syntax given in the previous section is transformed | |
into a simpler syntax by applying the following transformation rules | |
in order. The effect must be as if each rule was applied to all | |
elements in the schema before the next rule is applied. A | |
transformation rule may also specify constraints that must be | |
satisfied by a correct schema. The transformation rules are applied | |
at the data model level. Before the transformations are applied, the | |
schema is parsed into an instance of the data model.</para> | |
<section> | |
<title>Annotations</title> | |
<para>Foreign attributes and elements are removed.</para> | |
<note><para>It is safe to remove <literal>xml:base</literal> | |
attributes at this stage because <literal>xml:base</literal> | |
attributes are used in determining the [base URI] of an element | |
information item, which is in turn used to construct the base URI of | |
the context of an element. Thus, after a document has been parsed | |
into an instance of the data model, <literal>xml:base</literal> | |
attributes can be discarded.</para></note> | |
</section> | |
<section> | |
<title>Whitespace</title> | |
<para>For each element other than <literal>value</literal> and | |
<literal>param</literal>, each child that is a string containing only | |
whitespace characters is removed.</para> | |
<para>Leading and trailing whitespace characters are removed from the | |
value of each <literal>name</literal>, <literal>type</literal> and | |
<literal>combine</literal> attribute and from the content of each | |
<literal>name</literal> element.</para> | |
</section> | |
<section> | |
<title><literal>datatypeLibrary</literal> attribute</title> | |
<para>The value of each <literal>datatypeLibary</literal> attribute is | |
transformed by escaping disallowed characters as specified in Section | |
5.4 of <xref linkend="xlink"/>.</para> | |
<para>For any <literal>data</literal> or <literal>value</literal> | |
element that does not have a <literal>datatypeLibrary</literal> | |
attribute, a <literal>datatypeLibrary</literal> attribute is | |
added. The value of the added <literal>datatypeLibrary</literal> | |
attribute is the value of the <literal>datatypeLibrary</literal> | |
attribute of the nearest ancestor element that has a | |
<literal>datatypeLibrary</literal> attribute, or the empty string if | |
there is no such ancestor. Then, any <literal>datatypeLibrary</literal> | |
attribute that is on an element other than <literal>data</literal> or | |
<literal>value</literal> is removed.</para> | |
</section> | |
<section> | |
<title><literal>type</literal> attribute of <literal>value</literal> element</title> | |
<para>For any <literal>value</literal> element that does not have a | |
<literal>type</literal> attribute, a <literal>type</literal> attribute | |
is added with value <literal>token</literal> and the value of the | |
<literal>datatypeLibrary</literal> attribute is changed to the empty | |
string.</para> | |
</section> | |
<section id="href"> | |
<title><literal>href</literal> attribute</title> | |
<para>The value of the <literal>href</literal> attribute on an | |
<literal>externalRef</literal> or <literal>include</literal> element | |
is first transformed by escaping disallowed characters as specified in | |
Section 5.4 of <xref linkend="xlink"/>. The URI reference is then | |
resolved into an absolute form as described in section 5.2 of <xref | |
linkend="rfc2396"/> using the base URI from the context of the element | |
that bears the <literal>href</literal> attribute.</para> | |
<para>The value of the <literal>href</literal> attribute will be used | |
to construct an element (as specified in <xref | |
linkend="data-model"/>). This must be done as follows. The URI | |
reference consists of the URI itself and an optional fragment | |
identifier. The resource identified by the URI is retrieved. The | |
result is a MIME entity: a sequence of bytes labeled with a MIME | |
media type. The media type determines how an element is constructed | |
from the MIME entity and optional fragment identifier. When the media | |
type is <literal>application/xml</literal> or | |
<literal>text/xml</literal>, the MIME entity must be parsed as an XML | |
document in accordance with the applicable RFC (at the term of writing | |
<xref linkend="rfc3023"/>) and an element constructed from the result | |
of the parse as specified in <xref linkend="data-model"/>. In | |
particular, the <literal>charset</literal> parameter must be handled | |
as specified by the RFC. This specification does not define the | |
handling of media types other than <literal>application/xml</literal> | |
and <literal>text/xml</literal>. The <literal>href</literal> attribute | |
must not include a fragment identifier unless the registration of the | |
media type of the resource identified by the attribute defines the | |
interpretation of fragment identifiers for that media type.</para> | |
<note><para><xref linkend="rfc3023"/> does not define the | |
interpretation of fragment identifiers for | |
<literal>application/xml</literal> or | |
<literal>text/xml</literal>.</para></note> | |
</section> | |
<section> | |
<title><literal>externalRef</literal> element</title> | |
<para>An <literal>externalRef</literal> element is transformed as | |
follows. An element is constructed using the URI reference that is | |
the value of <literal>href</literal> attribute as specified in <xref | |
linkend="href"/>. This element must match the syntax for pattern. The | |
element is transformed by recursively applying the rules from this | |
subsection and from previous subsections of this section. This must | |
not result in a loop. In other words, the transformation of the | |
referenced element must not require the dereferencing of an | |
<literal>externalRef</literal> attribute with an | |
<literal>href</literal> attribute with the same value.</para> | |
<para>Any <literal>ns</literal> attribute on the | |
<literal>externalRef</literal> element is transferred to the | |
referenced element if the referenced element does not already have an | |
<literal>ns</literal> attribute. The <literal>externalRef</literal> | |
element is then replaced by the referenced element.</para> | |
</section> | |
<section> | |
<title><literal>include</literal> element</title> | |
<para>An <literal>include</literal> element is transformed as follows. | |
An element is constructed using the URI reference that is the value of | |
<literal>href</literal> attribute as specified in <xref | |
linkend="href"/>. This element must be a <literal>grammar</literal> | |
element, matching the syntax for grammar.</para> | |
<para>This <literal>grammar</literal> element is transformed by | |
recursively applying the rules from this subsection and from previous | |
subsections of this section. This must not result in a loop. In other | |
words, the transformation of the <literal>grammar</literal> element | |
must not require the dereferencing of an <literal>include</literal> | |
attribute with an <literal>href</literal> attribute with the same | |
value.</para> | |
<para>Define the <firstterm>components</firstterm> of an element to | |
be the children of the element together with the components of any | |
<literal>div</literal> child elements. If the | |
<literal>include</literal> element has a <literal>start</literal> | |
component, then the <literal>grammar</literal> element must have a | |
<literal>start</literal> component. If the <literal>include</literal> | |
element has a <literal>start</literal> component, then all | |
<literal>start</literal> components are removed from the | |
<literal>grammar</literal> element. If the <literal>include</literal> | |
element has a <literal>define</literal> component, then the | |
<literal>grammar</literal> element must have a | |
<literal>define</literal> component with the same name. For every | |
<literal>define</literal> component of the <literal>include</literal> | |
element, all <literal>define</literal> components with the same name | |
are removed from the <literal>grammar</literal> element.</para> | |
<para>The <literal>include</literal> element is transformed into a | |
<literal>div</literal> element. The attributes of the | |
<literal>div</literal> element are the attributes of the | |
<literal>include</literal> element other than the | |
<literal>href</literal> attribute. The children of the | |
<literal>div</literal> element are the <literal>grammar</literal> | |
element (after the removal of the <literal>start</literal> and | |
<literal>define</literal> components described by the preceding | |
paragraph) followed by the children of the <literal>include</literal> | |
element. The <literal>grammar</literal> element is then renamed to | |
<literal>div</literal>.</para> | |
</section> | |
<section> | |
<title><literal>name</literal> attribute of <literal>element</literal> | |
and <literal>attribute</literal> elements</title> | |
<para>The <literal>name</literal> attribute on an | |
<literal>element</literal> or <literal>attribute</literal> element is | |
transformed into a <literal>name</literal> child element.</para> | |
<para>If an <literal>attribute</literal> element has a | |
<literal>name</literal> attribute but no <literal>ns</literal> | |
attribute, then an <literal>ns=""</literal> attribute is added to the | |
<literal>name</literal> child element.</para> | |
</section> | |
<section> | |
<title><literal>ns</literal> attribute</title> | |
<para>For any <literal>name</literal>, <literal>nsName</literal> or | |
<literal>value</literal> element that does not have an | |
<literal>ns</literal> attribute, an <literal>ns</literal> attribute is | |
added. The value of the added <literal>ns</literal> attribute is the | |
value of the <literal>ns</literal> attribute of the nearest ancestor | |
element that has an <literal>ns</literal> attribute, or the empty | |
string if there is no such ancestor. Then, any <literal>ns</literal> | |
attribute that is on an element other than <literal>name</literal>, | |
<literal>nsName</literal> or <literal>value</literal> is | |
removed.</para> | |
<note><para>The value of the <literal>ns</literal> attribute is | |
<emphasis role="strong">not</emphasis> transformed either by escaping | |
disallowed characters, or in any other way, because the value of the | |
<literal>ns</literal> attribute is compared against namespace URIs in | |
the instance, which are not subject to any | |
transformation.</para></note> | |
<note><para>Since <literal>include</literal> and | |
<literal>externalRef</literal> elements are resolved after | |
<literal>datatypeLibrary</literal> attributes are added but before | |
<literal>ns</literal> attributes are added, <literal>ns</literal> | |
attributes are inherited into external schemas but | |
<literal>datatypeLibrary</literal> attributes are not.</para></note> | |
</section> | |
<section> | |
<title>QNames</title> | |
<para>For any <literal>name</literal> element containing a prefix, the | |
prefix is removed and an <literal>ns</literal> attribute is added | |
replacing any existing <literal>ns</literal> attribute. The value of | |
the added <literal>ns</literal> attribute is the value to which the | |
namespace map of the context of the <literal>name</literal> element | |
maps the prefix. The context must have a mapping for the | |
prefix.</para> | |
</section> | |
<section> | |
<title><literal>div</literal> element</title> | |
<para>Each <literal>div</literal> element is replaced by its | |
children.</para> | |
</section> | |
<section id="number-child-elements"> | |
<title>Number of child elements</title> | |
<para>A <literal>define</literal>, <literal>oneOrMore</literal>, | |
<literal>zeroOrMore</literal>, <literal>optional</literal>, <literal>list</literal> or | |
<literal>mixed</literal> element is transformed so that it has exactly | |
one child element. If it has more than one child element, then its | |
child elements are wrapped in a <literal>group</literal> | |
element. Similarly, an <literal>element</literal> element is transformed so | |
that it has exactly two child elements, the first being a name class | |
and the second being a pattern. If it has more than two child elements, | |
then the child elements other than the first are wrapped in a | |
<literal>group</literal> element.</para> | |
<para>A <literal>except</literal> element is transformed | |
so that it has exactly one child element. If it has more | |
than one child element, then its child elements are wrapped | |
in a <literal>choice</literal> element.</para> | |
<para>If an <literal>attribute</literal> element has only one child | |
element (a name class), then a <literal>text</literal> element is | |
added.</para> | |
<para>A <literal>choice</literal>, <literal>group</literal> or | |
<literal>interleave</literal> element is transformed so that it has | |
exactly two child elements. If it has one child element, then it is | |
replaced by its child element. If it has more than two child | |
elements, then the first two child elements are combined into a new | |
element with the same name as the parent element and with the first | |
two child elements as its children. For example,</para> | |
<programlisting><choice> <replaceable>p1</replaceable> <replaceable>p2</replaceable> <replaceable>p3</replaceable> </choice></programlisting> | |
<para>is transformed to</para> | |
<programlisting><choice> <choice> <replaceable>p1</replaceable> <replaceable>p2</replaceable> </choice> <replaceable>p3</replaceable> </choice></programlisting> | |
<para>This reduces the number of child elements by one. The | |
transformation is applied repeatedly until there are exactly two child | |
elements.</para> | |
</section> | |
<section> | |
<title><literal>mixed</literal> element</title> | |
<para>A <literal>mixed</literal> element is transformed into an | |
interleaving with a <literal>text</literal> element:</para> | |
<programlisting><mixed> <replaceable>p</replaceable> </mixed></programlisting> | |
<para>is transformed into</para> | |
<programlisting><interleave> <replaceable>p</replaceable> <text/> </interleave></programlisting> | |
</section> | |
<section> | |
<title><literal>optional</literal> element</title> | |
<para>An <literal>optional</literal> element is transformed into | |
a choice with <literal>empty</literal>:</para> | |
<programlisting><optional> <replaceable>p</replaceable> </optional></programlisting> | |
<para>is transformed into</para> | |
<programlisting><choice> <replaceable>p</replaceable> <empty/> </choice></programlisting> | |
</section> | |
<section> | |
<title><literal>zeroOrMore</literal> element</title> | |
<para>A <literal>zeroOrMore</literal> element is transformed into a choice | |
between <literal>oneOrMore</literal> and | |
<literal>empty</literal>:</para> | |
<programlisting><zeroOrMore> <replaceable>p</replaceable> </zeroOrMore></programlisting> | |
<para>is transformed into</para> | |
<programlisting><choice> <oneOrMore> <replaceable>p</replaceable> </oneOrMore> <empty/> </choice></programlisting> | |
</section> | |
<section id="constraints"> | |
<title>Constraints</title> | |
<para>In this rule, no transformation is performed, but various | |
constraints are checked.</para> | |
<note><para>The constraints in this section, unlike the constraints | |
specified in <xref linkend="restriction"/>, can be checked without | |
resolving any <literal>ref</literal> elements, and are accordingly | |
applied even to patterns that will disappear during later stages of | |
simplification because they are not reachable (see <xref | |
linkend="define-ref"/>) or because of <literal>notAllowed</literal> | |
(see <xref linkend="notAllowed"/>).</para></note> | |
<para>An <literal>except</literal> element that is a child of an | |
<literal>anyName</literal> element must not have any | |
<literal>anyName</literal> descendant elements. An | |
<literal>except</literal> element that is a child of an | |
<literal>nsName</literal> element must not have any | |
<literal>nsName</literal> or <literal>anyName</literal> descendant | |
elements.</para> | |
<para>A <literal>name</literal> element that occurs as the first child | |
of an <literal>attribute</literal> element or as the descendant of the | |
first child of an <literal>attribute</literal> element and that has an | |
<literal>ns</literal> attribute with value equal to the empty string | |
must not have content equal to <literal>xmlns</literal>.</para> | |
<para>A <literal>name</literal> or <literal>nsName</literal> element | |
that occurs as the first child of an <literal>attribute</literal> | |
element or as the descendant of the first child of an | |
<literal>attribute</literal> element must not have an | |
<literal>ns</literal> attribute with value | |
<literal>http://www.w3.org/2000/xmlns</literal>.</para> | |
<note><para>The <xref linkend="infoset"/> defines the namespace URI of | |
namespace declaration attributes to be | |
<literal>http://www.w3.org/2000/xmlns</literal>.</para></note> | |
<para>A <literal>data</literal> or <literal>value</literal> element | |
must be correct in its use of datatypes. Specifically, the | |
<literal>type</literal> attribute must identify a datatype within the | |
datatype library identified by the value of the | |
<literal>datatypeLibrary</literal> attribute. For a | |
<literal>data</literal> element, the parameter list must be one that | |
is allowed by the datatype (see <xref | |
linkend="data-pattern"/>).</para> | |
</section> | |
<section> | |
<title><literal>combine</literal> attribute</title> | |
<para>For each <literal>grammar</literal> element, all | |
<literal>define</literal> elements with the same name are combined | |
together. For any name, there must not be more than one | |
<literal>define</literal> element with that name that does not have a | |
<literal>combine</literal> attribute. For any name, if there is a | |
<literal>define</literal> element with that name that has a | |
<literal>combine</literal> attribute with the value | |
<literal>choice</literal>, then there must not also be a | |
<literal>define</literal> element with that name that has a | |
<literal>combine</literal> attribute with the value | |
<literal>interleave</literal>. Thus, for any name, if there is more | |
than one <literal>define</literal> element with that name, then there | |
is a unique value for the <literal>combine</literal> attribute for | |
that name. After determining this unique value, the | |
<literal>combine</literal> attributes are removed. A pair of | |
definitions</para> | |
<programlisting><define name="<replaceable>n</replaceable>"> | |
<replaceable>p1</replaceable> | |
</define> | |
<define name="<replaceable>n</replaceable>"> | |
<replaceable>p2</replaceable> | |
</define></programlisting> | |
<para>is combined into</para> | |
<programlisting><define name="<replaceable>n</replaceable>"> | |
<<replaceable>c</replaceable>> | |
<replaceable>p1</replaceable> | |
<replaceable>p2</replaceable> | |
</<replaceable>c</replaceable>> | |
</define></programlisting> | |
<para>where <replaceable>c</replaceable> is the value of the | |
<literal>combine</literal> attribute. Pairs of definitions are | |
combined until there is exactly one <literal>define</literal> element | |
for each name.</para> | |
<para>Similarly, for each <literal>grammar</literal> element all | |
<literal>start</literal> elements are combined together. There must | |
not be more than one <literal>start</literal> element that does not | |
have a <literal>combine</literal> attribute. If there is a | |
<literal>start</literal> element that has a <literal>combine</literal> | |
attribute with the value <literal>choice</literal>, there must not | |
also be a <literal>start</literal> element that has a | |
<literal>combine</literal> attribute with the value | |
<literal>interleave</literal>.</para> | |
</section> | |
<section> | |
<title><literal>grammar</literal> element</title> | |
<para>In this rule, the schema is transformed so that its top-level | |
element is <literal>grammar</literal> and so that it has no other | |
<literal>grammar</literal> elements.</para> | |
<para>Define the <firstterm>in-scope grammar</firstterm> for an | |
element to be the nearest ancestor <literal>grammar</literal> element. A | |
<literal>ref</literal> element <firstterm>refers to</firstterm> a | |
<literal>define</literal> element if the value of their | |
<literal>name</literal> attributes is the same and their in-scope | |
grammars are the same. A <literal>parentRef</literal> element | |
<firstterm>refers to</firstterm> a <literal>define</literal> element | |
if the value of their <literal>name</literal> attributes is the same | |
and the in-scope grammar of the in-scope grammar of the | |
<literal>parentRef</literal> element is the same as the in-scope | |
grammar of the <literal>define</literal> element. Every | |
<literal>ref</literal> or <literal>parentRef</literal> element must | |
refer to a <literal>define</literal> element. A | |
<literal>grammar</literal> must have a <literal>start</literal> child | |
element.</para> | |
<para>First, transform the top-level pattern | |
<replaceable>p</replaceable> into | |
<literal><grammar><start><replaceable>p</replaceable></start></grammar></literal>. | |
Next, rename <literal>define</literal> elements so that no two | |
<literal>define</literal> elements anywhere in the schema have the | |
same name. To rename a <literal>define</literal> element, change the | |
value of its <literal>name</literal> attribute and change the value of | |
the <literal>name</literal> attribute of all <literal>ref</literal> | |
and <literal>parentRef</literal> elements that refer to that | |
<literal>define</literal> element. Next, move all | |
<literal>define</literal> elements to be children of the top-level | |
<literal>grammar</literal> element, replace each nested | |
<literal>grammar</literal> element by the child of its | |
<literal>start</literal> element and rename each | |
<literal>parentRef</literal> element to <literal>ref</literal>.</para> | |
</section> | |
<section id="define-ref"> | |
<title><literal>define</literal> and <literal>ref</literal> elements</title> | |
<para>In this rule, the grammar is transformed so that every | |
<literal>element</literal> element is the child of a | |
<literal>define</literal> element, and the child of every | |
<literal>define</literal> element is an <literal>element</literal> | |
element.</para> | |
<para>First, remove any <literal>define</literal> element that is not | |
<firstterm>reachable</firstterm>. A <literal>define</literal> element | |
is reachable if there is reachable <literal>ref</literal> element | |
referring to it. A <literal>ref</literal> element is reachable if it | |
is the descendant of the <literal>start</literal> element or of a | |
reachable <literal>define</literal> element. Now, for | |
each <literal>element</literal> element that is not the child of a | |
<literal>define</literal> element, add a <literal>define</literal> | |
element to the <literal>grammar</literal> element, and replace the | |
<literal>element</literal> element by a <literal>ref</literal> element | |
referring to the added <literal>define</literal> element. The value of | |
the <literal>name</literal> attribute of the added | |
<literal>define</literal> element must be different from value of the | |
<literal>name</literal> attribute of all other | |
<literal>define</literal> elements. The child of the added | |
<literal>define</literal> element is the <literal>element</literal> | |
element.</para> | |
<para>Define a <literal>ref</literal> element to be | |
<firstterm>expandable</firstterm> if it refers to a | |
<literal>define</literal> element whose child is not an | |
<literal>element</literal> element. For each <literal>ref</literal> | |
element that is expandable and is a descendant of a | |
<literal>start</literal> element or an <literal>element</literal> | |
element, expand it by replacing the <literal>ref</literal> element by | |
the child of the <literal>define</literal> element to which it refers and | |
then recursively expanding any expandable <literal>ref</literal> | |
elements in this replacement. This must not result in a loop. | |
In other words expanding the replacement of a | |
<literal>ref</literal> element having a <literal>name</literal> with | |
value <replaceable>n</replaceable> must not require the expansion of | |
<literal>ref</literal> element also having a <literal>name</literal> | |
with value <replaceable>n</replaceable>. Finally, remove any | |
<literal>define</literal> element whose child is not an | |
<literal>element</literal> element.</para> | |
</section> | |
<section id="notAllowed"> | |
<title><literal>notAllowed</literal> element</title> | |
<para>In this rule, the grammar is transformed so that a | |
<literal>notAllowed</literal> element occurs only as the child of | |
a <literal>start</literal> or <literal>element</literal> element. An | |
<literal>attribute</literal>, <literal>list</literal>, | |
<literal>group</literal>, <literal>interleave</literal>, | |
or <literal>oneOrMore</literal> element that has a | |
<literal>notAllowed</literal> child element is transformed into a | |
<literal>notAllowed</literal> element. A <literal>choice</literal> | |
element that has two <literal>notAllowed</literal> child elements is | |
transformed into a <literal>notAllowed</literal> element. A | |
<literal>choice</literal> element that has one | |
<literal>notAllowed</literal> child element is transformed into its | |
other child element. An <literal>except</literal> element that has a | |
<literal>notAllowed</literal> child element is removed. | |
The preceding transformations are applied | |
repeatedly until none of them is applicable any more. | |
Any <literal>define</literal> element that is no longer reachable | |
is removed.</para> | |
</section> | |
<section> | |
<title><literal>empty</literal> element</title> | |
<para>In this rule, the grammar is transformed so that an | |
<literal>empty</literal> element does not occur as a child of a | |
<literal>group</literal>, <literal>interleave</literal>, or | |
<literal>oneOrMore</literal> element or as the second child of | |
a <literal>choice</literal> element. A <literal>group</literal>, | |
<literal>interleave</literal> or <literal>choice</literal> element | |
that has two <literal>empty</literal> child elements is transformed | |
into an <literal>empty</literal> element. A <literal>group</literal> | |
or <literal>interleave</literal> element that has one | |
<literal>empty</literal> child element is transformed into its other | |
child element. A <literal>choice</literal> element whose | |
second child element is an <literal>empty</literal> element is | |
transformed by interchanging its two child elements. A | |
<literal>oneOrMore</literal> element that has an | |
<literal>empty</literal> child element is transformed into an | |
<literal>empty</literal> element. The preceding transformations are applied | |
repeatedly until none of them is applicable any more.</para> | |
</section> | |
</section> | |
<section id="simple-syntax"> | |
<title>Simple syntax</title> | |
<para>After applying all the rules in <xref | |
linkend="simplification"/>, the schema will match the following | |
grammar:</para> | |
<grammarref src="simple.rng"/> | |
<para>With this grammar, no elements or attributes are allowed other | |
than those explicitly shown.</para> | |
<section id="simple-syntax-example"> | |
<title>Example</title> | |
<para>The following is an example of how the schema in <xref | |
linkend="full-syntax-example"/> can be transformed into the simple | |
syntax:</para> | |
<programlisting><![CDATA[<?xml version="1.0"?> | |
<grammar xmlns="http://relaxng.org/ns/structure/1.0"> | |
<start> | |
<ref name="foo.element"/> | |
</start> | |
<define name="foo.element"> | |
<element> | |
<name ns="">foo</name> | |
<group> | |
<ref name="bar1.element"/> | |
<ref name="bar2.element"/> | |
</group> | |
</element> | |
</define> | |
<define name="bar1.element"> | |
<element> | |
<name ns="http://www.example.com/n1">bar1</name> | |
<empty/> | |
</element> | |
</define> | |
<define name="bar2.element"> | |
<element> | |
<name ns="http://www.example.com/n2">bar2</name> | |
<empty/> | |
</element> | |
</define> | |
</grammar>]]></programlisting> | |
<note><para>Strictly speaking, the result of simplification is an | |
instance of the data model rather than an XML document. For | |
convenience, we use an XML document to represent an instance of the | |
data model.</para></note> | |
</section> | |
</section> | |
<section id="semantics"> | |
<title>Semantics</title> | |
<para>In this section, we define the semantics of a correct RELAX NG | |
schema that has been transformed into the simple syntax. The | |
semantics of a RELAX NG schema consist of a specification of what XML | |
documents are valid with respect to that schema. The semantics are | |
described formally. The formalism uses axioms and inference rules. | |
Axioms are propositions that are provable unconditionally. An | |
inference rule consists of one or more antecedents and exactly one | |
consequent. An antecedent is either positive or negative. If all the | |
positive antecedents of an inference rule are provable and none of the | |
negative antecedents are provable, then the consequent of the | |
inference rule is provable. An XML document is valid with respect to a | |
RELAX NG schema if and only if the proposition that it is valid is | |
provable in the formalism specified in this section.</para> | |
<note><para>This kind of formalism is similar to a proof system. | |
However, a traditional proof system only has positive | |
antecedents.</para></note> | |
<para>The notation for inference rules separates the antecedents from | |
the consequent by a horizontal line: the antecedents are above the | |
line; the consequent is below the line. If an antecedent is of the | |
form not(<replaceable>p</replaceable>), then it is a negative | |
antecedent; otherwise, it is a positive antecedent. Both axioms and | |
inferences | |
rules may use variables. A variable has a name and optionally a | |
subscript. The name of a variable is italicized. Each variable has a | |
range that is determined by its name. Axioms and inference rules are | |
implicitly universally quantified over the variables they contain. We | |
explain this further below.</para> | |
<para>The possibility that an inference rule or axiom may contain more | |
than one occurrence of a particular variable requires that an identity | |
relation be defined on each kind of object over which a variable can | |
range. The identity relation for all kinds of object is value-based. | |
Two objects of a particular kind are identical if the constituents of | |
the objects are identical. For example, two attributes are considered | |
the same if they have the same name and the same value. Two characters | |
are identical if their Unicode character codes are the same.</para> | |
<section id="name-classes"> | |
<title>Name classes</title> | |
<para>The main semantic concept for name classes is that of a name | |
belonging to a name class. A name class is an element that matches the | |
production nameClass. A name is as defined in <xref | |
linkend="data-model"/>: it consists of a namespace URI and a local | |
name.</para> | |
<para>We use the following notation:</para> | |
<variablelist> | |
<varlistentry><term><p:var range="name"/></term><listitem><para>is a variable | |
that ranges over names</para></listitem></varlistentry> | |
<varlistentry><term><p:var range="nameClass"/></term><listitem><para>ranges over name classes</para></listitem></varlistentry> | |
<varlistentry><term><p:judgement name="belongs"> | |
<p:var range="name"/> | |
<p:var range="nameClass"/> | |
</p:judgement></term><listitem><para> | |
asserts that name <p:var range="name"/> is a member of name class <p:var range="nameClass"/> | |
</para></listitem></varlistentry> | |
</variablelist> | |
<para>We are now ready for our first axiom, which is called "anyName | |
1":</para> | |
<p:proofSystem> | |
<p:rule name="anyName 1"> | |
<p:judgement name="belongs"> | |
<p:var range="name"/> | |
<p:element name="anyName"/> | |
</p:judgement> | |
</p:rule> | |
</p:proofSystem> | |
<para>This says for any name <p:var range="name"/>, <p:var | |
range="name"/> belongs to the name class <p:element name="anyName"/>, | |
in other words <p:element name="anyName"/> matches any name. Note the | |
effect of the implicit universal quantification over the variables in | |
the axiom: this is what makes the axiom apply for any name <p:var | |
range="name"/>.</para> | |
<para>Our first inference rule is almost as simple:</para> | |
<p:proofSystem> | |
<p:rule name="anyName 2"> | |
<p:not> | |
<p:judgement name="belongs"> | |
<p:var range="name"/> | |
<p:var range="nameClass"/> | |
</p:judgement> | |
</p:not> | |
<p:judgement name="belongs"> | |
<p:var range="name"/> | |
<p:element name="anyName"> | |
<p:element name="except"> | |
<p:var range="nameClass"/> | |
</p:element> | |
</p:element> | |
</p:judgement> | |
</p:rule> | |
</p:proofSystem> | |
<para>This says that for any name <p:var range="name"/> | |
and for any name class <p:var range="nameClass"/>, | |
if <p:var range="name"/> does not belong to <p:var range="nameClass"/>, | |
then <p:var range="name"/> belongs to | |
<p:element name="anyName"> | |
<p:element name="except"> | |
<p:var range="nameClass"/> | |
</p:element> | |
</p:element>. In other words, <p:element name="anyName"> | |
<p:element name="except"> | |
<p:var range="nameClass"/> | |
</p:element> | |
</p:element> matches any name that does not match <p:var range="nameClass"/>.</para> | |
<para>We now need the following additional notation:</para> | |
<variablelist> | |
<varlistentry><term><p:var range="ncname"/></term> | |
<listitem><para>ranges over local names; a local name is a string that | |
matches the NCName production of <xref linkend="xml-names"/>, that is, | |
a name with no colons</para></listitem> | |
</varlistentry> | |
<varlistentry><term><p:var range="uri"/></term><listitem><para>ranges over URIs</para></listitem></varlistentry> | |
<varlistentry> | |
<term> | |
<p:function name="name"> | |
<p:var range="uri"/> | |
<p:var range="ncname"/> | |
</p:function> | |
</term> | |
<listitem><para>constructs a name with URI <p:var range="uri"/> and local | |
name <p:var range="ncname"/></para></listitem> | |
</varlistentry> | |
</variablelist> | |
<para>The remaining axioms and inference rules for name classes are as | |
follows:</para> | |
<p:proofSystem> | |
<p:rule name="nsName 1"> | |
<p:judgement name="belongs"> | |
<p:function name="name"> | |
<p:var range="uri"/> | |
<p:var range="ncname"/> | |
</p:function> | |
<p:element name="nsName"> | |
<p:attribute name="ns"> | |
<p:var range="uri"/> | |
</p:attribute> | |
</p:element> | |
</p:judgement> | |
</p:rule> | |
<p:rule name="nsName 2"> | |
<p:not> | |
<p:judgement name="belongs"> | |
<p:function name="name"> | |
<p:var range="uri"/> | |
<p:var range="ncname"/> | |
</p:function> | |
<p:var range="nameClass"/> | |
</p:judgement> | |
</p:not> | |
<p:judgement name="belongs"> | |
<p:function name="name"> | |
<p:var range="uri"/> | |
<p:var range="ncname"/> | |
</p:function> | |
<p:element name="nsName"> | |
<p:attribute name="ns"> | |
<p:var range="uri"/> | |
</p:attribute> | |
<p:element name="except"> | |
<p:var range="nameClass"/> | |
</p:element> | |
</p:element> | |
</p:judgement> | |
</p:rule> | |
<p:rule name="name"> | |
<p:judgement name="belongs"> | |
<p:function name="name"> | |
<p:var range="uri"/> | |
<p:var range="ncname"/> | |
</p:function> | |
<p:element name="name"> | |
<p:attribute name="ns"> | |
<p:var range="uri"/> | |
</p:attribute> | |
<p:var range="ncname"/> | |
</p:element> | |
</p:judgement> | |
</p:rule> | |
<p:rule name="name choice 1"> | |
<p:judgement name="belongs"> | |
<p:var range="name"/> | |
<p:var range="nameClass" sub="1"/> | |
</p:judgement> | |
<p:judgement name="belongs"> | |
<p:var range="name"/> | |
<p:element name="choice"> | |
<p:var range="nameClass" sub="1"/> | |
<p:var range="nameClass" sub="2"/> | |
</p:element> | |
</p:judgement> | |
</p:rule> | |
<p:rule name="name choice 2"> | |
<p:judgement name="belongs"> | |
<p:var range="name"/> | |
<p:var range="nameClass" sub="2"/> | |
</p:judgement> | |
<p:judgement name="belongs"> | |
<p:var range="name"/> | |
<p:element name="choice"> | |
<p:var range="nameClass" sub="1"/> | |
<p:var range="nameClass" sub="2"/> | |
</p:element> | |
</p:judgement> | |
</p:rule> | |
</p:proofSystem> | |
</section> | |
<section> | |
<title>Patterns</title> | |
<para>The axioms and inference rules for patterns use the following | |
notation:</para> | |
<variablelist> | |
<varlistentry><term><p:var range="context"/></term><listitem><para>ranges | |
over contexts (as defined in <xref | |
linkend="data-model"/>)</para></listitem></varlistentry> | |
<varlistentry><term><p:var range="att"/></term><listitem><para>ranges over | |
sets of attributes; a set with a single member | |
is considered the same as that member</para></listitem></varlistentry> | |
<varlistentry><term><p:var | |
range="mixed"/></term><listitem><para>ranges over sequences of | |
elements and strings; a sequence with a single member is considered | |
the same as that member; the sequences ranged over by <p:var | |
range="mixed"/> may contain consecutive strings and may contain strings | |
that are empty; thus, there are sequences ranged over by <p:var | |
range="mixed"/> that cannot occur as the children of an | |
element</para></listitem></varlistentry> | |
<varlistentry><term><p:var range="pattern"/></term><listitem><para>ranges | |
over patterns (elements matching the pattern | |
production)</para></listitem></varlistentry> | |
<varlistentry><term><p:judgement name="match"> | |
<p:var range="context"/> | |
<p:var range="att"/> | |
<p:var range="mixed"/> | |
<p:var range="pattern"/> | |
</p:judgement></term><listitem><para> | |
asserts that with respect to context <p:var range="context"/>, the | |
attributes <p:var range="att"/> and the sequence of elements and | |
strings <p:var range="mixed"/> matches the pattern <p:var | |
range="pattern"/></para></listitem></varlistentry> | |
</variablelist> | |
<section id="choice-pattern"> | |
<title><literal>choice</literal> pattern</title> | |
<para>The semantics of the <literal>choice</literal> pattern are as follows:</para> | |
<p:proofSystem> | |
<p:rule name="choice 1"> | |
<p:judgement name="match"> | |
<p:var range="context"/> | |
<p:var range="att"/> | |
<p:var range="mixed"/> | |
<p:var range="pattern" sub="1"/> | |
</p:judgement> | |
<p:judgement name="match"> | |
<p:var range="context"/> | |
<p:var range="att"/> | |
<p:var range="mixed"/> | |
<p:element name="choice"> | |
<p:var range="pattern" sub="1"/> | |
<p:var range="pattern" sub="2"/> | |
</p:element> | |
</p:judgement> | |
</p:rule> | |
<p:rule name="choice 2"> | |
<p:judgement name="match"> | |
<p:var range="context"/> | |
<p:var range="att"/> | |
<p:var range="mixed"/> | |
<p:var range="pattern" sub="2"/> | |
</p:judgement> | |
<p:judgement name="match"> | |
<p:var range="context"/> | |
<p:var range="att"/> | |
<p:var range="mixed"/> | |
<p:element name="choice"> | |
<p:var range="pattern" sub="1"/> | |
<p:var range="pattern" sub="2"/> | |
</p:element> | |
</p:judgement> | |
</p:rule> | |
</p:proofSystem> | |
</section> | |
<section> | |
<title><literal>group</literal> pattern</title> | |
<para>We use the following additional notation:</para> | |
<variablelist> | |
<varlistentry><term><p:function name="append"> | |
<p:var range="mixed" sub="1"/> | |
<p:var range="mixed" sub="2"/> | |
</p:function></term><listitem> | |
<para>represents the concatenation of the sequences <p:var range="mixed" sub="1"/> and <p:var range="mixed" sub="2"/> | |
</para></listitem></varlistentry> | |
<varlistentry><term><p:function name="union"> | |
<p:var range="att" sub="1"/> | |
<p:var range="att" sub="2"/> | |
</p:function></term><listitem> | |
<para>represents the union of <p:var range="att" sub="1"/> | |
and <p:var range="att" sub="2"/></para> | |
</listitem> | |
</varlistentry> | |
</variablelist> | |
<para>The semantics of the <literal>group</literal> pattern are as follows:</para> | |
<p:proofSystem> | |
<p:rule name="group"> | |
<p:judgement name="match"> | |
<p:var range="context"/> | |
<p:var range="att" sub="1"/> | |
<p:var range="mixed" sub="1"/> | |
<p:var range="pattern" sub="1"/> | |
</p:judgement> | |
<p:judgement name="match"> | |
<p:var range="context"/> | |
<p:var range="att" sub="2"/> | |
<p:var range="mixed" sub="2"/> | |
<p:var range="pattern" sub="2"/> | |
</p:judgement> | |
<p:judgement name="match"> | |
<p:var range="context"/> | |
<p:function name="union"> | |
<p:var range="att" sub="1"/> | |
<p:var range="att" sub="2"/> | |
</p:function> | |
<p:function name="append"> | |
<p:var range="mixed" sub="1"/> | |
<p:var range="mixed" sub="2"/> | |
</p:function> | |
<p:element name="group"> | |
<p:var range="pattern" sub="1"/> | |
<p:var range="pattern" sub="2"/> | |
</p:element> | |
</p:judgement> | |
</p:rule> | |
</p:proofSystem> | |
<note><para>The restriction in <xref linkend="attribute-restrictions"/> | |
ensures that the set of attributes constructed in the consequent will | |
not have multiple attributes with the same name.</para></note> | |
</section> | |
<section id="empty-pattern"> | |
<title><literal>empty</literal> pattern</title> | |
<para>We use the following additional notation:</para> | |
<variablelist> | |
<varlistentry><term><p:function name="emptySequence"/></term><listitem><para>represents an empty sequence</para></listitem></varlistentry> | |
<varlistentry><term><p:function name="emptySet"/></term><listitem><para>represents an empty set</para></listitem></varlistentry> | |
</variablelist> | |
<para>The semantics of the <literal>empty</literal> pattern are as follows:</para> | |
<p:proofSystem> | |
<p:rule name="empty"> | |
<p:judgement name="match"> | |
<p:var range="context"/> | |
<p:function name="emptySet"/> | |
<p:function name="emptySequence"/> | |
<p:element name="empty"></p:element> | |
<p:function name="emptySet"/> | |
<p:function name="emptySet"/> | |
</p:judgement> | |
</p:rule> | |
</p:proofSystem> | |
</section> | |
<section id="text-pattern"> | |
<title><literal>text</literal> pattern</title> | |
<para>We use the following additional notation:</para> | |
<variablelist> | |
<varlistentry><term><p:var range="string"/></term><listitem><para>ranges | |
over strings</para></listitem></varlistentry> | |
</variablelist> | |
<para>The semantics of the <literal>text</literal> pattern are as follows:</para> | |
<p:proofSystem> | |
<p:rule name="text 1"> | |
<p:judgement name="match"> | |
<p:var range="context"/> | |
<p:function name="emptySet"/> | |
<p:function name="emptySequence"/> | |
<p:element name="text"></p:element> | |
<p:function name="emptySet"/> | |
<p:function name="emptySet"/> | |
</p:judgement> | |
</p:rule> | |
<p:rule name="text 2"> | |
<p:judgement name="match"> | |
<p:var range="context"/> | |
<p:function name="emptySet"/> | |
<p:var range="mixed"/> | |
<p:element name="text"></p:element> | |
<p:function name="emptySet"/> | |
<p:function name="emptySet"/> | |
</p:judgement> | |
<p:judgement name="match"> | |
<p:var range="context"/> | |
<p:function name="emptySet"/> | |
<p:function name="append"> | |
<p:var range="mixed"/> | |
<p:var range="string"/> | |
</p:function> | |
<p:element name="text"></p:element> | |
<p:function name="emptySet"/> | |
<p:function name="emptySet"/> | |
</p:judgement> | |
</p:rule> | |
</p:proofSystem> | |
<para>The effect of the above rule is that a <literal>text</literal> | |
element matches zero or more strings.</para> | |
</section> | |
<section> | |
<title><literal>oneOrMore</literal> pattern</title> | |
<para>We use the following additional notation:</para> | |
<variablelist> | |
<varlistentry><term><p:judgement name="disjoint"> | |
<p:var range="att" sub="1"/> | |
<p:var range="att" sub="2"/> | |
</p:judgement></term><listitem><para> | |
asserts that there is no name that is | |
the name of both an attribute in <p:var range="att" sub="1"/> | |
and of an attribute in <p:var range="att" sub="2"/> | |
</para></listitem></varlistentry> | |
</variablelist> | |
<para>The semantics of the <literal>oneOrMore</literal> pattern are as follows:</para> | |
<p:proofSystem> | |
<p:rule name="oneOrMore 1"> | |
<p:judgement name="match"> | |
<p:var range="context"/> | |
<p:var range="att"/> | |
<p:var range="mixed"/> | |
<p:var range="pattern"/> | |
</p:judgement> | |
<p:judgement name="match"> | |
<p:var range="context"/> | |
<p:var range="att"/> | |
<p:var range="mixed"/> | |
<p:element name="oneOrMore"> | |
<p:var range="pattern"/> | |
</p:element> | |
</p:judgement> | |
</p:rule> | |
<p:rule name="oneOrMore 2"> | |
<p:judgement name="match"> | |
<p:var range="context"/> | |
<p:var range="att" sub="1"/> | |
<p:var range="mixed" sub="1"/> | |
<p:var range="pattern"/> | |
</p:judgement> | |
<p:judgement name="match"> | |
<p:var range="context"/> | |
<p:var range="att" sub="2"/> | |
<p:var range="mixed" sub="2"/> | |
<p:element name="oneOrMore"> | |
<p:var range="pattern"/> | |
</p:element> | |
</p:judgement> | |
<p:judgement name="disjoint"> | |
<p:var range="att" sub="1"/> | |
<p:var range="att" sub="2"/> | |
</p:judgement> | |
<p:judgement name="match"> | |
<p:var range="context"/> | |
<p:function name="union"> | |
<p:var range="att" sub="1"/> | |
<p:var range="att" sub="2"/> | |
</p:function> | |
<p:function name="append"> | |
<p:var range="mixed" sub="1"/> | |
<p:var range="mixed" sub="2"/> | |
</p:function> | |
<p:element name="oneOrMore"> | |
<p:var range="pattern"/> | |
</p:element> | |
</p:judgement> | |
</p:rule> | |
</p:proofSystem> | |
</section> | |
<section> | |
<title><literal>interleave</literal> pattern</title> | |
<para>We use the following additional notation:</para> | |
<variablelist> | |
<varlistentry><term><p:judgement name="interleave"> | |
<p:var range="mixed" sub="1"/> | |
<p:var range="mixed" sub="2"/> | |
<p:var range="mixed" sub="3"/> | |
</p:judgement></term><listitem><para> | |
asserts that <p:var range="mixed" sub="1"/> | |
is an interleaving of <p:var range="mixed" sub="2"/> | |
and <p:var range="mixed" sub="3"/> | |
</para></listitem></varlistentry> | |
</variablelist> | |
<para>The semantics of interleaving are defined by the following rules.</para> | |
<p:proofSystem> | |
<p:rule name="interleaves 1"> | |
<p:judgement name="interleave"> | |
<p:function name="emptySequence"/> | |
<p:function name="emptySequence"/> | |
<p:function name="emptySequence"/> | |
</p:judgement> | |
</p:rule> | |
<p:rule name="interleaves 2"> | |
<p:judgement name="interleave"> | |
<p:var range="mixed" sub="1"/> | |
<p:var range="mixed" sub="2"/> | |
<p:var range="mixed" sub="3"/> | |
</p:judgement> | |
<p:judgement name="interleave"> | |
<p:function name="append"> | |
<p:var range="mixed" sub="4"/> | |
<p:var range="mixed" sub="1"/> | |
</p:function> | |
<p:function name="append"> | |
<p:var range="mixed" sub="4"/> | |
<p:var range="mixed" sub="2"/> | |
</p:function> | |
<p:var range="mixed" sub="3"/> | |
</p:judgement> | |
</p:rule> | |
<p:rule name="interleaves 3"> | |
<p:judgement name="interleave"> | |
<p:var range="mixed" sub="1"/> | |
<p:var range="mixed" sub="2"/> | |
<p:var range="mixed" sub="3"/> | |
</p:judgement> | |
<p:judgement name="interleave"> | |
<p:function name="append"> | |
<p:var range="mixed" sub="4"/> | |
<p:var range="mixed" sub="1"/> | |
</p:function> | |
<p:var range="mixed" sub="2"/> | |
<p:function name="append"> | |
<p:var range="mixed" sub="4"/> | |
<p:var range="mixed" sub="3"/> | |
</p:function> | |
</p:judgement> | |
</p:rule> | |
</p:proofSystem> | |
<para>For example, the interleavings of | |
<literal><![CDATA[<a/><a/>]]></literal> and | |
<literal><![CDATA[<b/>]]></literal> are | |
<literal><![CDATA[<a/><a/><b/>]]></literal>, | |
<literal><![CDATA[<a/><b/><a/>]]></literal>, and | |
<literal><![CDATA[<b/><a/><a/>]]></literal>.</para> | |
<para>The semantics of the <literal>interleave</literal> pattern are | |
as follows:</para> | |
<p:proofSystem> | |
<p:rule name="interleave"> | |
<p:judgement name="match"> | |
<p:var range="context"/> | |
<p:var range="att" sub="1"/> | |
<p:var range="mixed" sub="1"/> | |
<p:var range="pattern" sub="1"/> | |
</p:judgement> | |
<p:judgement name="match"> | |
<p:var range="context"/> | |
<p:var range="att" sub="2"/> | |
<p:var range="mixed" sub="2"/> | |
<p:var range="pattern" sub="2"/> | |
</p:judgement> | |
<p:judgement name="interleave"> | |
<p:var range="mixed" sub="3"/> | |
<p:var range="mixed" sub="1"/> | |
<p:var range="mixed" sub="2"/> | |
</p:judgement> | |
<p:judgement name="match"> | |
<p:var range="context"/> | |
<p:function name="union"> | |
<p:var range="att" sub="1"/> | |
<p:var range="att" sub="2"/> | |
</p:function> | |
<p:var range="mixed" sub="3"/> | |
<p:element name="interleave"> | |
<p:var range="pattern" sub="1"/> | |
<p:var range="pattern" sub="2"/> | |
</p:element> | |
</p:judgement> | |
</p:rule> | |
</p:proofSystem> | |
<note><para>The restriction in <xref linkend="attribute-restrictions"/> | |
ensures that the set of attributes constructed in the consequent will | |
not have multiple attributes with the same name.</para></note> | |
</section> | |
<section id="element-pattern"> | |
<title><literal>element</literal> and <literal>attribute</literal> pattern</title> | |
<para>The value of an attribute is always a single string, which may | |
be empty. Thus, the empty sequence is not a possible attribute value. | |
On the hand, the children of an element can be an empty sequence and | |
cannot consist of an empty string. In order to ensure that validation | |
handles attributes and elements consistently, we introduce a variant | |
of matching called <firstterm>weak matching</firstterm>. Weak | |
matching is used when matching the pattern for the value of an | |
attribute or for the attributes and children of an element. We use | |
the following notation to define weak matching.</para> | |
<variablelist> | |
<varlistentry><term><p:function | |
name="emptyString"/></term><listitem><para>represents an empty | |
string</para></listitem></varlistentry> | |
<varlistentry><term><p:var | |
range="whiteSpace"/></term><listitem><para>ranges over the empty | |
sequence and strings that consist entirely of | |
whitespace</para></listitem></varlistentry> | |
<varlistentry><term><p:judgement name="weakMatch"> | |
<p:var range="context"/> | |
<p:var range="att"/> | |
<p:var range="mixed"/> | |
<p:var range="pattern"/> | |
</p:judgement></term><listitem><para> | |
asserts that with respect to context <p:var range="context"/>, the | |
attributes <p:var range="att"/> and the sequence of elements and | |
strings <p:var range="mixed"/> weakly matches the pattern <p:var | |
range="pattern"/></para></listitem></varlistentry> | |
</variablelist> | |
<para>The semantics of weak matching are as follows:</para> | |
<p:proofSystem> | |
<p:rule name="weak match 1"> | |
<p:judgement name="match"> | |
<p:var range="context"/> | |
<p:var range="att"/> | |
<p:var range="mixed"/> | |
<p:var range="pattern"/> | |
</p:judgement> | |
<p:judgement name="weakMatch"> | |
<p:var range="context"/> | |
<p:var range="att"/> | |
<p:var range="mixed"/> | |
<p:var range="pattern"/> | |
</p:judgement> | |
</p:rule> | |
<p:rule name="weak match 2"> | |
<p:judgement name="match"> | |
<p:var range="context"/> | |
<p:var range="att"/> | |
<p:function name="emptySequence"/> | |
<p:var range="pattern"/> | |
</p:judgement> | |
<p:judgement name="weakMatch"> | |
<p:var range="context"/> | |
<p:var range="att"/> | |
<p:var range="whiteSpace"/> | |
<p:var range="pattern"/> | |
</p:judgement> | |
</p:rule> | |
<p:rule name="weak match 3"> | |
<p:judgement name="match"> | |
<p:var range="context"/> | |
<p:var range="att"/> | |
<p:function name="emptyString"/> | |
<p:var range="pattern"/> | |
</p:judgement> | |
<p:judgement name="weakMatch"> | |
<p:var range="context"/> | |
<p:var range="att"/> | |
<p:function name="emptySequence"/> | |
<p:var range="pattern"/> | |
</p:judgement> | |
</p:rule> | |
</p:proofSystem> | |
<para>We use the following additional notation:</para> | |
<variablelist> | |
<varlistentry><term><p:function name="attribute"> | |
<p:var range="name"/> | |
<p:var range="string"/> | |
</p:function></term><listitem><para> | |
constructs an attribute with name <p:var range="name"/> | |
and value <p:var range="string"/> | |
</para></listitem></varlistentry> | |
<varlistentry><term><p:function name="element"> | |
<p:var range="name"/> | |
<p:var range="context"/> | |
<p:var range="att"/> | |
<p:var range="mixed"/> | |
</p:function></term><listitem><para> | |
constructs an element with name <p:var range="name"/>, | |
context <p:var range="context"/>, | |
attributes <p:var range="att"/> | |
and mixed sequence <p:var range="mixed"/> as children | |
</para></listitem></varlistentry> | |
<varlistentry><term><p:judgement name="okAsChildren"> | |
<p:var range="mixed"/> | |
</p:judgement></term><listitem><para> | |
asserts that the mixed sequence <p:var range="mixed"/> can occur as | |
the children of an element: it does not contain any member that is an | |
empty string, nor does it contain two consecutive members that are | |
both strings</para></listitem></varlistentry> | |
<varlistentry><term><p:judgement name="bind"> | |
<p:var range="ncname"/> | |
<p:var range="nameClass"/> | |
<p:var range="pattern"/> | |
</p:judgement></term><listitem><para> | |
asserts that the grammar contains | |
<p:element name="define"> | |
<p:attribute name="name"> | |
<p:var range="ncname"/> | |
</p:attribute> | |
<p:element name="element"> | |
<p:var range="nameClass"/> | |
<p:var range="pattern"/> | |
</p:element> | |
</p:element> | |
</para></listitem></varlistentry> | |
</variablelist> | |
<para>The semantics of the <literal>attribute</literal> pattern are as follows:</para> | |
<p:proofSystem> | |
<p:rule name="attribute"> | |
<p:judgement name="weakMatch"> | |
<p:var range="context"/> | |
<p:function name="emptySet"/> | |
<p:var range="string"/> | |
<p:var range="pattern"/> | |
</p:judgement> | |
<p:judgement name="belongs"> | |
<p:var range="name"/> | |
<p:var range="nameClass"/> | |
</p:judgement> | |
<p:judgement name="match"> | |
<p:var range="context"/> | |
<p:function name="attribute"> | |
<p:var range="name"/> | |
<p:var range="string"/> | |
</p:function> | |
<p:function name="emptySequence"/> | |
<p:element name="attribute"> | |
<p:var range="nameClass"/> | |
<p:var range="pattern"/> | |
</p:element> | |
</p:judgement> | |
</p:rule> | |
</p:proofSystem> | |
<para>The semantics of the <literal>element</literal> pattern are as follows:</para> | |
<p:proofSystem> | |
<p:rule name="element"> | |
<p:judgement name="weakMatch"> | |
<p:var range="context" sub="1"/> | |
<p:var range="att"/> | |
<p:var range="mixed"/> | |
<p:var range="pattern"/> | |
</p:judgement> | |
<p:judgement name="belongs"> | |
<p:var range="name"/> | |
<p:var range="nameClass"/> | |
</p:judgement> | |
<p:judgement name="okAsChildren"> | |
<p:var range="mixed"/> | |
</p:judgement> | |
<p:judgement name="bind"> | |
<p:var range="ncname"/> | |
<p:var range="nameClass"/> | |
<p:var range="pattern"/> | |
</p:judgement> | |
<p:judgement name="match"> | |
<p:var range="context" sub="2"/> | |
<p:function name="emptySet"/> | |
<p:function name="append"> | |
<p:var range="whiteSpace" sub="1"/> | |
<p:function name="element"> | |
<p:var range="name"/> | |
<p:var range="context" sub="1"/> | |
<p:var range="att"/> | |
<p:var range="mixed"/> | |
</p:function> | |
<p:var range="whiteSpace" sub="2"/> | |
</p:function> | |
<p:element name="ref"> | |
<p:attribute name="name"> | |
<p:var range="ncname"/> | |
</p:attribute> | |
</p:element> | |
</p:judgement> | |
</p:rule> | |
</p:proofSystem> | |
</section> | |
<section id="data-pattern"> | |
<title><literal>data</literal> and <literal>value</literal> pattern</title> | |
<para>RELAX NG relies on datatype libraries to perform datatyping. | |
A datatype library is identified by a URI. A datatype within a | |
datatype library is identified by an NCName. A datatype library | |
provides two services.</para> | |
<itemizedlist> | |
<listitem><para>It can determine whether a string is a legal | |
representation of a datatype. This service accepts a list of zero or | |
more parameters. For example, a string datatype might have a parameter | |
specifying the length of a string. The datatype library determines | |
what parameters are applicable for each datatype.</para></listitem> | |
<listitem><para>It can determine whether two strings represent the | |
same value of a datatype. This service does not have any | |
parameters.</para></listitem> | |
</itemizedlist> | |
<para>Both services may make use of the context of a string. For | |
example, a datatype representing a QName would use the namespace | |
map.</para> | |
<para>We use the following additional notation:</para> | |
<variablelist> | |
<varlistentry><term><p:judgement name="datatypeAllows"> | |
<p:var range="uri"/> | |
<p:var range="ncname"/> | |
<p:var range="params"/> | |
<p:var range="string"/> | |
<p:var range="context"/> | |
</p:judgement></term><listitem><para> | |
asserts that in the datatype library identified by URI <p:var range="uri"/>, the string <p:var range="string"/> interpreted with | |
context <p:var range="context"/> is a legal | |
value of datatype <p:var range="ncname"/> with parameters <p:var range="params"/></para></listitem></varlistentry> | |
<varlistentry><term><p:judgement name="datatypeEqual"> | |
<p:var range="uri"/> | |
<p:var range="ncname"/> | |
<p:var range="string" sub="1"/> | |
<p:var range="context" sub="1"/> | |
<p:var range="string" sub="2"/> | |
<p:var range="context" sub="2"/> | |
</p:judgement></term><listitem><para> | |
asserts that in the datatype library identified by URI <p:var range="uri"/>, string <p:var range="string" sub="1"/> interpreted with | |
context <p:var range="context" sub="1"/> represents the same value of | |
the datatype <p:var range="ncname"/> as the string <p:var range="string" sub="2"/> interpreted in the context of <p:var range="context" sub="2"/> | |
</para></listitem></varlistentry> | |
<varlistentry><term><p:var range="params"/></term><listitem><para>ranges over sequences of parameters</para></listitem></varlistentry> | |
<varlistentry><term><p:context> | |
<p:var range="context"/> | |
</p:context></term><listitem><para> | |
within the start-tag of a pattern refers to the context | |
of the pattern element | |
</para></listitem></varlistentry> | |
<varlistentry> | |
<term> | |
<p:function name="context"> | |
<p:var range="uri"/> | |
<p:var range="context"/> | |
</p:function> | |
</term> | |
<listitem><para>constructs a context which is the same as <p:var range="context"/> | |
except that the default namespace is <p:var range="uri"/>; if <p:var | |
range="uri"/> is the empty string, then there is no default namespace | |
in the constructed context</para></listitem></varlistentry> | |
</variablelist> | |
<para>The datatypeEqual function must be reflexive, transitive | |
and symmetric, that is, the following inference rules must hold:</para> | |
<p:proofSystem> | |
<p:rule name="datatypeEqual reflexive"> | |
<p:judgement name="datatypeAllows"> | |
<p:var range="uri"/> | |
<p:var range="ncname"/> | |
<p:var range="params"/> | |
<p:var range="string"/> | |
<p:var range="context"/> | |
</p:judgement> | |
<p:judgement name="datatypeEqual"> | |
<p:var range="uri"/> | |
<p:var range="ncname"/> | |
<p:var range="string"/> | |
<p:var range="context"/> | |
<p:var range="string"/> | |
<p:var range="context"/> | |
</p:judgement> | |
</p:rule> | |
<p:rule name="datatypeEqual transitive"> | |
<p:judgement name="datatypeEqual"> | |
<p:var range="uri"/> | |
<p:var range="ncname"/> | |
<p:var range="string" sub="1"/> | |
<p:var range="context" sub="1"/> | |
<p:var range="string" sub="2"/> | |
<p:var range="context" sub="2"/> | |
</p:judgement> | |
<p:judgement name="datatypeEqual"> | |
<p:var range="uri"/> | |
<p:var range="ncname"/> | |
<p:var range="string" sub="2"/> | |
<p:var range="context" sub="3"/> | |
<p:var range="string" sub="3"/> | |
<p:var range="context" sub="3"/> | |
</p:judgement> | |
<p:judgement name="datatypeEqual"> | |
<p:var range="uri"/> | |
<p:var range="ncname"/> | |
<p:var range="string" sub="1"/> | |
<p:var range="context" sub="1"/> | |
<p:var range="string" sub="3"/> | |
<p:var range="context" sub="3"/> | |
</p:judgement> | |
</p:rule> | |
<p:rule name="datatypeEqual symmetric"> | |
<p:judgement name="datatypeEqual"> | |
<p:var range="uri"/> | |
<p:var range="ncname"/> | |
<p:var range="string" sub="1"/> | |
<p:var range="context" sub="1"/> | |
<p:var range="string" sub="2"/> | |
<p:var range="context" sub="2"/> | |
</p:judgement> | |
<p:judgement name="datatypeEqual"> | |
<p:var range="uri"/> | |
<p:var range="ncname"/> | |
<p:var range="string" sub="2"/> | |
<p:var range="context" sub="2"/> | |
<p:var range="string" sub="1"/> | |
<p:var range="context" sub="1"/> | |
</p:judgement> | |
</p:rule> | |
</p:proofSystem> | |
<para>The semantics of the <literal>data</literal> and | |
<literal>value</literal> patterns are as follows:</para> | |
<p:proofSystem> | |
<p:rule name="value"> | |
<p:judgement name="datatypeEqual"> | |
<p:var range="uri" sub="1"/> | |
<p:var range="ncname"/> | |
<p:var range="string" sub="1"/> | |
<p:var range="context" sub="1"/> | |
<p:var range="string" sub="2"/> | |
<p:function name="context"> | |
<p:var range="uri" sub="2"/> | |
<p:var range="context" sub="2"/> | |
</p:function> | |
</p:judgement> | |
<p:judgement name="match"> | |
<p:var range="context" sub="1"/> | |
<p:function name="emptySet"/> | |
<p:var range="string" sub="1"/> | |
<p:element name="value"> | |
<p:attribute name="datatypeLibrary"> | |
<p:var range="uri" sub="1"/> | |
</p:attribute> | |
<p:attribute name="type"> | |
<p:var range="ncname"/> | |
</p:attribute> | |
<p:attribute name="ns"> | |
<p:var range="uri" sub="2"/> | |
</p:attribute> | |
<p:context> | |
<p:var range="context" sub="2"/> | |
</p:context> | |
<p:var range="string" sub="2"/> | |
</p:element> | |
<p:function name="emptySet"/> | |
<p:function name="emptySet"/> | |
</p:judgement> | |
</p:rule> | |
<p:rule name="data 1"> | |
<p:judgement name="datatypeAllows"> | |
<p:var range="uri"/> | |
<p:var range="ncname"/> | |
<p:var range="params"/> | |
<p:var range="string"/> | |
<p:var range="context"/> | |
</p:judgement> | |
<p:judgement name="match"> | |
<p:var range="context"/> | |
<p:function name="emptySet"/> | |
<p:var range="string"/> | |
<p:element name="data"> | |
<p:attribute name="datatypeLibrary"> | |
<p:var range="uri"/> | |
</p:attribute> | |
<p:attribute name="type"> | |
<p:var range="ncname"/> | |
</p:attribute> | |
<p:var range="params"/> | |
</p:element> | |
<p:function name="emptySet"/> | |
<p:function name="emptySet"/> | |
</p:judgement> | |
</p:rule> | |
<p:rule name="data 2"> | |
<p:judgement name="datatypeAllows"> | |
<p:var range="uri"/> | |
<p:var range="ncname"/> | |
<p:var range="params"/> | |
<p:var range="string"/> | |
<p:var range="context"/> | |
</p:judgement> | |
<p:not> | |
<p:judgement name="match"> | |
<p:var range="context"/> | |
<p:var range="att"/> | |
<p:var range="string"/> | |
<p:var range="pattern"/> | |
</p:judgement> | |
</p:not> | |
<p:judgement name="match"> | |
<p:var range="context"/> | |
<p:function name="emptySet"/> | |
<p:var range="string"/> | |
<p:element name="data"> | |
<p:attribute name="datatypeLibrary"> | |
<p:var range="uri"/> | |
</p:attribute> | |
<p:attribute name="type"> | |
<p:var range="ncname"/> | |
</p:attribute> | |
<p:var range="params"/> | |
<p:element name="except"> | |
<p:var range="pattern"/> | |
</p:element> | |
</p:element> | |
<p:function name="emptySet"/> | |
<p:function name="emptySet"/> | |
</p:judgement> | |
</p:rule> | |
</p:proofSystem> | |
</section> | |
<section id="built-in-datatype"> | |
<title>Built-in datatype library</title> | |
<para>The empty URI identifies a special built-in datatype library. | |
This provides two datatypes, <literal>string</literal> and | |
<literal>token</literal>. No parameters are allowed for either of | |
these datatypes.</para> | |
<variablelist> | |
<varlistentry><term> | |
<p:judgement name="equal"> | |
<p:var range="string" sub="1"/> | |
<p:var range="string" sub="2"/> | |
</p:judgement></term> | |
<listitem><para>asserts that <p:var range="string" sub="1"/> | |
and <p:var range="string" sub="2"/> are identical</para></listitem> | |
</varlistentry> | |
<varlistentry><term> | |
<p:function name="normalizeWhiteSpace"> | |
<p:var range="string"/> | |
</p:function> | |
</term> | |
<listitem><para>returns the string <p:var range="string"/>, | |
with leading and trailing whitespace characters removed, | |
and with each other maximal sequence of whitespace characters | |
replaced by a single space character </para></listitem> | |
</varlistentry> | |
</variablelist> | |
<para>The semantics of the two built-in datatypes are as | |
follows:</para> | |
<p:proofSystem> | |
<p:rule name="string allows"> | |
<p:judgement name="datatypeAllows"> | |
<p:function name="emptyString"/> | |
<p:string>string</p:string> | |
<p:function name="emptySequence"/> | |
<p:var range="string"/> | |
<p:var range="context"/> | |
</p:judgement> | |
</p:rule> | |
<p:rule name="string equal"> | |
<p:judgement name="datatypeEqual"> | |
<p:function name="emptyString"/> | |
<p:string>string</p:string> | |
<p:var range="string"/> | |
<p:var range="context" sub="1"/> | |
<p:var range="string"/> | |
<p:var range="context" sub="2"/> | |
</p:judgement> | |
</p:rule> | |
<p:rule name="token allows"> | |
<p:judgement name="datatypeAllows"> | |
<p:function name="emptyString"/> | |
<p:string>token</p:string> | |
<p:function name="emptySequence"/> | |
<p:var range="string"/> | |
<p:var range="context"/> | |
</p:judgement> | |
</p:rule> | |
<p:rule name="token equal"> | |
<p:judgement name="equal"> | |
<p:function name="normalizeWhiteSpace"> | |
<p:var range="string" sub="1"/> | |
</p:function> | |
<p:function name="normalizeWhiteSpace"> | |
<p:var range="string" sub="2"/> | |
</p:function> | |
</p:judgement> | |
<p:judgement name="datatypeEqual"> | |
<p:function name="emptyString"/> | |
<p:string>token</p:string> | |
<p:var range="string" sub="1"/> | |
<p:var range="context" sub="1"/> | |
<p:var range="string" sub="2"/> | |
<p:var range="context" sub="2"/> | |
</p:judgement> | |
</p:rule> | |
</p:proofSystem> | |
</section> | |
<section> | |
<title><literal>list</literal> pattern</title> | |
<para>We use the following additional notation:</para> | |
<variablelist> | |
<varlistentry><term><p:function name="split"> | |
<p:var range="string"/> | |
</p:function></term><listitem><para> | |
returns a sequence of strings one for each whitespace delimited token | |
of <p:var range="string"/>; each string in the returned sequence will | |
be non-empty and will not contain any | |
whitespace</para></listitem></varlistentry> | |
</variablelist> | |
<para>The semantics of the <literal>list</literal> pattern are as follows:</para> | |
<p:proofSystem> | |
<p:rule name="list"> | |
<p:judgement name="match"> | |
<p:var range="context"/> | |
<p:function name="emptySet"/> | |
<p:function name="split"> | |
<p:var range="string"/> | |
</p:function> | |
<p:var range="pattern"/> | |
</p:judgement> | |
<p:judgement name="match"> | |
<p:var range="context"/> | |
<p:function name="emptySet"/> | |
<p:var range="string"/> | |
<p:element name="list"> | |
<p:var range="pattern"/> | |
</p:element> | |
</p:judgement> | |
</p:rule> | |
</p:proofSystem> | |
<note><para>It is crucial in the above inference rule that the | |
sequence that is matched against a pattern can contain consecutive | |
strings.</para></note> | |
</section> | |
</section> | |
<section id="validity"> | |
<title>Validity</title> | |
<para>Now we can define when an element is valid with respect to a | |
schema. We use the following additional notation:</para> | |
<variablelist> | |
<varlistentry><term><p:var range="element"/></term><listitem><para>ranges over elements</para></listitem></varlistentry> | |
<varlistentry><term><p:judgement name="valid"> | |
<p:var range="element"/> | |
</p:judgement></term><listitem><para> | |
asserts that the element <p:var range="element"/> is valid with | |
respect to the grammar</para></listitem></varlistentry> | |
<varlistentry><term><p:judgement name="start"> | |
<p:var range="pattern"/> | |
</p:judgement></term><listitem><para> | |
asserts that the grammar contains | |
<p:element name="start"><p:var range="pattern"/> </p:element></para></listitem></varlistentry> | |
</variablelist> | |
<para>An element is valid if together with an empty set of attributes | |
it matches the <literal>start</literal> pattern of the grammar.</para> | |
<p:proofSystem> | |
<p:rule name="valid"> | |
<p:judgement name="start"> | |
<p:var range="pattern"/> | |
</p:judgement> | |
<p:judgement name="match"> | |
<p:var range="context"/> | |
<p:function name="emptySet"/> | |
<p:var range="element"/> | |
<p:var range="pattern"/> | |
</p:judgement> | |
<p:judgement name="valid"> | |
<p:var range="element"/> | |
</p:judgement> | |
</p:rule> | |
</p:proofSystem> | |
</section> | |
<section> | |
<title>Example</title> | |
<para>Let <p:var range="element" sub="0"/> be</para> | |
<p:formula> | |
<p:function name="element"> | |
<p:function name="name"> | |
<p:function name="emptyString"/> | |
<p:string>foo</p:string> | |
</p:function> | |
<p:var range="context" sub="0"/> | |
<p:function name="emptySet"/> | |
<p:var range="mixed"/> | |
</p:function> | |
</p:formula> | |
<para>where <p:var range="mixed"/> is</para> | |
<p:formula> | |
<p:function name="append"> | |
<p:var range="element" sub="1"/> | |
<p:var range="element" sub="2"/> | |
</p:function> | |
</p:formula> | |
<para>and <p:var range="element" sub="1"/> is</para> | |
<p:formula> | |
<p:function name="element"> | |
<p:function name="name"> | |
<p:string>http://www.example.com/n1</p:string> | |
<p:string>bar1</p:string> | |
</p:function> | |
<p:var range="context" sub="1"/> | |
<p:function name="emptySet"/> | |
<p:function name="emptySequence"/> | |
</p:function> | |
</p:formula> | |
<para>and <p:var range="element" sub="2"/> is</para> | |
<p:formula> | |
<p:function name="element"> | |
<p:function name="name"> | |
<p:string>http://www.example.com/n2</p:string> | |
<p:string>bar2</p:string> | |
</p:function> | |
<p:var range="context" sub="2"/> | |
<p:function name="emptySet"/> | |
<p:function name="emptySequence"/> | |
</p:function> | |
</p:formula> | |
<para>Assuming appropriate definitions of <p:var range="context" | |
sub="0"/>, <p:var range="context" sub="1"/> and <p:var range="context" | |
sub="2"/>, this represents the document in <xref | |
linkend="data-model-example"/>.</para> | |
<para>We now show how <p:var range="element" sub="0"/> can be shown to | |
be valid with respect to the schema in <xref | |
linkend="simple-syntax-example"/>. The schema is equivalent to the | |
following propositions:</para> | |
<p:formula> | |
<p:judgement name="start"> | |
<p:element name="ref"> | |
<p:attribute name="name"><p:string>foo</p:string></p:attribute> | |
</p:element> | |
</p:judgement> | |
</p:formula> | |
<p:formula> | |
<p:judgement name="bind"> | |
<p:string>foo.element</p:string> | |
<p:element name="name"> | |
<p:attribute name="ns"><p:function name="emptyString"/></p:attribute> | |
<p:string>foo</p:string> | |
</p:element> | |
<p:element name="group"> | |
<p:element name="ref"> | |
<p:attribute name="name"> | |
<p:string>bar1</p:string> | |
</p:attribute> | |
</p:element> | |
<p:element name="ref"> | |
<p:attribute name="name"> | |
<p:string>bar2</p:string> | |
</p:attribute> | |
</p:element> | |
</p:element> | |
</p:judgement> | |
</p:formula> | |
<p:formula> | |
<p:judgement name="bind"> | |
<p:string>bar1.element</p:string> | |
<p:element name="name"> | |
<p:attribute name="ns"> | |
<p:string>http://www.example.com/n1</p:string> | |
</p:attribute> | |
<p:string>bar1</p:string> | |
</p:element> | |
<p:element name="empty"/> | |
</p:judgement> | |
</p:formula> | |
<p:formula> | |
<p:judgement name="bind"> | |
<p:string>bar2.element</p:string> | |
<p:element name="name"> | |
<p:attribute name="ns"> | |
<p:string>http://www.example.com/n2</p:string> | |
</p:attribute> | |
<p:string>bar2</p:string> | |
</p:element> | |
<p:element name="empty"/> | |
</p:judgement> | |
</p:formula> | |
<para>Let name class <p:var range="nameClass" sub="1"/> be</para> | |
<p:formula> | |
<p:element name="name"> | |
<p:attribute name="ns"> | |
<p:string>http://www.example.com/n1</p:string> | |
</p:attribute> | |
<p:string>bar1</p:string> | |
</p:element> | |
</p:formula> | |
<para>and let <p:var range="nameClass" sub="2"/> be</para> | |
<p:formula> | |
<p:element name="name"> | |
<p:attribute name="ns"> | |
<p:string>http://www.example.com/n2</p:string> | |
</p:attribute> | |
<p:string>bar2</p:string> | |
</p:element> | |
</p:formula> | |
<para>Then, by the inference rule (name) in <xref | |
linkend="name-classes"/>, we have</para> | |
<p:formula> | |
<p:judgement name="belongs"> | |
<p:function name="name"> | |
<p:string>http://www.example.com/n1</p:string> | |
<p:string>bar1</p:string> | |
</p:function> | |
<p:var range="nameClass" sub="1"/> | |
</p:judgement> | |
</p:formula> | |
<para>and</para> | |
<p:formula> | |
<p:judgement name="belongs"> | |
<p:function name="name"> | |
<p:string>http://www.example.com/n2</p:string> | |
<p:string>bar2</p:string> | |
</p:function> | |
<p:var range="nameClass" sub="2"/> | |
</p:judgement> | |
</p:formula> | |
<para>By the inference rule (empty) in <xref linkend="empty-pattern"/>, | |
we have</para> | |
<p:formula> | |
<p:judgement name="match"> | |
<p:var range="context" sub="1"/> | |
<p:function name="emptySet"/> | |
<p:function name="emptySequence"/> | |
<p:element name="empty"></p:element> | |
</p:judgement> | |
</p:formula> | |
<para>and</para> | |
<p:formula> | |
<p:judgement name="match"> | |
<p:var range="context" sub="2"/> | |
<p:function name="emptySet"/> | |
<p:function name="emptySequence"/> | |
<p:element name="empty"></p:element> | |
</p:judgement> | |
</p:formula> | |
<para>Thus by the inference rule (element) in <xref | |
linkend="element-pattern"/>, we have</para> | |
<p:formula> | |
<p:judgement name="match"> | |
<p:var range="context" sub="0"/> | |
<p:function name="emptySet"/> | |
<p:var range="element" sub="1"/> | |
<p:element name="ref"> | |
<p:attribute name="name"> | |
<p:string>bar1</p:string> | |
</p:attribute> | |
</p:element> | |
</p:judgement> | |
</p:formula> | |
<para>Note that we have chosen <p:var | |
range="context" sub="0"/>, since any context is allowed.</para> | |
<para>Likewise, we have</para> | |
<p:formula> | |
<p:judgement name="match"> | |
<p:var range="context" sub="0"/> | |
<p:function name="emptySet"/> | |
<p:var range="element" sub="2"/> | |
<p:element name="ref"> | |
<p:attribute name="name"> | |
<p:string>bar2</p:string> | |
</p:attribute> | |
</p:element> | |
</p:judgement> | |
</p:formula> | |
<para>By the inference rule (group) in <xref | |
linkend="choice-pattern"/>, we have</para> | |
<p:formula> | |
<p:judgement name="match"> | |
<p:var range="context" sub="0"/> | |
<p:function name="emptySet"/> | |
<p:function name="append"> | |
<p:var range="element" sub="1"/> | |
<p:var range="element" sub="2"/> | |
</p:function> | |
<p:element name="group"> | |
<p:element name="ref"> | |
<p:attribute name="name"> | |
<p:string>bar1</p:string> | |
</p:attribute> | |
</p:element> | |
<p:element name="ref"> | |
<p:attribute name="name"> | |
<p:string>bar2</p:string> | |
</p:attribute> | |
</p:element> | |
</p:element> | |
</p:judgement> | |
</p:formula> | |
<para>By the inference rule (element) in <xref | |
linkend="element-pattern"/>, we have</para> | |
<p:formula> | |
<p:judgement name="match"> | |
<p:var range="context" sub="3"/> | |
<p:function name="emptySet"/> | |
<p:function name="element"> | |
<p:function name="name"> | |
<p:function name="emptyString"/> | |
<p:string>foo</p:string> | |
</p:function> | |
<p:var range="context" sub="0"/> | |
<p:function name="emptySet"/> | |
<p:var range="mixed"/> | |
</p:function> | |
<p:element name="ref"> | |
<p:attribute name="name"> | |
<p:string>foo</p:string> | |
</p:attribute> | |
</p:element> | |
</p:judgement> | |
</p:formula> | |
<para>Here <p:var range="context" sub="3"/> is an arbitrary | |
context.</para> | |
<para>Thus we can apply the inference rule (valid) in <xref | |
linkend="validity"/> and obtain</para> | |
<p:formula> | |
<p:judgement name="valid"> | |
<p:var range="element" sub="0"/> | |
</p:judgement> | |
</p:formula> | |
</section> | |
</section> | |
<section id="restriction"> | |
<title>Restrictions</title> | |
<para>The following constraints are all checked after the grammar has | |
been transformed to the simple form described in <xref | |
linkend="simple-syntax"/>. The purpose of these restrictions is to | |
catch user errors and to facilitate implementation.</para> | |
<section id="contextual-restriction"> | |
<title>Contextual restrictions</title> | |
<para>In this section we describe restrictions on where elements are | |
allowed in the schema based on the names of the ancestor elements. We | |
use the concept of a <firstterm>prohibited path</firstterm> to | |
describe these restrictions. A path is a sequence of NCNames separated | |
by <literal>/</literal> or <literal>//</literal>.</para> | |
<itemizedlist> | |
<listitem><para>An element matches a path | |
<replaceable>x</replaceable>, where <replaceable>x</replaceable> is an | |
NCName, if and only if the local name of the element is | |
<replaceable>x</replaceable></para></listitem> | |
<listitem><para>An element matches a path | |
<replaceable>x</replaceable><literal>/</literal><replaceable>p</replaceable>, | |
where <replaceable>x</replaceable> is an NCName and | |
<replaceable>p</replaceable> is a path, if and only if the local name | |
of the element is <replaceable>x</replaceable> and the element has a | |
child that matches <replaceable>p</replaceable></para></listitem> | |
<listitem><para>An element matches a path | |
<replaceable>x</replaceable><literal>//</literal><replaceable>p</replaceable>, | |
where <replaceable>x</replaceable> is an NCName and | |
<replaceable>p</replaceable> is a path, if and only if the local name | |
of the element is <replaceable>x</replaceable> and the element has a | |
descendant that matches <replaceable>p</replaceable></para></listitem> | |
</itemizedlist> | |
<para>For example, the element</para> | |
<programlisting><![CDATA[<foo> | |
<bar> | |
<baz/> | |
</bar> | |
</foo>]]></programlisting> | |
<para>matches the paths <literal>foo</literal>, | |
<literal>foo/bar</literal>, <literal>foo//bar</literal>, | |
<literal>foo//baz</literal>, <literal>foo/bar/baz</literal>, | |
<literal>foo/bar//baz</literal> and <literal>foo//bar/baz</literal>, | |
but not <literal>foo/baz</literal> or | |
<literal>foobar</literal>.</para> | |
<para>A correct RELAX NG schema must be such that, after | |
transformation to the simple form, it does not contain any element | |
that matches a prohibited path.</para> | |
<section> | |
<title><literal>attribute</literal> pattern</title> | |
<para>The following paths are prohibited:</para> | |
<itemizedlist> | |
<listitem><para><literal>attribute//ref</literal></para></listitem> | |
<listitem><para><literal>attribute//attribute</literal></para></listitem> | |
</itemizedlist> | |
</section> | |
<section> | |
<title><literal>oneOrMore</literal> pattern</title> | |
<para>The following paths are prohibited:</para> | |
<itemizedlist> | |
<listitem><para><literal>oneOrMore//group//attribute</literal></para></listitem> | |
<listitem><para><literal>oneOrMore//interleave//attribute</literal></para></listitem> | |
</itemizedlist> | |
</section> | |
<section id="list-restrictions"> | |
<title><literal>list</literal> pattern</title> | |
<para>The following paths are prohibited:</para> | |
<itemizedlist> | |
<listitem><para><literal>list//list</literal></para></listitem> | |
<listitem><para><literal>list//ref</literal></para></listitem> | |
<listitem><para><literal>list//attribute</literal></para></listitem> | |
<listitem><para><literal>list//text</literal></para></listitem> | |
<listitem><para><literal>list//interleave</literal></para></listitem> | |
</itemizedlist> | |
</section> | |
<section id="context-data-except"> | |
<title><literal>except</literal> in <literal>data</literal> pattern</title> | |
<para>The following paths are prohibited:</para> | |
<itemizedlist> | |
<listitem><para><literal>data/except//attribute</literal></para></listitem> | |
<listitem><para><literal>data/except//ref</literal></para></listitem> | |
<listitem><para><literal>data/except//text</literal></para></listitem> | |
<listitem><para><literal>data/except//list</literal></para></listitem> | |
<listitem><para><literal>data/except//group</literal></para></listitem> | |
<listitem><para><literal>data/except//interleave</literal></para></listitem> | |
<listitem><para><literal>data/except//oneOrMore</literal></para></listitem> | |
<listitem><para><literal>data/except//empty</literal></para></listitem> | |
</itemizedlist> | |
<note><para>This implies that an <literal>except</literal> element | |
with a <literal>data</literal> parent can contain only | |
<literal>data</literal>, <literal>value</literal> and | |
<literal>choice</literal> elements.</para></note> | |
</section> | |
<section id="context-start"> | |
<title><literal>start</literal> element</title> | |
<para>The following paths are prohibited:</para> | |
<itemizedlist> | |
<listitem><para><literal>start//attribute</literal></para></listitem> | |
<listitem><para><literal>start//data</literal></para></listitem> | |
<listitem><para><literal>start//value</literal></para></listitem> | |
<listitem><para><literal>start//text</literal></para></listitem> | |
<listitem><para><literal>start//list</literal></para></listitem> | |
<listitem><para><literal>start//group</literal></para></listitem> | |
<listitem><para><literal>start//interleave</literal></para></listitem> | |
<listitem><para><literal>start//oneOrMore</literal></para></listitem> | |
<listitem><para><literal>start//empty</literal></para></listitem> | |
</itemizedlist> | |
</section> | |
</section> | |
<section id="string-sequences"> | |
<title>String sequences</title> | |
<para>RELAX NG does not allow a pattern such as:</para> | |
<programlisting><![CDATA[<element name="foo"> | |
<group> | |
<data type="int"/> | |
<element name="bar"> | |
<empty/> | |
</element> | |
</group> | |
</element>]]></programlisting> | |
<para>Nor does it allow a pattern such as:</para> | |
<programlisting><![CDATA[<element name="foo"> | |
<group> | |
<data type="int"/> | |
<text/> | |
</group> | |
</element>]]></programlisting> | |
<para>More generally, if the pattern for the content of an element or | |
attribute contains</para> | |
<itemizedlist> | |
<listitem><para>a pattern that can match a child | |
(that is, an <literal>element</literal>, <literal>data</literal>, | |
<literal>value</literal>, <literal>list</literal> or | |
<literal>text</literal> pattern), and</para></listitem> | |
<listitem><para>a pattern that matches a single string (that is, a | |
<literal>data</literal>, <literal>value</literal> or | |
<literal>list</literal> pattern),</para></listitem> | |
</itemizedlist> | |
<para>then the two patterns must be alternatives to each other.</para> | |
<para>This rule does not apply to patterns occurring within a | |
<literal>list</literal> pattern.</para> | |
<para>To formalize this, we use the concept of a content-type. A | |
pattern that is allowable as the content of an element has one of | |
three content-types: empty, complex and simple. We use the following | |
notation.</para> | |
<variablelist> | |
<varlistentry> | |
<term><p:function name="empty"/></term> | |
<listitem><para>returns the empty content-type</para></listitem> | |
</varlistentry> | |
<varlistentry> | |
<term><p:function name="complex"/></term> | |
<listitem><para>returns the complex content-type</para></listitem> | |
</varlistentry> | |
<varlistentry> | |
<term><p:function name="simple"/></term> | |
<listitem><para>returns the simple content-type</para></listitem> | |
</varlistentry> | |
<varlistentry><term><p:var range="contentType"/></term> | |
<listitem><para>ranges over content-types</para></listitem> | |
</varlistentry> | |
<varlistentry><term> | |
<p:judgement name="groupable"> | |
<p:var range="contentType" sub="1"/> | |
<p:var range="contentType" sub="2"/> | |
</p:judgement> | |
</term> | |
<listitem><para>asserts that the content-types <p:var | |
range="contentType" sub="1"/> and <p:var range="contentType" sub="2"/> | |
are groupable</para></listitem> | |
</varlistentry> | |
</variablelist> | |
<para>The empty content-type is groupable with anything. In addition, | |
the complex content-type is groupable with the complex content-type. The | |
following rules formalize this.</para> | |
<p:proofSystem> | |
<p:rule name="group empty 1"> | |
<p:judgement name="groupable"> | |
<p:function name="empty"/> | |
<p:var range="contentType"/> | |
</p:judgement> | |
</p:rule> | |
<p:rule name="group empty 2"> | |
<p:judgement name="groupable"> | |
<p:var range="contentType"/> | |
<p:function name="empty"/> | |
</p:judgement> | |
</p:rule> | |
<p:rule name="group complex"> | |
<p:judgement name="groupable"> | |
<p:function name="complex"/> | |
<p:function name="complex"/> | |
</p:judgement> | |
</p:rule> | |
</p:proofSystem> | |
<para>Some patterns have a content-type. We use the following | |
additional notation.</para> | |
<variablelist> | |
<varlistentry><term> | |
<p:judgement name="contentType"> | |
<p:var range="pattern"/> | |
<p:var range="contentType"/> | |
</p:judgement> | |
</term> | |
<listitem><para>asserts that pattern <p:var range="pattern"/> has | |
content-type <p:var range="contentType"/></para></listitem> | |
</varlistentry> | |
<varlistentry><term> | |
<p:function name="max"> | |
<p:var range="contentType" sub="1"/> | |
<p:var range="contentType" sub="2"/> | |
</p:function> | |
</term> | |
<listitem><para>returns the maximum of <p:var range="contentType" | |
sub="1"/> and <p:var range="contentType" sub="2"/> where the | |
content-types in increasing order are <p:function name="empty"/>, | |
<p:function name="complex"/>, <p:function | |
name="simple"/></para></listitem> | |
</varlistentry> | |
</variablelist> | |
<para>The following rules define when a pattern has a content-type and, | |
if so, what it is.</para> | |
<p:proofSystem> | |
<p:rule name="value"> | |
<p:judgement name="contentType"> | |
<p:element name="value"> | |
<p:attribute name="datatypeLibrary"> | |
<p:var range="uri" sub="1"/> | |
</p:attribute> | |
<p:attribute name="type"> | |
<p:var range="ncname"/> | |
</p:attribute> | |
<p:attribute name="ns"> | |
<p:var range="uri" sub="2"/> | |
</p:attribute> | |
<p:var range="string"/> | |
</p:element> | |
<p:function name="simple"/> | |
</p:judgement> | |
</p:rule> | |
<p:rule name="data 1"> | |
<p:judgement name="contentType"> | |
<p:element name="data"> | |
<p:attribute name="datatypeLibrary"> | |
<p:var range="uri"/> | |
</p:attribute> | |
<p:attribute name="type"> | |
<p:var range="ncname"/> | |
</p:attribute> | |
<p:var range="params"/> | |
</p:element> | |
<p:function name="simple"/> | |
</p:judgement> | |
</p:rule> | |
<p:rule name="data 2"> | |
<p:judgement name="contentType"> | |
<p:var range="pattern"/> | |
<p:var range="contentType"/> | |
</p:judgement> | |
<p:judgement name="contentType"> | |
<p:element name="data"> | |
<p:attribute name="datatypeLibrary"> | |
<p:var range="uri"/> | |
</p:attribute> | |
<p:attribute name="type"> | |
<p:var range="ncname"/> | |
</p:attribute> | |
<p:var range="params"/> | |
<p:element name="except"> | |
<p:var range="pattern"/> | |
</p:element> | |
</p:element> | |
<p:function name="simple"/> | |
</p:judgement> | |
</p:rule> | |
<p:rule name="list"> | |
<p:judgement name="contentType"> | |
<p:element name="list"> | |
<p:var range="pattern"/> | |
</p:element> | |
<p:function name="simple"/> | |
</p:judgement> | |
</p:rule> | |
<p:rule name="text"> | |
<p:judgement name="contentType"> | |
<p:element name="text"/> | |
<p:function name="complex"/> | |
</p:judgement> | |
</p:rule> | |
<p:rule name="ref"> | |
<p:judgement name="contentType"> | |
<p:element name="ref"> | |
<p:attribute name="name"> | |
<p:var range="ncname"/> | |
</p:attribute> | |
</p:element> | |
<p:function name="complex"/> | |
</p:judgement> | |
</p:rule> | |
<p:rule name="empty"> | |
<p:judgement name="contentType"> | |
<p:element name="empty"/> | |
<p:function name="empty"/> | |
</p:judgement> | |
</p:rule> | |
<p:rule name="attribute"> | |
<p:judgement name="contentType"> | |
<p:var range="pattern"/> | |
<p:var range="contentType"/> | |
</p:judgement> | |
<p:judgement name="contentType"> | |
<p:element name="attribute"> | |
<p:var range="nameClass"/> | |
<p:var range="pattern"/> | |
</p:element> | |
<p:function name="empty"/> | |
</p:judgement> | |
</p:rule> | |
<p:rule name="group"> | |
<p:judgement name="contentType"> | |
<p:var range="pattern" sub="1"/> | |
<p:var range="contentType" sub="1"/> | |
</p:judgement> | |
<p:judgement name="contentType"> | |
<p:var range="pattern" sub="2"/> | |
<p:var range="contentType" sub="2"/> | |
</p:judgement> | |
<p:judgement name="groupable"> | |
<p:var range="contentType" sub="1"/> | |
<p:var range="contentType" sub="2"/> | |
</p:judgement> | |
<p:judgement name="contentType"> | |
<p:element name="group"> | |
<p:var range="pattern" sub="1"/> | |
<p:var range="pattern" sub="2"/> | |
</p:element> | |
<p:function name="max"> | |
<p:var range="contentType" sub="1"/> | |
<p:var range="contentType" sub="2"/> | |
</p:function> | |
</p:judgement> | |
</p:rule> | |
<p:rule name="interleave"> | |
<p:judgement name="contentType"> | |
<p:var range="pattern" sub="1"/> | |
<p:var range="contentType" sub="1"/> | |
</p:judgement> | |
<p:judgement name="contentType"> | |
<p:var range="pattern" sub="2"/> | |
<p:var range="contentType" sub="2"/> | |
</p:judgement> | |
<p:judgement name="groupable"> | |
<p:var range="contentType" sub="1"/> | |
<p:var range="contentType" sub="2"/> | |
</p:judgement> | |
<p:judgement name="contentType"> | |
<p:element name="interleave"> | |
<p:var range="pattern" sub="1"/> | |
<p:var range="pattern" sub="2"/> | |
</p:element> | |
<p:function name="max"> | |
<p:var range="contentType" sub="1"/> | |
<p:var range="contentType" sub="2"/> | |
</p:function> | |
</p:judgement> | |
</p:rule> | |
<p:rule name="oneOrMore"> | |
<p:judgement name="contentType"> | |
<p:var range="pattern"/> | |
<p:var range="contentType"/> | |
</p:judgement> | |
<p:judgement name="groupable"> | |
<p:var range="contentType"/> | |
<p:var range="contentType"/> | |
</p:judgement> | |
<p:judgement name="contentType"> | |
<p:element name="oneOrMore"> | |
<p:var range="pattern"/> | |
</p:element> | |
<p:var range="contentType"/> | |
</p:judgement> | |
</p:rule> | |
<p:rule name="choice"> | |
<p:judgement name="contentType"> | |
<p:var range="pattern" sub="1"/> | |
<p:var range="contentType" sub="1"/> | |
</p:judgement> | |
<p:judgement name="contentType"> | |
<p:var range="pattern" sub="2"/> | |
<p:var range="contentType" sub="2"/> | |
</p:judgement> | |
<p:judgement name="contentType"> | |
<p:element name="choice"> | |
<p:var range="pattern" sub="1"/> | |
<p:var range="pattern" sub="2"/> | |
</p:element> | |
<p:function name="max"> | |
<p:var range="contentType" sub="1"/> | |
<p:var range="contentType" sub="2"/> | |
</p:function> | |
</p:judgement> | |
</p:rule> | |
</p:proofSystem> | |
<note><para>The antecedent in the (data 2) rule above is in fact | |
redundant because of the prohibited paths in <xref | |
linkend="context-data-except"/>.</para></note> | |
<para>Now we can describe the restriction. We use the following | |
notation.</para> | |
<variablelist> | |
<varlistentry><term> | |
<p:judgement name="incorrectSchema"/> | |
</term> | |
<listitem><para>asserts that the schema is incorrect</para></listitem> | |
</varlistentry> | |
</variablelist> | |
<para>All patterns occurring as the content of an element pattern must | |
have a content-type.</para> | |
<p:proofSystem> | |
<p:rule name="element"> | |
<p:judgement name="bind"> | |
<p:var range="ncname"/> | |
<p:var range="nameClass"/> | |
<p:var range="pattern"/> | |
</p:judgement> | |
<p:not> | |
<p:judgement name="contentType"> | |
<p:var range="pattern"/> | |
<p:var range="contentType"/> | |
</p:judgement> | |
</p:not> | |
<p:judgement name="incorrectSchema"/> | |
</p:rule> | |
</p:proofSystem> | |
</section> | |
<section id="attribute-restrictions"> | |
<title>Restrictions on attributes</title> | |
<para>Duplicate attributes are not allowed. More precisely, for a | |
pattern <literal><group> <replaceable>p1</replaceable> | |
<replaceable>p2</replaceable> </group></literal> or | |
<literal><interleave> <replaceable>p1</replaceable> | |
<replaceable>p2</replaceable> </interleave></literal>, there must | |
not be a name that belongs to both the name class of an | |
<literal>attribute</literal> pattern occurring in | |
<replaceable>p1</replaceable> and the name class of an | |
<literal>attribute</literal> pattern occurring in | |
<replaceable>p2</replaceable>. A pattern <replaceable>p1</replaceable> | |
is defined to <firstterm>occur in</firstterm> a pattern | |
<replaceable>p2</replaceable> if</para> | |
<itemizedlist> | |
<listitem><para><replaceable>p1</replaceable> is | |
<replaceable>p2</replaceable>, or</para></listitem> | |
<listitem><para><replaceable>p2</replaceable> is a | |
<literal>choice</literal>, <literal>interleave</literal>, | |
<literal>group</literal> or <literal>oneOrMore</literal> element and | |
<replaceable>p1</replaceable> occurs in one or more children of | |
<replaceable>p2</replaceable>.</para></listitem> | |
</itemizedlist> | |
<para>Attributes using infinite name classes must be repeated. More | |
precisely, an <literal>attribute</literal> element that has an | |
<literal>anyName</literal> or <literal>nsName</literal> descendant | |
element must have a <literal>oneOrMore</literal> ancestor | |
element.</para> | |
<note><para>This restriction is necessary for closure under | |
negation.</para></note> | |
</section> | |
<section id="interleave-restrictions"> | |
<title>Restrictions on <literal>interleave</literal></title> | |
<para>For a pattern <literal><interleave> | |
<replaceable>p1</replaceable> <replaceable>p2</replaceable> | |
</interleave></literal>,</para> | |
<itemizedlist> | |
<listitem><para>there must not be a name that belongs to both the name | |
class of an <literal>element</literal> pattern referenced by a | |
<literal>ref</literal> pattern occurring in | |
<replaceable>p1</replaceable> and the name class of an | |
<literal>element</literal> pattern referenced by a | |
<literal>ref</literal> pattern occurring in | |
<replaceable>p2</replaceable>, and</para></listitem> | |
<listitem><para>a <literal>text</literal> pattern must not occur in | |
both <replaceable>p1</replaceable> and | |
<replaceable>p2</replaceable>.</para></listitem> | |
</itemizedlist> | |
<para><xref linkend="attribute-restrictions"/> defines when one | |
pattern is considered to occur in another pattern.</para> | |
</section> | |
</section> | |
<section id="conformance"> | |
<title>Conformance</title> | |
<para>A conforming RELAX NG validator must be able to determine for | |
any XML document whether it is a correct RELAX NG schema. A | |
conforming RELAX NG validator must be able to determine for any XML | |
document and for any correct RELAX NG schema whether the document is | |
valid with respect to the schema.</para> | |
<para>However, the requirements in the preceding paragraph do not | |
apply if the schema uses a datatype library that the validator does | |
not support. A conforming RELAX NG validator is only required to | |
support the built-in datatype library described in <xref | |
linkend="built-in-datatype"/>. A validator that claims conformance to | |
RELAX NG should document which datatype libraries it supports. The | |
requirements in the preceding paragraph also do not apply if the | |
schema includes <literal>externalRef</literal> or | |
<literal>include</literal> elements and the validator is unable to | |
retrieve the resource identified by the URI or is unable to construct | |
an element from the retrieved resource. A validator that claims | |
conformance to RELAX NG should document its capabilities for handling | |
URI references.</para> | |
</section> | |
<appendix> | |
<title>RELAX NG schema for RELAX NG</title> | |
<rngref src="relaxng.rng"/> | |
</appendix> | |
<appendix> | |
<title>Changes since version 0.9</title> | |
<para>The changes in this version relative to version 0.9 | |
are as follows:</para> | |
<itemizedlist> | |
<listitem><para>in the namespace URI, <literal>0.9</literal> has been | |
changed to <literal>1.0</literal></para></listitem> | |
<listitem><para><literal>data/except//empty</literal> has been added | |
as a prohibited path (see <xref | |
linkend="context-data-except"/>)</para></listitem> | |
<listitem><para><literal>start//empty</literal> has been added | |
as a prohibited path (see <xref | |
linkend="context-start"/>)</para></listitem> | |
<listitem><para><xref linkend="number-child-elements"/> now specifies how a | |
<literal>list</literal> element with more than one child element is | |
transformed</para></listitem> | |
<listitem><para><xref linkend="notAllowed"/> now specifies how a | |
<literal>notAllowed</literal> element occurring in an | |
<literal>except</literal> element is transformed</para></listitem> | |
<listitem><para>although a relative URI is not allowed as the value of | |
the <literal>ns</literal> and <literal>datatypeLibrary</literal> | |
attributes, an empty string is allowed (see <xref | |
linkend="full-syntax"/>)</para></listitem> | |
<listitem><para>the removal of unreachable definitions in <xref | |
linkend="define-ref"/> is now correctly specified</para></listitem> | |
<listitem><para><xref linkend="notAllowed"/> now specifies that | |
<literal>define</literal> elements that are no longer reachable are | |
removed</para></listitem> | |
<listitem><para><xref linkend="constraints"/> has been added; the | |
restrictions on the contents of <literal>except</literal> in name | |
classes that are now specified in the newly added section were | |
previously specified in a subsection of <xref | |
linkend="contextual-restriction"/>, which has been | |
removed</para></listitem> | |
<listitem><para>the treatment of element and attribute values that | |
consist only of whitespace has been refined (see <xref | |
linkend="element-pattern"/> and <xref | |
linkend="data-pattern"/>)</para></listitem> | |
<listitem><para>attributes with infinite name classes are now required | |
to be repeated (see <xref | |
linkend="attribute-restrictions"/>)</para></listitem> | |
<listitem><para>restrictions have been imposed on | |
<literal>interleave</literal> (see <xref | |
linkend="interleave-restrictions"/>); <literal>list//interleave</literal> | |
has been added as a prohibited path (see <xref | |
linkend="list-restrictions"/>)</para></listitem> | |
<listitem><para>some of the prohibited paths in <xref | |
linkend="contextual-restriction"/> have been corrected to use | |
<literal>ref</literal> rather than | |
<literal>element</literal></para></listitem> | |
<listitem><para>an error in the inference rule (text 1) in <xref | |
linkend="text-pattern"/> has been corrected</para></listitem> | |
<listitem><para>the value of the <literal>ns</literal> attribute is | |
now unconstrained (see <xref | |
linkend="full-syntax"/>)</para></listitem> | |
</itemizedlist> | |
</appendix> | |
<appendix> | |
<title>RELAX NG TC (Non-Normative)</title> | |
<para>This specification was prepared and approved for publication by | |
the RELAX NG TC. The current members of the TC are:</para> | |
<itemizedlist> | |
<listitem><para>Fabio Arciniegas</para></listitem> | |
<listitem><para>James Clark</para></listitem> | |
<listitem><para>Mike Fitzgerald</para></listitem> | |
<listitem><para>KAWAGUCHI Kohsuke</para></listitem> | |
<listitem><para>Josh Lubell</para></listitem> | |
<listitem><para>MURATA Makoto</para></listitem> | |
<listitem><para>Norman Walsh</para></listitem> | |
<listitem><para>David Webber</para></listitem> | |
</itemizedlist> | |
</appendix> | |
<bibliography><title>References</title> | |
<bibliodiv><title>Normative</title> | |
<bibliomixed id="xml-rec"><abbrev>XML 1.0</abbrev>Tim Bray, | |
Jean Paoli, and | |
C. M. Sperberg-McQueen, Eve Maler, editors. | |
<citetitle><ulink url="http://www.w3.org/TR/REC-xml">Extensible Markup | |
Language (XML) 1.0 Second Edition</ulink></citetitle>. | |
W3C (World Wide Web Consortium), 2000.</bibliomixed> | |
<bibliomixed id="xml-names"><abbrev>XML Namespaces</abbrev>Tim Bray, | |
Dave Hollander, | |
and Andrew Layman, editors. | |
<citetitle><ulink url="http://www.w3.org/TR/REC-xml-names/">Namespaces in | |
XML</ulink></citetitle>. | |
W3C (World Wide Web Consortium), 1999.</bibliomixed> | |
<bibliomixed id="xlink"><abbrev>XLink</abbrev>Steve DeRose, Eve Maler | |
and David Orchard, editors. | |
<citetitle><ulink url="http://www.w3.org/TR/xlink/">XML Linking | |
Language (XLink) Version 1.0</ulink></citetitle>. | |
W3C (World Wide Web Consortium), 2001.</bibliomixed> | |
<bibliomixed id="infoset"><abbrev>XML Infoset</abbrev>John Cowan, Richard Tobin, | |
editors. | |
<citetitle><ulink url="http://www.w3.org/TR/xml-infoset/">XML | |
Information Set</ulink></citetitle>. | |
W3C (World Wide Web Consortium), 2001.</bibliomixed> | |
<bibliomixed id="rfc2396"><abbrev>RFC 2396</abbrev>T. Berners-Lee, R. Fielding, L. Masinter. | |
<citetitle><ulink url="http://www.ietf.org/rfc/rfc2396.txt" >RFC 2396: | |
Uniform Resource Identifiers (URI): Generic | |
Syntax</ulink></citetitle>. | |
IETF (Internet Engineering Task Force). 1998.</bibliomixed> | |
<bibliomixed id="rfc2732"><abbrev>RFC 2732</abbrev>R. Hinden, B. Carpenter, L. Masinter. | |
<citetitle><ulink url="http://www.ietf.org/rfc/rfc2732.txt">RFC 2732: Format for Literal IPv6 Addresses in URL's</ulink></citetitle>. | |
IETF (Internet Engineering Task Force), 1999.</bibliomixed> | |
<bibliomixed id="rfc3023"><abbrev>RFC 3023</abbrev> M. Murata, | |
S. St.Laurent, D. Kohn. <citetitle><ulink | |
url="http://www.ietf.org/rfc/rfc3023.txt">RFC 3023: XML Media | |
Types</ulink></citetitle>. IETF (Internet Engineering Task Force), | |
2001.</bibliomixed> | |
</bibliodiv> | |
<bibliodiv><title>Non-Normative</title> | |
<bibliomixed id="xmlschema-2"><abbrev>W3C XML Schema Datatypes</abbrev>Paul V. Biron, Ashok Malhotra, editors. | |
<citetitle><ulink url="http://www.w3.org/TR/xmlschema-2/">XML Schema Part 2: Datatypes</ulink></citetitle>. | |
W3C (World Wide Web Consortium), 2001.</bibliomixed> | |
<bibliomixed id="trex"><abbrev>TREX</abbrev>James Clark. | |
<citetitle><ulink url="http://www.thaiopensource.com/trex/">TREX - Tree Regular Expressions for XML</ulink></citetitle>. | |
Thai Open Source Software Center, 2001.</bibliomixed> | |
<bibliomixed id="relax"><abbrev>RELAX</abbrev>MURATA Makoto. | |
<citetitle><ulink url="http://www.xml.gr.jp/relax/">RELAX (Regular | |
Language description for XML)</ulink></citetitle>. INSTAC | |
(Information Technology Research and Standardization Center), 2001.</bibliomixed> | |
<bibliomixed id="xsfd"><abbrev>XML Schema Formal</abbrev>Allen Brown, | |
Matthew Fuchs, Jonathan Robie, Philip Wadler, editors. | |
<citetitle><ulink url="http://www.w3.org/TR/xmlschema-formal/">XML Schema: Formal Description</ulink></citetitle>. | |
W3C (World Wide Web Consortium), 2001.</bibliomixed> | |
<bibliomixed id="tutorial"><abbrev>Tutorial</abbrev>James Clark, | |
Makoto MURATA, editors. <citetitle><ulink | |
url="http://www.oasis-open.org/committees/relax-ng/tutorial.html">RELAX | |
NG Tutorial</ulink></citetitle>. OASIS, 2001.</bibliomixed> | |
</bibliodiv> | |
</bibliography> | |
</article> |