Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 1 | <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" |
| 2 | "http://www.w3.org/TR/html4/loose.dtd"> |
| 3 | <html> |
| 4 | <head> |
| 5 | <meta http-equiv="Content-Type" content="text/html"> |
William M. Brack | 008c06b | 2003-09-01 22:17:39 +0000 | [diff] [blame] | 6 | <style type="text/css"></style> |
Daniel Veillard | 6943a4d | 2002-12-28 18:07:59 +0000 | [diff] [blame] | 7 | <!-- |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 8 | TD {font-family: Verdana,Arial,Helvetica} |
| 9 | BODY {font-family: Verdana,Arial,Helvetica; margin-top: 2em; margin-left: 0em; margin-right: 0em} |
| 10 | H1 {font-family: Verdana,Arial,Helvetica} |
| 11 | H2 {font-family: Verdana,Arial,Helvetica} |
| 12 | H3 {font-family: Verdana,Arial,Helvetica} |
William M. Brack | 008c06b | 2003-09-01 22:17:39 +0000 | [diff] [blame] | 13 | A:link, A:visited, A:active { text-decoration: underline } |
Daniel Veillard | 6943a4d | 2002-12-28 18:07:59 +0000 | [diff] [blame] | 14 | </style> |
William M. Brack | 008c06b | 2003-09-01 22:17:39 +0000 | [diff] [blame] | 15 | --> |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 16 | <title>XML resources publication guidelines</title> |
| 17 | </head> |
| 18 | |
| 19 | <body bgcolor="#fffacd" text="#000000"> |
| 20 | <h1 align="center">XML resources publication guidelines</h1> |
| 21 | |
| 22 | <p></p> |
| 23 | |
| 24 | <p>The goal of this document is to provide a set of guidelines and tips |
| 25 | helping the publication and deployment of <a |
Daniel Veillard | 1cb689b | 2004-12-26 21:01:48 +0000 | [diff] [blame] | 26 | href="http://www.w3.org/XML/">XML</a> resources for the <a |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 27 | href="http://www.gnome.org/">GNOME project</a>. However it is not tied to |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 28 | GNOME and might be helpful more generally. I welcome <a |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 29 | href="mailto:veillard@redhat.com">feedback</a> on this document.</p> |
| 30 | |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 31 | <p>The intended audience is the software developers who started using XML |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 32 | for some of the resources of their project, as a storage format, for data |
| 33 | exchange, checking or transformations. There have been an increasing number |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 34 | of new XML formats defined, but not all steps have been taken, possibly because of |
| 35 | lack of documentation, to truly gain all the benefits of the use of XML. |
| 36 | These guidelines hope to improve the matter and provide a better overview of |
| 37 | the overall XML processing and associated steps needed to deploy it |
Daniel Veillard | 6943a4d | 2002-12-28 18:07:59 +0000 | [diff] [blame] | 38 | successfully:</p> |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 39 | |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 40 | <p>Table of contents:</p> |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 41 | <ol> |
| 42 | <li><a href="#Design">Design guidelines</a></li> |
| 43 | <li><a href="#Canonical">Canonical URL</a></li> |
| 44 | <li><a href="#Catalog">Catalog setup</a></li> |
| 45 | <li><a href="#Package">Package integration</a></li> |
| 46 | </ol> |
| 47 | |
| 48 | <h2><a name="Design">Design guidelines</a></h2> |
| 49 | |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 50 | <p>This part intends to focus on the format itself of XML. It may arrive |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 51 | a bit too late since the structure of the document may already be cast in |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 52 | existing and deployed code. Still, here are a few rules which might be helpful |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 53 | when designing a new XML vocabulary or making the revision of an existing |
| 54 | format:</p> |
| 55 | |
| 56 | <h3>Reuse existing formats:</h3> |
| 57 | |
| 58 | <p>This may sounds a bit simplistic, but before designing your own format, |
| 59 | try to lookup existing XML vocabularies on similar data. Ideally this allows |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 60 | you to reuse them, in which case a lot of the existing tools like DTD, schemas |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 61 | and stylesheets may already be available. If you are looking at a |
| 62 | documentation format, <a href="http://www.docbook.org/">DocBook</a> should |
| 63 | handle your needs. If reuse is not possible because some semantic or use case |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 64 | aspects are too different this will be helpful avoiding design errors like |
| 65 | targeting the vocabulary to the wrong abstraction level. In this format |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 66 | design phase try to be synthetic and be sure to express the real content of |
| 67 | your data and use the XML structure to express the semantic and context of |
| 68 | those data.</p> |
| 69 | |
Daniel Veillard | 6943a4d | 2002-12-28 18:07:59 +0000 | [diff] [blame] | 70 | <h3>DTD rules:</h3> |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 71 | |
| 72 | <p>Building a DTD (Document Type Definition) or a Schema describing the |
| 73 | structure allowed by instances is the core of the design process of the |
| 74 | vocabulary. Here are a few tips:</p> |
| 75 | <ul> |
Daniel Veillard | d105eb6 | 2003-01-13 16:21:45 +0000 | [diff] [blame] | 76 | <li>use significant words for the element and attributes names.</li> |
| 77 | <li>do not use attributes for general textual content, attributes |
| 78 | will be modified by the parser before reaching the application, |
| 79 | spaces and line informations will be modified.</li> |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 80 | <li>use single elements for every string that might be subject to |
| 81 | localization. The canonical way to localize XML content is to use |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 82 | siblings element carrying different xml:lang attributes like in the |
| 83 | following: |
| 84 | <pre><welcome> |
| 85 | <msg xml:lang="en">hello</msg> |
| 86 | <msg xml:lang="fr">bonjour</msg> |
| 87 | </welcome></pre> |
| 88 | </li> |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 89 | <li>use attributes to refine the content of an element but avoid them for |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 90 | more complex tasks, attribute parsing is not cheaper than an element and |
| 91 | it is far easier to make an element content more complex while attribute |
| 92 | will have to remain very simple.</li> |
| 93 | </ul> |
| 94 | |
| 95 | <h3>Versioning:</h3> |
| 96 | |
| 97 | <p>As part of the design, make sure the structure you define will be usable |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 98 | for future extension that you may not consider for the current version. There |
| 99 | are two parts to this:</p> |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 100 | <ul> |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 101 | <li>Make sure the instance contains a version number which will allow to |
| 102 | make backward compatibility easy. Something as simple as having a |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 103 | <code>version="1.0"</code> on the root document of the instance is |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 104 | sufficient.</li> |
| 105 | <li>While designing the code doing the analysis of the data provided by the |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 106 | XML parser, make sure you can work with unknown versions, generate a UI |
| 107 | warning and process only the tags recognized by your version but keep in |
| 108 | mind that you should not break on unknown elements if the version |
| 109 | attribute was not in the recognized set.</li> |
| 110 | </ul> |
| 111 | |
Daniel Veillard | 6943a4d | 2002-12-28 18:07:59 +0000 | [diff] [blame] | 112 | <h3>Other design parts:</h3> |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 113 | |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 114 | <p>While defining you vocabulary, try to think in term of other usage of your |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 115 | data, for example how using XSLT stylesheets could be used to make an HTML |
| 116 | view of your data, or to convert it into a different format. Checking XML |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 117 | Schemas and looking at defining an XML Schema with a more complete |
| 118 | validation and datatyping of your data structures is important, this helps |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 119 | avoiding some mistakes in the design phase.</p> |
| 120 | |
| 121 | <h3>Namespace:</h3> |
| 122 | |
| 123 | <p>If you expect your XML vocabulary to be used or recognized outside of your |
| 124 | application (for example binding a specific processing from a graphic shell |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 125 | like Nautilus to an instance of your data) then you should really define an <a |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 126 | href="http://www.w3.org/TR/REC-xml-names/">XML namespace</a> for your |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 127 | vocabulary. A namespace name is an URL (absolute URI more precisely). It is |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 128 | generally recommended to anchor it as an HTTP resource to a server associated |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 129 | with the software project. See the next section about this. In practice this |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 130 | will mean that XML parsers will not handle your element names as-is but as a |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 131 | couple based on the namespace name and the element name. This allows it to |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 132 | recognize and disambiguate processing. Unicity of the namespace name can be |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 133 | for the most part guaranteed by the use of the DNS registry. Namespace can |
| 134 | also be used to carry versioning information like:</p> |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 135 | |
| 136 | <p><code>"http://www.gnome.org/project/projectname/1.0/"</code></p> |
| 137 | |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 138 | <p>An easy way to use them is to make them the default namespace on the |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 139 | root element of the XML instance like:</p> |
| 140 | <pre><structure xmlns="http://www.gnome.org/project/projectname/1.0/"> |
| 141 | <data> |
| 142 | ... |
| 143 | </data> |
| 144 | </structure></pre> |
| 145 | |
| 146 | <p>In that document, structure and all descendant elements like data are in |
| 147 | the given namespace.</p> |
| 148 | |
| 149 | <h2><a name="Canonical">Canonical URL</a></h2> |
| 150 | |
| 151 | <p>As seen in the previous namespace section, while XML processing is not |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 152 | tied to the Web there is a natural synergy between both. XML was designed to |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 153 | be available on the Web, and keeping the infrastructure that way helps |
| 154 | deploying the XML resources. The core of this issue is the notion of |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 155 | "Canonical URL" of an XML resource. The resource can be an XML document, a |
| 156 | DTD, a stylesheet, a schema, or even non-XML data associated with an XML |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 157 | resource, the canonical URL is the URL where the "master" copy of that |
| 158 | resource is expected to be present on the Web. Usually when processing XML a |
| 159 | copy of the resource will be present on the local disk, maybe in |
| 160 | /usr/share/xml or /usr/share/sgml maybe in /opt or even on C:\projectname\ |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 161 | (horror !). The key point is that the way to name that resource should be |
| 162 | independent of the actual place where it resides on disk if it is available, |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 163 | and the fact that the processing will still work if there is no local copy |
| 164 | (and that the machine where the processing is connected to the Internet).</p> |
| 165 | |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 166 | <p>What this really means is that one should never use the local name of a |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 167 | resource to reference it but always use the canonical URL. For example in a |
| 168 | DocBook instance the following should not be used:</p> |
| 169 | <pre><!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"<br> |
Daniel Veillard | 6943a4d | 2002-12-28 18:07:59 +0000 | [diff] [blame] | 170 | |
| 171 | |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 172 | "/usr/share/xml/docbook/4.2/docbookx.dtd"></pre> |
| 173 | |
| 174 | <p>But always reference the canonical URL for the DTD:</p> |
| 175 | <pre><!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"<br> |
Daniel Veillard | 6943a4d | 2002-12-28 18:07:59 +0000 | [diff] [blame] | 176 | |
| 177 | |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 178 | "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd"> </pre> |
| 179 | |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 180 | <p>Similarly, the document instance may reference the <a |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 181 | href="http://www.w3.org/TR/xslt">XSLT</a> stylesheets needed to process it to |
| 182 | generate HTML, and the canonical URL should be used:</p> |
| 183 | <pre><?xml-stylesheet |
| 184 | href="http://docbook.sourceforge.net/release/xsl/current/html/docbook.xsl" |
| 185 | type="text/xsl"?></pre> |
| 186 | |
Daniel Veillard | 6943a4d | 2002-12-28 18:07:59 +0000 | [diff] [blame] | 187 | <p>Defining the canonical URL for the resources needed should obey a few |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 188 | simple rules similar to those used to design namespace names:</p> |
| 189 | <ul> |
| 190 | <li>use a DNS name you know is associated to the project and will be |
| 191 | available on the long term</li> |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 192 | <li>within that server space, reserve the right to the subtree where you |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 193 | intend to keep those data</li> |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 194 | <li>version the URL so that multiple concurrent versions of the resources |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 195 | can be hosted simultaneously</li> |
| 196 | </ul> |
| 197 | |
| 198 | <h2><a name="Catalog">Catalog setup</a></h2> |
| 199 | |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 200 | <h3>How catalogs work:</h3> |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 201 | |
| 202 | <p>The catalogs are the technical mechanism which allow the XML processing |
| 203 | tools to use a local copy of the resources if it is available even if the |
| 204 | instance document references the canonical URL. <a |
| 205 | href="http://www.oasis-open.org/committees/entity/">XML Catalogs</a> are |
| 206 | anchored in the root catalog (usually <code>/etc/xml/catalog</code> or |
| 207 | defined by the user). They are a tree of XML documents defining the mappings |
| 208 | between the canonical naming space and the local installed ones, this can be |
| 209 | seen as a static cache structure.</p> |
| 210 | |
| 211 | <p>When the XML processor is asked to process a resource it will |
| 212 | automatically test for a locally available version in the catalog, starting |
| 213 | from the root catalog, and possibly fetching sub-catalog resources until it |
| 214 | finds that the catalog has that resource or not. If not the default |
| 215 | processing of fetching the resource from the Web is done, allowing in most |
| 216 | case to recover from a catalog miss. The key point is that the document |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 217 | instances are totally independent of the availability of a catalog or from |
| 218 | the actual place where the local resource they reference may be installed. |
| 219 | This greatly improves the management of the documents in the long run, making |
| 220 | them independent of the platform or toolchain used to process them. The |
Daniel Veillard | 6943a4d | 2002-12-28 18:07:59 +0000 | [diff] [blame] | 221 | figure below tries to express that mechanism:<img src="catalog.gif" |
| 222 | alt="Picture describing the catalog "></p> |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 223 | |
| 224 | <h3>Usual catalog setup:</h3> |
| 225 | |
| 226 | <p>Usually catalogs for a project are setup as a 2 level hierarchical cache, |
| 227 | the root catalog containing only "delegates" indicating a separate subcatalog |
| 228 | dedicated to the project. The goal is to keep the root catalog clean and |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 229 | simplify the maintenance of the catalog by using separate catalogs per |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 230 | project. For example when creating a catalog for the <a |
| 231 | href="http://www.w3.org/TR/xhtml1">XHTML1</a> DTDs, only 3 items are added to |
| 232 | the root catalog:</p> |
| 233 | <pre> <delegatePublic publicIdStartString="-//W3C//DTD XHTML 1.0" |
| 234 | catalog="file:///usr/share/sgml/xhtml1/xmlcatalog"/> |
| 235 | <delegateSystem systemIdStartString="http://www.w3.org/TR/xhtml1/DTD" |
| 236 | catalog="file:///usr/share/sgml/xhtml1/xmlcatalog"/> |
| 237 | <delegateURI uriStartString="http://www.w3.org/TR/xhtml1/DTD" |
| 238 | catalog="file:///usr/share/sgml/xhtml1/xmlcatalog"/></pre> |
| 239 | |
| 240 | <p>They are all "delegates" meaning that if the catalog system is asked to |
| 241 | resolve a reference corresponding to them, it has to lookup a sub catalog. |
| 242 | Here the subcatalog was installed as |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 243 | <code>/usr/share/sgml/xhtml1/xmlcatalog</code> in the local tree. That |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 244 | decision is left to the sysadmin or the packager for that system and may |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 245 | obey different rules, but the actual place on the filesystem (or on a |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 246 | resource cache on the local network) will not influence the processing as |
| 247 | long as it is available. The first rule indicate that if the reference uses a |
Daniel Veillard | 6943a4d | 2002-12-28 18:07:59 +0000 | [diff] [blame] | 248 | PUBLIC identifier beginning with the</p> |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 249 | |
| 250 | <p><code>"-//W3C//DTD XHTML 1.0"</code></p> |
| 251 | |
| 252 | <p>substring, then the catalog lookup should be limited to the specific given |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 253 | lookup catalog. Similarly the second and third entries indicate those |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 254 | delegation rules for SYSTEM, DOCTYPE or normal URI references when the URL |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 255 | starts with the <code>"http://www.w3.org/TR/xhtml1/DTD"</code> substring |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 256 | which indicates the location on the W3C server where the XHTML1 resources are |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 257 | stored. Those are the beginning of all Canonical URLs for XHTML1 resources. |
| 258 | Those three rules are sufficient in practice to capture all references to XHTML1 |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 259 | resources and direct the processing tools to the right subcatalog.</p> |
| 260 | |
| 261 | <h3>A subcatalog example:</h3> |
| 262 | |
| 263 | <p>Here is the complete subcatalog used for XHTML1:</p> |
| 264 | <pre><?xml version="1.0"?> |
| 265 | <!DOCTYPE catalog PUBLIC "-//OASIS//DTD Entity Resolution XML Catalog V1.0//EN" |
| 266 | "http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd"> |
| 267 | <catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"> |
| 268 | <public publicId="-//W3C//DTD XHTML 1.0 Strict//EN" |
| 269 | uri="xhtml1-20020801/DTD/xhtml1-strict.dtd"/> |
| 270 | <public publicId="-//W3C//DTD XHTML 1.0 Transitional//EN" |
| 271 | uri="xhtml1-20020801/DTD/xhtml1-transitional.dtd"/> |
| 272 | <public publicId="-//W3C//DTD XHTML 1.0 Frameset//EN" |
| 273 | uri="xhtml1-20020801/DTD/xhtml1-frameset.dtd"/> |
| 274 | <rewriteSystem systemIdStartString="http://www.w3.org/TR/xhtml1/DTD" |
| 275 | rewritePrefix="xhtml1-20020801/DTD"/> |
| 276 | <rewriteURI uriStartString="http://www.w3.org/TR/xhtml1/DTD" |
| 277 | rewritePrefix="xhtml1-20020801/DTD"/> |
Daniel Veillard | 6943a4d | 2002-12-28 18:07:59 +0000 | [diff] [blame] | 278 | </catalog></pre> |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 279 | |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 280 | <p>There are a few things to notice:</p> |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 281 | <ul> |
| 282 | <li>this is an XML resource, it points to the DTD using Canonical URLs, the |
| 283 | root element defines a namespace (but based on an URN not an HTTP |
| 284 | URL).</li> |
| 285 | <li>it contains 5 rules, the 3 first ones are direct mapping for the 3 |
| 286 | PUBLIC identifiers defined by the XHTML1 specification and associating |
| 287 | them with the local resource containing the DTD, the 2 last ones are |
| 288 | rewrite rules allowing to build the local filename for any URL based on |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 289 | "http://www.w3.org/TR/xhtml1/DTD", the local cache simplifies the rules by |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 290 | keeping the same structure as the on-line server at the Canonical URL</li> |
| 291 | <li>the local resources are designated using URI references (the uri or |
| 292 | rewritePrefix attributes), the base being the containing sub-catalog URL, |
| 293 | which means that in practice the copy of the XHTML1 strict DTD is stored |
| 294 | locally in |
| 295 | <code>/usr/share/sgml/xhtml1/xmlcatalog/xhtml1-20020801/DTD/xhtml1-strict.dtd</code></li> |
| 296 | </ul> |
| 297 | |
| 298 | <p>Those 5 rules are sufficient to cover all references to the resources held |
| 299 | at the Canonical URL for the XHTML1 DTDs.</p> |
| 300 | |
| 301 | <h2><a name="Package">Package integration</a></h2> |
| 302 | |
| 303 | <p>Creating and removing catalogs should be handled as part of the process of |
| 304 | (un)installing the local copy of the resources. The catalog files being XML |
| 305 | resources should be processed with XML based tools to avoid problems with the |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 306 | generated files, the xmlcatalog command coming with libxml2 allows you to create |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 307 | catalogs, and add or remove rules at that time. Here is a complete example |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 308 | coming from the RPM for the XHTML1 DTDs post install script. While this example |
Daniel Veillard | 6943a4d | 2002-12-28 18:07:59 +0000 | [diff] [blame] | 309 | is platform and packaging specific, this can be useful as a an example in |
| 310 | other contexts:</p> |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 311 | <pre>%post |
| 312 | CATALOG=/usr/share/sgml/xhtml1/xmlcatalog |
| 313 | # |
| 314 | # Register it in the super catalog with the appropriate delegates |
| 315 | # |
| 316 | ROOTCATALOG=/etc/xml/catalog |
| 317 | |
| 318 | if [ ! -r $ROOTCATALOG ] |
| 319 | then |
| 320 | /usr/bin/xmlcatalog --noout --create $ROOTCATALOG |
| 321 | fi |
| 322 | |
| 323 | if [ -w $ROOTCATALOG ] |
| 324 | then |
| 325 | /usr/bin/xmlcatalog --noout --add "delegatePublic" \ |
| 326 | "-//W3C//DTD XHTML 1.0" \ |
| 327 | "file://$CATALOG" $ROOTCATALOG |
| 328 | /usr/bin/xmlcatalog --noout --add "delegateSystem" \ |
| 329 | "http://www.w3.org/TR/xhtml1/DTD" \ |
| 330 | "file://$CATALOG" $ROOTCATALOG |
| 331 | /usr/bin/xmlcatalog --noout --add "delegateURI" \ |
| 332 | "http://www.w3.org/TR/xhtml1/DTD" \ |
| 333 | "file://$CATALOG" $ROOTCATALOG |
| 334 | fi</pre> |
| 335 | |
| 336 | <p>The XHTML1 subcatalog is not created on-the-fly in that case, it is |
| 337 | installed as part of the files of the packages. So the only work needed is to |
| 338 | make sure the root catalog exists and register the delegate rules.</p> |
| 339 | |
MST 2003 John Fleck | 97ddfc0 | 2003-01-20 00:09:57 +0000 | [diff] [blame] | 340 | <p>Similarly, the script for the post-uninstall just remove the rules from the |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 341 | catalog:</p> |
| 342 | <pre>%postun |
| 343 | # |
| 344 | # On removal, unregister the xmlcatalog from the supercatalog |
| 345 | # |
| 346 | if [ "$1" = 0 ]; then |
| 347 | CATALOG=/usr/share/sgml/xhtml1/xmlcatalog |
| 348 | ROOTCATALOG=/etc/xml/catalog |
| 349 | |
| 350 | if [ -w $ROOTCATALOG ] |
| 351 | then |
| 352 | /usr/bin/xmlcatalog --noout --del \ |
| 353 | "-//W3C//DTD XHTML 1.0" $ROOTCATALOG |
| 354 | /usr/bin/xmlcatalog --noout --del \ |
| 355 | "http://www.w3.org/TR/xhtml1/DTD" $ROOTCATALOG |
| 356 | /usr/bin/xmlcatalog --noout --del \ |
| 357 | "http://www.w3.org/TR/xhtml1/DTD" $ROOTCATALOG |
| 358 | fi |
| 359 | fi</pre> |
| 360 | |
| 361 | <p>Note the test against $1, this is needed to not remove the delegate rules |
| 362 | in case of upgrade of the package.</p> |
| 363 | |
| 364 | <p>Following the set of guidelines and tips provided in this document should |
| 365 | help deploy the XML resources in the GNOME framework without much pain and |
| 366 | ensure a smooth evolution of the resource and instances.</p> |
| 367 | |
| 368 | <p><a href="mailto:veillard@redhat.com">Daniel Veillard</a></p> |
| 369 | |
| 370 | <p>$Id$</p> |
| 371 | |
Daniel Veillard | 6943a4d | 2002-12-28 18:07:59 +0000 | [diff] [blame] | 372 | <p></p> |
Daniel Veillard | 8329884 | 2002-12-28 15:12:33 +0000 | [diff] [blame] | 373 | </body> |
| 374 | </html> |