Daniel Veillard | 43d3f61 | 2001-11-10 11:57:23 +0000 | [diff] [blame] | 1 | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd"> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 2 | <html> |
| 3 | <head> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 4 | <meta content="text/html; charset=ISO-8859-1" http-equiv="Content-Type"> |
| 5 | <style type="text/css"><!-- |
Daniel Veillard | 373a475 | 2002-02-21 14:46:29 +0000 | [diff] [blame] | 6 | TD {font-family: Verdana,Arial,Helvetica} |
| 7 | BODY {font-family: Verdana,Arial,Helvetica; margin-top: 2em; margin-left: 0em; margin-right: 0em} |
| 8 | H1 {font-family: Verdana,Arial,Helvetica} |
| 9 | H2 {font-family: Verdana,Arial,Helvetica} |
| 10 | H3 {font-family: Verdana,Arial,Helvetica} |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 11 | A:link, A:visited, A:active { text-decoration: underline } |
| 12 | --></style> |
| 13 | <title>Catalog support</title> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 14 | </head> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 15 | <body bgcolor="#8b7765" text="#000000" link="#000000" vlink="#000000"> |
| 16 | <table border="0" width="100%" cellpadding="5" cellspacing="0" align="center"><tr> |
| 17 | <td width="180"> |
| 18 | <a href="http://www.gnome.org/"><img src="smallfootonly.gif" alt="Gnome Logo"></a><a href="http://www.w3.org/Status"><img src="w3c.png" alt="W3C Logo"></a><a href="http://www.redhat.com/"><img src="redhat.gif" alt="Red Hat Logo"></a> |
| 19 | </td> |
| 20 | <td><table border="0" width="90%" cellpadding="2" cellspacing="0" align="center" bgcolor="#000000"><tr><td><table width="100%" border="0" cellspacing="1" cellpadding="3" bgcolor="#fffacd"><tr><td align="center"> |
| 21 | <h1>The XML C library for Gnome</h1> |
| 22 | <h2>Catalog support</h2> |
| 23 | </td></tr></table></td></tr></table></td> |
| 24 | </tr></table> |
| 25 | <table border="0" cellpadding="4" cellspacing="0" width="100%" align="center"><tr><td bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="2" width="100%"><tr> |
| 26 | <td valign="top" width="200" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td> |
| 27 | <table width="100%" border="0" cellspacing="1" cellpadding="3"> |
| 28 | <tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr> |
Daniel Veillard | 8acca11 | 2002-01-21 09:52:27 +0000 | [diff] [blame] | 29 | <tr><td bgcolor="#fffacd"><ul> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 30 | <li><a href="index.html">Home</a></li> |
| 31 | <li><a href="intro.html">Introduction</a></li> |
| 32 | <li><a href="FAQ.html">FAQ</a></li> |
| 33 | <li><a href="docs.html">Documentation</a></li> |
| 34 | <li><a href="bugs.html">Reporting bugs and getting help</a></li> |
| 35 | <li><a href="help.html">How to help</a></li> |
| 36 | <li><a href="downloads.html">Downloads</a></li> |
| 37 | <li><a href="news.html">News</a></li> |
Daniel Veillard | 7b602b4 | 2002-01-08 13:26:00 +0000 | [diff] [blame] | 38 | <li><a href="XMLinfo.html">XML</a></li> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 39 | <li><a href="XSLT.html">XSLT</a></li> |
Daniel Veillard | 6dbcaf8 | 2002-02-20 14:37:47 +0000 | [diff] [blame] | 40 | <li><a href="python.html">Python and bindings</a></li> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 41 | <li><a href="architecture.html">libxml architecture</a></li> |
| 42 | <li><a href="tree.html">The tree output</a></li> |
| 43 | <li><a href="interface.html">The SAX interface</a></li> |
| 44 | <li><a href="xmldtd.html">Validation & DTDs</a></li> |
| 45 | <li><a href="xmlmem.html">Memory Management</a></li> |
| 46 | <li><a href="encoding.html">Encodings support</a></li> |
| 47 | <li><a href="xmlio.html">I/O Interfaces</a></li> |
| 48 | <li><a href="catalog.html">Catalog support</a></li> |
| 49 | <li><a href="library.html">The parser interfaces</a></li> |
| 50 | <li><a href="entities.html">Entities or no entities</a></li> |
| 51 | <li><a href="namespaces.html">Namespaces</a></li> |
| 52 | <li><a href="upgrade.html">Upgrading 1.x code</a></li> |
Daniel Veillard | 52dcab3 | 2001-10-30 12:51:17 +0000 | [diff] [blame] | 53 | <li><a href="threads.html">Thread safety</a></li> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 54 | <li><a href="DOM.html">DOM Principles</a></li> |
| 55 | <li><a href="example.html">A real example</a></li> |
| 56 | <li><a href="contribs.html">Contributions</a></li> |
| 57 | <li> |
| 58 | <a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a> |
| 59 | </li> |
| 60 | </ul></td></tr> |
| 61 | </table> |
| 62 | <table width="100%" border="0" cellspacing="1" cellpadding="3"> |
Daniel Veillard | 3bf65be | 2002-01-23 12:36:34 +0000 | [diff] [blame] | 63 | <tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>API Indexes</b></center></td></tr> |
| 64 | <tr><td bgcolor="#fffacd"><ul> |
Daniel Veillard | f859256 | 2002-01-23 17:58:17 +0000 | [diff] [blame] | 65 | <li><a href="APIchunk0.html">Alphabetic</a></li> |
Daniel Veillard | 3bf65be | 2002-01-23 12:36:34 +0000 | [diff] [blame] | 66 | <li><a href="APIconstructors.html">Constructors</a></li> |
| 67 | <li><a href="APIfunctions.html">Functions/Types</a></li> |
| 68 | <li><a href="APIfiles.html">Modules</a></li> |
| 69 | <li><a href="APIsymbols.html">Symbols</a></li> |
| 70 | </ul></td></tr> |
| 71 | </table> |
| 72 | <table width="100%" border="0" cellspacing="1" cellpadding="3"> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 73 | <tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Related links</b></center></td></tr> |
Daniel Veillard | 8acca11 | 2002-01-21 09:52:27 +0000 | [diff] [blame] | 74 | <tr><td bgcolor="#fffacd"><ul> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 75 | <li><a href="http://mail.gnome.org/archives/xml/">Mail archive</a></li> |
| 76 | <li><a href="http://xmlsoft.org/XSLT/">XSLT libxslt</a></li> |
Daniel Veillard | 4a85920 | 2002-01-08 11:49:22 +0000 | [diff] [blame] | 77 | <li><a href="http://phd.cs.unibo.it/gdome2/">DOM gdome2</a></li> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 78 | <li><a href="ftp://xmlsoft.org/">FTP</a></li> |
| 79 | <li><a href="http://www.fh-frankfurt.de/~igor/projects/libxml/">Windows binaries</a></li> |
Daniel Veillard | db9dfd9 | 2001-11-26 17:25:02 +0000 | [diff] [blame] | 80 | <li><a href="http://garypennington.net/libxml2/">Solaris binaries</a></li> |
Daniel Veillard | c6271d2 | 2001-10-27 07:50:58 +0000 | [diff] [blame] | 81 | <li><a href="http://bugzilla.gnome.org/buglist.cgi?product=libxml">Bug Tracker</a></li> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 82 | </ul></td></tr> |
| 83 | </table> |
| 84 | </td></tr></table></td> |
| 85 | <td valign="top" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%"><tr><td><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td><table border="0" cellpadding="3" cellspacing="1" width="100%"><tr><td bgcolor="#fffacd"> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 86 | <p>Table of Content:</p> |
| 87 | <ol> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 88 | <li><a href="General2">General overview</a></li> |
| 89 | <li><a href="#definition">The definition</a></li> |
| 90 | <li><a href="#Simple">Using catalogs</a></li> |
| 91 | <li><a href="#Some">Some examples</a></li> |
| 92 | <li><a href="#reference">How to tune catalog usage</a></li> |
| 93 | <li><a href="#validate">How to debug catalog processing</a></li> |
| 94 | <li><a href="#Declaring">How to create and maintain catalogs</a></li> |
| 95 | <li><a href="#implemento">The implementor corner quick review of the |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 96 | API</a></li> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 97 | <li><a href="#Other">Other resources</a></li> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 98 | </ol> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 99 | <h3><a name="General2">General overview</a></h3> |
| 100 | <p>What is a catalog? Basically it's a lookup mechanism used when an entity |
| 101 | (a file or a remote resource) references another entity. The catalog lookup |
| 102 | is inserted between the moment the reference is recognized by the software |
| 103 | (XML parser, stylesheet processing, or even images referenced for inclusion |
| 104 | in a rendering) and the time where loading that resource is actually |
| 105 | started.</p> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 106 | <p>It is basically used for 3 things:</p> |
| 107 | <ul> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 108 | <li>mapping from "logical" names, the public identifiers and a more |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 109 | concrete name usable for download (and URI). For example it can associate |
Daniel Veillard | ffb120d | 2001-08-23 00:52:23 +0000 | [diff] [blame] | 110 | the logical name |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 111 | <p>"-//OASIS//DTD DocBook XML V4.1.2//EN"</p> |
| 112 | <p>of the DocBook 4.1.2 XML DTD with the actual URL where it can be |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 113 | downloaded</p> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 114 | <p>http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd</p> |
| 115 | </li> |
| 116 | <li>remapping from a given URL to another one, like an HTTP indirection |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 117 | saying that |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 118 | <p>"http://www.oasis-open.org/committes/tr.xsl"</p> |
| 119 | <p>should really be looked at</p> |
| 120 | <p>"http://www.oasis-open.org/committes/entity/stylesheets/base/tr.xsl"</p> |
| 121 | </li> |
| 122 | <li>providing a local cache mechanism allowing to load the entities |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 123 | associated to public identifiers or remote resources, this is a really |
| 124 | important feature for any significant deployment of XML or SGML since it |
MDT 2001 John Fleck | 0468500 | 2001-09-03 16:11:47 +0000 | [diff] [blame] | 125 | allows to avoid the aleas and delays associated to fetching remote |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 126 | resources.</li> |
| 127 | </ul> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 128 | <h3><a name="definition">The definitions</a></h3> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 129 | <p>Libxml, as of 2.4.3 implements 2 kind of catalogs:</p> |
| 130 | <ul> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 131 | <li>the older SGML catalogs, the official spec is SGML Open Technical |
| 132 | Resolution TR9401:1997, but is better understood by reading <a href="http://www.jclark.com/sp/catalog.htm">the SP Catalog page</a> from |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 133 | James Clark. This is relatively old and not the preferred mode of |
| 134 | operation of libxml.</li> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 135 | <li> |
| 136 | <a href="http://www.oasis-open.org/committees/entity/spec.html">XML |
Daniel Veillard | ffb120d | 2001-08-23 00:52:23 +0000 | [diff] [blame] | 137 | Catalogs</a> |
| 138 | is far more flexible, more recent, uses an XML syntax and should scale |
| 139 | quite better. This is the default option of libxml.</li> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 140 | </ul> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 141 | <p> |
| 142 | <h3><a name="Simple">Using catalog</a></h3> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 143 | <p>In a normal environment libxml will by default check the presence of a |
| 144 | catalog in /etc/xml/catalog, and assuming it has been correctly populated, |
| 145 | the processing is completely transparent to the document user. To take a |
| 146 | concrete example, suppose you are authoring a DocBook document, this one |
| 147 | starts with the following DOCTYPE definition:</p> |
| 148 | <pre><?xml version='1.0'?> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 149 | <!DOCTYPE book PUBLIC "-//Norman Walsh//DTD DocBk XML V3.1.4//EN" |
| 150 | "http://nwalsh.com/docbook/xml/3.1.4/db3xml.dtd"></pre> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 151 | <p>When validating the document with libxml, the catalog will be |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 152 | automatically consulted to lookup the public identifier "-//Norman Walsh//DTD |
| 153 | DocBk XML V3.1.4//EN" and the system identifier |
| 154 | "http://nwalsh.com/docbook/xml/3.1.4/db3xml.dtd", and if these entities have |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 155 | been installed on your system and the catalogs actually point to them, libxml |
| 156 | will fetch them from the local disk.</p> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 157 | <p style="font-size: 10pt"> |
| 158 | <strong>Note</strong>: Really don't use this |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 159 | DOCTYPE example it's a really old version, but is fine as an example.</p> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 160 | <p>Libxml will check the catalog each time that it is requested to load an |
MDT 2001 John Fleck | 0468500 | 2001-09-03 16:11:47 +0000 | [diff] [blame] | 161 | entity, this includes DTD, external parsed entities, stylesheets, etc ... If |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 162 | your system is correctly configured all the authoring phase and processing |
MDT 2001 John Fleck | 0468500 | 2001-09-03 16:11:47 +0000 | [diff] [blame] | 163 | should use only local files, even if your document stays portable because it |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 164 | uses the canonical public and system ID, referencing the remote document.</p> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 165 | <h3><a name="Some">Some examples:</a></h3> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 166 | <p>Here is a couple of fragments from XML Catalogs used in libxml early |
| 167 | regression tests in <code>test/catalogs</code> :</p> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 168 | <pre><?xml version="1.0"?> |
| 169 | <!DOCTYPE catalog PUBLIC |
| 170 | "-//OASIS//DTD Entity Resolution XML Catalog V1.0//EN" |
| 171 | "http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd"> |
| 172 | <catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"> |
| 173 | <public publicId="-//OASIS//DTD DocBook XML V4.1.2//EN" |
| 174 | uri="http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd"/> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 175 | ...</pre> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 176 | <p>This is the beginning of a catalog for DocBook 4.1.2, XML Catalogs are |
| 177 | written in XML, there is a specific namespace for catalog elements |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 178 | "urn:oasis:names:tc:entity:xmlns:xml:catalog". The first entry in this |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 179 | catalog is a <code>public</code> mapping it allows to associate a Public |
Daniel Veillard | ffb120d | 2001-08-23 00:52:23 +0000 | [diff] [blame] | 180 | Identifier with an URI.</p> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 181 | <pre>... |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 182 | <rewriteSystem systemIdStartString="http://www.oasis-open.org/docbook/" |
| 183 | rewritePrefix="file:///usr/share/xml/docbook/"/> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 184 | ...</pre> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 185 | <p>A <code>rewriteSystem</code> is a very powerful instruction, it says that |
| 186 | any URI starting with a given prefix should be looked at another URI |
| 187 | constructed by replacing the prefix with an new one. In effect this acts like |
| 188 | a cache system for a full area of the Web. In practice it is extremely useful |
| 189 | with a file prefix if you have installed a copy of those resources on your |
Daniel Veillard | ffb120d | 2001-08-23 00:52:23 +0000 | [diff] [blame] | 190 | local system.</p> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 191 | <pre>... |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 192 | <delegatePublic publicIdStartString="-//OASIS//DTD XML Catalog //" |
| 193 | catalog="file:///usr/share/xml/docbook.xml"/> |
| 194 | <delegatePublic publicIdStartString="-//OASIS//ENTITIES DocBook XML" |
| 195 | catalog="file:///usr/share/xml/docbook.xml"/> |
| 196 | <delegatePublic publicIdStartString="-//OASIS//DTD DocBook XML" |
| 197 | catalog="file:///usr/share/xml/docbook.xml"/> |
| 198 | <delegateSystem systemIdStartString="http://www.oasis-open.org/docbook/" |
| 199 | catalog="file:///usr/share/xml/docbook.xml"/> |
| 200 | <delegateURI uriStartString="http://www.oasis-open.org/docbook/" |
| 201 | catalog="file:///usr/share/xml/docbook.xml"/> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 202 | ...</pre> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 203 | <p>Delegation is the core features which allows to build a tree of catalogs, |
| 204 | easier to maintain than a single catalog, based on Public Identifier, System |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 205 | Identifier or URI prefixes it instructs the catalog software to look up |
| 206 | entries in another resource. This feature allow to build hierarchies of |
| 207 | catalogs, the set of entries presented should be sufficient to redirect the |
| 208 | resolution of all DocBook references to the specific catalog in |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 209 | <code>/usr/share/xml/docbook.xml</code> this one in turn could delegate all |
| 210 | references for DocBook 4.2.1 to a specific catalog installed at the same time |
| 211 | as the DocBook resources on the local machine.</p> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 212 | <h3><a name="reference">How to tune catalog usage:</a></h3> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 213 | <p>The user can change the default catalog behaviour by redirecting queries |
| 214 | to its own set of catalogs, this can be done by setting the |
| 215 | <code>XML_CATALOG_FILES</code> environment variable to a list of catalogs, an |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 216 | empty one should deactivate loading the default <code>/etc/xml/catalog</code> |
| 217 | default catalog</p> |
| 218 | <h3><a name="validate">How to debug catalog processing:</a></h3> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 219 | <p>Setting up the <code>XML_DEBUG_CATALOG</code> environment variable will |
| 220 | make libxml output debugging informations for each catalog operations, for |
| 221 | example:</p> |
| 222 | <pre>orchis:~/XML -> xmllint --memory --noout test/ent2 |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 223 | warning: failed to load external entity "title.xml" |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 224 | orchis:~/XML -> export XML_DEBUG_CATALOG= |
| 225 | orchis:~/XML -> xmllint --memory --noout test/ent2 |
| 226 | Failed to parse catalog /etc/xml/catalog |
| 227 | Failed to parse catalog /etc/xml/catalog |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 228 | warning: failed to load external entity "title.xml" |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 229 | Catalogs cleanup |
| 230 | orchis:~/XML -> </pre> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 231 | <p>The test/ent2 references an entity, running the parser from memory makes |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 232 | the base URI unavailable and the the "title.xml" entity cannot be loaded. |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 233 | Setting up the debug environment variable allows to detect that an attempt is |
| 234 | made to load the <code>/etc/xml/catalog</code> but since it's not present the |
Daniel Veillard | ffb120d | 2001-08-23 00:52:23 +0000 | [diff] [blame] | 235 | resolution fails.</p> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 236 | <p>But the most advanced way to debug XML catalog processing is to use the |
| 237 | <strong>xmlcatalog</strong> command shipped with libxml2, it allows to load |
| 238 | catalogs and make resolution queries to see what is going on. This is also |
| 239 | used for the regression tests:</p> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 240 | <pre>orchis:~/XML -> ./xmlcatalog test/catalogs/docbook.xml \ |
| 241 | "-//OASIS//DTD DocBook XML V4.1.2//EN" |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 242 | http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd |
| 243 | orchis:~/XML -> </pre> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 244 | <p>For debugging what is going on, adding one -v flags increase the verbosity |
| 245 | level to indicate the processing done (adding a second flag also indicate |
| 246 | what elements are recognized at parsing):</p> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 247 | <pre>orchis:~/XML -> ./xmlcatalog -v test/catalogs/docbook.xml \ |
| 248 | "-//OASIS//DTD DocBook XML V4.1.2//EN" |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 249 | Parsing catalog test/catalogs/docbook.xml's content |
| 250 | Found public match -//OASIS//DTD DocBook XML V4.1.2//EN |
| 251 | http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd |
| 252 | Catalogs cleanup |
| 253 | orchis:~/XML -> </pre> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 254 | <p>A shell interface is also available to debug and process multiple queries |
| 255 | (and for regression tests):</p> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 256 | <pre>orchis:~/XML -> ./xmlcatalog -shell test/catalogs/docbook.xml \ |
| 257 | "-//OASIS//DTD DocBook XML V4.1.2//EN" |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 258 | > help |
| 259 | Commands available: |
| 260 | public PublicID: make a PUBLIC identifier lookup |
| 261 | system SystemID: make a SYSTEM identifier lookup |
| 262 | resolve PublicID SystemID: do a full resolver lookup |
| 263 | add 'type' 'orig' 'replace' : add an entry |
| 264 | del 'values' : remove values |
| 265 | dump: print the current catalog state |
| 266 | debug: increase the verbosity level |
| 267 | quiet: decrease the verbosity level |
| 268 | exit: quit the shell |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 269 | > public "-//OASIS//DTD DocBook XML V4.1.2//EN" |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 270 | http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd |
| 271 | > quit |
| 272 | orchis:~/XML -> </pre> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 273 | <p>This should be sufficient for most debugging purpose, this was actually |
MDT 2001 John Fleck | 0468500 | 2001-09-03 16:11:47 +0000 | [diff] [blame] | 274 | used heavily to debug the XML Catalog implementation itself.</p> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 275 | <h3> |
| 276 | <a name="Declaring">How to create and maintain</a> catalogs:</h3> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 277 | <p>Basically XML Catalogs are XML files, you can either use XML tools to |
| 278 | manage them or use <strong>xmlcatalog</strong> for this. The basic step is |
| 279 | to create a catalog the -create option provide this facility:</p> |
| 280 | <pre>orchis:~/XML -> ./xmlcatalog --create tst.xml |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 281 | <?xml version="1.0"?> |
| 282 | <!DOCTYPE catalog PUBLIC "-//OASIS//DTD Entity Resolution XML Catalog V1.0//EN" |
| 283 | "http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd"> |
| 284 | <catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"/> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 285 | orchis:~/XML -> </pre> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 286 | <p>By default xmlcatalog does not overwrite the original catalog and save the |
MDT 2001 John Fleck | 0468500 | 2001-09-03 16:11:47 +0000 | [diff] [blame] | 287 | result on the standard output, this can be overridden using the -noout |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 288 | option. The <code>-add</code> command allows to add entries in the |
| 289 | catalog:</p> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 290 | <pre>orchis:~/XML -> ./xmlcatalog --noout --create --add "public" \ |
| 291 | "-//OASIS//DTD DocBook XML V4.1.2//EN" \ |
| 292 | http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd tst.xml |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 293 | orchis:~/XML -> cat tst.xml |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 294 | <?xml version="1.0"?> |
| 295 | <!DOCTYPE catalog PUBLIC "-//OASIS//DTD Entity Resolution XML Catalog V1.0//EN" \ |
| 296 | "http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd"> |
| 297 | <catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"> |
| 298 | <public publicId="-//OASIS//DTD DocBook XML V4.1.2//EN" |
| 299 | uri="http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd"/> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 300 | </catalog> |
| 301 | orchis:~/XML -> </pre> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 302 | <p>The <code>-add</code> option will always take 3 parameters even if some of |
| 303 | the XML Catalog constructs (like nextCatalog) will have only a single |
| 304 | argument, just pass a third empty string, it will be ignored.</p> |
MDT 2001 John Fleck | 0468500 | 2001-09-03 16:11:47 +0000 | [diff] [blame] | 305 | <p>Similarly the <code>-del</code> option remove matching entries from the |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 306 | catalog:</p> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 307 | <pre>orchis:~/XML -> ./xmlcatalog --del \ |
| 308 | "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" tst.xml |
| 309 | <?xml version="1.0"?> |
| 310 | <!DOCTYPE catalog PUBLIC "-//OASIS//DTD Entity Resolution XML Catalog V1.0//EN" |
| 311 | "http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd"> |
| 312 | <catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"/> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 313 | orchis:~/XML -> </pre> |
MDT 2001 John Fleck | 0468500 | 2001-09-03 16:11:47 +0000 | [diff] [blame] | 314 | <p>The catalog is now empty. Note that the matching of <code>-del</code> is |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 315 | exact and would have worked in a similar fashion with the Public ID |
| 316 | string.</p> |
Daniel Veillard | ffb120d | 2001-08-23 00:52:23 +0000 | [diff] [blame] | 317 | <p>This is rudimentary but should be sufficient to manage a not too complex |
| 318 | catalog tree of resources.</p> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 319 | <h3><a name="implemento">The implementor corner quick review of the |
| 320 | API:</a></h3> |
| 321 | <p>First, and like for every other module of libxml, there is an |
| 322 | automatically generated <a href="html/libxml-catalog.html">API page for |
| 323 | catalog support</a>.</p> |
Daniel Veillard | ffb120d | 2001-08-23 00:52:23 +0000 | [diff] [blame] | 324 | <p>The header for the catalog interfaces should be included as:</p> |
| 325 | <pre>#include <libxml/catalog.h></pre> |
Daniel Veillard | ffb120d | 2001-08-23 00:52:23 +0000 | [diff] [blame] | 326 | <p>The API is voluntarily kept very simple. First it is not obvious that |
| 327 | applications really need access to it since it is the default behaviour of |
| 328 | libxml (Note: it is possible to completely override libxml default catalog by |
| 329 | using <a href="html/libxml-parser.html">xmlSetExternalEntityLoader</a> to |
| 330 | plug an application specific resolver).</p> |
Daniel Veillard | ffb120d | 2001-08-23 00:52:23 +0000 | [diff] [blame] | 331 | <p>Basically libxml support 2 catalog lists:</p> |
| 332 | <ul> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 333 | <li>the default one, global shared by all the application</li> |
| 334 | <li>a per-document catalog, this one is built if the document uses the |
Daniel Veillard | ffb120d | 2001-08-23 00:52:23 +0000 | [diff] [blame] | 335 | <code>oasis-xml-catalog</code> PIs to specify its own catalog list, it is |
| 336 | associated to the parser context and destroyed when the parsing context |
| 337 | is destroyed.</li> |
| 338 | </ul> |
Daniel Veillard | ffb120d | 2001-08-23 00:52:23 +0000 | [diff] [blame] | 339 | <p>the document one will be used first if it exists.</p> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 340 | <h4>Initialization routines:</h4> |
Daniel Veillard | ffb120d | 2001-08-23 00:52:23 +0000 | [diff] [blame] | 341 | <p>xmlInitializeCatalog(), xmlLoadCatalog() and xmlLoadCatalogs() should be |
| 342 | used at startup to initialize the catalog, if the catalog should be |
| 343 | initialized with specific values xmlLoadCatalog() or xmlLoadCatalogs() |
| 344 | should be called before xmlInitializeCatalog() which would otherwise do a |
| 345 | default initialization first.</p> |
Daniel Veillard | ffb120d | 2001-08-23 00:52:23 +0000 | [diff] [blame] | 346 | <p>The xmlCatalogAddLocal() call is used by the parser to grow the document |
| 347 | own catalog list if needed.</p> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 348 | <h4>Preferences setup:</h4> |
Daniel Veillard | ffb120d | 2001-08-23 00:52:23 +0000 | [diff] [blame] | 349 | <p>The XML Catalog spec requires the possibility to select default |
| 350 | preferences between public and system delegation, |
| 351 | xmlCatalogSetDefaultPrefer() allows this, xmlCatalogSetDefaults() and |
| 352 | xmlCatalogGetDefaults() allow to control if XML Catalogs resolution should |
| 353 | be forbidden, allowed for global catalog, for document catalog or both, the |
| 354 | default is to allow both.</p> |
Daniel Veillard | ffb120d | 2001-08-23 00:52:23 +0000 | [diff] [blame] | 355 | <p>And of course xmlCatalogSetDebug() allows to generate debug messages |
| 356 | (through the xmlGenericError() mechanism).</p> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 357 | <h4>Querying routines:</h4> |
Daniel Veillard | ffb120d | 2001-08-23 00:52:23 +0000 | [diff] [blame] | 358 | <p>xmlCatalogResolve(), xmlCatalogResolveSystem(), xmlCatalogResolvePublic() |
| 359 | and xmlCatalogResolveURI() are relatively explicit if you read the XML |
| 360 | Catalog specification they correspond to section 7 algorithms, they should |
| 361 | also work if you have loaded an SGML catalog with a simplified semantic.</p> |
Daniel Veillard | ffb120d | 2001-08-23 00:52:23 +0000 | [diff] [blame] | 362 | <p>xmlCatalogLocalResolve() and xmlCatalogLocalResolveURI() are the same but |
| 363 | operate on the document catalog list</p> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 364 | <h4>Cleanup and Miscellaneous:</h4> |
Daniel Veillard | ffb120d | 2001-08-23 00:52:23 +0000 | [diff] [blame] | 365 | <p>xmlCatalogCleanup() free-up the global catalog, xmlCatalogFreeLocal() is |
| 366 | the per-document equivalent.</p> |
Daniel Veillard | ffb120d | 2001-08-23 00:52:23 +0000 | [diff] [blame] | 367 | <p>xmlCatalogAdd() and xmlCatalogRemove() are used to dynamically modify the |
| 368 | first catalog in the global list, and xmlCatalogDump() allows to dump a |
| 369 | catalog state, those routines are primarily designed for xmlcatalog, I'm not |
| 370 | sure that exposing more complex interfaces (like navigation ones) would be |
| 371 | really useful.</p> |
Daniel Veillard | 9f7b84b | 2001-08-23 15:31:19 +0000 | [diff] [blame] | 372 | <p>The xmlParseCatalogFile() is a function used to load XML Catalog files, |
| 373 | it's similar as xmlParseFile() except it bypass all catalog lookups, it's |
| 374 | provided because this functionality may be useful for client tools.</p> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 375 | <h4>threaded environments:</h4> |
Daniel Veillard | ffb120d | 2001-08-23 00:52:23 +0000 | [diff] [blame] | 376 | <p>Since the catalog tree is built progressively, some care has been taken to |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 377 | try to avoid troubles in multithreaded environments. The code is now thread |
| 378 | safe assuming that the libxml library has been compiled with threads |
| 379 | support.</p> |
| 380 | <p> |
| 381 | <h3><a name="Other">Other resources</a></h3> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 382 | <p>The XML Catalog specification is relatively recent so there isn't much |
MDT 2001 John Fleck | 0468500 | 2001-09-03 16:11:47 +0000 | [diff] [blame] | 383 | literature to point at:</p> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 384 | <ul> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 385 | <li>You can find an good rant from Norm Walsh about <a href="http://www.arbortext.com/Think_Tank/XML_Resources/Issue_Three/issue_three.html">the |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 386 | need for catalogs</a>, it provides a lot of context informations even if |
| 387 | I don't agree with everything presented.</li> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 388 | <li>An <a href="http://home.ccil.org/~cowan/XML/XCatalog.html">old XML |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 389 | catalog proposal</a> from John Cowan</li> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 390 | <li>The <a href="http://www.rddl.org/">Resource Directory Description |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 391 | Language</a> (RDDL) another catalog system but more oriented toward |
| 392 | providing metadata for XML namespaces.</li> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 393 | <li>the page from the OASIS Technical <a href="http://www.oasis-open.org/committees/entity/">Committee on Entity |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 394 | Resolution</a> who maintains XML Catalog, you will find pointers to the |
| 395 | specification update, some background and pointers to others tools |
| 396 | providing XML Catalog support</li> |
Daniel Veillard | 35e937a | 2002-01-19 22:21:54 +0000 | [diff] [blame] | 397 | <li>Here is a <a href="buildDocBookCatalog">shell script</a> to generate |
| 398 | XML Catalogs for DocBook 4.1.2 . If it can write to the /etc/xml/ |
| 399 | directory, it will set-up /etc/xml/catalog and /etc/xml/docbook based on |
| 400 | the resources found on the system. Otherwise it will just create |
| 401 | ~/xmlcatalog and ~/dbkxmlcatalog and doing: |
Daniel Veillard | c575b99 | 2002-02-08 13:28:40 +0000 | [diff] [blame] | 402 | <p><code>export XMLCATALOG=$HOME/xmlcatalog</code></p> |
Daniel Veillard | 35e937a | 2002-01-19 22:21:54 +0000 | [diff] [blame] | 403 | <p>should allow to process DocBook documentations without requiring |
| 404 | network accesses for the DTd or stylesheets</p> |
| 405 | </li> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 406 | <li>I have uploaded <a href="ftp://xmlsoft.org/test/dbk412catalog.tar.gz">a |
Daniel Veillard | 35e937a | 2002-01-19 22:21:54 +0000 | [diff] [blame] | 407 | small tarball</a> containing XML Catalogs for DocBook 4.1.2 which seems |
| 408 | to work fine for me too</li> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 409 | <li>The <a href="http://www.xmlsoft.org/xmlcatalog_man.html">xmlcatalog |
| 410 | manual page</a> |
| 411 | </li> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 412 | </ul> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 413 | <p>If you have suggestions for corrections or additions, simply contact |
| 414 | me:</p> |
Daniel Veillard | 3f4c40f | 2002-02-13 09:19:28 +0000 | [diff] [blame] | 415 | <p><a href="bugs.html">Daniel Veillard</a></p> |
Daniel Veillard | b8cfbd1 | 2001-10-25 10:53:28 +0000 | [diff] [blame] | 416 | </td></tr></table></td></tr></table></td></tr></table></td> |
| 417 | </tr></table></td></tr></table> |
Daniel Veillard | e7ead2d | 2001-08-22 23:44:09 +0000 | [diff] [blame] | 418 | </body> |
| 419 | </html> |