added documentation about Catalog support, misses an API description

* doc/catalog.html doc/xml.html: added documentation about
  Catalog support, misses an API description
* doc/html/*: reextracted the API pages
Daniel
diff --git a/doc/catalog.html b/doc/catalog.html
new file mode 100644
index 0000000..a93d2f2
--- /dev/null
+++ b/doc/catalog.html
@@ -0,0 +1,315 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
+    "http://www.w3.org/TR/html4/loose.dtd">
+<html>
+<head>
+  <title>Libxml Catalog support</title>
+  <meta name="GENERATOR" content="amaya V5.0">
+  <meta http-equiv="Content-Type" content="text/html">
+</head>
+
+<body bgcolor="#ffffff">
+<h1 align="center">Libxml Catalog support</h1>
+
+<p>Location: <a
+href="http://xmlsoft.org/catalog.html">http://xmlsoft.org/catalog.html</a></p>
+
+<p>Libxml home page: <a href="http://xmlsoft.org/">http://xmlsoft.org/</a></p>
+
+<p>Mailing-list archive:  <a
+href="http://mail.gnome.org/archives/xml/">http://mail.gnome.org/archives/xml/</a></p>
+
+<p>Version: $Revision:$</p>
+
+<p>Table of Content:</p>
+<ol>
+  <li><a href="#General">General overview</a></li>
+  <li><a href="#definition">The definition</a></li>
+  <li><a href="#Simple">Using catalogs</a></li>
+  <li><a href="#Some">Some examples</a></li>
+  <li><a href="#reference">How to tune  catalog usage</a></li>
+  <li><a href="#validate">How to debug catalog processing</a></li>
+  <li><a href="#Declaring">How to create and maintain catalogs</a></li>
+  <li><a href="#implemento">The implementor corner quick review of the
+  API</a></li>
+  <li><a href="#Other">Other resources</a></li>
+</ol>
+
+<h2><a name="General">General overview</a></h2>
+
+<p>What is a catalog ? Basically it's a lookup mechanism which is used when
+an entity (a file or a remote resource) reference another entity. The catalog
+lookup is inserted between the moment the reference is recognized by the
+software (XML parser, stylesheet processing, or even images referenced for
+inclusion in a rendering) and the time where loading that resource is
+actually started. </p>
+
+<p>It is basically used for 3 things:</p>
+<ul>
+  <li>mapping from "logical" names, the public identifiers and a more
+    concrete name usable for download (and URI). For example it can associate
+    the logical name 
+    <p>"-//OASIS//DTD DocBook XML V4.1.2//EN"  </p>
+    <p>of the DocBook 4.1.2 XML DTD with the actual URL where it can be
+    downloaded</p>
+    <p>http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd  </p>
+  </li>
+  <li>remapping from a given URL to another one, like an HTTP indirection
+    saying that
+    <p>"http://www.oasis-open.org/committes/tr.xsl"</p>
+    <p>should really be looked at</p>
+    <p>"http://www.oasis-open.org/committes/entity/stylesheets/base/tr.xsl" 
+    </p>
+  </li>
+  <li>providing a local cache mechanism allowing to load the entities
+    associated to public identifiers or remote resources, this is a really
+    important feature for any significant deployment of XML or SGML since it
+    allows to avoid the aleas and delays associated to fetching remore
+    resources.</li>
+</ul>
+
+<h2><a name="definition">The definitions</a></h2>
+
+<p>Libxml, as of 2.4.3 implements 2 kind of catalogs:</p>
+<ul>
+  <li>the older SGML catalogs, the official spec is  SGML Open Technical
+    Resolution TR9401:1997, but is better understood by reading <a
+    href="http://www.jclark.com/sp/catalog.htm">the SP Catalog page</a> from
+    James Clark. This is relatively old and not the preferred mode of
+    operation of libxml.</li>
+  <li><a href="http://www.oasis-open.org/committees/entity/spec.html">XML
+    Catalogs</a> is far more flexible, more recent, uses an XML syntax and
+    should scale quite better. This is the default option of libxml.</li>
+</ul>
+
+<p></p>
+
+<h2><a name="Simple">Using catalog</a></h2>
+
+<p>In a normal environment libxml will by default check the presence of a
+catalog in /etc/xml/catalog, and assuming it has been correctly populated,
+the processing is completely transparent to the document user. To take a
+concrete example, suppose you are authoring a DocBook document, this one
+starts with the following DOCTYPE definition:</p>
+<pre>&lt;?xml version='1.0'?&gt;
+&lt;!DOCTYPE book PUBLIC "-//Norman Walsh//DTD DocBk XML V3.1.4//EN"
+                         "http://nwalsh.com/docbook/xml/3.1.4/db3xml.dtd"&gt;
+
+</pre>
+
+<p>When validating the document with libxml, the catalog will be
+automatically consulted to lookup the public identifier "-//Norman Walsh//DTD
+DocBk XML V3.1.4//EN" and the system identifier
+"http://nwalsh.com/docbook/xml/3.1.4/db3xml.dtd", and if these entities have
+been installed on your system and the catalogs actually point to them, libxml
+will fetch them from the local disk.</p>
+
+<p style="font-size: 10pt"><strong>Note</strong>: Really don't use this
+DOCTYPE example it's a really old version, but is fine as an example.</p>
+
+<p>Libxml will check the catalog each time that it is requested to load an
+entity, this include DTD, external parsed entities, stylesheets, etc ... If
+your system is correctly configured all the authoring phase and processing
+should use only local files, even if your document stay portable because it
+uses the canonical public and system ID, referencing the remote document.</p>
+
+<h2><a name="Some">Some examples:</a></h2>
+
+<p>Here is a couple of fragments from XML Catalogs used in libxml early
+regression tests in <code>test/catalogs</code> :</p>
+<pre>&lt;?xml version="1.0"?&gt;
+&lt;!DOCTYPE catalog PUBLIC "-//OASIS//DTD Entity Resolution XML Catalog V1.0//EN"
+       "http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd"&gt;
+&lt;catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"&gt;
+    &lt;public publicId="-//OASIS//DTD DocBook XML V4.1.2//EN"
+            uri="http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd"/&gt;
+...</pre>
+
+<p>This is the beginning of a catalog for DocBook 4.1.2, XML Catalogs are
+written in XML,  there is a specific namespace for catalog elements
+"urn:oasis:names:tc:entity:xmlns:xml:catalog". The first entry in this
+catalog is a <code>public</code> mapping it allows to associate a Public
+Identifier with an URI. </p>
+<pre>...
+    &lt;rewriteSystem systemIdStartString="http://www.oasis-open.org/docbook/"
+                   rewritePrefix="file:///usr/share/xml/docbook/"/&gt;
+...</pre>
+
+<p>A <code>rewriteSystem</code> is a very powerful instruction, it says that
+any URI starting with a given prefix should be looked at another  URI
+constructed by replacing the prefix with an new one. In effect this acts like
+a cache system for a full area of the Web. In practice it is extremely useful
+with a file prefix if you have installed a copy of those resources on your
+local system. </p>
+<pre>...
+&lt;delegatePublic publicIdStartString="-//OASIS//DTD XML Catalog //"
+                catalog="file:///usr/share/xml/docbook.xml"/&gt;
+&lt;delegatePublic publicIdStartString="-//OASIS//ENTITIES DocBook XML"
+                catalog="file:///usr/share/xml/docbook.xml"/&gt;
+&lt;delegatePublic publicIdStartString="-//OASIS//DTD DocBook XML"
+                catalog="file:///usr/share/xml/docbook.xml"/&gt;
+&lt;delegateSystem systemIdStartString="http://www.oasis-open.org/docbook/"
+                catalog="file:///usr/share/xml/docbook.xml"/&gt;
+&lt;delegateURI uriStartString="http://www.oasis-open.org/docbook/"
+                catalog="file:///usr/share/xml/docbook.xml"/&gt;
+...</pre>
+
+<p>Delegation is the core features which allows to build a tree of catalogs,
+easier to maintain than a single catalog, based on Public Identifier, System
+Identifier or URI prefixes it instruct the catalog software to lookup entries
+in another resource. This feature allow to build hierarchies of catalogs, the
+set of entries presented should be sufficient to redirect the resolution of
+all DocBook references to the  specific catalog in
+<code>/usr/share/xml/docbook.xml</code> this one in turn could delegate all
+references for DocBook 4.2.1 to a specific catalog installed at the same time
+as the DocBook resources on the local machine.</p>
+
+<h2><a name="reference">How to tune catalog usage:</a></h2>
+
+<p>The user can change the default catalog behaviour by redirecting queries
+to its own set of catalogs, this can be done by setting the
+<code>XML_CATALOG_FILES</code> environment variable to a list of catalogs, an
+empty one should desactivate loading the default
+<code>/etc/xml/catalog</code> default catalog.</p>
+
+<p>@@More options are likely to be provided in the future@@</p>
+
+<h2><a name="validate">How to debug catalog processing:</a></h2>
+
+<p>Setting up the <code>XML_DEBUG_CATALOG</code> environment variable will
+make libxml output debugging informations for each catalog operations, for
+example:</p>
+<pre>orchis:~/XML -&gt; xmllint --memory --noout test/ent2
+warning: failed to load external entity "title.xml"
+orchis:~/XML -&gt; export XML_DEBUG_CATALOG=
+orchis:~/XML -&gt; xmllint --memory --noout test/ent2
+Failed to parse catalog /etc/xml/catalog
+Failed to parse catalog /etc/xml/catalog
+warning: failed to load external entity "title.xml"
+Catalogs cleanup
+orchis:~/XML -&gt; </pre>
+
+<p>The test/ent2 references an entity, running the parser from memory makes
+the base URI unavailable and the the "title.xml" entity cannot be loaded.
+Setting up the debug environment variable allows to detect that an attempt is
+made to load the <code>/etc/xml/catalog</code> but since it's not present the
+resolution fails.  </p>
+
+<p>But the most advanced way to debug XML catalog processing is to use the
+<strong>xmlcatalog</strong> command shipped with libxml2, it allows to load
+catalogs and make resolution queries to see what is going on. This is also
+used for the regression tests:</p>
+<pre>orchis:~/XML -&gt; ./xmlcatalog test/catalogs/docbook.xml "-//OASIS//DTD DocBook XML V4.1.2//EN"
+http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd
+orchis:~/XML -&gt; </pre>
+
+<p>For debugging what is going on, adding one -v flags increase the verbosity
+level to indicate the processing done (adding a second flag also indicate
+what elements are recognized at parsing):</p>
+<pre>orchis:~/XML -&gt; ./xmlcatalog -v test/catalogs/docbook.xml "-//OASIS//DTD DocBook XML V4.1.2//EN"
+Parsing catalog test/catalogs/docbook.xml's content
+Found public match -//OASIS//DTD DocBook XML V4.1.2//EN
+http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd
+Catalogs cleanup
+orchis:~/XML -&gt; </pre>
+
+<p>A shell interface is also available to debug and process multiple queries
+(and for regression tests):</p>
+<pre>orchis:~/XML -&gt; ./xmlcatalog -shell test/catalogs/docbook.xml "-//OASIS//DTD DocBook XML V4.1.2//EN"
+&gt; help   
+Commands available:
+public PublicID: make a PUBLIC identifier lookup
+system SystemID: make a SYSTEM identifier lookup
+resolve PublicID SystemID: do a full resolver lookup
+add 'type' 'orig' 'replace' : add an entry
+del 'values' : remove values
+dump: print the current catalog state
+debug: increase the verbosity level
+quiet: decrease the verbosity level
+exit:  quit the shell
+&gt; public "-//OASIS//DTD DocBook XML V4.1.2//EN"
+http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd
+&gt; quit
+orchis:~/XML -&gt; </pre>
+
+<p>This should be sufficient for most debugging purpose, this was actually
+used heavilly to debug the XML Catalog implementation itself.</p>
+
+<h2><a name="Declaring">How to create and maintain</a> catalogs:</h2>
+
+<p>Basically XML Catalogs are XML files, you can either use XML tools to
+manage them or use  <strong>xmlcatalog</strong> for this. The basic step is
+to create a catalog the -create option provide this facility:</p>
+<pre>orchis:~/XML -&gt; ./xmlcatalog --create tst.xml
+&lt;?xml version="1.0"?&gt;
+&lt;!DOCTYPE catalog PUBLIC "-//OASIS//DTD Entity Resolution XML Catalog V1.0//EN"
+         "http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd"&gt;
+&lt;catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"/&gt;
+orchis:~/XML -&gt; </pre>
+
+<p>By default xmlcatalog does not overwrite the original catalog and save the
+result on the standard output, this can be overrident using the -noout
+option. The <code>-add</code> command allows to add entries in the
+catalog:</p>
+<pre>orchis:~/XML -&gt; ./xmlcatalog --noout --create --add "public" "-//OASIS//DTD DocBook XML V4.1.2//EN" http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd tst.xml
+orchis:~/XML -&gt; cat tst.xml
+&lt;?xml version="1.0"?&gt;
+&lt;!DOCTYPE catalog PUBLIC "-//OASIS//DTD Entity Resolution XML Catalog V1.0//EN" "http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd"&gt;
+&lt;catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"&gt;
+&lt;public publicId="-//OASIS//DTD DocBook XML V4.1.2//EN"
+        uri="http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd"/&gt;
+&lt;/catalog&gt;
+orchis:~/XML -&gt; </pre>
+
+<p>The <code>-add</code> option will always take 3 parameters even if some of
+the XML Catalog constructs (like nextCatalog) will have only a single
+argument, just pass a third empty string, it will be ignored.</p>
+
+<p>Similary the <code>-del</code> option remove matching entries from the
+catalog:</p>
+<pre>orchis:~/XML -&gt; ./xmlcatalog --del "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" tst.xml
+&lt;?xml version="1.0"?&gt;
+&lt;!DOCTYPE catalog PUBLIC "-//OASIS//DTD Entity Resolution XML Catalog V1.0//EN" "http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd"&gt;
+&lt;catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"/&gt;
+orchis:~/XML -&gt; </pre>
+
+<p>The catalog is now empty. Note that the maching of <code>-del</code> is
+exact and would have worked in a similar fashion with the Public ID
+string.</p>
+
+<p> This is rudimentary but should be sufficient to manage a not too complex
+catalog tree of resources. </p>
+
+<h2><a name="implemento">The implementor corner quick review of the
+API:</a></h2>
+
+<p>@@TODO@@</p>
+
+<h2><a name="Other">Other resources</a></h2>
+
+<p>The XML Catalog specification is relatively recent so there isn't much
+litterature to point at:</p>
+<ul>
+  <li>You can find an good rant from Norm Walsh about <a
+    href="http://www.arbortext.com/Think_Tank/XML_Resources/Issue_Three/issue_three.html">the
+    need for catalogs</a>, it provides a lot of context informations even if
+    I don't agree with everything presented.</li>
+  <li>An <a href="http://home.ccil.org/~cowan/XML/XCatalog.html">old XML
+    catalog proposal</a> from John Cowan</li>
+  <li>The <a href="http://www.rddl.org/">Resource Directory Description
+    Language</a> (RDDL) another catalog system but more oriented toward
+    providing metadata for XML namespaces.</li>
+  <li>the page from the OASIS Technical <a
+    href="http://www.oasis-open.org/committees/entity/">Committee on Entity
+    Resolution</a> who maintains XML Catalog, you will find pointers to the
+    specification update, some background and pointers to others tools
+    providing XML Catalog support</li>
+</ul>
+
+<p>If you have suggestions for corrections or additions, simply contact
+me:</p>
+
+<p><a href="mailto:daniel@veillard.com">Daniel Veillard</a></p>
+
+<p>$Id:$</p>
+</body>
+</html>