blob: 72a0a9c64a716e821038bb002f09a0fac2bfddef [file] [log] [blame]
Daniel Veillard09ab7e12001-07-10 15:49:44 +00001<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
2 "http://www.w3.org/TR/html4/loose.dtd">
Daniel Veillardccb09631998-10-27 06:21:04 +00003<html>
4<head>
Daniel Veillard598bec32003-07-06 10:02:03 +00005 <title>The XML C parser and toolkit of Gnome</title>
Daniel Veillard69839ba2006-06-06 13:27:03 +00006 <meta name="GENERATOR" content="amaya 8.8.5, see http://www.w3.org/Amaya/">
7 <meta http-equiv="content-type" content="text/html">
Daniel Veillardccb09631998-10-27 06:21:04 +00008</head>
Daniel Veillardccb09631998-10-27 06:21:04 +00009
Daniel Veillardb05deb71999-08-10 19:04:08 +000010<body bgcolor="#ffffff">
Daniel Veillard598bec32003-07-06 10:02:03 +000011<h1 align="center">The XML C parser and toolkit of Gnome</h1>
Daniel Veillardb05deb71999-08-10 19:04:08 +000012
Daniel Veillard69839ba2006-06-06 13:27:03 +000013<h1>Note: this is the flat content of the <a
14href="index.html">website</a></h1>
Daniel Veillard9c466822001-10-25 12:03:39 +000015
Daniel Veillardc9484202001-10-24 12:35:52 +000016<h1 style="text-align: center">libxml, a.k.a. gnome-xml</h1>
Daniel Veillardc8eab3a1999-09-04 18:27:23 +000017
18<p></p>
Daniel Veillard9c466822001-10-25 12:03:39 +000019
Daniel Veillard7ebac022004-02-25 22:36:35 +000020<p
Daniel Veillardfabafd52006-06-08 08:16:33 +000021style="text-align: right; font-style: italic; font-size: 10pt">"Programmingwithlibxml2
22is like the thrilling embrace of an exotic stranger." <a
Daniel Veillard69839ba2006-06-06 13:27:03 +000023href="http://diveintomark.org/archives/2004/02/18/libxml2">MarkPilgrim</a></p>
Daniel Veillard7ebac022004-02-25 22:36:35 +000024
Daniel Veillardfabafd52006-06-08 08:16:33 +000025<p>Libxml2 is the XML C parser and toolkit developed for the Gnomeproject(but
26usable outside of the Gnome platform), it is free softwareavailableunder the
27<a
28href="http://www.opensource.org/licenses/mit-license.html">MITLicense</a>.XML
29itself is a metalanguage to design markup languages, i.e.text languagewhere
30semantic and structure are added to the content usingextra
31"markup"information enclosed between angle brackets. HTML is the
32mostwell-knownmarkup language. Though the library is written in C <a
33href="python.html">avariety of language bindings</a>make it available inother
34environments.</p>
Daniel Veillard9c466822001-10-25 12:03:39 +000035
Daniel Veillardfabafd52006-06-08 08:16:33 +000036<p>Libxml2 is known to be very portable, the library should build
37andworkwithout serious troubles on a variety of systems (Linux,
38Unix,Windows,CygWin, MacOS, MacOS X, RISC Os, OS/2, VMS, QNX, MVS, ...)</p>
Daniel Veillard710823b2003-03-04 10:05:52 +000039
Daniel Veillardfabafd52006-06-08 08:16:33 +000040<p>Libxml2 implements a number of existing standards related
41tomarkuplanguages:</p>
Daniel Veillard2f4dfc41999-09-24 14:03:48 +000042<ul>
Daniel Veillard9c466822001-10-25 12:03:39 +000043 <li>the XML standard: <a
44 href="http://www.w3.org/TR/REC-xml">http://www.w3.org/TR/REC-xml</a></li>
45 <li>Namespaces in XML: <a
46 href="http://www.w3.org/TR/REC-xml-names/">http://www.w3.org/TR/REC-xml-names/</a></li>
47 <li>XML Base: <a
48 href="http://www.w3.org/TR/xmlbase/">http://www.w3.org/TR/xmlbase/</a></li>
Daniel Veillardfabafd52006-06-08 08:16:33 +000049 <li><a
50 href="http://www.cis.ohio-state.edu/rfc/rfc2396.txt">RFC2396</a>:Uniform
51 Resource Identifiers <a
Daniel Veillard9c466822001-10-25 12:03:39 +000052 href="http://www.ietf.org/rfc/rfc2396.txt">http://www.ietf.org/rfc/rfc2396.txt</a></li>
53 <li>XML Path Language (XPath) 1.0: <a
54 href="http://www.w3.org/TR/xpath">http://www.w3.org/TR/xpath</a></li>
55 <li>HTML4 parser: <a
56 href="http://www.w3.org/TR/html401/">http://www.w3.org/TR/html401/</a></li>
Daniel Veillard23a52c52003-08-18 10:01:18 +000057 <li>XML Pointer Language (XPointer) Version 1.0: <a
Daniel Veillard9c466822001-10-25 12:03:39 +000058 href="http://www.w3.org/TR/xptr">http://www.w3.org/TR/xptr</a></li>
59 <li>XML Inclusions (XInclude) Version 1.0: <a
60 href="http://www.w3.org/TR/xinclude/">http://www.w3.org/TR/xinclude/</a></li>
Daniel Veillard23a52c52003-08-18 10:01:18 +000061 <li>ISO-8859-x encodings, as well as <a
Daniel Veillardfabafd52006-06-08 08:16:33 +000062 href="http://www.cis.ohio-state.edu/rfc/rfc2044.txt">rfc2044</a>[UTF-8]and<a
63 href="http://www.cis.ohio-state.edu/rfc/rfc2781.txt">rfc2781</a>[UTF-16]Unicode
64 encodings, and more if using iconv support</li>
Daniel Veillard9c466822001-10-25 12:03:39 +000065 <li>part of SGML Open Technical Resolution TR9401:1997</li>
66 <li>XML Catalogs Working Draft 06 August 2001: <a
67 href="http://www.oasis-open.org/committees/entity/spec-2001-08-06.html">http://www.oasis-open.org/committees/entity/spec-2001-08-06.html</a></li>
Daniel Veillard5c396542002-03-15 07:57:50 +000068 <li>Canonical XML Version 1.0: <a
Daniel Veillardfabafd52006-06-08 08:16:33 +000069 href="http://www.w3.org/TR/xml-c14n">http://www.w3.org/TR/xml-c14n</a>andthe
70 Exclusive XML Canonicalization CR draft <a
Daniel Veillard2d347fa2002-03-17 10:34:11 +000071 href="http://www.w3.org/TR/xml-exc-c14n">http://www.w3.org/TR/xml-exc-c14n</a></li>
Daniel Veillard758c5312003-12-15 11:51:25 +000072 <li>Relax NG, ISO/IEC 19757-2:2003, <a
Daniel Veillard17bed982003-02-24 20:11:43 +000073 href="http://www.oasis-open.org/committees/relax-ng/spec-20011203.html">http://www.oasis-open.org/committees/relax-ng/spec-20011203.html</a></li>
Daniel Veillardf83a2c72003-04-08 13:48:40 +000074 <li>W3C XML Schemas Part 2: Datatypes <a
Daniel Veillardfabafd52006-06-08 08:16:33 +000075 href="http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/">REC
76 02May2001</a></li>
77 <li>W3C <a href="http://www.w3.org/TR/xml-id/">xml:id</a>Working
78 Draft7April 2004</li>
Daniel Veillard96984452000-08-31 13:50:12 +000079</ul>
80
Daniel Veillardfabafd52006-06-08 08:16:33 +000081<p>In most cases libxml2 tries to implement the specifications in
82arelativelystrictly compliant way. As of release 2.4.16, libxml2 passed
83all1800+ testsfrom the <a
84href="http://www.oasis-open.org/committees/xml-conformance/">OASIS
85XMLTestsSuite</a>.</p>
Daniel Veillard5b16f582002-02-20 11:38:46 +000086
Daniel Veillardfabafd52006-06-08 08:16:33 +000087<p>To some extent libxml2 provides support for the
88followingadditionalspecifications but doesn't claim to implement them
89completely:</p>
Daniel Veillard9c466822001-10-25 12:03:39 +000090<ul>
91 <li>Document Object Model (DOM) <a
Daniel Veillardfabafd52006-06-08 08:16:33 +000092 href="http://www.w3.org/TR/DOM-Level-2-Core/">http://www.w3.org/TR/DOM-Level-2-Core/</a>thedocument
93 model, but it doesn't implement the API itself, gdome2 doesthison top of
94 libxml2</li>
95 <li><a
96 href="http://www.cis.ohio-state.edu/rfc/rfc959.txt">RFC959</a>:libxml2
97 implements a basic FTP client code</li>
98 <li><a
99 href="http://www.cis.ohio-state.edu/rfc/rfc1945.txt">RFC1945</a>:HTTP/1.0,
100 again a basic HTTP client code</li>
101 <li>SAX: a SAX2 like interface and a minimal SAX1
102 implementationcompatiblewith early expat versions</li>
Daniel Veillard9c466822001-10-25 12:03:39 +0000103</ul>
104
Daniel Veillardf83a2c72003-04-08 13:48:40 +0000105<p>A partial implementation of <a
Daniel Veillardfabafd52006-06-08 08:16:33 +0000106href="http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/">XML Schemas
107Part1:Structure</a>is being worked on but it would be far too early to
108makeanyconformance statement about it at the moment.</p>
Daniel Veillarde6d8e202002-05-02 06:11:10 +0000109
Daniel Veillard96984452000-08-31 13:50:12 +0000110<p>Separate documents:</p>
111<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000112 <li><a href="http://xmlsoft.org/XSLT/">the libxslt
113 page</a>providinganimplementation of XSLT 1.0 and common extensions like
114 EXSLTforlibxml2</li>
115 <li><a href="http://www.cs.unibo.it/~casarini/gdome2/">the gdome2
116 page</a>:a standard DOM2 implementation for libxml2</li>
117 <li><a href="http://www.aleksey.com/xmlsec/">the XMLSec
118 page</a>:animplementation of <a
119 href="http://www.w3.org/TR/xmldsig-core/">W3CXMLDigital Signature</a>for
120 libxml2</li>
121 <li>also check the related links section below for more related
122 andactiveprojects.</li>
Daniel Veillard2f4dfc41999-09-24 14:03:48 +0000123</ul>
Daniel Veillard252004d2004-03-23 12:32:32 +0000124<!----------------<p>Results of the <a
Daniel Veillard806cada2003-03-19 21:58:59 +0000125href="http://xmlbench.sourceforge.net/results/benchmark/index.html">xmlbench
Daniel Veillard31ae4622004-02-16 07:45:44 +0000126benchmark</a> on sourceforge February 2004 (smaller is better):</p>
Daniel Veillardd8da01c2003-03-24 15:58:23 +0000127
Daniel Veillard31ae4622004-02-16 07:45:44 +0000128<p align="center"><img src="benchmark.png"
Daniel Veillardd8da01c2003-03-24 15:58:23 +0000129alt="benchmark results for Expat Xerces libxml2 Oracle and Sun toolkits"></p>
Daniel Veillardcf80b782004-03-07 19:32:19 +0000130-------------->
Daniel Veillard806cada2003-03-19 21:58:59 +0000131
Daniel Veillarde1662542002-08-28 11:50:59 +0000132<p>Logo designed by <a href="mailto:liyanage@access.ch">Marc Liyanage</a>.</p>
133
Daniel Veillard2f4dfc41999-09-24 14:03:48 +0000134<h2><a name="Introducti">Introduction</a></h2>
Daniel Veillardc8eab3a1999-09-04 18:27:23 +0000135
Daniel Veillardf13e1ed2000-03-06 07:41:49 +0000136<p>This document describes libxml, the <a
Daniel Veillard69839ba2006-06-06 13:27:03 +0000137href="http://www.w3.org/XML/">XML</a>C parser and toolkit developed for the<a
138href="http://www.gnome.org/">Gnome</a>project. <a
Daniel Veillardfabafd52006-06-08 08:16:33 +0000139href="http://www.w3.org/XML/">XML is a standard</a>for
140buildingtag-basedstructured documents/data.</p>
Daniel Veillardb05deb71999-08-10 19:04:08 +0000141
Daniel Veillard0142b842000-01-14 14:45:24 +0000142<p>Here are some key points about libxml:</p>
143<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000144 <li>Libxml2 exports Push (progressive) and Pull (blocking)
145 typeparserinterfaces for both XML and HTML.</li>
146 <li>Libxml2 can do DTD validation at parse time, using a
147 parseddocumentinstance, or with an arbitrary DTD.</li>
Daniel Veillard8a469172003-06-12 16:05:07 +0000148 <li>Libxml2 includes complete <a
Daniel Veillard8c2ecaf2001-07-10 17:53:07 +0000149 href="http://www.w3.org/TR/xpath">XPath</a>, <a
Daniel Veillard69839ba2006-06-06 13:27:03 +0000150 href="http://www.w3.org/TR/xptr">XPointer</a>and <a
151 href="http://www.w3.org/TR/xinclude">XInclude</a>implementations.</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000152 <li>It is written in plain C, making as few assumptions as
153 possible,andsticking closely to ANSI C/POSIX for easy embedding.
154 WorksonLinux/Unix/Windows, ported to a number of other platforms.</li>
155 <li>Basic support for HTTP and FTP client allowing applications
156 tofetchremote resources.</li>
Daniel Veillard91e9d582001-02-26 07:31:12 +0000157 <li>The design is modular, most of the extensions can be compiled out.</li>
Daniel Veillard63d83142002-05-20 06:51:05 +0000158 <li>The internal document representation is as close as possible to the <a
Daniel Veillard69839ba2006-06-06 13:27:03 +0000159 href="http://www.w3.org/DOM/">DOM</a>interfaces.</li>
Daniel Veillard560c2a42003-07-06 21:13:49 +0000160 <li>Libxml2 also has a <a
Daniel Veillardfabafd52006-06-08 08:16:33 +0000161 href="http://www.megginson.com/SAX/index.html">SAX like
162 interface</a>;theinterface is designed to be compatible with <a
Daniel Veillard402e8c82000-02-29 22:57:47 +0000163 href="http://www.jclark.com/xml/expat.html">Expat</a>.</li>
Daniel Veillardc575b992002-02-08 13:28:40 +0000164 <li>This library is released under the <a
Daniel Veillardfabafd52006-06-08 08:16:33 +0000165 href="http://www.opensource.org/licenses/mit-license.html">MITLicense</a>.See
166 the Copyright file in the distribution for the precisewording.</li>
Daniel Veillard0142b842000-01-14 14:45:24 +0000167</ul>
Daniel Veillardccb09631998-10-27 06:21:04 +0000168
Daniel Veillardfabafd52006-06-08 08:16:33 +0000169<p>Warning: unless you are forced to because your application links
170withaGnome-1.X library requiring it, <strong><span
171style="background-color: #FF0000">Do Not Use
172libxml1</span></strong>,uselibxml2</p>
Daniel Veillarde0c1d722001-03-21 10:28:36 +0000173
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000174<h2><a name="FAQ">FAQ</a></h2>
175
Daniel Veillard0b28e882002-07-24 23:47:05 +0000176<p>Table of Contents:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000177<ul>
Daniel Veillard63d83142002-05-20 06:51:05 +0000178 <li><a href="FAQ.html#License">License(s)</a></li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000179 <li><a href="FAQ.html#Installati">Installation</a></li>
180 <li><a href="FAQ.html#Compilatio">Compilation</a></li>
181 <li><a href="FAQ.html#Developer">Developer corner</a></li>
182</ul>
183
Daniel Veillard63d83142002-05-20 06:51:05 +0000184<h3><a name="License">License</a>(s)</h3>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000185<ol>
186 <li><em>Licensing Terms for libxml</em>
Daniel Veillard8a469172003-06-12 16:05:07 +0000187 <p>libxml2 is released under the <a
Daniel Veillardfabafd52006-06-08 08:16:33 +0000188 href="http://www.opensource.org/licenses/mit-license.html">MITLicense</a>;see
189 the file Copyright in the distribution for the precisewording</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000190 </li>
Daniel Veillard8a469172003-06-12 16:05:07 +0000191 <li><em>Can I embed libxml2 in a proprietary application ?</em>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000192 <p>Yes. The MIT License allows you to keep proprietary the changesyoumade
193 to libxml, but it would be graceful to send-back bug fixesandimprovements
194 as patches for possible incorporation in themaindevelopment tree.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000195 </li>
196</ol>
197
198<h3><a name="Installati">Installation</a></h3>
199<ol>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000200 <li><strong><span style="background-color: #FF0000">Do
201 NotUselibxml1</span></strong>, use libxml2</li>
Daniel Veillard69839ba2006-06-06 13:27:03 +0000202 <li><em>Where can I get libxml</em>?
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000203 <p>The original distribution comes from <a
Daniel Veillard69839ba2006-06-06 13:27:03 +0000204 href="ftp://xmlsoft.org/libxml2/">xmlsoft.org</a>or <a
Daniel Veillard024f1992003-12-10 16:43:49 +0000205 href="ftp://ftp.gnome.org/pub/GNOME/sources/libxml2/2.6/">gnome.org</a></p>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000206 <p>Most Linux and BSD distributions include libxml, this is
207 probablythesafer way for end-users to use libxml.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000208 <p>David Doolin provides precompiled Windows versions at <a
209 href="http://www.ce.berkeley.edu/~doolin/code/libxmlwin32/ ">http://www.ce.berkeley.edu/~doolin/code/libxmlwin32/</a></p>
210 </li>
211 <li><em>I see libxml and libxml2 releases, which one should I install ?</em>
212 <ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000213 <li>If you are not constrained by backward compatibility
214 issueswithexisting applications, install libxml2 only</li>
215 <li>If you are not doing development, you can safely
216 installboth.Usually the packages <a
Daniel Veillard69839ba2006-06-06 13:27:03 +0000217 href="http://rpmfind.net/linux/RPM/libxml.html">libxml</a>and <a
Daniel Veillardfabafd52006-06-08 08:16:33 +0000218 href="http://rpmfind.net/linux/RPM/libxml2.html">libxml2</a>arecompatible(this
219 is not the case for development packages).</li>
220 <li>If you are a developer and your system provides
221 separatepackagingfor shared libraries and the development components,
222 it ispossibleto install libxml and libxml2, and also <a
223 href="http://rpmfind.net/linux/RPM/libxml-devel.html">libxml-devel</a>and<a
224 href="http://rpmfind.net/linux/RPM/libxml2-devel.html">libxml2-devel</a>toofor
225 libxml2 &gt;= 2.3.0</li>
226 <li>If you are developing a new application, please
227 developagainstlibxml2(-devel)</li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000228 </ul>
229 </li>
Daniel Veillard0b28e882002-07-24 23:47:05 +0000230 <li><em>I can't install the libxml package, it conflicts with libxml0</em>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000231 <p>You probably have an old libxml0 package used to provide
232 thesharedlibrary for libxml.so.0, you can probably safely remove it.
233 Thelibxmlpackages provided on <a
Daniel Veillard69839ba2006-06-06 13:27:03 +0000234 href="ftp://xmlsoft.org/libxml2/">xmlsoft.org</a>providelibxml.so.0</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000235 </li>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000236 <li><em>I can't install the libxml(2) RPM package due
237 tofaileddependencies</em>
238 <p>The most generic solution is to re-fetch the latest src.rpm
239 ,andrebuild it locally with</p>
Daniel Veillard0b28e882002-07-24 23:47:05 +0000240 <p><code>rpm --rebuild libxml(2)-xxx.src.rpm</code>.</p>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000241 <p>If everything goes well it will generate two binary rpm
242 packages(oneproviding the shared libs and xmllint, and the other one,
243 the-develpackage, providing includes, static libraries and scripts needed
244 tobuildapplications with libxml(2)) that you can install locally.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000245 </li>
246</ol>
247
248<h3><a name="Compilatio">Compilation</a></h3>
249<ol>
Daniel Veillard8a469172003-06-12 16:05:07 +0000250 <li><em>What is the process to compile libxml2 ?</em>
251 <p>As most UNIX libraries libxml2 follows the "standard":</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000252 <p><code>gunzip -c xxx.tar.gz | tar xvf -</code></p>
253 <p><code>cd libxml-xxxx</code></p>
254 <p><code>./configure --help</code></p>
255 <p>to see the options, then the compilation/installation proper</p>
256 <p><code>./configure [possible options]</code></p>
257 <p><code>make</code></p>
258 <p><code>make install</code></p>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000259 <p>At that point you may have to rerun ldconfig or a similar
260 utilitytoupdate your list of installed shared libs.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000261 </li>
Daniel Veillard8a469172003-06-12 16:05:07 +0000262 <li><em>What other libraries are needed to compile/install libxml2 ?</em>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000263 <p>Libxml2 does not require any other library, the normal C ANSIAPIshould
264 be sufficient (please report any violation to this rule youmayfind).</p>
265 <p>However if found at configuration time libxml2 will detect and
266 usethefollowing libs:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000267 <ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000268 <li><a
269 href="http://www.info-zip.org/pub/infozip/zlib/">libz</a>:ahighly
270 portable and available widely compression library.</li>
271 <li>iconv: a powerful character encoding conversion library.
272 Itisincluded by default in recent glibc libraries, so it doesn't
273 needtobe installed specifically on Linux. It now seems a <a
274 href="http://www.opennc.org/onlinepubs/7908799/xsh/iconv.html">partofthe
275 official UNIX</a>specification. Here is one <a
276 href="http://www.gnu.org/software/libiconv/">implementation
277 ofthelibrary</a>which source can be found <a
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000278 href="ftp://ftp.ilog.fr/pub/Users/haible/gnu/">here</a>.</li>
279 </ul>
280 </li>
Daniel Veillard0b28e882002-07-24 23:47:05 +0000281 <li><em>Make check fails on some platforms</em>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000282 <p>Sometimes the regression tests' results don't completely matchthevalue
283 produced by the parser, and the makefile uses diff to printthedelta. On
284 some platforms the diff return breaks the compilationprocess;if the diff
285 is small this is probably not a serious problem.</p>
286 <p>Sometimes (especially on Solaris) make checks fail due tolimitationsin
287 make. Try using GNU-make instead.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000288 </li>
289 <li><em>I use the CVS version and there is no configure script</em>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000290 <p>The configure script (and other Makefiles) are generated.
291 Usetheautogen.sh script to regenerate the configure script
292 andMakefiles,like:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000293 <p><code>./autogen.sh --prefix=/usr --disable-shared</code></p>
294 </li>
295 <li><em>I have troubles when running make tests with gcc-3.0</em>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000296 <p>It seems the initial release of gcc-3.0 has a problem withtheoptimizer
297 which miscompiles the URI module. Please useanothercompiler.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000298 </li>
299</ol>
300
Daniel Veillard69839ba2006-06-06 13:27:03 +0000301<h3><a name="Developer">Developer</a>corner</h3>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000302<ol>
Daniel Veillard93d95252003-04-29 20:25:40 +0000303 <li><em>Troubles compiling or linking programs using libxml2</em>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000304 <p>Usually the problem comes from the fact that the compiler
305 doesn'tgetthe right compilation or linking flags. There is a small
306 shellscript<code>xml2-config</code>which is installed as part of
307 libxml2usualinstall process which provides those flags. Use</p>
Daniel Veillard93d95252003-04-29 20:25:40 +0000308 <p><code>xml2-config --cflags</code></p>
309 <p>to get the compilation flags and</p>
310 <p><code>xml2-config --libs</code></p>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000311 <p>to get the linker flags. Usually this is done directly fromtheMakefile
312 as:</p>
Daniel Veillard93d95252003-04-29 20:25:40 +0000313 <p><code>CFLAGS=`xml2-config --cflags`</code></p>
314 <p><code>LIBS=`xml2-config --libs`</code></p>
315 </li>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000316 <li><em>I want to install my own copy of libxml2 in my home
317 directoryandlink my programs against it, but it doesn't work</em>
318 <p>There are many different ways to accomplish this. Here is one waytodo
319 this under Linux. Suppose your home directory
320 is<code>/home/user.</code>Then:</p>
Daniel Veillard67952602006-01-05 15:29:44 +0000321 <ul>
322 <li>Create a subdirectory, let's call it <code>myxml</code></li>
William M. Brack99906ad2005-01-09 17:02:42 +0000323 <li>unpack the libxml2 distribution into that subdirectory</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000324 <li>chdir into the unpacked
325 distribution(<code>/home/user/myxml/libxml2</code>)</li>
326 <li>configure the library using the
327 "<code>--prefix</code>"switch,specifying an installation
328 subdirectoryin<code>/home/user/myxml</code>, e.g.
329 <p><code>./configure
330 --prefix/home/user/myxml/xmlinst</code>{otherconfiguration
331 options}</p>
Daniel Veillard67952602006-01-05 15:29:44 +0000332 </li>
Daniel Veillard69839ba2006-06-06 13:27:03 +0000333 <li>now run <code>make</code>followed by <code>make install</code></li>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000334 <li>At this point, the installation subdirectory contains
335 thecomplete"private" include files, library files and binary
336 programfiles (e.g.xmllint), located in
337 <p><code>/home/user/myxml/xmlinst/lib,/home/user/myxml/xmlinst/include</code>and
338 <code>/home/user/myxml/xmlinst/bin</code></p>
Daniel Veillard67952602006-01-05 15:29:44 +0000339 respectively.</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000340 <li>In order to use this "private" library, you should first add
341 ittothe beginning of your default PATH (so that your own
342 privateprogramfiles such as xmllint will be used instead of the
343 normalsystemones). To do this, the Bash command would be
Daniel Veillard67952602006-01-05 15:29:44 +0000344 <p><code>export PATH=/home/user/myxml/xmlinst/bin:$PATH</code></p>
345 </li>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000346 <li>Now suppose you have a program <code>test1.c</code>that
347 youwouldlike to compile with your "private" library. Simply compile
348 itusingthe command
Daniel Veillard67952602006-01-05 15:29:44 +0000349 <p><code>gcc `xml2-config --cflags --libs` -o test test.c</code></p>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000350 Note that, because your PATH has been set
351 with<code>/home/user/myxml/xmlinst/bin</code>at the beginning,
352 thexml2-configprogram which you just installed will be used instead
353 ofthe systemdefault one, and this will <em>automatically</em>get
354 thecorrectlibraries linked with your program.</li>
Daniel Veillard67952602006-01-05 15:29:44 +0000355 </ul>
356 </li>
357
358 <p></p>
Daniel Veillard0b28e882002-07-24 23:47:05 +0000359 <li><em>xmlDocDump() generates output on one line.</em>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000360 <p>Libxml2 will not <strong>invent</strong>spaces in the content
361 ofadocument since <strong>all spaces in the content of a
362 documentaresignificant</strong>. If you build a tree from the API
363 andwantindentation:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000364 <ol>
Daniel Veillard0b28e882002-07-24 23:47:05 +0000365 <li>the correct way is to generate those yourself too.</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000366 <li>the dangerous way is to ask libxml2 to add those blanks
367 toyourcontent <strong>modifying the content of your document
368 intheprocess</strong>. The result may not be what you expect.
369 Thereis<strong>NO</strong>way to guarantee that such a
370 modificationwon'taffect other parts of the content of your document.
371 See <a
372 href="http://xmlsoft.org/html/libxml-parser.html#xmlKeepBlanksDefault">xmlKeepBlanksDefault()</a>and<a
Daniel Veillard69839ba2006-06-06 13:27:03 +0000373 href="http://xmlsoft.org/html/libxml-tree.html#xmlSaveFormatFile">xmlSaveFormatFile()</a></li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000374 </ol>
375 </li>
376 <li>Extra nodes in the document:
377 <p><em>For a XML file as below:</em></p>
378 <pre>&lt;?xml version="1.0"?&gt;
379&lt;PLAN xmlns="http://www.argus.ca/autotest/1.0/"&gt;
380&lt;NODE CommFlag="0"/&gt;
381&lt;NODE CommFlag="1"/&gt;
382&lt;/PLAN&gt;</pre>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000383 <p><em>after parsing it with
384 thefunctionpxmlDoc=xmlParseFile(...);</em></p>
385 <p><em>I want to the get the content of the first node (node
386 withtheCommFlag="0")</em></p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000387 <p><em>so I did it as following;</em></p>
Daniel Veillard63d83142002-05-20 06:51:05 +0000388 <pre>xmlNodePtr pnode;
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000389pnode=pxmlDoc-&gt;children-&gt;children;</pre>
390 <p><em>but it does not work. If I change it to</em></p>
391 <pre>pnode=pxmlDoc-&gt;children-&gt;children-&gt;next;</pre>
392 <p><em>then it works. Can someone explain it to me.</em></p>
393 <p></p>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000394 <p>In XML all characters in the content of the document
395 aresignificant<strong>including blanks and formatting
396 linebreaks</strong>.</p>
397 <p>The extra nodes you are wondering about are just that, text
398 nodeswiththe formatting spaces which are part of the document but that
399 peopletendto forget. There is a function <a
400 href="http://xmlsoft.org/html/libxml-parser.html">xmlKeepBlanksDefault()</a>toremove
401 those at parse time, but that's an heuristic, and itsuse should belimited
402 to cases where you are certain there is nomixed-content in
403 thedocument.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000404 </li>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000405 <li><em>I get compilation errors of existing code like
406 whenaccessing<strong>root</strong>or <strong>child
407 fields</strong>ofnodes.</em>
408 <p>You are compiling code developed for libxml version 1 and
409 usingalibxml2 development environment. Either switch back to libxml v1
410 develoreven better fix the code to compile with libxml2 (or both) by <a
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000411 href="upgrade.html">following the instructions</a>.</p>
412 </li>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000413 <li><em>I get compilation errors about
414 nonexisting<strong>xmlRootNode</strong>or<strong>xmlChildrenNode</strong>fields.</em>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000415 <p>The source code you are using has been <a
Daniel Veillardfabafd52006-06-08 08:16:33 +0000416 href="upgrade.html">upgraded</a>to be able to compile with both
417 libxmlandlibxml2, but you need to install a more recent
418 version:libxml(-devel)&gt;= 1.8.8 or libxml2(-devel) &gt;= 2.1.0</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000419 </li>
420 <li><em>XPath implementation looks seriously broken</em>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000421 <p>XPath implementation prior to 2.3.0 was really incomplete. Upgrade
422 toarecent version, there are no known bugs in the current version.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000423 </li>
Daniel Veillard0b28e882002-07-24 23:47:05 +0000424 <li><em>The example provided in the web page does not compile.</em>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000425 <p>It's hard to maintain the documentation in sync with
426 thecode&lt;grin/&gt; ...</p>
427 <p>Check the previous points 1/ and 2/ raised before, and
428 pleasesendpatches.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000429 </li>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000430 <li><em>Where can I get more examples and information than provided
431 ontheweb page?</em>
432 <p>Ideally a libxml2 book would be nice. I have no such plan ...
433 Butyoucan:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000434 <ul>
Daniel Veillard69839ba2006-06-06 13:27:03 +0000435 <li>check more deeply the <a
436 href="html/libxml-lib.html">existinggenerated doc</a></li>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000437 <li>have a look at <a href="examples/index.html">the
438 setofexamples</a>.</li>
439 <li>look for examples of use for libxml2 function using the
440 Gnomecode.For example the following will query the full Gnome CVS
441 base fortheuse of the <strong>xmlAddChild()</strong>function:
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000442 <p><a
443 href="http://cvs.gnome.org/lxr/search?string=xmlAddChild">http://cvs.gnome.org/lxr/search?string=xmlAddChild</a></p>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000444 <p>This may be slow, a large hardware donation to the
445 gnomeprojectcould cure this :-)</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000446 </li>
447 <li><a
Daniel Veillardfabafd52006-06-08 08:16:33 +0000448 href="http://cvs.gnome.org/bonsai/rview.cgi?cvsroot=/cvs/gnome&amp;dir=gnome-xml">Browsethelibxml2
449 source</a>, I try to write code as clean and documentedaspossible, so
450 looking at it may be helpful. In particular the codeofxmllint.c and
451 of the various testXXX.c test programs shouldprovidegood examples of
452 how to do things with the library.</li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000453 </ul>
454 </li>
455 <li>What about C++ ?
Daniel Veillardfabafd52006-06-08 08:16:33 +0000456 <p>libxml2 is written in pure C in order to allow easy reuse on anumberof
457 platforms, including embedded systems. I don't intend to converttoC++.</p>
Daniel Veillard91e69c52003-08-04 01:43:07 +0000458 <p>There is however a C++ wrapper which may fulfill your needs:</p>
Daniel Veillard9b6fd302002-05-13 12:06:47 +0000459 <ul>
460 <li>by Ari Johnson &lt;ari@btigate.com&gt;:
461 <p>Website: <a
Daniel Veillard91e69c52003-08-04 01:43:07 +0000462 href="http://libxmlplusplus.sourceforge.net/">http://libxmlplusplus.sourceforge.net/</a></p>
Daniel Veillard9b6fd302002-05-13 12:06:47 +0000463 <p>Download: <a
Daniel Veillard91e69c52003-08-04 01:43:07 +0000464 href="http://sourceforge.net/project/showfiles.php?group_id=12999">http://sourceforge.net/project/showfiles.php?group_id=12999</a></p>
Daniel Veillard9b6fd302002-05-13 12:06:47 +0000465 </li>
Daniel Veillard83ee40d2003-08-09 22:24:09 +0000466 <!-- Website is currently unavailable as of 2003-08-02
Daniel Veillard9b6fd302002-05-13 12:06:47 +0000467 <li>by Peter Jones &lt;pjones@pmade.org&gt;
Daniel Veillard83ee40d2003-08-09 22:24:09 +0000468 <p>Website: <a
469 href="http://pmade.org/pjones/software/xmlwrapp/">http://pmade.org/pjones/software/xmlwrapp/</a></p>
Daniel Veillard9b6fd302002-05-13 12:06:47 +0000470 </li>
Daniel Veillard83ee40d2003-08-09 22:24:09 +0000471 -->
Daniel Veillard9b6fd302002-05-13 12:06:47 +0000472 </ul>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000473 </li>
474 <li>How to validate a document a posteriori ?
Daniel Veillardfabafd52006-06-08 08:16:33 +0000475 <p>It is possible to validate documents which had not been
476 validatedatinitial parsing time or documents which have been built
477 fromscratchusing the API. Use the <a
478 href="http://xmlsoft.org/html/libxml-valid.html#xmlValidateDtd">xmlValidateDtd()</a>function.It
479 is also possible to simply add a DTD to an existingdocument:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000480 <pre>xmlDocPtr doc; /* your existing document */
Daniel Veillard0b28e882002-07-24 23:47:05 +0000481xmlDtdPtr dtd = xmlParseDTD(NULL, filename_of_dtd); /* parse the DTD */
482
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000483 dtd-&gt;name = xmlStrDup((xmlChar*)"root_name"); /* use the given root */
484
485 doc-&gt;intSubset = dtd;
486 if (doc-&gt;children == NULL) xmlAddChild((xmlNodePtr)doc, (xmlNodePtr)dtd);
487 else xmlAddPrevSibling(doc-&gt;children, (xmlNodePtr)dtd);
488 </pre>
489 </li>
Daniel Veillarddad3f682002-11-17 16:47:27 +0000490 <li>So what is this funky "xmlChar" used all the time?
Daniel Veillardfabafd52006-06-08 08:16:33 +0000491 <p>It is a null terminated sequence of utf-8 characters. And
492 onlyutf-8!You need to convert strings encoded in different ways to
493 utf-8beforepassing them to the API. This can be accomplished with the
494 iconvlibraryfor instance.</p>
John Fleck61f6fb62002-10-31 15:23:29 +0000495 </li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +0000496 <li>etc ...</li>
497</ol>
498
499<p></p>
500
Daniel Veillard66f68e72003-08-18 16:39:51 +0000501<h2><a name="Documentat">Developer Menu</a></h2>
Daniel Veillardb05deb71999-08-10 19:04:08 +0000502
Daniel Veillard0b28e882002-07-24 23:47:05 +0000503<p>There are several on-line resources related to using libxml:</p>
Daniel Veillard0142b842000-01-14 14:45:24 +0000504<ol>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000505 <li>Use the <a href="search.php">search engine</a>to lookupinformation.</li>
Daniel Veillard0b28e882002-07-24 23:47:05 +0000506 <li>Check the <a href="FAQ.html">FAQ.</a></li>
Daniel Veillard69839ba2006-06-06 13:27:03 +0000507 <li>Check the <a
Daniel Veillardfabafd52006-06-08 08:16:33 +0000508 href="http://xmlsoft.org/html/libxml-lib.html">extensivedocumentation</a>automaticallyextracted
509 from code comments.</li>
Daniel Veillard69839ba2006-06-06 13:27:03 +0000510 <li>Look at the documentation about <a
511 href="encoding.html">libxmlinternationalization support</a>.</li>
512 <li>This page provides a global overview and <a
513 href="example.html">someexamples</a>on how to use libxml.</li>
Daniel Veillardc480c4e2003-12-10 13:24:38 +0000514 <li><a href="examples/index.html">Code examples</a></li>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000515 <li>John Fleck's libxml2 tutorial: <a
516 href="tutorial/index.html">html</a>or<a
517 href="tutorial/xmltutorial.pdf">pdf</a>.</li>
Daniel Veillard1177ca42003-04-26 22:29:54 +0000518 <li>If you need to parse large files, check the <a
Daniel Veillard69839ba2006-06-06 13:27:03 +0000519 href="xmlreader.html">xmlReader</a>API tutorial</li>
520 <li><a href="mailto:james@daa.com.au">James Henstridge</a>wrote <a
Daniel Veillardfabafd52006-06-08 08:16:33 +0000521 href="http://www.daa.com.au/~james/gnome/xml-sax/xml-sax.html">somenicedocumentation</a>explaining
522 how to use the libxml SAX interface.</li>
Daniel Veillardc480c4e2003-12-10 13:24:38 +0000523 <li>George Lebl wrote <a
Daniel Veillardfabafd52006-06-08 08:16:33 +0000524 href="http://www-106.ibm.com/developerworks/library/l-gnome3/">anarticlefor
525 IBM developerWorks</a>about using libxml.</li>
526 <li>Check <a
527 href="http://cvs.gnome.org/lxr/source/gnome-xml/TODO">theTODOfile</a>.</li>
528 <li>Read the <a href="upgrade.html">1.x to 2.x upgrade
529 path</a>description.If you are starting a new project using libxml you
530 shouldreally use the2.x version.</li>
Daniel Veillard845cce42002-01-09 11:51:37 +0000531 <li>And don't forget to look at the <a
532 href="http://mail.gnome.org/archives/xml/">mailing-list archive</a>.</li>
Daniel Veillard0142b842000-01-14 14:45:24 +0000533</ol>
Daniel Veillardc8eab3a1999-09-04 18:27:23 +0000534
Daniel Veillardd5f97f82000-09-17 16:38:14 +0000535<h2><a name="Reporting">Reporting bugs and getting help</a></h2>
Daniel Veillard4c3a2031999-11-19 17:46:26 +0000536
Daniel Veillardfabafd52006-06-08 08:16:33 +0000537<p>Well, bugs or missing features are always possible, and I will make
538apointof fixing them in a timely fashion. The best way to report a bug is
539touse the<a
540href="http://bugzilla.gnome.org/buglist.cgi?product=libxml2">Gnomebugtracking
541database</a>(make sure to use the "libxml2" module name). Ilook atreports
542there regularly and it's good to have a reminder when a bugis stillopen. Be
543sure to specify that the bug is for the package libxml2.</p>
Daniel Veillard4c3a2031999-11-19 17:46:26 +0000544
Daniel Veillardfabafd52006-06-08 08:16:33 +0000545<p>For small problems you can try to get help on IRC, the #xml
546channelonirc.gnome.org (port 6667) usually have a few person subscribed which
547mayhelp(but there is no garantee and if a real issue is raised it should go
548onthemailing-list for archival).</p>
Daniel Veillard9582d6c2003-09-16 11:40:04 +0000549
Daniel Veillard0142b842000-01-14 14:45:24 +0000550<p>There is also a mailing-list <a
Daniel Veillard69839ba2006-06-06 13:27:03 +0000551href="mailto:xml@gnome.org">xml@gnome.org</a>for libxml, with an <a
552href="http://mail.gnome.org/archives/xml/">on-line archive</a>(<a
Daniel Veillardfabafd52006-06-08 08:16:33 +0000553href="http://xmlsoft.org/messages">old</a>). To subscribe to this
554list,pleasevisit the <a
555href="http://mail.gnome.org/mailman/listinfo/xml">associatedWeb</a>page
556andfollow the instructions. <strong>Do not send code, I won'tdebug
557it</strong>(but patches are really appreciated!).</p>
Daniel Veillardd99224d2004-04-06 10:04:16 +0000558
Daniel Veillardfabafd52006-06-08 08:16:33 +0000559<p>Please note that with the current amount of virus and SPAM, sending
560mailtothe list without being subscribed won't work. There is *far too
561manybounces*(in the order of a thousand a day !) I cannot approve them
562manuallyanymore.If your mail to the list bounced waiting for administrator
563approval,it isLOST ! Repost it and fix the problem triggering the error. Also
564pleasenotethat <span style="color: #FF0000; background-color: #FFFFFF">emails
565withalegal warning asking to not copy or redistribute freely the
566informationstheycontain</span>are <strong>NOT</strong>acceptable for the
567mailing-list,suchmail will as much as possible be discarded automatically,
568and are lesslikelyto be answered if they made it to the list, <strong>DO
569NOT</strong>post tothe list from an email address where such legal
570requirements areautomaticallyadded, get private paying support if you can't
571shareinformations.</p>
Daniel Veillard234547b2001-07-05 09:46:10 +0000572
Daniel Veillard69839ba2006-06-06 13:27:03 +0000573<p>Check the following <strong><span
574style="color: #FF0000">beforeposting</span></strong>:</p>
Daniel Veillard234547b2001-07-05 09:46:10 +0000575<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000576 <li>Read the <a href="FAQ.html">FAQ</a>and <a
577 href="search.php">usethesearch engine</a>to get information related to
578 your problem.</li>
579 <li>Make sure you are <a href="ftp://xmlsoft.org/libxml2/">using
580 arecentversion</a>, and that the problem still shows up in a
581 recentversion.</li>
Daniel Veillard69839ba2006-06-06 13:27:03 +0000582 <li>Check the <a
Daniel Veillardfabafd52006-06-08 08:16:33 +0000583 href="http://mail.gnome.org/archives/xml/">listarchives</a>to see if
584 theproblem was reported already. In this casethere is probably a
585 fixavailable, similarly check the <a
586 href="http://bugzilla.gnome.org/buglist.cgi?product=libxml2">registeredopenbugs</a>.</li>
587 <li>Make sure you can reproduce the bug with xmllint or one of
588 thetestprograms found in source in the distribution.</li>
589 <li>Please send the command showing the error as well as the input
590 (asanattachment)</li>
Daniel Veillard234547b2001-07-05 09:46:10 +0000591</ul>
Daniel Veillard0142b842000-01-14 14:45:24 +0000592
Daniel Veillarda37aab82003-06-09 09:10:36 +0000593<p>Then send the bug with associated information to reproduce it to the <a
Daniel Veillardfabafd52006-06-08 08:16:33 +0000594href="mailto:xml@gnome.org">xml@gnome.org</a>list; if it's
595reallylibxmlrelated I will approve it. Please do not send mail to me
596directly, itmakesthings really hard to track and in some cases I am not the
597best persontoanswer a given question, ask on the list.</p>
Daniel Veillard98d071d2003-01-10 09:22:44 +0000598
599<p>To <span style="color: #E50000">be really clear about support</span>:</p>
600<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000601 <li>Support or help <span style="color: #E50000">requests MUST be senttothe
602 list or on bugzilla</span>in case of problems, so that theQuestionand
603 Answers can be shared publicly. Failing to do so carries
604 theimplicitmessage "I want free support but I don't want to share
605 thebenefits withothers" and is not welcome. I will automatically
606 Carbon-Copythexml@gnome.org mailing list for any technical reply made
607 about libxml2orlibxslt.</li>
608 <li>There is <span style="color: #E50000">no garantee of
609 support</span>,ifyour question remains unanswered after a week, repost
610 it, making sureyougave all the detail needed and the information
611 requested.</li>
612 <li>Failing to provide information as requested or double checking
613 firstforprior feedback also carries the implicit message "the time of
614 thelibrarymaintainers is less valuable than my time" and might not
615 bewelcome.</li>
Daniel Veillard98d071d2003-01-10 09:22:44 +0000616</ul>
Daniel Veillard4c3a2031999-11-19 17:46:26 +0000617
Daniel Veillardfabafd52006-06-08 08:16:33 +0000618<p>Of course, bugs reported with a suggested patch for fixing
619themwillprobably be processed faster than those without.</p>
Daniel Veillardec303412000-03-24 13:41:54 +0000620
621<p>If you're looking for help, a quick look at <a
Daniel Veillardfabafd52006-06-08 08:16:33 +0000622href="http://mail.gnome.org/archives/xml/">the list
623archive</a>mayactuallyprovide the answer. I usually send source samples when
624answeringlibxml2usage questions. The <a
625href="http://xmlsoft.org/html/book1.html">auto-generateddocumentation</a>isnot
626as polished as I would like (i need to learn moreabout DocBook), butit's a
627good starting point.</p>
Daniel Veillardec303412000-03-24 13:41:54 +0000628
Daniel Veillardd5f97f82000-09-17 16:38:14 +0000629<h2><a name="help">How to help</a></h2>
630
Daniel Veillardfabafd52006-06-08 08:16:33 +0000631<p>You can help the project in various ways, the best thing to do first
632istosubscribe to the mailing-list as explained before, check the <a
Daniel Veillardf7ed3362001-08-17 12:01:21 +0000633href="http://mail.gnome.org/archives/xml/">archives </a>and the <a
Daniel Veillardfabafd52006-06-08 08:16:33 +0000634href="http://bugzilla.gnome.org/buglist.cgi?product=libxml2">Gnomebugdatabase</a>:</p>
Daniel Veillardd5f97f82000-09-17 16:38:14 +0000635<ol>
Daniel Veillard0b28e882002-07-24 23:47:05 +0000636 <li>Provide patches when you find problems.</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000637 <li>Provide the diffs when you port libxml2 to a new platform. They
638 maynotbe integrated in all cases but help pinpointing
639 portabilityproblemsand</li>
640 <li>Provide documentation fixes (either as patches to the code commentsoras
641 HTML diffs).</li>
Daniel Veillard69839ba2006-06-06 13:27:03 +0000642 <li>Provide new documentations pieces (translations, examples, etc...).</li>
Daniel Veillard0b28e882002-07-24 23:47:05 +0000643 <li>Check the TODO file and try to close one of the items.</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000644 <li>Take one of the points raised in the archive or the bug
645 databaseandprovide a fix. <a href="mailto:daniel@veillard.com">Get in
646 touch withme</a>before to avoid synchronization problems and check that
647 thesuggestedfix will fit in nicely :-)</li>
Daniel Veillardd5f97f82000-09-17 16:38:14 +0000648</ol>
649
Daniel Veillard10a2c651999-12-12 13:03:50 +0000650<h2><a name="Downloads">Downloads</a></h2>
Daniel Veillard2f4dfc41999-09-24 14:03:48 +0000651
Daniel Veillard688f6692004-03-26 10:57:38 +0000652<p>The latest versions of libxml2 can be found on the <a
Daniel Veillard69839ba2006-06-06 13:27:03 +0000653href="ftp://xmlsoft.org/libxml2/">xmlsoft.org</a>server ( <a
Daniel Veillardeca726d2004-04-18 21:47:34 +0000654href="http://xmlsoft.org/sources/">HTTP</a>, <a
Daniel Veillardfabafd52006-06-08 08:16:33 +0000655href="ftp://xmlsoft.org/libxml2/">FTP</a>and rsync are available), there
656isalsomirrors (<a
657href="ftp://ftp.planetmirror.com/pub/xmlsoft/">Australia</a>(<a
658href="http://xmlsoft.planetmirror.com/">Web</a>), <a
Daniel Veillard20c8cf22001-06-26 22:47:36 +0000659href="ftp://fr.rpmfind.net/pub/libxml/">France</a>) or on the <a
Daniel Veillard69839ba2006-06-06 13:27:03 +0000660href="ftp://ftp.gnome.org/pub/GNOME/MIRRORS.html">Gnome FTP server</a>as <a
Daniel Veillardfabafd52006-06-08 08:16:33 +0000661href="ftp://ftp.gnome.org/pub/GNOME/sources/libxml2/2.6/">source
662archive</a>,Antonin Sprinzl also provide <a
663href="ftp://gd.tuwien.ac.at/pub/libxml/">amirror in Austria</a>. (NOTE
664thatyou need both the <a
Daniel Veillard69839ba2006-06-06 13:27:03 +0000665href="http://rpmfind.net/linux/RPM/libxml2.html">libxml(2)</a>and <a
Daniel Veillardfabafd52006-06-08 08:16:33 +0000666href="http://rpmfind.net/linux/RPM/libxml2-devel.html">libxml(2)-devel</a>packagesinstalled
667to compile applications using libxml.)</p>
Daniel Veillardeca726d2004-04-18 21:47:34 +0000668
669<p>You can find all the history of libxml(2) and libxslt releases in the <a
Daniel Veillardfabafd52006-06-08 08:16:33 +0000670href="http://xmlsoft.org/sources/old/">old</a>directory.
671TheprecompiledWindows binaries made by Igor Zlatovic are available in the <a
Daniel Veillard69839ba2006-06-06 13:27:03 +0000672href="http://xmlsoft.org/sources/win32/">win32</a>directory.</p>
Daniel Veillardb46a5732003-04-08 13:35:48 +0000673
674<p>Binary ports:</p>
675<ul>
676 <li>Red Hat RPMs for i386 are available directly on <a
Daniel Veillardfabafd52006-06-08 08:16:33 +0000677 href="ftp://xmlsoft.org/libxml2/">xmlsoft.org</a>, the source RPM
678 willcompile onany architecture supported by Red Hat.</li>
679 <li><a href="mailto:igor@zlatkovic.com">Igor Zlatkovic</a>is
680 nowthemaintainer of the Windows port, <a
681 href="http://www.zlatkovic.com/projects/libxml/index.html">heprovidesbinaries</a>.</li>
Daniel Veillard67952602006-01-05 15:29:44 +0000682 <li>Blastwave provides <a
Daniel Veillard69839ba2006-06-06 13:27:03 +0000683 href="http://www.blastwave.org/packages.php/libxml2">Solarisbinaries</a>.</li>
684 <li><a href="mailto:Steve.Ball@explain.com.au">Steve Ball</a>provides <a
Daniel Veillardfabafd52006-06-08 08:16:33 +0000685 href="http://www.explain.com.au/oss/libxml2xslt.html">Mac
686 OsXbinaries</a>.</li>
Daniel Veillardb46a5732003-04-08 13:35:48 +0000687 <li>The HP-UX porting center provides <a
688 href="http://hpux.connect.org.uk/hppd/hpux/Gnome/">HP-UX binaries</a></li>
Daniel Veillard67952602006-01-05 15:29:44 +0000689 <li>Bull provides precompiled <a
Daniel Veillardfabafd52006-06-08 08:16:33 +0000690 href="http://gnome.bullfreeware.com/new_index.html">RPMs forAIX</a>aspatr
691 of their GNOME packages</li>
Daniel Veillardb46a5732003-04-08 13:35:48 +0000692</ul>
693
694<p>If you know other supported binary ports, please <a
695href="http://veillard.com/">contact me</a>.</p>
Daniel Veillard2f4dfc41999-09-24 14:03:48 +0000696
Daniel Veillard6c8b1172000-03-01 00:40:41 +0000697<p><a name="Snapshot">Snapshot:</a></p>
698<ul>
Daniel Veillard3cef1192004-08-18 09:30:31 +0000699 <li>Code from the W3C cvs base libxml2 module, updated hourly <a
Daniel Veillard9110ed62006-04-03 15:21:57 +0000700 href="ftp://xmlsoft.org/libxml2/libxml2-cvs-snapshot.tar.gz">libxml2-cvs-snapshot.tar.gz</a>.</li>
Daniel Veillard6c8b1172000-03-01 00:40:41 +0000701 <li>Docs, content of the web site, the list archive included <a
Daniel Veillard9110ed62006-04-03 15:21:57 +0000702 href="ftp://xmlsoft.org/libxml2/libxml-docs.tar.gz">libxml-docs.tar.gz</a>.</li>
Daniel Veillard6c8b1172000-03-01 00:40:41 +0000703</ul>
704
Daniel Veillardc6271d22001-10-27 07:50:58 +0000705<p><a name="Contribs">Contributions:</a></p>
Daniel Veillard6c8b1172000-03-01 00:40:41 +0000706
Daniel Veillardfabafd52006-06-08 08:16:33 +0000707<p>I do accept external contributions, especially if compiling
708onanotherplatform, get in touch with the list to upload the package,
709wrappersforvarious languages have been provided, and can be found in the <a
Daniel Veillardd99224d2004-04-06 10:04:16 +0000710href="python.html">bindings section</a></p>
Daniel Veillard6c8b1172000-03-01 00:40:41 +0000711
Daniel Veillard8a469172003-06-12 16:05:07 +0000712<p>Libxml2 is also available from CVS:</p>
Daniel Veillard10a2c651999-12-12 13:03:50 +0000713<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000714 <li><p>The <a
715 href="http://cvs.gnome.org/viewcvs/libxml2/">GnomeCVSbase</a>. Check the
716 <a href="http://developer.gnome.org/tools/cvs.html">Gnome CVS
717 Tools</a>page;the CVS module is <b>libxml2</b>.</p>
Daniel Veillard10a2c651999-12-12 13:03:50 +0000718 </li>
Daniel Veillard69839ba2006-06-06 13:27:03 +0000719 <li>The <strong>libxslt</strong>module is also present there</li>
Daniel Veillard10a2c651999-12-12 13:03:50 +0000720</ul>
721
Daniel Veillard78fed532004-10-09 19:44:48 +0000722<h2><a name="News">Releases</a></h2>
Daniel Veillard10a2c651999-12-12 13:03:50 +0000723
Daniel Veillardfabafd52006-06-08 08:16:33 +0000724<p>Items not finished and worked on, get in touch with the list if you
725wanttohelp those</p>
Daniel Veillardab8500d2000-10-15 21:06:19 +0000726<ul>
Daniel Veillard72fef162003-02-05 14:31:19 +0000727 <li>More testing on RelaxNG</li>
Daniel Veillard69839ba2006-06-06 13:27:03 +0000728 <li>Finishing up <a
729 href="http://www.w3.org/TR/xmlschema-1/">XMLSchemas</a></li>
Daniel Veillard72fef162003-02-05 14:31:19 +0000730</ul>
731
Daniel Veillardfabafd52006-06-08 08:16:33 +0000732<p>The <a href="ChangeLog.html">change log</a>describes the recents
733commitstothe <a href="http://cvs.gnome.org/viewcvs/libxml2/">CVS</a>code
734base.</p>
Daniel Veillard2d908032004-08-10 10:16:36 +0000735
Daniel Veillardd99224d2004-04-06 10:04:16 +0000736<p>There is the list of public releases:</p>
737
Daniel Veillardfabafd52006-06-08 08:16:33 +0000738<h3>2.6.26: Jun 6 2006</h3>
Daniel Veillard69839ba2006-06-06 13:27:03 +0000739<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000740 <li>portability fixes: Python detection (Joseph Sacco), compilation
741 error(William Brack and Graham Bennett), LynxOS patch (Olli Savia)</li>
742 <li>bug fixes: encoding buffer problem, mix of code and data in
743 xmlIO.c(Kjartan Maraas), entities in XSD validation (Kasimier Buchcik),
744 variousXSD validation fixes (Kasimier), memory leak in pattern (Rob
745 Richards andKasimier), attribute with colon in name (Rob Richards), XPath
746 leak inerror reporting (Aleksey Sanin), XInclude text include of
747 selfdocument.</li>
748 <li>improvements: Xpath optimizations (Kasimier), XPath object
749 cache(Kasimier)</li>
Daniel Veillard69839ba2006-06-06 13:27:03 +0000750</ul>
751
Daniel Veillardfabafd52006-06-08 08:16:33 +0000752<h3>2.6.25: Jun 6 2006:</h3>
753
754<p>Do not use or package 2.6.25</p>
755
Daniel Veillardb2f8f1d2006-04-28 16:30:48 +0000756<h3>2.6.24: Apr 28 2006</h3>
757<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000758 <li>Portability fixes: configure on Windows, testapi compile
759 onwindows(Kasimier Buchcik, venkat naidu), Borland C++ 6 compile
760 (EricZurcher),HP-UX compiler workaround (Rick Jones), xml2-config
761 bugfix,gcc-4.1cleanups, Python detection scheme (Joseph Sacco), UTF-8
762 file pathsonWindows (Roland Schwingel).</li>
763 <li>Improvements: xmlDOMWrapReconcileNamespaces
764 xmlDOMWrapCloneNode(KasimierBuchcik), XML catalog debugging (Rick Jones),
765 update to Unicode4.01.</li>
766 <li>Bug fixes: xmlParseChunk() problem in 2.6.23,
767 xmlParseInNodeContext()onHTML docs, URI behaviour on Windows (Rob
768 Richards), comment streamingbug,xmlParseComment (with William Brack),
769 regexp bug fixes (DV &amp;YouriGolovanov), xmlGetNodePath on text/CDATA
770 (Kasimier),one Relax-NGinterleave bug, xmllint --path and --valid,XSD
771 bugfixes (Kasimier),remove debugleft in Python bindings (Nic Ferrier),
772 xmlCatalogAdd bug(Martin Cole),xmlSetProp fixes (Rob Richards), HTML
773 IDness (RobRichards), a largenumber of cleanups and small fixes based on
774 Coverityreports, bugin character ranges, Unicode tables const (Aivars
775 Kalvans),schemasfix (Stefan Kost), xmlRelaxNGParse
776 errordeallocation,xmlSchemaAddSchemaDoc error deallocation, error
777 handling onunallowedcode point, ixmllint --nonet to never reach the net
778 (GaryCoady),line break in writer after end PI (Jason Viers).</li>
Daniel Veillardb2f8f1d2006-04-28 16:30:48 +0000779 <li>Documentation: man pages updates and cleanups (Daniel Leidert).</li>
780 <li>New features: Relax NG structure error handlers.</li>
781</ul>
782
Daniel Veillard67952602006-01-05 15:29:44 +0000783<h3>2.6.23: Jan 5 2006</h3>
784<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000785 <li>portability fixes: Windows (Rob Richards), getaddrinfo on
786 Windows(KoljaNowak, Rob Richards), icc warnings (Kjartan
787 Maraas),--with-minimumcompilation fixes (William Brack), error case
788 handling fixon Solaris(Albert Chin), don't use 'list' as parameter name
789 reported bySamuel DiazGarcia, more old Unices portability fixes (Albert
790 Chin),MinGW compilation(Mark Junker), HP-UX compiler warnings
791 (RickJones),</li>
792 <li>code cleanup: xmlReportError (Adrian Mouat), removexmlBufferClose(Geert
793 Jansen), unreachable code (Oleksandr Kononenko),refactoringparsing code
794 (Bjorn Reese)</li>
795 <li>bug fixes: xmlBuildRelativeURI and empty path
796 (WilliamBrack),combinatory explosion and performances in regexp code,
797 leakinxmlTextReaderReadString(), xmlStringLenDecodeEntities
798 problem(MassimoMorara), Identity Constraints bugs and a segfault
799 (KasimierBuchcik),XPath pattern based evaluation bugs (DV
800 &amp;Kasimier),xmlSchemaContentModelDump() memory leak (Kasimier),
801 potentialleak inxmlSchemaCheckCSelectorXPath(), xmlTextWriterVSprintf()
802 misuseofvsnprintf (William Brack), XHTML serialization fix (Rob
803 Richards),CRLFsplit problem (William), issues with non-namespaced
804 attributesinxmlAddChild() xmlAddNextSibling() and xmlAddPrevSibling()
805 (RobRichards),HTML parsing of script, Python must not output to stdout
806 (NicFerrier),exclusive C14N namespace visibility (Aleksey Sanin),
807 XSDdataypetotalDigits bug (Kasimier Buchcik), error handling when writing
808 toanxmlBuffer (Rob Richards), runtest schemas error not
809 reported(HisashiFujinaka), signed/unsigned problem in date/time code
810 (AlbertChin), fixXSI driven XSD validation (Kasimier), parsing of
811 xs:decimal(Kasimier),fix DTD writer output (Rob Richards), leak
812 inxmlTextReaderReadInnerXml(Gary Coady), regexp bug affecting
813 schemas(Kasimier), configuration ofruntime debugging
814 (Kasimier),xmlNodeBufGetContent bug on entity refs(Oleksandr
815 Kononenko),xmlRegExecPushString2 bug (Sreeni Nair),compilation and build
816 fixes(Michael Day), removed dependancies onxmlSchemaValidError
817 (Kasimier), bugwith &lt;xml:foo/&gt;, more XPathpattern based evaluation
818 fixes(Kasimier)</li>
819 <li>improvements: XSD Schemas redefinitions/restrictions
820 (KasimierBuchcik),node copy checks and fix for attribute (Rob Richards),
821 countedtransitionbug in regexps, ctxt-&gt;standalone = -2 to indicate
822 nostandaloneattribute was found, add
823 xmlSchemaSetParserStructuredErrors()(KasimierBuchcik), add
824 xmlTextReaderSchemaValidateCtxt() to API(Kasimier), handlegzipped HTTP
825 resources (Gary Coady), addhtmlDocDumpMemoryFormat. (RobRichards),</li>
826 <li>documentation: typo (Michael Day), libxml man page (Albert
827 Chin),savefunction to XML buffer (Geert Jansen), small doc fix
828 (AronStansvik),</li>
Daniel Veillard67952602006-01-05 15:29:44 +0000829</ul>
830
Daniel Veillard33b20b72005-09-12 21:43:20 +0000831<h3>2.6.22: Sep 12 2005</h3>
832<ul>
833 <li>build fixes: compile without schematron (Stéphane Bidoul)</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000834 <li>bug fixes: xmlDebugDumpNode on namespace node (Oleg
835 Paraschenko)i,CDATApush parser bug, xmlElemDump problem with XHTML1
836 doc,XML_FEATURE_xxxclash with expat headers renamed XML_WITH_xxx, fix
837 someoutput formattingfor meta element (Rob Richards), script and
838 styleXHTML1 serialization(David Madore), Attribute derivation fixups in
839 XSD(Kasimier Buchcik),better IDC error reports (Kasimier Buchcik)</li>
840 <li>improvements: add XML_SAVE_NO_EMPTY xmlSaveOption (Rob
841 Richards),addXML_SAVE_NO_XHTML xmlSaveOption, XML Schemas improvements
842 preparingforderive (Kasimier Buchcik).</li>
843 <li>documentation: generation of gtk-doc like docs,
844 integrationwithdevhelp.</li>
Daniel Veillard33b20b72005-09-12 21:43:20 +0000845</ul>
Daniel Veillard67952602006-01-05 15:29:44 +0000846
Daniel Veillardb3d14912005-09-04 20:47:39 +0000847<h3>2.6.21: Sep 4 2005</h3>
848<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000849 <li>build fixes: Cygwin portability fixes (Gerrit P.
850 Haase),callingconvention problems on Windows (Marcus Boerger), cleanups
851 based onLinus'sparse tool, update of win32/configure.js (Rob Richards),
852 removewarningson Windows(Marcus Boerger), compilation without SAX1,
853 detectionof thePython binary, use $GCC inestad of $CC = 'gcc' (Andrew
854 W.Nosenko),compilation/link with threads and old gcc, compile problem
855 byC370 onZ/OS,</li>
856 <li>bug fixes: http_proxy environments (Peter Breitenlohner), HTML
857 UTF-8bug(Jiri Netolicky), XPath NaN compare bug (William
858 Brack),htmlParseScriptpotential bug, Schemas regexp handling of spaces,
859 Base64Schemascomparisons NIST passes, automata build error
860 xsd:all,xmlGetNodePath fornamespaced attributes (Alexander Pohoyda),
861 xmlSchemasforeign namespaceshandling, XML Schemas facet comparison
862 (KupriyanovAnatolij),xmlSchemaPSimpleTypeErr error report (Kasimier
863 Buchcik), xml:namespaceahndling in Schemas (Kasimier), empty model group
864 in Schemas(Kasimier),wilcard in Schemas (Kasimier), URI composition
865 (William),xs:anyType inSchemas (Kasimier), Python resolver emmitting
866 errormessages directly,Python xmlAttr.parent (Jakub Piotr Clapa), trying
867 tofix the file path/URIconversion, xmlTextReaderGetAttribute fix
868 (RobRichards),xmlSchemaFreeAnnot memleak (Kasimier), HTML
869 UTF-8serialization, streamingXPath, Schemas determinism detection
870 problem,XInclude bug, Schemascontext type (Dean Hill), validation fix
871 (DerekPoon),xmlTextReaderGetAttribute[Ns] namespaces (Rob Richards),
872 Schemastype fix(Kuba Nowakowski), UTF-8 parser bug, error in
873 encodinghandling,xmlGetLineNo fixes, bug on entities handling, entity
874 nameextraction inerror handling with XInclude, text nodes in HTML body
875 tags(Gary Coady),xml:id and IDness at the treee level fixes, XPath
876 streamingpatternsbugs.</li>
877 <li>improvements: structured interfaces for schemas and RNG
878 errorreports(Marcus Boerger), optimization of the char data inner
879 loopparsing(thanks to Behdad Esfahbod for the idea), schematron
880 validationthoughnot finished yet, xmlSaveOption to omit XML declaration,
881 keyrefmatcherror reports (Kasimier), formal expression handling code
882 notpluggedyet, more lax mode for the HTML parser, parser
883 XML_PARSE_COMPACToptionfor text nodes allocation.</li>
Daniel Veillardb3d14912005-09-04 20:47:39 +0000884 <li>documentation: xmllint man page had --nonet duplicated</li>
885</ul>
Daniel Veillard67952602006-01-05 15:29:44 +0000886
Daniel Veillard78dfc9f2005-07-10 22:30:30 +0000887<h3>2.6.20: Jul 10 2005</h3>
888<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000889 <li>build fixes: Windows build (Rob Richards), Mingw
890 compilation(IgorZlatkovic), Windows Makefile (Igor), gcc warnings
891 (Kasimierandandriy@google.com), use gcc weak references to pthread to
892 avoidthepthread dependancy on Linux, compilation problem (Steve
893 Nairn),compilingof subset (Morten Welinder), IPv6/ss_family compilation
894 (WilliamBrack),compilation when disabling parts of the library,
895 standalonetestdistribution.</li>
896 <li>bug fixes: bug in lang(), memory cleanup on errors (William
897 Brack),HTTPquery strings (Aron Stansvik), memory leak in DTD
898 (William),integeroverflow in XPath (William), nanoftp buffer size,
899 pattern "." apthfixup(Kasimier), leak in tree reported by Malcolm Rowe,
900 replaceNodepatch(Brent Hendricks), CDATA with NULL content (Mark Vakoc),
901 xml:basefixupon XInclude (William), pattern fixes (William), attribute
902 buginexclusive c14n (Aleksey Sanin), xml:space and xml:lang with
903 SAX2(RobRichards), namespace trouble in complex parsing (Malcolm Rowe),
904 XSDtypeQNames fixes (Kasimier), XPath streaming fixups (William),
905 RelaxNGbug(Rob Richards), Schemas for Schemas fixes (Kasimier), removal
906 of ID(RobRichards), a small RelaxNG leak, HTML parsing in push mode
907 bug(JamesBursa), failure to detect UTF-8 parsing bugs in
908 CDATAsections,areBlanks() heuristic failure, duplicate attributes in
909 DTDbug(William).</li>
910 <li>improvements: lot of work on Schemas by Kasimier Buchcik
911 bothonconformance and streaming, Schemas validation messages
912 (KasimierBuchcik,Matthew Burgess), namespace removal at the python
913 level(BrentHendricks), Update to new Schemas regression tests
914 fromW3C/Nist(Kasimier), xmlSchemaValidateFile() (Kasimier),
915 implementationofxmlTextReaderReadInnerXml and xmlTextReaderReadOuterXml
916 (JamesWert),standalone test framework and programs, new DOM
917 importAPIsxmlDOMWrapReconcileNamespaces()
918 xmlDOMWrapAdoptNode()andxmlDOMWrapRemoveNode(), extension of xmllint
919 capabilities for SAXandSchemas regression tests, xmlStopParser()
920 available in pull modetoo,ienhancement to xmllint --shell namespaces
921 support, Windows port ofthestandalone testing tools (Kasimier
922 andWilliam),xmlSchemaValidateStream() xmlSchemaSAXPlug()
923 andxmlSchemaSAXUnplug() SAXSchemas APIs, Schemas xmlReader support.</li>
Daniel Veillard78dfc9f2005-07-10 22:30:30 +0000924</ul>
Daniel Veillard67952602006-01-05 15:29:44 +0000925
Daniel Veillard771971f2005-04-02 10:49:51 +0000926<h3>2.6.19: Apr 02 2005</h3>
927<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000928 <li>build fixes: drop .la from RPMs, --with-minimum build
929 fix(WilliamBrack), use XML_SOCKLEN_T instead of SOCKLEN_T because it
930 breakswith AIX5.3 compiler, fixed elfgcchack.h generation and PLT
931 reductioncode onLinux/ELF/gcc4</li>
932 <li>bug fixes: schemas type decimal fixups (William Brack),
933 xmmlintreturncode (Gerry Murphy), small schemas fixes (Matthew Burgess
934 andGUYFabrice), workaround "DAV:" namespace brokeness in c14n
935 (AlekseySanin),segfault in Schemas (Kasimier Buchcik), Schemas
936 attributevalidation(Kasimier), Prop related functions and
937 xmlNewNodeEatName (RobRichards),HTML serialization of name attribute on a
938 elements, Pythonerror handlersleaks and improvement (Brent Hendricks),
939 uninitializedvariable inencoding code, Relax-NG validation bug, potential
940 crashifgnorableWhitespace is NULL, xmlSAXParseDoc and
941 xmlParseDocsignatures,switched back to assuming UTF-8 in case no encoding
942 is givenatserialization time</li>
943 <li>improvements: lot of work on Schemas by Kasimier Buchcik
944 onfacetschecking and also mixed handling.</li>
Daniel Veillard771971f2005-04-02 10:49:51 +0000945 <li></li>
946</ul>
Daniel Veillard67952602006-01-05 15:29:44 +0000947
Daniel Veillard57c000e2005-03-13 18:34:29 +0000948<h3>2.6.18: Mar 13 2005</h3>
949<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000950 <li>build fixes: warnings (Peter Breitenlohner),
951 testapi.cgeneration,Bakefile support (Francesco Montorsi), Windows
952 compilation(Joel Reed),some gcc4 fixes, HP-UX portability fixes (Rick
953 Jones).</li>
954 <li>bug fixes: xmlSchemaElementDump namespace (Kasimier Buchcik),
955 pushandxmlreader stopping on non-fatal errors, thread support
956 fordictionnariesreference counting (Gary Coady), internal subset and
957 pushproblem, URLsaved in xmlCopyDoc, various schemas bug fixes
958 (Kasimier),Python pathsfixup (Stephane Bidoul), xmlGetNodePath and
959 namespaces,xmlSetNsProp fix(Mike Hommey), warning should not count as
960 error (WilliamBrack),xmlCreatePushParser empty chunk, XInclude parser
961 flags (William),cleanupFTP and HTTP code to reuse the uri parsing and
962 IPv6(William),xmlTextWriterStartAttributeNS fix (Rob
963 Richards),XMLLINT_INDENT beingempty (William), xmlWriter bugs (Rob
964 Richards),multithreading on Windows(Rich Salz), xmlSearchNsByHref fix
965 (Kasimier),Python binding leak (BrentHendricks), aliasing bug exposed by
966 gcc4 ons390, xmlTextReaderNext bug(Rob Richards), Schemas decimal type
967 fixes(William Brack),xmlByteConsumed static buffer (Ben Maurer).</li>
968 <li>improvement: speedup parsing comments and DTDs, dictionnary
969 supportforhash tables, Schemas Identity constraints (Kasimier),
970 streamingXPathsubset, xmlTextReaderReadString added (Bjorn Reese),
971 Schemascanonicalvalues handling (Kasimier), add
972 xmlTextReaderByteConsumed(AronStansvik),</li>
Daniel Veillard67952602006-01-05 15:29:44 +0000973 <li>Documentation: Wiki support (Joel Reed)</li>
Daniel Veillard57c000e2005-03-13 18:34:29 +0000974</ul>
Daniel Veillard67952602006-01-05 15:29:44 +0000975
Daniel Veillard298d9642005-01-16 20:01:55 +0000976<h3>2.6.17: Jan 16 2005</h3>
977<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +0000978 <li>build fixes: Windows, warnings removal (William
979 Brack),maintainer-cleandependency(William), build in a different
980 directory(William), fixing--with-minimum configure build (William), BeOS
981 build(Marcin Konicki),Python-2.4 detection (William), compilation on AIX
982 (DanMcNichol)</li>
983 <li>bug fixes: xmlTextReaderHasAttributes (Rob
984 Richards),xmlCtxtReadFile()to use the catalog(s), loop on output (William
985 Brack),XPath memory leak,ID deallocation problem (Steve Shepard),
986 debugDumpNodecrash (William),warning not using error callback (William),
987 xmlStopParserbug (William),UTF-16 with BOM on DTDs (William), namespace
988 bug on emptyelements inpush mode (Rob Richards), line and col
989 computations fixups(AlekseySanin), xmlURIEscape fix (William),
990 xmlXPathErr on bad range(William),patterns with too many steps, bug in
991 RNG choice optimization,line numbersometimes missing.</li>
992 <li>improvements: XSD Schemas (Kasimier Buchcik), pythongenerator(William),
993 xmlUTF8Strpos speedup (William), unicode Pythonstrings(William), XSD
994 error reports (Kasimier Buchcik), Python __str__callserialize().</li>
995 <li>new APIs: added xmlDictExists(), GetLineNumber and
996 GetColumnNumberforthe xmlReader (Aleksey Sanin), Dynamic Shared Libraries
997 APIs (mostlyJoelReed), error extraction API from regexps, new XMLSave
998 option forformat(Phil Shafer)</li>
Daniel Veillard69839ba2006-06-06 13:27:03 +0000999 <li>documentation: site improvement (John Fleck), FAQ entries(William).</li>
Daniel Veillard298d9642005-01-16 20:01:55 +00001000</ul>
1001
Daniel Veillardc3d7cb42004-11-10 14:34:45 +00001002<h3>2.6.16: Nov 10 2004</h3>
1003<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001004 <li>general hardening and bug fixing crossing all the API based
1005 onnewautomated regression testing</li>
Daniel Veillardc3d7cb42004-11-10 14:34:45 +00001006 <li>build fix: IPv6 build and test on AIX (Dodji Seketeli)</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001007 <li>bug fixes: problem with XML::Libxml reported by Petr
1008 Pajas,encodingconversion functions return values, UTF-8 bug affecting
1009 XPathreported byMarkus Bertheau, catalog problem with NULL entries
1010 (WilliamBrack)</li>
1011 <li>documentation: fix to xmllint man page, some API
1012 functiondescritpionwere updated.</li>
1013 <li>improvements: DTD validation APIs provided at the Python
1014 level(BrentHendricks)</li>
Daniel Veillardc3d7cb42004-11-10 14:34:45 +00001015</ul>
1016
Daniel Veillardc2f83d12004-10-27 22:59:21 +00001017<h3>2.6.15: Oct 27 2004</h3>
Daniel Veillard6927b102004-10-27 17:29:04 +00001018<ul>
1019 <li>security fixes on the nanoftp and nanohttp modules</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001020 <li>build fixes: xmllint detection bug in configure, building
1021 outsidethesource tree (Thomas Fitzsimmons)</li>
1022 <li>bug fixes: HTML parser on broken ASCII chars in names
1023 (William),Pythonpaths (Malcolm Tredinnick), xmlHasNsProp and default
1024 namespace(William),saving to python file objects (Malcolm Tredinnick),
1025 DTD lookupfix(Malcolm), save back &lt;group&gt; in catalogs (William),
1026 treebuildfixes (DV and Rob Richards), Schemas memory bug, structured
1027 errorhandleron Python 64bits, thread local memory deallocation, memory
1028 leakreportedby Volker Roth, xmlValidateDtd in the presence of an
1029 internalsubset,entities and _private problem (William),
1030 xmlBuildRelativeURIerror(William).</li>
1031 <li>improvements: better XInclude error reports (William),
1032 treedebuggingmodule and tests, convenience functions at the Reader
1033 API(GrahamBennett), add support for PI in the HTML parser.</li>
Daniel Veillard6927b102004-10-27 17:29:04 +00001034</ul>
1035
Daniel Veillard210818b2004-09-29 15:50:37 +00001036<h3>2.6.14: Sep 29 2004</h3>
1037<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001038 <li>build fixes: configure paths for xmllint and
1039 xsltproc,compilationwithout HTML parser, compilation warning cleanups
1040 (WilliamBrack &amp;Malcolm Tredinnick), VMS makefile update (Craig
1041 Berry),</li>
1042 <li>bug fixes: xmlGetUTF8Char (William Brack), QName
1043 properties(KasimierBuchcik), XInclude testing, Notation
1044 serialization,UTF8ToISO8859xtranscoding (Mark Itzcovitz), lots of XML
1045 Schemas cleanupand fixes(Kasimier), ChangeLog cleanup (Stepan Kasal),
1046 memory fixes (MarkVakoc),handling of failed realloc(), out of bound array
1047 adressing inSchemasdate handling, Python space/tabs cleanups (Malcolm
1048 Tredinnick),NMTOKENSE20 validation fix (Malcolm),</li>
1049 <li>improvements: added W3C XML Schemas testsuite (Kasimier
1050 Buchcik),addxmlSchemaValidateOneElement (Kasimier), Python
1051 exceptionhierearchy(Malcolm Tredinnick), Python libxml2 driver
1052 improvement(MalcolmTredinnick), Schemas support
1053 forxsi:schemaLocation,xsi:noNamespaceSchemaLocation, xsi:type
1054 (KasimierBuchcik)</li>
Daniel Veillard210818b2004-09-29 15:50:37 +00001055</ul>
1056
Daniel Veillardd1de4a32004-08-31 13:43:07 +00001057<h3>2.6.13: Aug 31 2004</h3>
1058<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001059 <li>build fixes: Windows and zlib (Igor Zlatkovic), -O flag withgcc,Solaris
1060 compiler warning, fixing RPM BuildRequires,</li>
1061 <li>fixes: DTD loading on Windows (Igor), Schemas error
1062 reportsAPIs(Kasimier Buchcik), Schemas validation crash, xmlCheckUTF8
1063 (WilliamBrackand Julius Mittenzwei), Schemas facet check (Kasimier),
1064 defaultnamespaceproblem (William), Schemas hexbinary empty values,
1065 encodingerror couldgenrate a serialization loop.</li>
1066 <li>Improvements: Schemas validity improvements (Kasimier), added
1067 --pathand--load-trace options to xmllint</li>
Daniel Veillardd1de4a32004-08-31 13:43:07 +00001068 <li>documentation: tutorial update (John Fleck)</li>
1069</ul>
1070
Daniel Veillardb331fff2004-08-22 14:21:57 +00001071<h3>2.6.12: Aug 22 2004</h3>
1072<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001073 <li>build fixes: fix --with-minimum, elfgcchack.h
1074 fixes(PeterBreitenlohner), perl path lookup (William), diff on
1075 Solaris(AlbertChin), some 64bits cleanups.</li>
1076 <li>Python: avoid a warning with 2.3 (William Brack), tab and
1077 spacemixes(William), wrapper generator fixes (William), Cygwin support
1078 (GerritP.Haase), node wrapper fix (Marc-Antoine Parent), XML
1079 Schemassupport(Torkel Lyng)</li>
Daniel Veillardb331fff2004-08-22 14:21:57 +00001080 <li>Schemas: a lot of bug fixes and improvements from Kasimier Buchcik</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001081 <li>fixes: RVT fixes (William), XPath context resets bug
1082 (William),memorydebug (Steve Hay), catalog white space handling
1083 (PeterBreitenlohner),xmlReader state after attribute reading
1084 (William),structured errorhandler (William), XInclude generated xml:base
1085 fixup(William), Windowsmemory reallocation problem (Steve Hay), Out of
1086 Memoryconditionshandling (William and Olivier Andrieu), htmlNewDoc()
1087 charsetbug,htmlReadMemory init (William), a posteriori validation
1088 DTDbase(William), notations serialization missing,
1089 xmlGetNodePath(Dodji),xmlCheckUTF8 (Diego Tartara), missing line numbers
1090 onentity(William)</li>
1091 <li>improvements: DocBook catalog build scrip (William),
1092 xmlcatalogtool(Albert Chin), xmllint --c14n option, no_proxy environment
1093 (MikeHommey),xmlParseInNodeContext() addition, extend xmllint --shell,
1094 allowXIncludeto not generate start/end nodes, extend xmllint --version
1095 toinclude CVStag (William)</li>
1096 <li>documentation: web pages fixes, validity API docs fixes(William)schemas
1097 API fix (Eric Haszlakiewicz), xmllint man page (JohnFleck)</li>
Daniel Veillardb331fff2004-08-22 14:21:57 +00001098</ul>
1099
Daniel Veillard45cb0f42004-07-05 17:45:35 +00001100<h3>2.6.11: July 5 2004</h3>
1101<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001102 <li>Schemas: a lot of changes and improvements by Kasimier
1103 Buchcikforattributes, namespaces and simple types.</li>
1104 <li>build fixes: --with-minimum (William Brack), some gcccleanup(William),
1105 --with-thread-alloc (William)</li>
1106 <li>portability: Windows binary package change (Igor Zlatkovic),Catalogpath
1107 on Windows</li>
1108 <li>documentation: update to the tutorial (John Fleck), xmllint
1109 returncode(John Fleck), man pages (Ville Skytta),</li>
1110 <li>bug fixes: C14N bug serializing namespaces (Aleksey
1111 Sanin),testSAXproperly initialize the library (William), empty node set
1112 inXPath(William), xmlSchemas errors (William), invalid charref
1113 problempointedby Morus Walter, XInclude xml:base generation (William),
1114 Relax-NGbugwith div processing (William), XPointer and
1115 xml:baseproblem(William),Reader and entities, xmllint return code for
1116 schemas(William), readerstreaming problem (Steve Ball), DTD
1117 serializationproblem (William),libxml.m4 fixes (Mike Hommey), do not
1118 providedestructors as methods onPython classes, xmlReader buffer bug,
1119 Pythonbindings memory interfacesimprovement (with Stéphane Bidoul), Fixed
1120 thepush parser to be back tosynchronous behaviour.</li>
1121 <li>improvement: custom per-thread I/O enhancement (Rob
1122 Richards),registernamespace in debug shell (Stefano Debenedetti), Python
1123 basedregressiontest for non-Unix users (William), dynamically increase
1124 thenumber ofXPath extension functions in Python and fix a memory
1125 leak(Marc-AntoineParent and William)</li>
1126 <li>performance: hack done with Arjan van de Ven to reduce ELF
1127 footprintandgenerated code on Linux, plus use gcc runtime profiling to
1128 optimizethecode generated in the RPM packages.</li>
Daniel Veillard45cb0f42004-07-05 17:45:35 +00001129</ul>
1130
Daniel Veillard81205012004-05-18 03:06:41 +00001131<h3>2.6.10: May 17 2004</h3>
1132<ul>
1133 <li>Web page generated for ChangeLog</li>
1134 <li>build fixes: --without-html problems, make check without make all</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001135 <li>portability: problem with xpath.c on Windows (MSC and
1136 Borland),memcmpvs. strncmp on Solaris, XPath tests on Windows (Mark
1137 Vakoc), C++ donotuse "list" as parameter name, make tests work with
1138 Python 1.5(EdDavis),</li>
1139 <li>improvements: made xmlTextReaderMode public, small
1140 buffersresizing(Morten Welinder), add --maxmem option to
1141 xmllint,addxmlPopInputCallback() for Matt Sergeant, refactoring
1142 ofserializationescaping, added escaping customization</li>
1143 <li>bugfixes: xsd:extension (Taihei Goi), assorted regexp
1144 bugs(WilliamBrack), xmlReader end of stream problem, node deregistration
1145 withreader,URI escaping and filemanes, XHTML1 formatting (Nick
1146 Wellnhofer),regexptransition reduction (William), various XSD Schemas
1147 fixes(KasimierBuchcik), XInclude fallback problem (William), weird
1148 problemswith DTD(William), structured error handler callback context
1149 (William),reversexmlEncodeSpecialChars() behaviour back to escaping
1150 '"'</li>
Daniel Veillard81205012004-05-18 03:06:41 +00001151</ul>
1152
Daniel Veillardeca726d2004-04-18 21:47:34 +00001153<h3>2.6.9: Apr 18 2004</h3>
1154<ul>
1155 <li>implement xml:id Working Draft, relaxed XPath id() checking</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001156 <li>bugfixes: xmlCtxtReset (Brent Hendricks), line number and
1157 CDATA(DaveBeckett), Relax-NG compilation (William Brack), Regexp
1158 patches(withWilliam), xmlUriEscape (Mark Vakoc), a Relax-NG notAllowed
1159 problem(withWilliam), Relax-NG name classes compares (William),
1160 XIncludeduplicatefallback (William), external DTD encoding detection
1161 (William), aDTDvalidation bug (William), xmlReader Close() fix,
1162 recusiveextentionschemas</li>
1163 <li>improvements: use xmlRead* APIs in test tools (Mark
1164 Vakoc),indentingsave optimization, better handle IIS broken HTTP
1165 redirectbehaviour (IanHummel), HTML parser frameset (James Bursa),
1166 libxml2-pythonRPMdependancy, XML Schemas union support (Kasimier
1167 Buchcik), warningremovalclanup (William), keep ChangeLog compressed when
1168 installing fromRPMs</li>
1169 <li>documentation: examples and xmlDocDumpMemory docs (John
1170 Fleck),newexample (load, xpath, modify, save), xmlCatalogDump()
1171 comments,</li>
1172 <li>Windows: Borland C++ builder (Eric Zurcher), work
1173 aroundMicrosoftcompiler NaN handling bug (Mark Vakoc)</li>
Daniel Veillardeca726d2004-04-18 21:47:34 +00001174</ul>
1175
Daniel Veillard252004d2004-03-23 12:32:32 +00001176<h3>2.6.8: Mar 23 2004</h3>
1177<ul>
1178 <li>First step of the cleanup of the serialization code and APIs</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001179 <li>XML Schemas: mixed content (Adam Dickmeiss), QName handling
1180 fixes(AdamDickmeiss), anyURI for "" (John Belmonte)</li>
Daniel Veillard252004d2004-03-23 12:32:32 +00001181 <li>Python: Canonicalization C14N support added (Anthony Carrico)</li>
1182 <li>xmlDocCopyNode() extension (William)</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001183 <li>Relax-NG: fix when processing XInclude results
1184 (William),externalreference in interleave (William), missing error
1185 on&lt;choice&gt;failure (William), memory leak in schemas
1186 datatypefacets.</li>
Daniel Veillard252004d2004-03-23 12:32:32 +00001187 <li>xmlWriter: patch for better DTD support (Alfred Mickautsch)</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001188 <li>bug fixes: xmlXPathLangFunction memory leak (Mike Hommey
1189 andWilliamBrack), no ID errors if using HTML_PARSE_NOERROR,
1190 xmlcatalogfallbacks toURI on SYSTEM lookup failure, XInclude parse
1191 flagsinheritance (William),XInclude and XPointer fixes for entities
1192 (William),XML parser bugreported by Holger Rauch, nanohttp fd leak
1193 (William),regexps chargroups '-' handling (William), dictionnary
1194 reference countingproblems,do not close stderr.</li>
Daniel Veillard252004d2004-03-23 12:32:32 +00001195 <li>performance patches from Petr Pajas</li>
1196 <li>Documentation fixes: XML_CATALOG_FILES in man pages (Mike Hommey)</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001197 <li>compilation and portability fixes: --without-valid,
1198 catalogcleanups(Peter Breitenlohner), MingW patch (Roland
1199 Schwingel),cross-compilationto Windows (Christophe de Vienne),
1200 --with-html-dirfixup (Julio MerinoVidal), Windows build (Eric
1201 Zurcher)</li>
Daniel Veillard252004d2004-03-23 12:32:32 +00001202</ul>
1203
Daniel Veillard92914492004-02-23 16:33:21 +00001204<h3>2.6.7: Feb 23 2004</h3>
1205<ul>
1206 <li>documentation: tutorial updates (John Fleck), benchmark results</li>
1207 <li>xmlWriter: updates and fixes (Alfred Mickautsch, Lucas Brasilino)</li>
1208 <li>XPath optimization (Petr Pajas)</li>
1209 <li>DTD ID handling optimization</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001210 <li>bugfixes: xpath number with &gt; 19 fractional (William
1211 Brack),pushmode with unescaped '&gt;' characters, fix xmllint --stream
1212 --timing,fixxmllint --memory --stream memory
1213 usage,xmlAttrSerializeTxtContenthandling NULL, trying to fix
1214 Relax-NG/Perlinterface.</li>
Daniel Veillard92914492004-02-23 16:33:21 +00001215 <li>python: 2.3 compatibility, whitespace fixes (Malcolm Tredinnick)</li>
1216 <li>Added relaxng option to xmllint --shell</li>
1217</ul>
1218
Daniel Veillard5c9547e2004-02-12 15:31:49 +00001219<h3>2.6.6: Feb 12 2004</h3>
1220<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001221 <li>nanohttp and nanoftp: buffer overflow error on URI parsing
1222 (IgorandWilliam) reported by Yuuichi Teranishi</li>
1223 <li>bugfixes: make test and path issues, xmlWriter
1224 attributeserialization(William Brack), xmlWriter indentation (William),
1225 schemasvalidation(Eric Haszlakiewicz), XInclude dictionnaries issues
1226 (Williamand OlegParaschenko), XInclude empty fallback (William), HTML
1227 warnings(William),XPointer in XInclude (William), Python
1228 namespaceserialization,isolat1ToUTF8 bound error (Alfred Mickautsch),
1229 output ofparameterentities in internal subset (William), internal subset
1230 bug inpush mode,&lt;xs:all&gt; fix (Alexey Sarytchev)</li>
1231 <li>Build: fix for automake-1.8 (Alexander Winston), warningsremoval(Philip
1232 Ludlam), SOCKLEN_T detection fixes (Daniel Richard),fix--with-minimum
1233 configuration.</li>
Daniel Veillard5c9547e2004-02-12 15:31:49 +00001234 <li>XInclude: allow the 2001 namespace without warning.</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001235 <li>Documentation: missing example/index.html (John
1236 Fleck),versiondependancies (John Fleck)</li>
Daniel Veillard5c9547e2004-02-12 15:31:49 +00001237 <li>reader API: structured error reporting (Steve Ball)</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001238 <li>Windows compilation: mingw, msys (Mikhail
1239 Grushinskiy),functionprototype (Cameron Johnson), MSVC6 compiler
1240 warnings,_WINSOCKAPI_patch</li>
1241 <li>Parsers: added xmlByteConsumed(ctxt) API to get the byte
1242 offestininput.</li>
Daniel Veillard5c9547e2004-02-12 15:31:49 +00001243</ul>
1244
Daniel Veillard189f46b2004-01-25 21:03:04 +00001245<h3>2.6.5: Jan 25 2004</h3>
1246<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001247 <li>Bugfixes: dictionnaries for schemas (William Brack),
1248 regexpsegfault(William), xs:all problem (William), a number of
1249 XPointerbugfixes(William), xmllint error go to stderr, DTD validation
1250 problemwithnamespace, memory leak (William), SAX1 cleanup and minimal
1251 optionsfixes(Mark Vadoc), parser context reset on error (Shaun McCance),
1252 XPathunionevaluation problem (William) , xmlReallocLoc with NULL
1253 (AlekseySanin),XML Schemas double free (Steve Ball), XInclude with no
1254 href,argumentcallbacks order for XPath callbacks (Frederic Peters)</li>
1255 <li>Documentation: python scripts (William Brack), xslt
1256 stylesheets(JohnFleck), doc (Sven Zimmerman), I/O example.</li>
1257 <li>Python bindings: fixes (William), enum support
1258 (StéphaneBidoul),structured error reporting (Stéphane Bidoul)</li>
1259 <li>XInclude: various fixes for conformance, problem related
1260 todictionnaryreferences (William &amp; me), recursion (William)</li>
1261 <li>xmlWriter: indentation (Lucas Brasilino), memory
1262 leaks(AlfredMickautsch),</li>
Daniel Veillard189f46b2004-01-25 21:03:04 +00001263 <li>xmlSchemas: normalizedString datatype (John Belmonte)</li>
1264 <li>code cleanup for strings functions (William)</li>
1265 <li>Windows: compiler patches (Mark Vakoc)</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001266 <li>Parser optimizations, a few new XPath and dictionnary APIs
1267 forfutureXSLT optimizations.</li>
Daniel Veillard189f46b2004-01-25 21:03:04 +00001268</ul>
1269
Daniel Veillarde6e59cd2003-12-24 11:56:44 +00001270<h3>2.6.4: Dec 24 2003</h3>
1271<ul>
1272 <li>Windows build fixes (Igor Zlatkovic)</li>
Daniel Veillard189f46b2004-01-25 21:03:04 +00001273 <li>Some serious XInclude problems reported by Oleg Paraschenko and</li>
1274 <li>Unix and Makefile packaging fixes (me, William Brack,</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001275 <li>Documentation improvements (John Fleck, William Brack),
1276 examplefix(Lucas Brasilino)</li>
1277 <li>bugfixes: xmlTextReaderExpand() with xmlReaderWalker, XPath
1278 handlingofNULL strings (William Brack) , API building reader or
1279 parserfromfiledescriptor should not close it, changed XPath sorting to
1280 bestableagain (William Brack), xmlGetNodePath() generating
1281 '(null)'(WilliamBrack), DTD validation and namespace bug (William Brack),
1282 XMLSchemasdouble inclusion behaviour</li>
Daniel Veillarde6e59cd2003-12-24 11:56:44 +00001283</ul>
1284
Daniel Veillardc480c4e2003-12-10 13:24:38 +00001285<h3>2.6.3: Dec 10 2003</h3>
1286<ul>
1287 <li>documentation updates and cleanup (DV, William Brack, John Fleck)</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001288 <li>added a repository of examples, examples from Aleksey
1289 Sanin,DodjiSeketeli, Alfred Mickautsch</li>
1290 <li>Windows updates: Mark Vakoc, Igor Zlatkovic, Eric Zurcher,Mingw(Kenneth
1291 Haley)</li>
Daniel Veillardc480c4e2003-12-10 13:24:38 +00001292 <li>Unicode range checking (William Brack)</li>
1293 <li>code cleanup (William Brack)</li>
1294 <li>Python bindings: doc (John Fleck), bug fixes</li>
1295 <li>UTF-16 cleanup and BOM issues (William Brack)</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001296 <li>bug fixes: ID and xmlReader validation, XPath (William
1297 Brack),xmlWriter(Alfred Mickautsch), hash.h inclusion problem, HTML
1298 parser(James Bursa),attribute defaulting and validation, some
1299 serializationcleanups,XML_GET_LINE macro, memory debug when using threads
1300 (WilliamBrack),serialization of attributes and entities content,
1301 xmlWriter(DanielSchulman)</li>
1302 <li>XInclude bugfix, new APIs and update to the last version
1303 includingthenamespace change.</li>
1304 <li>XML Schemas improvements: include (Robert Stepanek), importandnamespace
1305 handling, fixed the regression tests troubles, addedexamplesbased on Eric
1306 van der Vlist book, regexp fixes</li>
1307 <li>preliminary pattern support for streaming (needed
1308 forschemasconstraints), added xmlTextReaderPreservePattern() to
1309 collectsubdocumentwhen streaming.</li>
Daniel Veillardc480c4e2003-12-10 13:24:38 +00001310 <li>various fixes in the structured error handling</li>
1311</ul>
1312
Daniel Veillard6d373a22003-11-04 10:26:43 +00001313<h3>2.6.2: Nov 4 2003</h3>
1314<ul>
1315 <li>XPath context unregistration fixes</li>
1316 <li>text node coalescing fixes (Mark Lilback)</li>
1317 <li>API to screate a W3C Schemas from an existing document (Steve Ball)</li>
1318 <li>BeOS patches (Marcin 'Shard' Konicki)</li>
1319 <li>xmlStrVPrintf function added (Aleksey Sanin)</li>
1320 <li>compilation fixes (Mark Vakoc)</li>
1321 <li>stdin parsing fix (William Brack)</li>
1322 <li>a posteriori DTD validation fixes</li>
Daniel Veillardc480c4e2003-12-10 13:24:38 +00001323 <li>xmlReader bug fixes: Walker fixes, python bindings</li>
Daniel Veillard6d373a22003-11-04 10:26:43 +00001324 <li>fixed xmlStopParser() to really stop the parser and errors</li>
Daniel Veillard69839ba2006-06-06 13:27:03 +00001325 <li>always generate line numbers when using the new xmlReadxxxfunctions</li>
Daniel Veillard6d373a22003-11-04 10:26:43 +00001326 <li>added XInclude support to the xmlReader interface</li>
1327 <li>implemented XML_PARSE_NONET parser option</li>
1328 <li>DocBook XSLT processing bug fixed</li>
1329 <li>HTML serialization for &lt;p&gt; elements (William Brack and me)</li>
1330 <li>XPointer failure in XInclude are now handled as resource errors</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001331 <li>fixed xmllint --html to use the HTML serializer on output(added--xmlout
1332 to implement the previous behaviour of saving it using
1333 theXMLserializer)</li>
Daniel Veillard6d373a22003-11-04 10:26:43 +00001334</ul>
1335
Daniel Veillarde4e3f5d2003-10-28 23:06:32 +00001336<h3>2.6.1: Oct 28 2003</h3>
1337<ul>
1338 <li>Mostly bugfixes after the big 2.6.0 changes</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001339 <li>Unix compilation patches: libxml.m4 (Patrick Welche),
1340 warningscleanup(William Brack)</li>
1341 <li>Windows compilation patches (Joachim Bauch, Stephane
1342 Bidoul,IgorZlatkovic)</li>
Daniel Veillarde4e3f5d2003-10-28 23:06:32 +00001343 <li>xmlWriter bugfix (Alfred Mickautsch)</li>
1344 <li>chvalid.[ch]: couple of fixes from Stephane Bidoul</li>
Daniel Veillard69839ba2006-06-06 13:27:03 +00001345 <li>context reset: error state reset, push parser reset (GrahamBennett)</li>
Daniel Veillarde4e3f5d2003-10-28 23:06:32 +00001346 <li>context reuse: generate errors if file is not readable</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001347 <li>defaulted attributes for element coming from internal
1348 entities(StephaneBidoul)</li>
Daniel Veillarde4e3f5d2003-10-28 23:06:32 +00001349 <li>Python: tab and spaces mix (William Brack)</li>
1350 <li>Error handler could crash in DTD validation in 2.6.0</li>
1351 <li>xmlReader: do not use the document or element _private field</li>
1352 <li>testSAX.c: avoid a problem with some PIs (Massimo Morara)</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001353 <li>general bug fixes: mandatory encoding in text decl,
1354 serializingDocumentFragment nodes, xmlSearchNs 2.6.0 problem (Kasimier
1355 Buchcik),XPath errorsnot reported, slow HTML parsing of large
1356 documents.</li>
Daniel Veillarde4e3f5d2003-10-28 23:06:32 +00001357</ul>
1358
Daniel Veillard3e35f8e2003-10-21 00:05:38 +00001359<h3>2.6.0: Oct 20 2003</h3>
1360<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001361 <li>Major revision release: should be API and ABI compatible but got alotof
1362 change</li>
1363 <li>Increased the library modularity, far more options can be strippedout,a
1364 --with-minimum configuration will weight around 160KBytes</li>
1365 <li>Use per parser and per document dictionnary, allocate names
1366 andsmalltext nodes from the dictionnary</li>
1367 <li>Switch to a SAX2 like parser rewrote most of the XML
1368 parsercore,provides namespace resolution and defaulted attributes,
1369 minimizememoryallocations and copies, namespace checking and specific
1370 errorhandling,immutable buffers, make predefined entities static
1371 structures,etc...</li>
1372 <li>rewrote all the error handling in the library, all errors
1373 canbeintercepted at a structured level, with
1374 preciseinformationavailable.</li>
1375 <li>New simpler and more generic XML and HTML parser APIs,
1376 allowingtoeasilly modify the parsing options and reuse parser context
1377 formultipleconsecutive documents.</li>
1378 <li>Similar new APIs for the xmlReader, for options and reuse,
1379 providednewfunctions to access content as const strings, use them
1380 forPythonbindings</li>
1381 <li>a lot of other smaller API improvements: xmlStrPrintf
1382 (AlekseySanin),Walker i.e. reader on a document tree based on Alfred
1383 Mickautschcode,make room in nodes for line numbers, reference counting
1384 and futurePSVIextensions, generation of character ranges to be checked
1385 withfasteralgorithm (William), xmlParserMaxDepth (Crutcher
1386 Dunnavant),bufferaccess</li>
Daniel Veillard3e35f8e2003-10-21 00:05:38 +00001387 <li>New xmlWriter API provided by Alfred Mickautsch</li>
1388 <li>Schemas: base64 support by Anthony Carrico</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001389 <li>Parser&lt;-&gt;HTTP integration fix, proper processing of
1390 theMime-Typeand charset informations if available.</li>
1391 <li>Relax-NG: bug fixes including the one reported by Martijn
1392 FaassenandzeroOrMore, better error reporting.</li>
1393 <li>Python bindings (Stéphane Bidoul), never use stdout forerrorsoutput</li>
1394 <li>Portability: all the headers have macros for export
1395 andcallingconvention definitions (Igor Zlatkovic), VMS update (Craig
1396 A.Berry),Windows: threads (Jesse Pelton), Borland compiler (Eric
1397 Zurcher,Igor),Mingw (Igor), typos (Mark Vakoc), beta version
1398 (StephaneBidoul),warning cleanups on AIX and MIPS compilers (William
1399 Brack), BeOS(Marcin'Shard' Konicki)</li>
1400 <li>Documentation fixes and README (William Brack), search
1401 fix(William),tutorial updates (John Fleck), namespace docs (Stefan
1402 Kost)</li>
1403 <li>Bug fixes: xmlCleanupParser (Dave Beckett),
1404 threadinguninitializedmutexes, HTML doctype lowercase, SAX/IO
1405 (William),compression detectionand restore (William), attribute
1406 declaration in DTDs(William), namespaceon attribute in HTML output
1407 (William), input filename(Rob Richards),namespace DTD validation,
1408 xmlReplaceNode (Chris Ryland),I/O callbacks(Markus Keim), CDATA
1409 serialization (Shaun McCance),xmlReader (PeterDerr), high codepoint
1410 charref like &amp;#x10FFFF;, bufferaccess in pushmode (Justin Fletcher),
1411 TLS threads on Windows (JessePelton), XPath bug(William),
1412 xmlCleanupParser (Marc Liyanage), CDATAoutput (William), HTTPerror
1413 handling.</li>
1414 <li>xmllint options: --dtdvalidfpi for Tobias Reif, --sax1
1415 forcompattesting, --nodict for building without tree dictionnary,
1416 --nocdatatoreplace CDATA by text, --nsclean to remove
1417 surperfluousnamespacedeclarations</li>
Daniel Veillard3e35f8e2003-10-21 00:05:38 +00001418 <li>added xml2-config --libtool-libs option from Kevin P. Fleming</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001419 <li>a lot of profiling and tuning of the code, speedup
1420 patchforxmlSearchNs() by Luca Padovani. The xmlReader should do
1421 farlessallocation and it speed should get closer to SAX. Chris
1422 Andersonworkedon speeding and cleaning up repetitive checking code.</li>
Daniel Veillard3e35f8e2003-10-21 00:05:38 +00001423 <li>cleanup of "make tests"</li>
1424 <li>libxml-2.0-uninstalled.pc from Malcolm Tredinnick</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001425 <li>deactivated the broken docBook SGML parser code and plugged
1426 theXMLparser instead.</li>
Daniel Veillard3e35f8e2003-10-21 00:05:38 +00001427</ul>
1428
Daniel Veillardeec1ae92003-09-09 13:11:01 +00001429<h3>2.5.11: Sep 9 2003</h3>
1430
1431<p>A bugfix only release:</p>
1432<ul>
1433 <li>risk of crash in Relax-NG</li>
1434 <li>risk of crash when using multithreaded programs</li>
1435</ul>
1436
Daniel Veillardcfba2fe2003-08-15 00:33:43 +00001437<h3>2.5.10: Aug 15 2003</h3>
1438
1439<p>A bugfixes only release</p>
1440<ul>
1441 <li>Windows Makefiles (William Brack)</li>
1442 <li>UTF-16 support fixes (Mark Itzcovitz)</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001443 <li>Makefile and portability (William Brack) automake, Linux alpha,
1444 MingwonWindows (Mikhail Grushinskiy)</li>
Daniel Veillardcfba2fe2003-08-15 00:33:43 +00001445 <li>HTML parser (Oliver Stoeneberg)</li>
1446 <li>XInclude performance problem reported by Kevin Ruscoe</li>
1447 <li>XML parser performance problem reported by Grant Goodale</li>
1448 <li>xmlSAXParseDTD() bug fix from Malcolm Tredinnick</li>
Daniel Veillardeec1ae92003-09-09 13:11:01 +00001449 <li>and a couple other cleanup</li>
Daniel Veillardcfba2fe2003-08-15 00:33:43 +00001450</ul>
1451
Daniel Veillard83ee40d2003-08-09 22:24:09 +00001452<h3>2.5.9: Aug 9 2003</h3>
1453<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001454 <li>bugfixes: IPv6 portability, xmlHasNsProp (Markus Keim),
1455 Windowsbuild(Wiliam Brake, Jesse Pelton, Igor), Schemas (Peter
1456 Sobisch),threading(Rob Richards), hexBinary type (), UTF-16 BOM
1457 (DodjiSeketeli),xmlReader, Relax-NG schemas compilation, namespace
1458 handling,EXSLT (SeanGriffin), HTML parsing problem (William Brack), DTD
1459 validationfor mixedcontent + namespaces, HTML serialization,
1460 libraryinitialization,progressive HTML parser</li>
Daniel Veillard83ee40d2003-08-09 22:24:09 +00001461 <li>better interfaces for Relax-NG error handling (Joachim Bauch, )</li>
1462 <li>adding xmlXIncludeProcessTree() for XInclud'ing in a subtree</li>
1463 <li>doc fixes and improvements (John Fleck)</li>
1464 <li>configure flag for -with-fexceptions when embedding in C++</li>
1465 <li>couple of new UTF-8 helper functions (William Brack)</li>
1466 <li>general encoding cleanup + ISO-8859-x without iconv (Peter Jacobi)</li>
1467 <li>xmlTextReader cleanup + enum for node types (Bjorn Reese)</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001468 <li>general compilation/warning cleanup Solaris/HP-UX/...(WilliamBrack)</li>
Daniel Veillard83ee40d2003-08-09 22:24:09 +00001469</ul>
1470
Daniel Veillard560c2a42003-07-06 21:13:49 +00001471<h3>2.5.8: Jul 6 2003</h3>
1472<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001473 <li>bugfixes: XPath, XInclude, file/URI mapping, UTF-16
1474 save(MarkItzcovitz), UTF-8 checking, URI saving, error printing
1475 (WilliamBrack),PI related memleak, compilation without schemas or without
1476 xpath(JoergSchmitz-Linneweber/Garry Pennington), xmlUnlinkNode problem
1477 withDTDs,rpm problem on , i86_64, removed a few compilation problems
1478 from2.5.7,xmlIOParseDTD, and xmlSAXParseDTD (Malcolm Tredinnick)</li>
Daniel Veillard560c2a42003-07-06 21:13:49 +00001479 <li>portability: DJGPP (MsDos) , OpenVMS (Craig A. Berry)</li>
1480 <li>William Brack fixed multithreading lock problems</li>
1481 <li>IPv6 patch for FTP and HTTP accesses (Archana Shah/Wipro)</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001482 <li>Windows fixes (Igor Zlatkovic, Eric Zurcher),
1483 threading(StéphaneBidoul)</li>
Daniel Veillard560c2a42003-07-06 21:13:49 +00001484 <li>A few W3C Schemas Structure improvements</li>
1485 <li>W3C Schemas Datatype improvements (Charlie Bozeman)</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001486 <li>Python bindings for thread globals (Stéphane Bidoul),
1487 andmethod/classgenerator</li>
Daniel Veillard83ee40d2003-08-09 22:24:09 +00001488 <li>added --nonet option to xmllint</li>
Daniel Veillard560c2a42003-07-06 21:13:49 +00001489 <li>documentation improvements (John Fleck)</li>
1490</ul>
1491
Daniel Veillard92fc02c2003-04-24 23:12:35 +00001492<h3>2.5.7: Apr 25 2003</h3>
1493<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001494 <li>Relax-NG: Compiling to regexp and streaming validation on top
1495 ofthexmlReader interface, added to xmllint --stream</li>
Daniel Veillard92fc02c2003-04-24 23:12:35 +00001496 <li>xmlReader: Expand(), Next() and DOM access glue, bug fixes</li>
1497 <li>Support for large files: RGN validated a 4.5GB instance</li>
1498 <li>Thread support is now configured in by default</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001499 <li>Fixes: update of the Trio code (Bjorn), WXS Date and
1500 Durationfixes(Charles Bozeman), DTD and namespaces (Brent Hendricks),
1501 HTML pushparserand zero bytes handling, some missing Windows file
1502 pathconversions,behaviour of the parser and validator in the presence of
1503 "outof memory"error conditions</li>
1504 <li>extended the API to be able to plug a garbage
1505 collectingmemoryallocator, added xmlMallocAtomic() and modified
1506 theallocationsaccordingly.</li>
1507 <li>Performances: removed excessive malloc() calls, speedup of the
1508 pushandxmlReader interfaces, removed excessive thread locking</li>
Daniel Veillard92fc02c2003-04-24 23:12:35 +00001509 <li>Documentation: man page (John Fleck), xmlReader documentation</li>
1510 <li>Python: adding binding for xmlCatalogAddLocal (Brent M Hendricks)</li>
1511</ul>
1512
Daniel Veillardc2d4a932003-04-01 11:13:05 +00001513<h3>2.5.6: Apr 1 2003</h3>
1514<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001515 <li>Fixed W3C XML Schemas datatype, should be compliant now exceptforbinHex
1516 and base64 which are not supported yet.</li>
1517 <li>bug fixes: non-ASCII IDs, HTML output, XInclude on large
1518 docsandXInclude entities handling, encoding detection on external
1519 subsets,XMLSchemas bugs and memory leaks, HTML parser (James Bursa)</li>
Daniel Veillardc2d4a932003-04-01 11:13:05 +00001520 <li>portability: python/trio (Albert Chin), Sun compiler warnings</li>
1521 <li>documentation: added --relaxng option to xmllint man page (John)</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001522 <li>improved error reporting: xml:space, start/end tag mismatches,
1523 RelaxNGerrors</li>
Daniel Veillardc2d4a932003-04-01 11:13:05 +00001524</ul>
1525
1526<h3>2.5.5: Mar 24 2003</h3>
Daniel Veillardd8da01c2003-03-24 15:58:23 +00001527<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001528 <li>Lot of fixes on the Relax NG implementation. More
1529 testingincludingDocBook and TEI examples.</li>
Daniel Veillardd8da01c2003-03-24 15:58:23 +00001530 <li>Increased the support for W3C XML Schemas datatype</li>
1531 <li>Several bug fixes in the URI handling layer</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001532 <li>Bug fixes: HTML parser, xmlReader, DTD validation,
1533 XPath,encodingconversion, line counting in the parser.</li>
Daniel Veillardd8da01c2003-03-24 15:58:23 +00001534 <li>Added support for $XMLLINT_INDENT environment variable, FTP delete</li>
1535 <li>Fixed the RPM spec file name</li>
1536</ul>
1537
Daniel Veillard17bed982003-02-24 20:11:43 +00001538<h3>2.5.4: Feb 20 2003</h3>
1539<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001540 <li>Conformance testing and lot of fixes on Relax NG
1541 andXIncludeimplementation</li>
Daniel Veillard17bed982003-02-24 20:11:43 +00001542 <li>Implementation of XPointer element() scheme</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001543 <li>Bug fixes: XML parser, XInclude entities merge, validity
1544 checkingonnamespaces,
1545 <p>2 serialization bugs, node info generation problems, a
1546 DTDregexpgeneration problem.</p>
Daniel Veillard17bed982003-02-24 20:11:43 +00001547 </li>
1548 <li>Portability: windows updates and path canonicalization (Igor)</li>
1549 <li>A few typo fixes (Kjartan Maraas)</li>
1550 <li>Python bindings generator fixes (Stephane Bidoul)</li>
1551</ul>
1552
Daniel Veillard1d788d22003-02-10 16:21:58 +00001553<h3>2.5.3: Feb 10 2003</h3>
1554<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001555 <li>RelaxNG and XML Schemas datatypes improvements, and added afirstversion
1556 of RelaxNG Python bindings</li>
1557 <li>Fixes: XLink (Sean Chittenden), XInclude (Sean Chittenden), API
1558 fixforserializing namespace nodes, encoding conversion
1559 bug,XHTML1serialization</li>
Daniel Veillard1d788d22003-02-10 16:21:58 +00001560 <li>Portability fixes: Windows (Igor), AMD 64bits RPM spec file</li>
1561</ul>
1562
Daniel Veillard72fef162003-02-05 14:31:19 +00001563<h3>2.5.2: Feb 5 2003</h3>
1564<ul>
1565 <li>First implementation of RelaxNG, added --relaxng flag to xmllint</li>
1566 <li>Schemas support now compiled in by default.</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001567 <li>Bug fixes: DTD validation, namespace checking, XInclude
1568 andentities,delegateURI in XML Catalogs, HTML parser, XML reader
1569 (StéphaneBidoul),XPath parser and evaluation, UTF8ToUTF8 serialization,
1570 XMLreader memoryconsumption, HTML parser, HTML serialization in the
1571 presenceofnamespaces</li>
Daniel Veillard72fef162003-02-05 14:31:19 +00001572 <li>added an HTML API to check elements and attributes.</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001573 <li>Documentation improvement, PDF for the tutorial (John Fleck),docpatches
1574 (Stefan Kost)</li>
Daniel Veillard72fef162003-02-05 14:31:19 +00001575 <li>Portability fixes: NetBSD (Julio Merino), Windows (Igor Zlatkovic)</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001576 <li>Added python bindings for XPointer, contextual error
1577 reporting(StéphaneBidoul)</li>
Daniel Veillard72fef162003-02-05 14:31:19 +00001578 <li>URI/file escaping problems (Stefano Zacchiroli)</li>
Daniel Veillardcf27f7c2002-04-30 07:12:39 +00001579</ul>
1580
Daniel Veillarde2830f12003-01-08 17:47:49 +00001581<h3>2.5.1: Jan 8 2003</h3>
1582<ul>
1583 <li>Fixes a memory leak and configuration/compilation problems in 2.5.0</li>
1584 <li>documentation updates (John)</li>
1585 <li>a couple of XmlTextReader fixes</li>
1586</ul>
1587
Daniel Veillard7b4b2f92003-01-06 13:11:20 +00001588<h3>2.5.0: Jan 6 2003</h3>
1589<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001590 <li>New <a href="xmlreader.html">XmltextReader interface</a>based on
1591 C#API(with help of Stéphane Bidoul)</li>
Daniel Veillard7b4b2f92003-01-06 13:11:20 +00001592 <li>Windows: more exports, including the new API (Igor)</li>
1593 <li>XInclude fallback fix</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001594 <li>Python: bindings for the new API, packaging
1595 (StéphaneBidoul),drv_libxml2.py Python xml.sax driver (Stéphane Bidoul),
1596 fixes,speedupand iterators for Python-2.2 (Hannu Krosing)</li>
1597 <li>Tutorial fixes (john Fleck and Niraj Tolia) xmllint manupdate(John)</li>
Daniel Veillard7b4b2f92003-01-06 13:11:20 +00001598 <li>Fix an XML parser bug raised by Vyacheslav Pindyura</li>
1599 <li>Fix for VMS serialization (Nigel Hall) and config (Craig A. Berry)</li>
1600 <li>Entities handling fixes</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001601 <li>new API to optionally track node creation and
1602 deletion(LukasSchroeder)</li>
Daniel Veillard7b4b2f92003-01-06 13:11:20 +00001603 <li>Added documentation for the XmltextReader interface and some <a
1604 href="guidelines.html">XML guidelines</a></li>
1605</ul>
1606
Daniel Veillardc1eed322002-12-12 11:01:32 +00001607<h3>2.4.30: Dec 12 2002</h3>
1608<ul>
1609 <li>2.4.29 broke the python bindings, rereleasing</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001610 <li>Improvement/fixes of the XML API generator, and couple of
1611 minorcodefixes.</li>
Daniel Veillardc1eed322002-12-12 11:01:32 +00001612</ul>
1613
Daniel Veillard9b4bb4d2002-12-11 19:28:47 +00001614<h3>2.4.29: Dec 11 2002</h3>
1615<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001616 <li>Windows fixes (Igor): Windows CE port, pthread linking,
1617 pythonbindings(Stéphane Bidoul), Mingw (Magnus Henoch), and export
1618 listupdates</li>
Daniel Veillard9b4bb4d2002-12-11 19:28:47 +00001619 <li>Fix for prev in python bindings (ERDI Gergo)</li>
1620 <li>Fix for entities handling (Marcus Clarke)</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001621 <li>Refactored the XML and HTML dumps to a single code path,
1622 fixedXHTML1dump</li>
Daniel Veillard9b4bb4d2002-12-11 19:28:47 +00001623 <li>Fix for URI parsing when handling URNs with fragment identifiers</li>
1624 <li>Fix for HTTP URL escaping problem</li>
1625 <li>added an TextXmlReader (C#) like API (work in progress)</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001626 <li>Rewrote the API in XML generation script, includes a C parser
1627 andsavesmore informations needed for C# bindings</li>
Daniel Veillard9b4bb4d2002-12-11 19:28:47 +00001628</ul>
1629
Daniel Veillardf9c4cad2002-11-22 15:57:07 +00001630<h3>2.4.28: Nov 22 2002</h3>
1631<ul>
1632 <li>a couple of python binding fixes</li>
1633 <li>2 bug fixes in the XML push parser</li>
1634 <li>potential memory leak removed (Martin Stoilov)</li>
1635 <li>fix to the configure script for Unix (Dimitri Papadopoulos)</li>
1636 <li>added encoding support for XInclude parse="text"</li>
1637 <li>autodetection of XHTML1 and specific serialization rules added</li>
Daniel Veillard9b4bb4d2002-12-11 19:28:47 +00001638 <li>nasty threading bug fixed (William Brack)</li>
Daniel Veillardf9c4cad2002-11-22 15:57:07 +00001639</ul>
1640
Daniel Veillarddad3f682002-11-17 16:47:27 +00001641<h3>2.4.27: Nov 17 2002</h3>
1642<ul>
1643 <li>fixes for the Python bindings</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001644 <li>a number of bug fixes: SGML catalogs,xmlParseBalancedChunkMemory(),HTML
1645 parser, Schemas (Charles Bozeman),document fragment support(Christian
1646 Glahn), xmlReconciliateNs (BrianStafford), XPointer,xmlFreeNode(),
1647 xmlSAXParseMemory (Peter Jones),xmlGetNodePath (PetrPajas), entities
1648 processing</li>
Daniel Veillarddad3f682002-11-17 16:47:27 +00001649 <li>added grep to xmllint --shell</li>
1650 <li>VMS update patch from Craig A. Berry</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001651 <li>cleanup of the Windows build with support for more
1652 compilers(Igor),better thread support on Windows</li>
Daniel Veillarddad3f682002-11-17 16:47:27 +00001653 <li>cleanup of Unix Makefiles and spec file</li>
1654 <li>Improvements to the documentation (John Fleck)</li>
1655</ul>
1656
Daniel Veillard48267432002-10-18 11:21:38 +00001657<h3>2.4.26: Oct 18 2002</h3>
1658<ul>
1659 <li>Patches for Windows CE port, improvements on Windows paths handling</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001660 <li>Fixes to the validation code (DTD and Schemas), xmlNodeGetPath()
1661 ,HTMLserialization, Namespace compliance, and a number of
1662 smallproblems</li>
Daniel Veillard48267432002-10-18 11:21:38 +00001663</ul>
1664
Daniel Veillarde16b5742002-09-26 17:50:03 +00001665<h3>2.4.25: Sep 26 2002</h3>
1666<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001667 <li>A number of bug fixes: XPath, validation, Python bindings, DOM
1668 andtree,xmlI/O, Html</li>
Daniel Veillarde16b5742002-09-26 17:50:03 +00001669 <li>Serious rewrite of XInclude</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001670 <li>Made XML Schemas regexp part of the default build and APIs, smallfixand
1671 improvement of the regexp core</li>
Daniel Veillarde16b5742002-09-26 17:50:03 +00001672 <li>Changed the validation code to reuse XML Schemas regexp APIs</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001673 <li>Better handling of Windows file paths, improvement of
1674 Makefiles(Igor,Daniel Gehriger, Mark Vakoc)</li>
1675 <li>Improved the python I/O bindings, the tests, added resolver
1676 andregexpAPIs</li>
Daniel Veillarde16b5742002-09-26 17:50:03 +00001677 <li>New logos from Marc Liyanage</li>
1678 <li>Tutorial improvements: John Fleck, Christopher Harris</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001679 <li>Makefile: Fixes for AMD x86_64 (Mandrake),
1680 DESTDIR(ChristopheMerlet)</li>
Daniel Veillarde16b5742002-09-26 17:50:03 +00001681 <li>removal of all stderr/perror use for error reporting</li>
1682 <li>Better error reporting: XPath and DTD validation</li>
1683 <li>update of the trio portability layer (Bjorn Reese)</li>
1684</ul>
1685
Daniel Veillard42766c02002-08-22 20:52:17 +00001686<p><strong>2.4.24: Aug 22 2002</strong></p>
1687<ul>
1688 <li>XPath fixes (William), xf:escape-uri() (Wesley Terpstra)</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001689 <li>Python binding fixes: makefiles (William), generator, rpm
1690 build,x86-64(fcrozat)</li>
Daniel Veillard42766c02002-08-22 20:52:17 +00001691 <li>HTML &lt;style&gt; and boolean attributes serializer fixes</li>
1692 <li>C14N improvements by Aleksey</li>
Daniel Veillarde1662542002-08-28 11:50:59 +00001693 <li>doc cleanups: Rick Jones</li>
Daniel Veillard42766c02002-08-22 20:52:17 +00001694 <li>Windows compiler makefile updates: Igor and Elizabeth Barham</li>
1695 <li>XInclude: implementation of fallback and xml:base fixup added</li>
1696</ul>
1697
Daniel Veillard782afda2002-07-08 15:12:49 +00001698<h3>2.4.23: July 6 2002</h3>
1699<ul>
1700 <li>performances patches: Peter Jacobi</li>
1701 <li>c14n fixes, testsuite and performances: Aleksey Sanin</li>
1702 <li>added xmlDocFormatDump: Chema Celorio</li>
1703 <li>new tutorial: John Fleck</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001704 <li>new hash functions and performances: Sander Vesik, portability
1705 fixfromPeter Jacobi</li>
1706 <li>a number of bug fixes: XPath (William Brack, Richard Jinks), XMLandHTML
1707 parsers, ID lookup function</li>
Daniel Veillard782afda2002-07-08 15:12:49 +00001708 <li>removal of all remaining sprintf: Aleksey Sanin</li>
1709</ul>
1710
Daniel Veillardc0801af2002-05-28 16:28:42 +00001711<h3>2.4.22: May 27 2002</h3>
1712<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001713 <li>a number of bug fixes: configure scripts, base handling,
1714 parser,memoryusage, HTML parser, XPath, documentation
1715 (ChristianCornelssen),indentation, URI parsing</li>
1716 <li>Optimizations for XMLSec, fixing and making public some of
1717 thenetworkprotocol handlers (Aleksey)</li>
Daniel Veillardc0801af2002-05-28 16:28:42 +00001718 <li>performance patch from Gary Pennington</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001719 <li>Charles Bozeman provided date and time support for
1720 XMLSchemasdatatypes</li>
Daniel Veillardc0801af2002-05-28 16:28:42 +00001721</ul>
1722
Daniel Veillardcf27f7c2002-04-30 07:12:39 +00001723<h3>2.4.21: Apr 29 2002</h3>
1724
Daniel Veillardfabafd52006-06-08 08:16:33 +00001725<p>This release is both a bug fix release and also contains the
1726earlyXMLSchemas <a href="http://www.w3.org/TR/xmlschema-1/">structures</a>and
1727<a href="http://www.w3.org/TR/xmlschema-2/">datatypes</a>code,
1728beware,allinterfaces are likely to change, there is huge holes, it is clearly
1729a workinprogress and don't even think of putting this code in a
1730productionsystem,it's actually not compiled in by default. The real fixes
1731are:</p>
Daniel Veillardcf27f7c2002-04-30 07:12:39 +00001732<ul>
1733 <li>a couple of bugs or limitations introduced in 2.4.20</li>
1734 <li>patches for Borland C++ and MSC by Igor</li>
Daniel Veillard69839ba2006-06-06 13:27:03 +00001735 <li>some fixes on XPath strings and conformance patches by RichardJinks</li>
Daniel Veillardcf27f7c2002-04-30 07:12:39 +00001736 <li>patch from Aleksey for the ExcC14N specification</li>
1737 <li>OSF/1 bug fix by Bjorn</li>
Daniel Veillardaf43f632002-03-08 15:05:20 +00001738</ul>
1739
Daniel Veillarda7084cd2002-04-15 17:12:47 +00001740<h3>2.4.20: Apr 15 2002</h3>
1741<ul>
Daniel Veillard63d83142002-05-20 06:51:05 +00001742 <li>bug fixes: file descriptor leak, XPath, HTML output, DTD validation</li>
Daniel Veillarda7084cd2002-04-15 17:12:47 +00001743 <li>XPath conformance testing by Richard Jinks</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001744 <li>Portability fixes: Solaris, MPE/iX, Windows, OSF/1,
1745 pythonbindings,libxml.m4</li>
Daniel Veillarda7084cd2002-04-15 17:12:47 +00001746</ul>
1747
Daniel Veillard19274092002-03-25 16:48:03 +00001748<h3>2.4.19: Mar 25 2002</h3>
1749<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001750 <li>bug fixes: half a dozen XPath bugs, Validation, ISO-Latin
1751 toUTF8encoder</li>
Daniel Veillard19274092002-03-25 16:48:03 +00001752 <li>portability fixes in the HTTP code</li>
Daniel Veillardb2fb8ed2002-04-01 09:33:12 +00001753 <li>memory allocation checks using valgrind, and profiling tests</li>
Daniel Veillard19274092002-03-25 16:48:03 +00001754 <li>revamp of the Windows build and Makefiles</li>
1755</ul>
1756
Daniel Veillard34ce8be2002-03-18 19:37:11 +00001757<h3>2.4.18: Mar 18 2002</h3>
1758<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001759 <li>bug fixes: tree, SAX, canonicalization,
1760 validation,portability,XPath</li>
Daniel Veillard34ce8be2002-03-18 19:37:11 +00001761 <li>removed the --with-buffer option it was becoming unmaintainable</li>
1762 <li>serious cleanup of the Python makefiles</li>
1763 <li>speedup patch to XPath very effective for DocBook stylesheets</li>
1764 <li>Fixes for Windows build, cleanup of the documentation</li>
1765</ul>
1766
Daniel Veillardaf43f632002-03-08 15:05:20 +00001767<h3>2.4.17: Mar 8 2002</h3>
1768<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001769 <li>a lot of bug fixes, including "namespace nodes have no
1770 parentsinXPath"</li>
1771 <li>fixed/improved the Python wrappers, added more examples
1772 andmoreregression tests, XPath extension functions can now
1773 returnnode-sets</li>
Daniel Veillard63d83142002-05-20 06:51:05 +00001774 <li>added the XML Canonicalization support from Aleksey Sanin</li>
Daniel Veillard3d6ae1c2001-08-15 13:12:39 +00001775</ul>
1776
Daniel Veillard5f4b5992002-02-20 10:22:49 +00001777<h3>2.4.16: Feb 20 2002</h3>
1778<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001779 <li>a lot of bug fixes, most of them were triggered by the XMLTestsuitefrom
1780 OASIS and W3C. Compliance has been significantlyimproved.</li>
Daniel Veillard5f4b5992002-02-20 10:22:49 +00001781 <li>a couple of portability fixes too.</li>
1782</ul>
1783
Daniel Veillard397ff112002-02-11 18:27:20 +00001784<h3>2.4.15: Feb 11 2002</h3>
1785<ul>
1786 <li>Fixed the Makefiles, especially the python module ones</li>
1787 <li>A few bug fixes and cleanup</li>
1788 <li>Includes cleanup</li>
1789</ul>
1790
Daniel Veillardb6c1e2f2002-02-08 14:52:52 +00001791<h3>2.4.14: Feb 8 2002</h3>
1792<ul>
Daniel Veillard63d83142002-05-20 06:51:05 +00001793 <li>Change of License to the <a
Daniel Veillardfabafd52006-06-08 08:16:33 +00001794 href="http://www.opensource.org/licenses/mit-license.html">MITLicense</a>basicallyfor
1795 integration in XFree86 codebase, and removingconfusion around theprevious
1796 dual-licensing</li>
1797 <li>added Python bindings, beta software but should already
1798 bequitecomplete</li>
1799 <li>a large number of fixes and cleanups, especially for
1800 alltreemanipulations</li>
1801 <li>cleanup of the headers, generation of a reference API
1802 definitioninXML</li>
Daniel Veillardb6c1e2f2002-02-08 14:52:52 +00001803</ul>
1804
1805<h3>2.4.13: Jan 14 2002</h3>
Daniel Veillard744683d2002-01-14 17:30:20 +00001806<ul>
1807 <li>update of the documentation: John Fleck and Charlie Bozeman</li>
1808 <li>cleanup of timing code from Justin Fletcher</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001809 <li>fixes for Windows and initial thread support on Win32: Igor
1810 andSergueiNarojnyi</li>
Daniel Veillard744683d2002-01-14 17:30:20 +00001811 <li>Cygwin patch from Robert Collins</li>
1812 <li>added xmlSetEntityReferenceFunc() for Keith Isdale work on xsldbg</li>
1813</ul>
1814
Daniel Veillardef90ba72001-12-07 14:24:22 +00001815<h3>2.4.12: Dec 7 2001</h3>
1816<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001817 <li>a few bug fixes: thread (Gary Pennington), xmllint
1818 (GeertKloosterman),XML parser (Robin Berjon), XPointer (Danny Jamshy),
1819 I/Ocleanups(robert)</li>
Daniel Veillardef90ba72001-12-07 14:24:22 +00001820 <li>Eric Lavigne contributed project files for MacOS</li>
1821 <li>some makefiles cleanups</li>
1822</ul>
1823
Daniel Veillarda4871052001-11-26 13:19:48 +00001824<h3>2.4.11: Nov 26 2001</h3>
1825<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001826 <li>fixed a couple of errors in the includes, fixed a few bugs,
1827 somecodecleanups</li>
Daniel Veillarda4871052001-11-26 13:19:48 +00001828 <li>xmllint man pages improvement by Heiko Rupp</li>
1829 <li>updated VMS build instructions from John A Fotheringham</li>
1830 <li>Windows Makefiles updates from Igor</li>
1831</ul>
1832
Daniel Veillard43d3f612001-11-10 11:57:23 +00001833<h3>2.4.10: Nov 10 2001</h3>
1834<ul>
1835 <li>URI escaping fix (Joel Young)</li>
1836 <li>added xmlGetNodePath() (for paths or XPointers generation)</li>
1837 <li>Fixes namespace handling problems when using DTD and validation</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001838 <li>improvements on xmllint: Morus Walter patches for --format
1839 and--encode,Stefan Kost and Heiko Rupp improvements on the --shell</li>
Daniel Veillard43d3f612001-11-10 11:57:23 +00001840 <li>fixes for xmlcatalog linking pointed by Weiqi Gao</li>
1841 <li>fixes to the HTML parser</li>
1842</ul>
1843
1844<h3>2.4.9: Nov 6 2001</h3>
1845<ul>
1846 <li>fixes more catalog bugs</li>
1847 <li>avoid a compilation problem, improve xmlGetLineNo()</li>
1848</ul>
1849
Daniel Veillarded421aa2001-11-04 21:22:45 +00001850<h3>2.4.8: Nov 4 2001</h3>
1851<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001852 <li>fixed SGML catalogs broken in previous release,
1853 updatedxmlcatalogtool</li>
Daniel Veillarded421aa2001-11-04 21:22:45 +00001854 <li>fixed a compile errors and some includes troubles.</li>
1855</ul>
1856
Daniel Veillard52dcab32001-10-30 12:51:17 +00001857<h3>2.4.7: Oct 30 2001</h3>
1858<ul>
1859 <li>exported some debugging interfaces</li>
1860 <li>serious rewrite of the catalog code</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001861 <li>integrated Gary Pennington thread safety patch, added
1862 configureoptionand regression tests</li>
Daniel Veillard52dcab32001-10-30 12:51:17 +00001863 <li>removed an HTML parser bug</li>
1864 <li>fixed a couple of potentially serious validation bugs</li>
1865 <li>integrated the SGML DocBook support in xmllint</li>
1866 <li>changed the nanoftp anonymous login passwd</li>
1867 <li>some I/O cleanup and a couple of interfaces for Perl wrapper</li>
1868 <li>general bug fixes</li>
1869 <li>updated xmllint man page by John Fleck</li>
1870 <li>some VMS and Windows updates</li>
1871</ul>
1872
Daniel Veillard60087f32001-10-10 09:45:09 +00001873<h3>2.4.6: Oct 10 2001</h3>
1874<ul>
Daniel Veillard52dcab32001-10-30 12:51:17 +00001875 <li>added an updated man pages by John Fleck</li>
Daniel Veillard60087f32001-10-10 09:45:09 +00001876 <li>portability and configure fixes</li>
1877 <li>an infinite loop on the HTML parser was removed (William)</li>
1878 <li>Windows makefile patches from Igor</li>
Daniel Veillard63d83142002-05-20 06:51:05 +00001879 <li>fixed half a dozen bugs reported for libxml or libxslt</li>
Daniel Veillard60087f32001-10-10 09:45:09 +00001880 <li>updated xmlcatalog to be able to modify SGML super catalogs</li>
1881</ul>
1882
Daniel Veillarddadd0872001-09-15 09:21:44 +00001883<h3>2.4.5: Sep 14 2001</h3>
1884<ul>
1885 <li>Remove a few annoying bugs in 2.4.4</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001886 <li>forces the HTML serializer to output decimal charrefs since
1887 someversionof Netscape can't handle hexadecimal ones</li>
Daniel Veillarddadd0872001-09-15 09:21:44 +00001888</ul>
1889
1890<h3>1.8.16: Sep 14 2001</h3>
1891<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001892 <li>maintenance release of the old libxml1 branch, couple of
1893 bugandportability fixes</li>
Daniel Veillarddadd0872001-09-15 09:21:44 +00001894</ul>
1895
Daniel Veillard04382ae2001-09-12 18:51:30 +00001896<h3>2.4.4: Sep 12 2001</h3>
1897<ul>
Daniel Veillard69839ba2006-06-06 13:27:03 +00001898 <li>added --convert to xmlcatalog, bug fixes and cleanups of XMLCatalog</li>
Daniel Veillard04382ae2001-09-12 18:51:30 +00001899 <li>a few bug fixes and some portability changes</li>
1900 <li>some documentation cleanups</li>
1901</ul>
1902
Daniel Veillard39936902001-08-24 00:49:01 +00001903<h3>2.4.3: Aug 23 2001</h3>
1904<ul>
1905 <li>XML Catalog support see the doc</li>
1906 <li>New NaN/Infinity floating point code</li>
1907 <li>A few bug fixes</li>
1908</ul>
1909
1910<h3>2.4.2: Aug 15 2001</h3>
Daniel Veillard3d6ae1c2001-08-15 13:12:39 +00001911<ul>
1912 <li>adds xmlLineNumbersDefault() to control line number generation</li>
1913 <li>lot of bug fixes</li>
Daniel Veillard63d83142002-05-20 06:51:05 +00001914 <li>the Microsoft MSC projects files should now be up to date</li>
Daniel Veillard3d6ae1c2001-08-15 13:12:39 +00001915 <li>inheritance of namespaces from DTD defaulted attributes</li>
1916 <li>fixes a serious potential security bug</li>
1917 <li>added a --format option to xmllint</li>
1918</ul>
1919
1920<h3>2.4.1: July 24 2001</h3>
1921<ul>
1922 <li>possibility to keep line numbers in the tree</li>
1923 <li>some computation NaN fixes</li>
1924 <li>extension of the XPath API</li>
1925 <li>cleanup for alpha and ia64 targets</li>
1926 <li>patch to allow saving through HTTP PUT or POST</li>
Daniel Veillard09ab7e12001-07-10 15:49:44 +00001927</ul>
1928
1929<h3>2.4.0: July 10 2001</h3>
1930<ul>
1931 <li>Fixed a few bugs in XPath, validation, and tree handling.</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001932 <li>Fixed XML Base implementation, added a couple of examples
1933 totheregression tests</li>
Daniel Veillard09ab7e12001-07-10 15:49:44 +00001934 <li>A bit of cleanup</li>
Daniel Veillardab8500d2000-10-15 21:06:19 +00001935</ul>
1936
Daniel Veillard5b43fde2001-07-05 23:31:40 +00001937<h3>2.3.14: July 5 2001</h3>
1938<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001939 <li>fixed some entities problems and reduce memory
1940 requirementwhensubstituting them</li>
1941 <li>lots of improvements in the XPath queries interpreter
1942 canbesubstantially faster</li>
Daniel Veillard5b43fde2001-07-05 23:31:40 +00001943 <li>Makefiles and configure cleanups</li>
1944 <li>Fixes to XPath variable eval, and compare on empty node set</li>
1945 <li>HTML tag closing bug fixed</li>
Daniel Veillard63d83142002-05-20 06:51:05 +00001946 <li>Fixed an URI reference computation problem when validating</li>
Daniel Veillard5b43fde2001-07-05 23:31:40 +00001947</ul>
1948
Daniel Veillard2adbb512001-06-28 16:20:36 +00001949<h3>2.3.13: June 28 2001</h3>
1950<ul>
1951 <li>2.3.12 configure.in was broken as well as the push mode XML parser</li>
1952 <li>a few more fixes for compilation on Windows MSC by Yon Derek</li>
1953</ul>
1954
1955<h3>1.8.14: June 28 2001</h3>
1956<ul>
1957 <li>Zbigniew Chyla gave a patch to use the old XML parser in push mode</li>
1958 <li>Small Makefile fix</li>
1959</ul>
1960
Daniel Veillard11648102001-06-26 16:08:24 +00001961<h3>2.3.12: June 26 2001</h3>
1962<ul>
1963 <li>lots of cleanup</li>
1964 <li>a couple of validation fix</li>
1965 <li>fixed line number counting</li>
1966 <li>fixed serious problems in the XInclude processing</li>
1967 <li>added support for UTF8 BOM at beginning of entities</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001968 <li>fixed a strange gcc optimizer bugs in xpath handling of
1969 float,gcc-3.0miscompile uri.c (William), Thomas Leitner provided a fix
1970 fortheoptimizer on Tru64</li>
1971 <li>incorporated Yon Derek and Igor Zlatkovic fixes and
1972 improvementsforcompilation on Windows MSC</li>
Daniel Veillard11648102001-06-26 16:08:24 +00001973 <li>update of libxml-doc.el (Felix Natter)</li>
1974 <li>fixed 2 bugs in URI normalization code</li>
1975</ul>
1976
Daniel Veillarde3c81b52001-06-17 14:50:34 +00001977<h3>2.3.11: June 17 2001</h3>
1978<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001979 <li>updates to trio, Makefiles and configure should fix
1980 someportabilityproblems (alpha)</li>
1981 <li>fixed some HTML serialization problems (pre, script,
1982 andblock/inlinehandling), added encoding aware APIs, cleanup of
1983 thiscode</li>
Daniel Veillarde3c81b52001-06-17 14:50:34 +00001984 <li>added xmlHasNsProp()</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001985 <li>implemented a specific PI for encoding support in the
1986 DocBookSGMLparser</li>
1987 <li>some XPath fixes (-Infinity, / as a function parameter
1988 andnamespacesnode selection)</li>
Daniel Veillarde3c81b52001-06-17 14:50:34 +00001989 <li>fixed a performance problem and an error in the validation code</li>
1990 <li>fixed XInclude routine to implement the recursive behaviour</li>
1991 <li>fixed xmlFreeNode problem when libxml is included statically twice</li>
1992 <li>added --version to xmllint for bug reports</li>
1993</ul>
1994
Daniel Veillard2e4f1882001-06-01 10:11:57 +00001995<h3>2.3.10: June 1 2001</h3>
1996<ul>
1997 <li>fixed the SGML catalog support</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00001998 <li>a number of reported bugs got fixed, in XPath, iconv
1999 detection,XIncludeprocessing</li>
Daniel Veillard2e4f1882001-06-01 10:11:57 +00002000 <li>XPath string function should now handle unicode correctly</li>
2001</ul>
2002
Daniel Veillard4623acd2001-05-19 15:13:15 +00002003<h3>2.3.9: May 19 2001</h3>
2004
2005<p>Lots of bugfixes, and added a basic SGML catalog support:</p>
2006<ul>
2007 <li>HTML push bugfix #54891 and another patch from Jonas Borgström</li>
Daniel Veillard63d83142002-05-20 06:51:05 +00002008 <li>some serious speed optimization again</li>
Daniel Veillard4623acd2001-05-19 15:13:15 +00002009 <li>some documentation cleanups</li>
Daniel Veillard63d83142002-05-20 06:51:05 +00002010 <li>trying to get better linking on Solaris (-R)</li>
Daniel Veillard4623acd2001-05-19 15:13:15 +00002011 <li>XPath API cleanup from Thomas Broyer</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002012 <li>Validation bug fixed #54631, added a patch from Gary
2013 Pennington,fixedxmlValidGetValidElements()</li>
Daniel Veillard4623acd2001-05-19 15:13:15 +00002014 <li>Added an INSTALL file</li>
2015 <li>Attribute removal added to API: #54433</li>
2016 <li>added a basic support for SGML catalogs</li>
2017 <li>fixed xmlKeepBlanksDefault(0) API</li>
2018 <li>bugfix in xmlNodeGetLang()</li>
2019 <li>fixed a small configure portability problem</li>
2020 <li>fixed an inversion of SYSTEM and PUBLIC identifier in HTML document</li>
2021</ul>
2022
Daniel Veillarda265af72001-05-14 11:13:58 +00002023<h3>1.8.13: May 14 2001</h3>
2024<ul>
2025 <li>bugfixes release of the old libxml1 branch used by Gnome</li>
2026</ul>
2027
Daniel Veillard3bbbe6f2001-05-03 11:15:37 +00002028<h3>2.3.8: May 3 2001</h3>
2029<ul>
2030 <li>Integrated an SGML DocBook parser for the Gnome project</li>
2031 <li>Fixed a few things in the HTML parser</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002032 <li>Fixed some XPath bugs raised by XSLT use, tried to fix thefloatingpoint
2033 portability issue</li>
2034 <li>Speed improvement (8M/s for SAX, 3M/s for DOM, 1.5M/s
2035 forDOM+validationusing the XML REC as input and a 700MHz celeron).</li>
Daniel Veillard3bbbe6f2001-05-03 11:15:37 +00002036 <li>incorporated more Windows cleanup</li>
2037 <li>added xmlSaveFormatFile()</li>
2038 <li>fixed problems in copying nodes with entities references (gdome)</li>
2039 <li>removed some troubles surrounding the new validation module</li>
2040</ul>
2041
Daniel Veillarda41123c2001-04-22 19:31:20 +00002042<h3>2.3.7: April 22 2001</h3>
2043<ul>
2044 <li>lots of small bug fixes, corrected XPointer</li>
Daniel Veillard63d83142002-05-20 06:51:05 +00002045 <li>Non deterministic content model validation support</li>
Daniel Veillarda41123c2001-04-22 19:31:20 +00002046 <li>added xmlDocCopyNode for gdome2</li>
2047 <li>revamped the way the HTML parser handles end of tags</li>
Daniel Veillard63d83142002-05-20 06:51:05 +00002048 <li>XPath: corrections of namespaces support and number formatting</li>
Daniel Veillarda41123c2001-04-22 19:31:20 +00002049 <li>Windows: Igor Zlatkovic patches for MSC compilation</li>
Daniel Veillard63d83142002-05-20 06:51:05 +00002050 <li>HTML output fixes from P C Chow and William M. Brack</li>
Daniel Veillarda41123c2001-04-22 19:31:20 +00002051 <li>Improved validation speed sensible for DocBook</li>
2052 <li>fixed a big bug with ID declared in external parsed entities</li>
2053 <li>portability fixes, update of Trio from Bjorn Reese</li>
2054</ul>
2055
Daniel Veillardafc73112001-04-11 11:51:41 +00002056<h3>2.3.6: April 8 2001</h3>
2057<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002058 <li>Code cleanup using extreme gcc compiler warning options,
2059 foundandcleared half a dozen potential problem</li>
Daniel Veillardafc73112001-04-11 11:51:41 +00002060 <li>the Eazel team found an XML parser bug</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002061 <li>cleaned up the user of some of the string formatting function.
2062 usedthetrio library code to provide the one needed when the platform
2063 ismissingthem</li>
2064 <li>xpath: removed a memory leak and fixed the predicate
2065 evaluationproblem,extended the testsuite and cleaned up the result.
2066 XPointer seemsbroken...</li>
Daniel Veillardafc73112001-04-11 11:51:41 +00002067</ul>
2068
Daniel Veillard56a4cb82001-03-24 17:00:36 +00002069<h3>2.3.5: Mar 23 2001</h3>
2070<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002071 <li>Biggest change is separate parsing and evaluation of
2072 XPathexpressions,there is some new APIs for this too</li>
2073 <li>included a number of bug fixes(XML push parser,
2074 51876,notations,52299)</li>
Daniel Veillard56a4cb82001-03-24 17:00:36 +00002075 <li>Fixed some portability issues</li>
2076</ul>
2077
Daniel Veillarde356c282001-03-10 12:32:04 +00002078<h3>2.3.4: Mar 10 2001</h3>
2079<ul>
2080 <li>Fixed bugs #51860 and #51861</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002081 <li>Added a global variable xmlDefaultBufferSize to allow defaultbuffersize
2082 to be application tunable.</li>
2083 <li>Some cleanup in the validation code, still a bug left and
2084 thispartshould probably be rewritten to support ambiguous content
2085 model:-\</li>
2086 <li>Fix a couple of serious bugs introduced or raised by changes
2087 in2.3.3parser</li>
Daniel Veillarde356c282001-03-10 12:32:04 +00002088 <li>Fixed another bug in xmlNodeGetContent()</li>
2089 <li>Bjorn fixed XPath node collection and Number formatting</li>
2090 <li>Fixed a loop reported in the HTML parsing</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002091 <li>blank space are reported even if the Dtd content model proves
2092 thattheyare formatting spaces, this is for XML conformance</li>
Daniel Veillarde356c282001-03-10 12:32:04 +00002093</ul>
2094
Daniel Veillardb402c072001-03-01 17:28:58 +00002095<h3>2.3.3: Mar 1 2001</h3>
2096<ul>
2097 <li>small change in XPath for XSLT</li>
2098 <li>documentation cleanups</li>
2099 <li>fix in validation by Gary Pennington</li>
2100 <li>serious parsing performances improvements</li>
2101</ul>
2102
Daniel Veillardec70e912001-02-26 20:10:45 +00002103<h3>2.3.2: Feb 24 2001</h3>
Daniel Veillard71681102001-02-24 17:48:53 +00002104<ul>
2105 <li>chasing XPath bugs, found a bunch, completed some TODO</li>
2106 <li>fixed a Dtd parsing bug</li>
2107 <li>fixed a bug in xmlNodeGetContent</li>
2108 <li>ID/IDREF support partly rewritten by Gary Pennington</li>
2109</ul>
2110
Daniel Veillardec70e912001-02-26 20:10:45 +00002111<h3>2.3.1: Feb 15 2001</h3>
Daniel Veillard6e6a6cc2001-02-15 15:55:44 +00002112<ul>
2113 <li>some XPath and HTML bug fixes for XSLT</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002114 <li>small extension of the hash table interfaces for
2115 DOMgdome2implementation</li>
Daniel Veillard6e6a6cc2001-02-15 15:55:44 +00002116 <li>A few bug fixes</li>
2117</ul>
2118
Daniel Veillardec70e912001-02-26 20:10:45 +00002119<h3>2.3.0: Feb 8 2001 (2.2.12 was on 25 Jan but I didn't kept track)</h3>
Daniel Veillard6e6a6cc2001-02-15 15:55:44 +00002120<ul>
2121 <li>Lots of XPath bug fixes</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002122 <li>Add a mode with Dtd lookup but without validation error
2123 reportingforXSLT</li>
Daniel Veillard6e6a6cc2001-02-15 15:55:44 +00002124 <li>Add support for text node without escaping (XSLT)</li>
2125 <li>bug fixes for xmlCheckFilename</li>
2126 <li>validation code bug fixes from Gary Pennington</li>
2127 <li>Patch from Paul D. Smith correcting URI path normalization</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002128 <li>Patch to allow simultaneous install of libxml-develandlibxml2-devel</li>
Daniel Veillard6e6a6cc2001-02-15 15:55:44 +00002129 <li>the example Makefile is now fixed</li>
2130 <li>added HTML to the RPM packages</li>
2131 <li>tree copying bugfixes</li>
2132 <li>updates to Windows makefiles</li>
Daniel Veillard63d83142002-05-20 06:51:05 +00002133 <li>optimization patch from Bjorn Reese</li>
Daniel Veillard6e6a6cc2001-02-15 15:55:44 +00002134</ul>
2135
Daniel Veillardec70e912001-02-26 20:10:45 +00002136<h3>2.2.11: Jan 4 2001</h3>
Daniel Veillard503b8932001-01-05 06:36:31 +00002137<ul>
2138 <li>bunch of bug fixes (memory I/O, xpath, ftp/http, ...)</li>
2139 <li>added htmlHandleOmittedElem()</li>
2140 <li>Applied Bjorn Reese's IPV6 first patch</li>
2141 <li>Applied Paul D. Smith patches for validation of XInclude results</li>
Daniel Veillard82687162001-01-22 15:32:01 +00002142 <li>added XPointer xmlns() new scheme support</li>
Daniel Veillard503b8932001-01-05 06:36:31 +00002143</ul>
2144
Daniel Veillard2ddd23d2000-11-25 10:42:19 +00002145<h3>2.2.10: Nov 25 2000</h3>
Daniel Veillard9d343c42000-11-25 10:12:43 +00002146<ul>
2147 <li>Fix the Windows problems of 2.2.8</li>
2148 <li>integrate OpenVMS patches</li>
2149 <li>better handling of some nasty HTML input</li>
2150 <li>Improved the XPointer implementation</li>
2151 <li>integrate a number of provided patches</li>
2152</ul>
2153
Daniel Veillard2ddd23d2000-11-25 10:42:19 +00002154<h3>2.2.9: Nov 25 2000</h3>
2155<ul>
2156 <li>erroneous release :-(</li>
2157</ul>
2158
Daniel Veillard28929b22000-11-13 18:22:49 +00002159<h3>2.2.8: Nov 13 2000</h3>
2160<ul>
Daniel Veillard69839ba2006-06-06 13:27:03 +00002161 <li>First version of <a
2162 href="http://www.w3.org/TR/xinclude">XInclude</a>support</li>
Daniel Veillard28929b22000-11-13 18:22:49 +00002163 <li>Patch in conditional section handling</li>
2164 <li>updated MS compiler project</li>
2165 <li>fixed some XPath problems</li>
2166 <li>added an URI escaping function</li>
2167 <li>some other bug fixes</li>
2168</ul>
2169
2170<h3>2.2.7: Oct 31 2000</h3>
2171<ul>
2172 <li>added message redirection</li>
2173 <li>XPath improvements (thanks TOM !)</li>
2174 <li>xmlIOParseDTD() added</li>
2175 <li>various small fixes in the HTML, URI, HTTP and XPointer support</li>
2176 <li>some cleanup of the Makefile, autoconf and the distribution content</li>
2177</ul>
2178
Daniel Veillard29a11cc2000-10-25 13:32:39 +00002179<h3>2.2.6: Oct 25 2000:</h3>
2180<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002181 <li>Added an hash table module, migrated a number of internal
2182 structuretothose</li>
Daniel Veillard29a11cc2000-10-25 13:32:39 +00002183 <li>Fixed a posteriori validation problems</li>
2184 <li>HTTP module cleanups</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002185 <li>HTML parser improvements (tag errors, script/style
2186 handling,attributenormalization)</li>
Daniel Veillard29a11cc2000-10-25 13:32:39 +00002187 <li>coalescing of adjacent text nodes</li>
2188 <li>couple of XPath bug fixes, exported the internal API</li>
2189</ul>
2190
Daniel Veillardab8500d2000-10-15 21:06:19 +00002191<h3>2.2.5: Oct 15 2000:</h3>
Daniel Veillard4c3a2031999-11-19 17:46:26 +00002192<ul>
Daniel Veillard189446d2000-10-13 10:23:06 +00002193 <li>XPointer implementation and testsuite</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002194 <li>Lot of XPath fixes, added variable and functions
2195 registration,moretests</li>
2196 <li>Portability fixes, lots of enhancements toward an easy Windows
2197 buildandrelease</li>
Daniel Veillard189446d2000-10-13 10:23:06 +00002198 <li>Late validation fixes</li>
2199 <li>Integrated a lot of contributed patches</li>
2200 <li>added memory management docs</li>
Daniel Veillardab8500d2000-10-15 21:06:19 +00002201 <li>a performance problem when using large buffer seems fixed</li>
Daniel Veillard189446d2000-10-13 10:23:06 +00002202</ul>
2203
2204<h3>2.2.4: Oct 1 2000:</h3>
2205<ul>
2206 <li>main XPath problem fixed</li>
2207 <li>Integrated portability patches for Windows</li>
2208 <li>Serious bug fixes on the URI and HTML code</li>
Daniel Veillard361d8452000-04-03 19:48:13 +00002209</ul>
2210
Daniel Veillardd5f97f82000-09-17 16:38:14 +00002211<h3>2.2.3: Sep 17 2000</h3>
2212<ul>
2213 <li>bug fixes</li>
2214 <li>cleanup of entity handling code</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002215 <li>overall review of all loops in the parsers, all sprintf usage
2216 hasbeenchecked too</li>
2217 <li>Far better handling of larges Dtd. Validating against DocBook
2218 XMLDtdworks smoothly now.</li>
Daniel Veillardd5f97f82000-09-17 16:38:14 +00002219</ul>
2220
2221<h3>1.8.10: Sep 6 2000</h3>
2222<ul>
2223 <li>bug fix release for some Gnome projects</li>
2224</ul>
2225
2226<h3>2.2.2: August 12 2000</h3>
Daniel Veillard786d7c82000-08-12 23:38:57 +00002227<ul>
2228 <li>mostly bug fixes</li>
Daniel Veillardec78c0f2000-08-25 10:25:23 +00002229 <li>started adding routines to access xml parser context options</li>
Daniel Veillard786d7c82000-08-12 23:38:57 +00002230</ul>
2231
Daniel Veillardd5f97f82000-09-17 16:38:14 +00002232<h3>2.2.1: July 21 2000</h3>
Daniel Veillarda2679fa2000-07-22 02:38:15 +00002233<ul>
2234 <li>a purely bug fixes release</li>
2235 <li>fixed an encoding support problem when parsing from a memory block</li>
2236 <li>fixed a DOCTYPE parsing problem</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002237 <li>removed a bug in the function allowing to override the
2238 memoryallocationroutines</li>
Daniel Veillarda2679fa2000-07-22 02:38:15 +00002239</ul>
2240
Daniel Veillardd5f97f82000-09-17 16:38:14 +00002241<h3>2.2.0: July 14 2000</h3>
Daniel Veillard94e90602000-07-17 14:38:19 +00002242<ul>
2243 <li>applied a lot of portability fixes</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002244 <li>better encoding support/cleanup and saving (content is nowalwaysencoded
2245 in UTF-8)</li>
Daniel Veillard94e90602000-07-17 14:38:19 +00002246 <li>the HTML parser now correctly handles encodings</li>
2247 <li>added xmlHasProp()</li>
2248 <li>fixed a serious problem with &amp;#38;</li>
2249 <li>propagated the fix to FTP client</li>
2250 <li>cleanup, bugfixes, etc ...</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002251 <li>Added a page about <a
2252 href="encoding.html">libxmlInternationalizationsupport</a></li>
Daniel Veillard94e90602000-07-17 14:38:19 +00002253</ul>
2254
Daniel Veillard60979bd2000-07-10 12:17:33 +00002255<h3>1.8.9: July 9 2000</h3>
2256<ul>
2257 <li>fixed the spec the RPMs should be better</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002258 <li>fixed a serious bug in the FTP implementation, released 1.8.9
2259 tosolverpmfind users problem</li>
Daniel Veillard60979bd2000-07-10 12:17:33 +00002260</ul>
2261
Daniel Veillard6388e172000-07-03 16:07:19 +00002262<h3>2.1.1: July 1 2000</h3>
2263<ul>
2264 <li>fixes a couple of bugs in the 2.1.0 packaging</li>
2265 <li>improvements on the HTML parser</li>
2266</ul>
2267
Daniel Veillard3f6f7f62000-06-30 17:58:25 +00002268<h3>2.1.0 and 1.8.8: June 29 2000</h3>
2269<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002270 <li>1.8.8 is mostly a commodity package for upgrading to libxml2accordingto
2271 <a href="upgrade.html">new instructions</a>. It fixes a nastyproblemabout
2272 &amp;#38; charref parsing</li>
2273 <li>2.1.0 also ease the upgrade from libxml v1 to the recent version.italso
2274 contains numerous fixes and enhancements:
Daniel Veillard3f6f7f62000-06-30 17:58:25 +00002275 <ul>
2276 <li>added xmlStopParser() to stop parsing</li>
2277 <li>improved a lot parsing speed when there is large CDATA blocs</li>
2278 <li>includes XPath patches provided by Picdar Technology</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002279 <li>tried to fix as much as possible DTD validation andnamespacerelated
2280 problems</li>
Daniel Veillard3f6f7f62000-06-30 17:58:25 +00002281 <li>output to a given encoding has been added/tested</li>
2282 <li>lot of various fixes</li>
2283 </ul>
2284 </li>
2285</ul>
2286
Daniel Veillarde0aed302000-04-16 08:52:20 +00002287<h3>2.0.0: Apr 12 2000</h3>
Daniel Veillard361d8452000-04-03 19:48:13 +00002288<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002289 <li>First public release of libxml2. If you are using libxml, it's
2290 agoodidea to check the 1.x to 2.x upgrade instructions. NOTE:
2291 whileinitiallyscheduled for Apr 3 the release occurred only on Apr 12 due
2292 tomassiveworkload.</li>
2293 <li>The include are now located under $prefix/include/libxml
2294 (insteadof$prefix/include/gnome-xml), they also are referenced by
Daniel Veillard60979bd2000-07-10 12:17:33 +00002295 <pre>#include &lt;libxml/xxx.h&gt;</pre>
Daniel Veillarde0aed302000-04-16 08:52:20 +00002296 <p>instead of</p>
Daniel Veillard361d8452000-04-03 19:48:13 +00002297 <pre>#include "xxx.h"</pre>
2298 </li>
Daniel Veillard8f621982000-03-20 13:07:15 +00002299 <li>a new URI module for parsing URIs and following strictly RFC 2396</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002300 <li>the memory allocation routines used by libxml can now
2301 beoverloadeddynamically by using xmlMemSetup()</li>
2302 <li>The previously CVS only tool tester has
2303 beenrenamed<strong>xmllint</strong>and is now installed as part of
2304 thelibxml2package</li>
2305 <li>The I/O interface has been revamped. There is now ways to
2306 pluginspecific I/O modules, either at the URI scheme detection
2307 levelusingxmlRegisterInputCallbacks() or by passing I/O functions
2308 whencreating aparser context using xmlCreateIOParserCtxt()</li>
2309 <li>there is a C preprocessor macro LIBXML_VERSION providing
2310 theversionnumber of the libxml module in use</li>
2311 <li>a number of optional features of libxml can now be excluded
2312 atconfiguretime (FTP/HTTP/HTML/XPath/Debug)</li>
Daniel Veillardedfb29b2000-03-14 19:59:05 +00002313</ul>
2314
2315<h3>2.0.0beta: Mar 14 2000</h3>
2316<ul>
2317 <li>This is a first Beta release of libxml version 2</li>
Daniel Veillard69839ba2006-06-06 13:27:03 +00002318 <li>It's available only from<a
Daniel Veillardfabafd52006-06-08 08:16:33 +00002319 href="ftp://xmlsoft.org/libxml2/">xmlsoft.orgFTP</a>, it's packaged
2320 aslibxml2-2.0.0beta and available as tar andRPMs</li>
2321 <li>This version is now the head in the Gnome CVS base, the old
2322 oneisavailable under the tag LIB_XML_1_X</li>
2323 <li>This includes a very large set of changes. From a programmatic
2324 pointofview applications should not have to be modified too much, check
2325 the<a href="upgrade.html">upgrade page</a></li>
Daniel Veillardedfb29b2000-03-14 19:59:05 +00002326 <li>Some interfaces may changes (especially a bit about encoding).</li>
2327 <li>the updates includes:
Daniel Veillard6c8b1172000-03-01 00:40:41 +00002328 <ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002329 <li>fix I18N support. ISO-Latin-x/UTF-8/UTF-16 (nearly)
2330 seemscorrectlyhandled now</li>
2331 <li>Better handling of entities, especially well-formedness
2332 checkingandproper PEref extensions in external subsets</li>
Daniel Veillard6c8b1172000-03-01 00:40:41 +00002333 <li>DTD conditional sections</li>
Daniel Veillard63d83142002-05-20 06:51:05 +00002334 <li>Validation now correctly handle entities content</li>
Daniel Veillard69839ba2006-06-06 13:27:03 +00002335 <li><a
Daniel Veillardfabafd52006-06-08 08:16:33 +00002336 href="http://rpmfind.net/tools/gdome/messages/0039.html">changestructuresto
2337 accommodate DOM</a></li>
Daniel Veillard6c8b1172000-03-01 00:40:41 +00002338 </ul>
2339 </li>
Daniel Veillardedfb29b2000-03-14 19:59:05 +00002340 <li>Serious progress were made toward compliance, <a
Daniel Veillardfabafd52006-06-08 08:16:33 +00002341 href="conf/result.html">here are the result of the
2342 test</a>againsttheOASIS testsuite (except the Japanese tests since I
2343 don't supportthatencoding yet). This URL is rebuilt every couple of hours
2344 using theCVShead version.</li>
Daniel Veillarde41f2b72000-01-30 20:00:07 +00002345</ul>
2346
Daniel Veillardf13e1ed2000-03-06 07:41:49 +00002347<h3>1.8.7: Mar 6 2000</h3>
2348<ul>
2349 <li>This is a bug fix release:</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002350 <li>It is possible to disable the ignorable blanks heuristic
2351 usedbylibxml-1.x, a new function xmlKeepBlanksDefault(0) will allow
2352 this.Notethat for adherence to XML spec, this behaviour will be
2353 disabledbydefault in 2.x . The same function will allow to keep
2354 compatibilityforold code.</li>
2355 <li>Blanks in &lt;a&gt; &lt;/a&gt; constructs are not
2356 ignoredanymore,avoiding heuristic is really the Right Way :-\</li>
2357 <li>The unchecked use of snprintf which was breakinglibxml-1.8.6compilation
2358 on some platforms has been fixed</li>
2359 <li>nanoftp.c nanohttp.c: Fixed '#' and '?' stripping
2360 whenprocessingURIs</li>
Daniel Veillardf13e1ed2000-03-06 07:41:49 +00002361</ul>
2362
Daniel Veillarde41f2b72000-01-30 20:00:07 +00002363<h3>1.8.6: Jan 31 2000</h3>
2364<ul>
2365 <li>added a nanoFTP transport module, debugged until the new version of <a
Daniel Veillardfabafd52006-06-08 08:16:33 +00002366 href="http://rpmfind.net/linux/rpm2html/rpmfind.html">rpmfind</a>canuseit
2367 without troubles</li>
Daniel Veillardda07c342000-01-25 18:31:22 +00002368</ul>
2369
2370<h3>1.8.5: Jan 21 2000</h3>
2371<ul>
Daniel Veillard0142b842000-01-14 14:45:24 +00002372 <li>adding APIs to parse a well balanced chunk of XML (production <a
Daniel Veillardfabafd52006-06-08 08:16:33 +00002373 href="http://www.w3.org/TR/REC-xml#NT-content">[43] content</a>of
2374 theXMLspec)</li>
Daniel Veillard461a66c2000-01-18 18:01:01 +00002375 <li>fixed a hideous bug in xmlGetProp pointed by Rune.Djurhuus@fast.no</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002376 <li>Jody Goldberg &lt;jgoldberg@home.com&gt; provided another patchtryingto
2377 solve the zlib checks problems</li>
2378 <li>The current state in gnome CVS base is expected to ship as
2379 1.8.5withgnumeric soon</li>
Daniel Veillard0142b842000-01-14 14:45:24 +00002380</ul>
2381
2382<h3>1.8.4: Jan 13 2000</h3>
2383<ul>
2384 <li>bug fixes, reintroduced xmlNewGlobalNs(), fixed xmlNewNs()</li>
2385 <li>all exit() call should have been removed from libxml</li>
2386 <li>fixed a problem with INCLUDE_WINSOCK on WIN32 platform</li>
2387 <li>added newDocFragment()</li>
Daniel Veillardf84f71f2000-01-05 19:54:23 +00002388</ul>
2389
2390<h3>1.8.3: Jan 5 2000</h3>
2391<ul>
2392 <li>a Push interface for the XML and HTML parsers</li>
Daniel Veillardf13e1ed2000-03-06 07:41:49 +00002393 <li>a shell-like interface to the document tree (try tester --shell :-)</li>
Daniel Veillard63d83142002-05-20 06:51:05 +00002394 <li>lots of bug fixes and improvement added over XMas holidays</li>
Daniel Veillard437b87b2000-01-03 17:30:46 +00002395 <li>fixed the DTD parsing code to work with the xhtml DTD</li>
Daniel Veillardf84f71f2000-01-05 19:54:23 +00002396 <li>added xmlRemoveProp(), xmlRemoveID() and xmlRemoveRef()</li>
2397 <li>Fixed bugs in xmlNewNs()</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002398 <li>External entity loading code has been revamped, now
2399 itusesxmlLoadExternalEntity(), some fix on entities processing
2400 wereadded</li>
Daniel Veillard437b87b2000-01-03 17:30:46 +00002401 <li>cleaned up WIN32 includes of socket stuff</li>
Daniel Veillard5cb5ab81999-12-21 15:35:29 +00002402</ul>
2403
2404<h3>1.8.2: Dec 21 1999</h3>
2405<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002406 <li>I got another problem with includes and C++, I hope this issue
2407 isfixedfor good this time</li>
2408 <li>Added a few tree modification
2409 functions:xmlReplaceNode,xmlAddPrevSibling, xmlAddNextSibling,
2410 xmlNodeSetNameandxmlDocSetRootElement</li>
Daniel Veillard5cb5ab81999-12-21 15:35:29 +00002411 <li>Tried to improve the HTML output with help from <a
2412 href="mailto:clahey@umich.edu">Chris Lahey</a></li>
Daniel Veillarde4e51311999-12-18 15:32:46 +00002413</ul>
Daniel Veillardb24054a1999-12-18 15:32:46 +00002414
Daniel Veillarde4e51311999-12-18 15:32:46 +00002415<h3>1.8.1: Dec 18 1999</h3>
2416<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002417 <li>various patches to avoid troubles when using libxml with
2418 C++compilersthe "namespace" keyword and C escaping in include files</li>
Daniel Veillarde4e51311999-12-18 15:32:46 +00002419 <li>a problem in one of the core macros IS_CHAR was corrected</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002420 <li>fixed a bug introduced in 1.8.0 breaking default
2421 namespaceprocessing,and more specifically the Dia application</li>
2422 <li>fixed a posteriori validation (validation after parsing, or by
2423 usingaDtd not specified in the original document)</li>
Daniel Veillardb24054a1999-12-18 15:32:46 +00002424 <li>fixed a bug in</li>
Daniel Veillard10a2c651999-12-12 13:03:50 +00002425</ul>
2426
2427<h3>1.8.0: Dec 12 1999</h3>
2428<ul>
2429 <li>cleanup, especially memory wise</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002430 <li>the parser should be more reliable, especially the HTML one,
2431 itshouldnot crash, whatever the input !</li>
2432 <li>Integrated various patches, especially a speedup improvement
2433 forlargedataset from <a
2434 href="mailto:cnygard@bellatlantic.net">CarlNygard</a>,configure with
2435 --with-buffers to enable them.</li>
Daniel Veillard10a2c651999-12-12 13:03:50 +00002436 <li>attribute normalization, oops should have been added long ago !</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002437 <li>attributes defaulted from DTDs should be available, xmlSetProp()nowdoes
2438 entities escaping by default.</li>
Daniel Veillard4c3a2031999-11-19 17:46:26 +00002439</ul>
Daniel Veillard35008381999-10-25 13:15:52 +00002440
2441<h3>1.7.4: Oct 25 1999</h3>
Daniel Veillard2f4dfc41999-09-24 14:03:48 +00002442<ul>
Daniel Veillard35008381999-10-25 13:15:52 +00002443 <li>Lots of HTML improvement</li>
2444 <li>Fixed some errors when saving both XML and HTML</li>
2445 <li>More examples, the regression tests should now look clean</li>
2446 <li>Fixed a bug with contiguous charref</li>
2447</ul>
2448
2449<h3>1.7.3: Sep 29 1999</h3>
2450<ul>
2451 <li>portability problems fixed</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002452 <li>snprintf was used unconditionally, leading to link problems
2453 onsystemwere it's not available, fixed</li>
Daniel Veillard2f4dfc41999-09-24 14:03:48 +00002454</ul>
2455
2456<h3>1.7.1: Sep 24 1999</h3>
2457<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002458 <li>The basic type for strings manipulated by libxml has been
2459 renamedin1.7.1 from <strong>CHAR</strong>to <strong>xmlChar</strong>.
2460 Thereasonis that CHAR was conflicting with a predefined type on
2461 Windows.Howeveron non WIN32 environment, compatibility is provided by the
2462 way ofa<strong>#define </strong>.</li>
2463 <li>Changed another error : the use of a structure field called
2464 errno,andleading to troubles on platforms where it's a macro</li>
Daniel Veillard2f4dfc41999-09-24 14:03:48 +00002465</ul>
2466
Daniel Veillard63d83142002-05-20 06:51:05 +00002467<h3>1.7.0: Sep 23 1999</h3>
Daniel Veillard2f4dfc41999-09-24 14:03:48 +00002468<ul>
2469 <li>Added the ability to fetch remote DTD or parsed entities, see the <a
Daniel Veillard69839ba2006-06-06 13:27:03 +00002470 href="html/libxml-nanohttp.html">nanohttp</a>module.</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002471 <li>Added an errno to report errors by another mean than a simpleprintflike
2472 callback</li>
Daniel Veillard2f4dfc41999-09-24 14:03:48 +00002473 <li>Finished ID/IDREF support and checking when validation</li>
2474 <li>Serious memory leaks fixed (there is now a <a
Daniel Veillard69839ba2006-06-06 13:27:03 +00002475 href="html/libxml-xmlmemory.html">memory wrapper</a>module)</li>
2476 <li>Improvement of <a
2477 href="http://www.w3.org/TR/xpath">XPath</a>implementation</li>
Daniel Veillard2f4dfc41999-09-24 14:03:48 +00002478 <li>Added an HTML parser front-end</li>
2479</ul>
2480
2481<h2><a name="XML">XML</a></h2>
Daniel Veillardc8eab3a1999-09-04 18:27:23 +00002482
Daniel Veillardfabafd52006-06-08 08:16:33 +00002483<p><a href="http://www.w3.org/TR/REC-xml">XML is astandard</a>formarkup-based
2484structured documents. Here is <a name="example">an example
2485XMLdocument</a>:</p>
Daniel Veillard60979bd2000-07-10 12:17:33 +00002486<pre>&lt;?xml version="1.0"?&gt;
2487&lt;EXAMPLE prop1="gnome is great" prop2="&amp;amp; linux too"&gt;
2488 &lt;head&gt;
2489 &lt;title&gt;Welcome to Gnome&lt;/title&gt;
2490 &lt;/head&gt;
2491 &lt;chapter&gt;
2492 &lt;title&gt;The Linux adventure&lt;/title&gt;
2493 &lt;p&gt;bla bla bla ...&lt;/p&gt;
2494 &lt;image href="linus.gif"/&gt;
2495 &lt;p&gt;...&lt;/p&gt;
2496 &lt;/chapter&gt;
2497&lt;/EXAMPLE&gt;</pre>
Daniel Veillardb05deb71999-08-10 19:04:08 +00002498
Daniel Veillardfabafd52006-06-08 08:16:33 +00002499<p>The first line specifies that it is an XML document and
2500givesusefulinformation about its encoding. Then the rest of the document is
2501atextformat whose structure is specified by tags between
2502brackets.<strong>Eachtag opened has to be closed</strong>. XML is pedantic
2503about this.However, ifa tag is empty (no content), a single tag can serve as
2504both theopening andclosing tag if it ends with <code>/&gt;</code>rather
2505thanwith<code>&gt;</code>. Note that, for example, the image tag has no
2506content(justan attribute) and is closed by ending the tag
2507with<code>/&gt;</code>.</p>
Daniel Veillardccb09631998-10-27 06:21:04 +00002508
Daniel Veillardfabafd52006-06-08 08:16:33 +00002509<p>XML can be applied successfully to a wide range of tasks, ranging
2510fromlongterm structured document maintenance (where it follows the steps
2511ofSGML) tosimple data encoding mechanisms like configuration file
2512formatting(glade),spreadsheets (gnumeric), or even shorter lived documents
2513such asWebDAV whereit is used to encode remote calls between a client and
2514aserver.</p>
Daniel Veillardc8eab3a1999-09-04 18:27:23 +00002515
Daniel Veillard82687162001-01-22 15:32:01 +00002516<h2><a name="XSLT">XSLT</a></h2>
2517
Daniel Veillard6e6a6cc2001-02-15 15:55:44 +00002518<p>Check <a href="http://xmlsoft.org/XSLT">the separate libxslt page</a></p>
2519
Daniel Veillardfabafd52006-06-08 08:16:33 +00002520<p><a href="http://www.w3.org/TR/xslt">XSL Transformations</a>, is
2521alanguagefor transforming XML documents into other XML documents
2522(orHTML/textualoutput).</p>
Daniel Veillard82687162001-01-22 15:32:01 +00002523
Daniel Veillardfabafd52006-06-08 08:16:33 +00002524<p>A separate library called libxslt is available implementing
2525XSLT-1.0forlibxml2. This module "libxslt" too can be found in the Gnome CVS
2526base.</p>
Daniel Veillard82687162001-01-22 15:32:01 +00002527
Daniel Veillard29f61002005-08-06 09:07:15 +00002528<p>You can check the progresses on the libxslt <a
2529href="http://xmlsoft.org/XSLT/ChangeLog.html">Changelog</a>.</p>
Daniel Veillard6dbcaf82002-02-20 14:37:47 +00002530
2531<h2><a name="Python">Python and bindings</a></h2>
2532
Daniel Veillardfabafd52006-06-08 08:16:33 +00002533<p>There are a number of language bindings and wrappers available
2534forlibxml2,the list below is not exhaustive. Please contact the <a
Daniel Veillard69839ba2006-06-06 13:27:03 +00002535href="http://mail.gnome.org/mailman/listinfo/xml-bindings">xml-bindings@gnome.org</a>(<a
Daniel Veillardfabafd52006-06-08 08:16:33 +00002536href="http://mail.gnome.org/archives/xml-bindings/">archives</a>) inorder
2537toget updates to this list or to discuss the specific topic of
2538libxml2orlibxslt wrappers or bindings:</p>
Daniel Veillard6dbcaf82002-02-20 14:37:47 +00002539<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002540 <li><a
2541 href="http://libxmlplusplus.sourceforge.net/">Libxml++</a>seemsthemost
2542 up-to-date C++ bindings for libxml2, check the <a
2543 href="http://libxmlplusplus.sourceforge.net/reference/html/hierarchy.html">documentation</a>andthe
2544 <a
Daniel Veillardc14401e2002-11-20 14:28:17 +00002545 href="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/libxmlplusplus/libxml%2b%2b/examples/">examples</a>.</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002546 <li>There is another <a
2547 href="http://libgdome-cpp.berlios.de/">C++wrapperbased on the gdome2
2548 bindings</a>maintained by Tobias Peters.</li>
Daniel Veillard9b6fd302002-05-13 12:06:47 +00002549 <li>and a third C++ wrapper by Peter Jones &lt;pjones@pmade.org&gt;
2550 <p>Website: <a
2551 href="http://pmade.org/pjones/software/xmlwrapp/">http://pmade.org/pjones/software/xmlwrapp/</a></p>
2552 </li>
Daniel Veillard6dbcaf82002-02-20 14:37:47 +00002553 <li><a
Daniel Veillardfabafd52006-06-08 08:16:33 +00002554 href="http://mail.gnome.org/archives/xml/2001-March/msg00014.html">MattSergeant</a>developed<a
2555 href="http://axkit.org/download/">XML::LibXSLT</a>, a Perl
2556 wrapperforlibxml2/libxslt as part of the <a
2557 href="http://axkit.com/">AxKitXMLapplication server</a>.</li>
Daniel Veillard4ac494b2003-09-18 15:08:00 +00002558 <li>If you're interested into scripting XML processing, have a look at <a
Daniel Veillardfabafd52006-06-08 08:16:33 +00002559 href="http://xsh.sourceforge.net/">XSH</a>an XML editing shell
2560 basedonLibxml2 Perl bindings.</li>
2561 <li><a href="mailto:dkuhlman@cutter.rexx.com">Dave
2562 Kuhlman</a>providesanearlier version of the libxml/libxslt <a
Daniel Veillard0b28e882002-07-24 23:47:05 +00002563 href="http://www.rexx.com/~dkuhlman">wrappers for Python</a>.</li>
Daniel Veillard21473672002-06-17 07:29:22 +00002564 <li>Gopal.V and Peter Minten develop <a
Daniel Veillardfabafd52006-06-08 08:16:33 +00002565 href="http://savannah.gnu.org/projects/libxmlsharp">libxml#</a>, a
2566 setofC# libxml2 bindings.</li>
Daniel Veillard6dbcaf82002-02-20 14:37:47 +00002567 <li>Petr Kozelka provides <a
Daniel Veillardfabafd52006-06-08 08:16:33 +00002568 href="http://sourceforge.net/projects/libxml2-pas">Pascal units
2569 togluelibxml2</a>with Kylix, Delphi and other Pascal compilers.</li>
Daniel Veillardb2fb8ed2002-04-01 09:33:12 +00002570 <li>Uwe Fechner also provides <a
Daniel Veillardfabafd52006-06-08 08:16:33 +00002571 href="http://sourceforge.net/projects/idom2-pas/">idom2</a>,
2572 aDOM2implementation for Kylix2/D5/D6 from Borland.</li>
2573 <li>There is <a href="http://libxml.rubyforge.org/">bindings forRuby</a>and
2574 libxml2 bindings are also available in Ruby through the <a
2575 href="http://libgdome-ruby.berlios.de/">libgdome-ruby</a>modulemaintainedby
2576 Tobias Peters.</li>
Daniel Veillardaf43f632002-03-08 15:05:20 +00002577 <li>Steve Ball and contributors maintains <a
Daniel Veillardfabafd52006-06-08 08:16:33 +00002578 href="http://tclxml.sourceforge.net/">libxml2 and libxslt
2579 bindingsforTcl</a>.</li>
Daniel Veillard142fb212005-04-07 12:48:10 +00002580 <li>libxml2 and libxslt is the default XML library for PHP5.</li>
Daniel Veillard67952602006-01-05 15:29:44 +00002581 <li><a
Daniel Veillardfabafd52006-06-08 08:16:33 +00002582 href="http://savannah.gnu.org/projects/classpathx/">LibxmlJ</a>isaneffort
2583 to create a 100% JAXP-compatible Java wrapper for libxml2andlibxslt as
2584 part of GNU ClasspathX project.</li>
2585 <li>Patrick McPhee provides Rexx bindings fof libxml2 and libxslt,
2586 lookfor<a
2587 href="http://www.interlog.com/~ptjm/software.html">RexxXML</a>.</li>
2588 <li><a
2589 href="http://www.satimage.fr/software/en/xml_suite.html">Satimage</a>provides<a
2590 href="http://www.satimage.fr/software/en/downloads_osaxen.html">XMLLibosax</a>.This
2591 is an osax for Mac OS X with a set of commands toimplement inAppleScript
2592 the XML DOM, XPATH and XSLT. Also includescommands forProperty-lists
2593 (Apple's fast lookup table XML format.)</li>
Daniel Veillard67952602006-01-05 15:29:44 +00002594 <li>Francesco Montorsi developped <a
Daniel Veillardfabafd52006-06-08 08:16:33 +00002595 href="https://sourceforge.net/project/showfiles.php?group_id=51305&amp;package_id=45182">wxXml2</a>wrappersthat
2596 interface libxml2, allowing wxWidgets applications toload/save/editXML
2597 instances.</li>
Daniel Veillard6dbcaf82002-02-20 14:37:47 +00002598</ul>
2599
Daniel Veillardfabafd52006-06-08 08:16:33 +00002600<p>The distribution includes a set of Python bindings, which are
2601guaranteedtobe maintained as part of the library in the future, though
2602thePythoninterface have not yet reached the completeness of the C API.</p>
Daniel Veillard7b4b2f92003-01-06 13:11:20 +00002603
Daniel Veillardfabafd52006-06-08 08:16:33 +00002604<p>Note that some of the Python purist dislike the default set
2605ofPythonbindings, rather than complaining I suggest they have a look at <a
2606href="http://codespeak.net/lxml/">lxml the more pythonic bindings
2607forlibxml2and libxslt</a>and <a
2608href="http://codespeak.net/mailman/listinfo/lxml-dev">helpMartijnFaassen</a>complete
2609those.</p>
Daniel Veillard929746e2005-05-11 11:08:22 +00002610
Daniel Veillardfabafd52006-06-08 08:16:33 +00002611<p><a
2612href="mailto:stephane.bidoul@softwareag.com">StéphaneBidoul</a>maintains <a
2613href="http://users.skynet.be/sbi/libxml-python/">aWindows portof the Python
2614bindings</a>.</p>
Daniel Veillardaf43f632002-03-08 15:05:20 +00002615
Daniel Veillard69839ba2006-06-06 13:27:03 +00002616<p>Note to people interested in building bindings, the API is formalized as<a
Daniel Veillardfabafd52006-06-08 08:16:33 +00002617href="libxml2-api.xml">an XML API description file</a>which allows
2618toautomatea large part of the Python bindings, this includes
2619functiondescriptions,enums, structures, typedefs, etc... The Python script
2620used tobuild thebindings is python/generator.py in the source
2621distribution.</p>
Daniel Veillard7ef0fcb2002-12-14 10:38:55 +00002622
Daniel Veillardaf43f632002-03-08 15:05:20 +00002623<p>To install the Python bindings there are 2 options:</p>
Daniel Veillard0b79dfe2002-02-23 13:02:31 +00002624<ul>
Daniel Veillardaf43f632002-03-08 15:05:20 +00002625 <li>If you use an RPM based distribution, simply install the <a
Daniel Veillardfabafd52006-06-08 08:16:33 +00002626 href="http://rpmfind.net/linux/rpm2html/search.php?query=libxml2-python">libxml2-pythonRPM</a>(andif
2627 needed the <a
Daniel Veillard69839ba2006-06-06 13:27:03 +00002628 href="http://rpmfind.net/linux/rpm2html/search.php?query=libxslt-python">libxslt-pythonRPM</a>).</li>
2629 <li>Otherwise use the <a
Daniel Veillardfabafd52006-06-08 08:16:33 +00002630 href="ftp://xmlsoft.org/libxml2/python/">libxml2-pythonmoduledistribution</a>corresponding
2631 to your installed version oflibxml2 andlibxslt. Note that to install it
2632 you will need both libxml2and libxsltinstalled and run "python setup.py
2633 build install" in themodule tree.</li>
Daniel Veillard0b79dfe2002-02-23 13:02:31 +00002634</ul>
Daniel Veillardaf43f632002-03-08 15:05:20 +00002635
Daniel Veillardfabafd52006-06-08 08:16:33 +00002636<p>The distribution includes a set of examples and regression tests
2637forthepython bindings in the <code>python/tests</code>directory. Here
2638aresomeexcerpts from those tests:</p>
Daniel Veillard6dbcaf82002-02-20 14:37:47 +00002639
2640<h3>tst.py:</h3>
2641
2642<p>This is a basic test of the file interface and DOM navigation:</p>
MST 2003 John Fleck2dffb762003-11-29 04:41:24 +00002643<pre>import libxml2, sys
Daniel Veillard6dbcaf82002-02-20 14:37:47 +00002644
2645doc = libxml2.parseFile("tst.xml")
2646if doc.name != "tst.xml":
2647 print "doc.name failed"
2648 sys.exit(1)
2649root = doc.children
2650if root.name != "doc":
2651 print "root.name failed"
2652 sys.exit(1)
2653child = root.children
2654if child.name != "foo":
2655 print "child.name failed"
2656 sys.exit(1)
2657doc.freeDoc()</pre>
2658
Daniel Veillardfabafd52006-06-08 08:16:33 +00002659<p>The Python module is called libxml2; parseFile is the
2660equivalentofxmlParseFile (most of the bindings are automatically generated,
2661and thexmlprefix is removed and the casing convention are kept). All node
2662seen atthebinding level share the same subset of accessors:</p>
Daniel Veillard6dbcaf82002-02-20 14:37:47 +00002663<ul>
Daniel Veillard69839ba2006-06-06 13:27:03 +00002664 <li><code>name</code>: returns the node name</li>
2665 <li><code>type</code>: returns a string indicating the node type</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00002666 <li><code>content</code>: returns the content of the node, it is
2667 basedonxmlNodeGetContent() and hence is recursive.</li>
2668 <li><code>parent</code>,
2669 <code>children</code>,<code>last</code>,<code>next</code>,
2670 <code>prev</code>,<code>doc</code>,<code>properties</code>: pointing to
2671 the associatedelement in the tree,those may return None in case no such
2672 linkexists.</li>
Daniel Veillard6dbcaf82002-02-20 14:37:47 +00002673</ul>
2674
Daniel Veillardfabafd52006-06-08 08:16:33 +00002675<p>Also note the need to explicitly deallocate documents with
2676freeDoc().Reference counting for libxml2 trees would need quite a lot of
2677worktofunction properly, and rather than risk memory leaks if
2678notimplementedcorrectly it sounds safer to have an explicit function to free
2679atree. Thewrapper python objects like doc, root or child are
2680themautomatically garbagecollected.</p>
Daniel Veillard6dbcaf82002-02-20 14:37:47 +00002681
2682<h3>validate.py:</h3>
2683
Daniel Veillardfabafd52006-06-08 08:16:33 +00002684<p>This test check the validation interfaces and redirection
2685oferrormessages:</p>
Daniel Veillard6dbcaf82002-02-20 14:37:47 +00002686<pre>import libxml2
2687
Daniel Veillard63d83142002-05-20 06:51:05 +00002688#deactivate error messages from the validation
Daniel Veillard6dbcaf82002-02-20 14:37:47 +00002689def noerr(ctx, str):
2690 pass
2691
2692libxml2.registerErrorHandler(noerr, None)
2693
2694ctxt = libxml2.createFileParserCtxt("invalid.xml")
2695ctxt.validate(1)
2696ctxt.parseDocument()
2697doc = ctxt.doc()
2698valid = ctxt.isValid()
2699doc.freeDoc()
2700if valid != 0:
Daniel Veillard63d83142002-05-20 06:51:05 +00002701 print "validity check failed"</pre>
Daniel Veillard6dbcaf82002-02-20 14:37:47 +00002702
Daniel Veillardfabafd52006-06-08 08:16:33 +00002703<p>The first thing to notice is the call to registerErrorHandler(),
2704itdefinesa new error handler global to the library. It is used to avoid
2705seeingtheerror messages when trying to validate the invalid document.</p>
Daniel Veillard6dbcaf82002-02-20 14:37:47 +00002706
Daniel Veillardfabafd52006-06-08 08:16:33 +00002707<p>The main interest of that test is the creation of a parser
2708contextwithcreateFileParserCtxt() and how the behaviour can be changed
2709beforecallingparseDocument() . Similarly the informations resulting from
2710theparsing phaseare also available using context methods.</p>
Daniel Veillard6dbcaf82002-02-20 14:37:47 +00002711
Daniel Veillardfabafd52006-06-08 08:16:33 +00002712<p>Contexts like nodes are defined as class and the libxml2 wrappers mapstheC
2713function interfaces in terms of objects method as much as possible.Thebest to
2714get a complete view of what methods are supported is to look atthelibxml2.py
2715module containing all the wrappers.</p>
Daniel Veillard6dbcaf82002-02-20 14:37:47 +00002716
2717<h3>push.py:</h3>
2718
2719<p>This test show how to activate the push parser interface:</p>
2720<pre>import libxml2
2721
2722ctxt = libxml2.createPushParser(None, "&lt;foo", 4, "test.xml")
2723ctxt.parseChunk("/&gt;", 2, 1)
2724doc = ctxt.doc()
2725
2726doc.freeDoc()</pre>
2727
Daniel Veillardfabafd52006-06-08 08:16:33 +00002728<p>The context is created with a special call based
2729onthexmlCreatePushParser() from the C library. The first argument is
2730anoptionalSAX callback object, then the initial set of data, the length and
2731thename ofthe resource in case URI-References need to be computed by
2732theparser.</p>
Daniel Veillard6dbcaf82002-02-20 14:37:47 +00002733
Daniel Veillardfabafd52006-06-08 08:16:33 +00002734<p>Then the data are pushed using the parseChunk() method, the
2735lastcallsetting the third argument terminate to 1.</p>
Daniel Veillard6dbcaf82002-02-20 14:37:47 +00002736
2737<h3>pushSAX.py:</h3>
2738
Daniel Veillardfabafd52006-06-08 08:16:33 +00002739<p>this test show the use of the event based parsing interfaces. In
2740thiscasethe parser does not build a document, but provides callback
2741informationasthe parser makes progresses analyzing the data being
2742provided:</p>
Daniel Veillard6dbcaf82002-02-20 14:37:47 +00002743<pre>import libxml2
2744log = ""
2745
2746class callback:
2747 def startDocument(self):
2748 global log
2749 log = log + "startDocument:"
2750
2751 def endDocument(self):
2752 global log
2753 log = log + "endDocument:"
2754
2755 def startElement(self, tag, attrs):
2756 global log
2757 log = log + "startElement %s %s:" % (tag, attrs)
2758
2759 def endElement(self, tag):
2760 global log
2761 log = log + "endElement %s:" % (tag)
2762
2763 def characters(self, data):
2764 global log
2765 log = log + "characters: %s:" % (data)
2766
2767 def warning(self, msg):
2768 global log
2769 log = log + "warning: %s:" % (msg)
2770
2771 def error(self, msg):
2772 global log
2773 log = log + "error: %s:" % (msg)
2774
2775 def fatalError(self, msg):
2776 global log
2777 log = log + "fatalError: %s:" % (msg)
2778
2779handler = callback()
2780
2781ctxt = libxml2.createPushParser(handler, "&lt;foo", 4, "test.xml")
2782chunk = " url='tst'&gt;b"
2783ctxt.parseChunk(chunk, len(chunk), 0)
2784chunk = "ar&lt;/foo&gt;"
2785ctxt.parseChunk(chunk, len(chunk), 1)
2786
Daniel Veillardfcbfa2d2002-02-21 17:54:27 +00002787reference = "startDocument:startElement foo {'url': 'tst'}:" + \
2788 "characters: bar:endElement foo:endDocument:"
Daniel Veillard6dbcaf82002-02-20 14:37:47 +00002789if log != reference:
2790 print "Error got: %s" % log
Daniel Veillard63d83142002-05-20 06:51:05 +00002791 print "Expected: %s" % reference</pre>
Daniel Veillard6dbcaf82002-02-20 14:37:47 +00002792
Daniel Veillardfabafd52006-06-08 08:16:33 +00002793<p>The key object in that test is the handler, it provides a number
2794ofentrypoints which can be called by the parser as it makes progresses
2795toindicatethe information set obtained. The full set of callback is larger
2796thanwhatthe callback class in that specific example implements (see
2797theSAXdefinition for a complete list). The wrapper will only call those
2798suppliedbythe object when activated. The startElement receives the names of
2799theelementand a dictionary containing the attributes carried by this
2800element.</p>
Daniel Veillard6dbcaf82002-02-20 14:37:47 +00002801
Daniel Veillardfabafd52006-06-08 08:16:33 +00002802<p>Also note that the reference string generated from the callback
2803showsasingle character call even though the string "bar" is passed to
2804theparserfrom 2 different call to parseChunk()</p>
Daniel Veillard6dbcaf82002-02-20 14:37:47 +00002805
2806<h3>xpath.py:</h3>
2807
Daniel Veillard63d83142002-05-20 06:51:05 +00002808<p>This is a basic test of XPath wrappers support</p>
Daniel Veillard6dbcaf82002-02-20 14:37:47 +00002809<pre>import libxml2
2810
2811doc = libxml2.parseFile("tst.xml")
2812ctxt = doc.xpathNewContext()
2813res = ctxt.xpathEval("//*")
2814if len(res) != 2:
2815 print "xpath query: wrong node set size"
2816 sys.exit(1)
2817if res[0].name != "doc" or res[1].name != "foo":
2818 print "xpath query: wrong node set value"
2819 sys.exit(1)
2820doc.freeDoc()
2821ctxt.xpathFreeContext()</pre>
2822
Daniel Veillardfabafd52006-06-08 08:16:33 +00002823<p>This test parses a file, then create an XPath context to
2824evaluateXPathexpression on it. The xpathEval() method execute an XPath query
2825andreturnsthe result mapped in a Python way. String and numbers are
2826nativelyconverted,and node sets are returned as a tuple of libxml2 Python
2827nodeswrappers. Likethe document, the XPath context need to be freed
2828explicitly,also not thatthe result of the XPath query may point back to the
2829documenttree and hencethe document must be freed after the result of the
2830query isused.</p>
Daniel Veillard6dbcaf82002-02-20 14:37:47 +00002831
2832<h3>xpathext.py:</h3>
2833
Daniel Veillardfabafd52006-06-08 08:16:33 +00002834<p>This test shows how to extend the XPath engine with functions
2835writteninpython:</p>
Daniel Veillard6dbcaf82002-02-20 14:37:47 +00002836<pre>import libxml2
2837
2838def foo(ctx, x):
2839 return x + 1
2840
2841doc = libxml2.parseFile("tst.xml")
2842ctxt = doc.xpathNewContext()
2843libxml2.registerXPathFunction(ctxt._o, "foo", None, foo)
2844res = ctxt.xpathEval("foo(1)")
2845if res != 2:
2846 print "xpath extension failure"
2847doc.freeDoc()
2848ctxt.xpathFreeContext()</pre>
2849
Daniel Veillardfabafd52006-06-08 08:16:33 +00002850<p>Note how the extension function is registered with the context
2851(butthatpart is not yet finalized, this may change slightly in the
2852future).</p>
Daniel Veillard6dbcaf82002-02-20 14:37:47 +00002853
2854<h3>tstxpath.py:</h3>
2855
Daniel Veillardfabafd52006-06-08 08:16:33 +00002856<p>This test is similar to the previous one but shows how
2857theextensionfunction can access the XPath evaluation context:</p>
Daniel Veillard6dbcaf82002-02-20 14:37:47 +00002858<pre>def foo(ctx, x):
2859 global called
2860
2861 #
2862 # test that access to the XPath evaluation contexts
2863 #
2864 pctxt = libxml2.xpathParserContext(_obj=ctx)
2865 ctxt = pctxt.context()
2866 called = ctxt.function()
2867 return x + 1</pre>
2868
Daniel Veillardfabafd52006-06-08 08:16:33 +00002869<p>All the interfaces around the XPath parser(or rather evaluation)contextare
2870not finalized, but it should be sufficient to do contextual workat
2871theevaluation point.</p>
Daniel Veillard6dbcaf82002-02-20 14:37:47 +00002872
2873<h3>Memory debugging:</h3>
2874
2875<p>last but not least, all tests starts with the following prologue:</p>
2876<pre>#memory debug specific
Daniel Veillardaf43f632002-03-08 15:05:20 +00002877libxml2.debugMemory(1)</pre>
Daniel Veillard6dbcaf82002-02-20 14:37:47 +00002878
2879<p>and ends with the following epilogue:</p>
2880<pre>#memory debug specific
2881libxml2.cleanupParser()
2882if libxml2.debugMemory(1) == 0:
2883 print "OK"
2884else:
2885 print "Memory leak %d bytes" % (libxml2.debugMemory(1))
2886 libxml2.dumpMemory()</pre>
2887
Daniel Veillardfabafd52006-06-08 08:16:33 +00002888<p>Those activate the memory debugging interface of libxml2 whereallallocated
2889block in the library are tracked. The prologue then cleans upthelibrary state
2890and checks that all allocated memory has been freed. If notitcalls
2891dumpMemory() which saves that list in a <code>.memdump</code>file.</p>
Daniel Veillard82687162001-01-22 15:32:01 +00002892
Daniel Veillard8a469172003-06-12 16:05:07 +00002893<h2><a name="architecture">libxml2 architecture</a></h2>
Daniel Veillard4540be42000-08-19 16:40:28 +00002894
Daniel Veillardfabafd52006-06-08 08:16:33 +00002895<p>Libxml2 is made of multiple components; some of them are optional,
2896andmostof the block interfaces are public. The main components are:</p>
Daniel Veillard4540be42000-08-19 16:40:28 +00002897<ul>
2898 <li>an Input/Output layer</li>
Daniel Veillard91e9d582001-02-26 07:31:12 +00002899 <li>FTP and HTTP client layers (optional)</li>
Daniel Veillard4540be42000-08-19 16:40:28 +00002900 <li>an Internationalization layer managing the encodings support</li>
Daniel Veillard91e9d582001-02-26 07:31:12 +00002901 <li>a URI module</li>
Daniel Veillard4540be42000-08-19 16:40:28 +00002902 <li>the XML parser and its basic SAX interface</li>
Daniel Veillard91e9d582001-02-26 07:31:12 +00002903 <li>an HTML parser using the same SAX interface (optional)</li>
Daniel Veillard4540be42000-08-19 16:40:28 +00002904 <li>a SAX tree module to build an in-memory DOM representation</li>
2905 <li>a tree module to manipulate the DOM representation</li>
Daniel Veillard91e9d582001-02-26 07:31:12 +00002906 <li>a validation module using the DOM representation (optional)</li>
Daniel Veillard69839ba2006-06-06 13:27:03 +00002907 <li>an XPath module for global lookup in a DOM representation(optional)</li>
Daniel Veillard91e9d582001-02-26 07:31:12 +00002908 <li>a debug module (optional)</li>
Daniel Veillard4540be42000-08-19 16:40:28 +00002909</ul>
2910
2911<p>Graphically this gives the following:</p>
2912
2913<p><img src="libxml.gif" alt="a graphical view of the various"></p>
2914
2915<p></p>
2916
Daniel Veillard2f4dfc41999-09-24 14:03:48 +00002917<h2><a name="tree">The tree output</a></h2>
Daniel Veillardb05deb71999-08-10 19:04:08 +00002918
Daniel Veillardfabafd52006-06-08 08:16:33 +00002919<p>The parser returns a tree built during the document analysis.
2920Thevaluereturned is an <strong>xmlDocPtr</strong>(i.e., a pointer
2921toan<strong>xmlDoc</strong>structure). This structure contains
2922informationsuchas the file name, the document type, and
2923a<strong>children</strong>pointerwhich is the root of the document (or
2924moreexactly the first child under theroot which is the document). The tree
2925ismade of <strong>xmlNode</strong>s,chained in double-linked lists of
2926siblingsand with a children&lt;-&gt;parentrelationship. An xmlNode can also
2927carryproperties (a chain of xmlAttrstructures). An attribute may have a
2928valuewhich is a list of TEXT orENTITY_REF nodes.</p>
Daniel Veillardb05deb71999-08-10 19:04:08 +00002929
Daniel Veillardfabafd52006-06-08 08:16:33 +00002930<p>Here is an example (erroneous with respect to the XML spec
2931sincethereshould be only one ELEMENT under the root):</p>
Daniel Veillardb05deb71999-08-10 19:04:08 +00002932
2933<p><img src="structure.gif" alt=" structure.gif "></p>
2934
Daniel Veillardfabafd52006-06-08 08:16:33 +00002935<p>In the source package there is a small program (not installed
2936bydefault)called <strong>xmllint</strong>which parses XML files given
2937asargument andprints them back as parsed. This is useful for detecting
2938errorsboth in XMLcode and in the XML parser itself. It has an
2939option<strong>--debug</strong>which prints the actual in-memory structure of
2940thedocument; here is theresult with the <a
2941href="#example">example</a>givenbefore:</p>
Daniel Veillard10c6a8f1998-10-28 01:00:12 +00002942<pre>DOCUMENT
2943version=1.0
2944standalone=true
2945 ELEMENT EXAMPLE
2946 ATTRIBUTE prop1
2947 TEXT
2948 content=gnome is great
2949 ATTRIBUTE prop2
2950 ENTITY_REF
2951 TEXT
Daniel Veillard0142b842000-01-14 14:45:24 +00002952 content= linux too
Daniel Veillard402e8c82000-02-29 22:57:47 +00002953 ELEMENT head
Daniel Veillard10c6a8f1998-10-28 01:00:12 +00002954 ELEMENT title
Daniel Veillard25940b71998-10-29 05:51:30 +00002955 TEXT
2956 content=Welcome to Gnome
Daniel Veillard10c6a8f1998-10-28 01:00:12 +00002957 ELEMENT chapter
2958 ELEMENT title
Daniel Veillard25940b71998-10-29 05:51:30 +00002959 TEXT
2960 content=The Linux adventure
Daniel Veillard10c6a8f1998-10-28 01:00:12 +00002961 ELEMENT p
Daniel Veillard25940b71998-10-29 05:51:30 +00002962 TEXT
2963 content=bla bla bla ...
Daniel Veillard10c6a8f1998-10-28 01:00:12 +00002964 ELEMENT image
2965 ATTRIBUTE href
2966 TEXT
2967 content=linus.gif
2968 ELEMENT p
Daniel Veillard25940b71998-10-29 05:51:30 +00002969 TEXT
2970 content=...</pre>
Daniel Veillardb05deb71999-08-10 19:04:08 +00002971
Daniel Veillard88f00ae2000-03-02 00:15:55 +00002972<p>This should be useful for learning the internal representation model.</p>
Daniel Veillardccb09631998-10-27 06:21:04 +00002973
Daniel Veillard2f4dfc41999-09-24 14:03:48 +00002974<h2><a name="interface">The SAX interface</a></h2>
Daniel Veillardc8eab3a1999-09-04 18:27:23 +00002975
Daniel Veillardfabafd52006-06-08 08:16:33 +00002976<p>Sometimes the DOM tree output is just too large to fit
2977reasonablyintomemory. In that case (and if you don't expect to save back the
2978XMLdocumentloaded using libxml), it's better to use the SAX interface of
2979libxml.SAX isa <strong>callback-based interface</strong>to the parser.
2980Beforeparsing,the application layer registers a customized set of callbacks
2981whicharecalled by the library as it progresses through the XML input.</p>
Daniel Veillardc8eab3a1999-09-04 18:27:23 +00002982
Daniel Veillardfabafd52006-06-08 08:16:33 +00002983<p>To get more detailed step-by-step guidance on using the SAX
2984interfaceoflibxml, see the <a
2985href="http://www.daa.com.au/~james/gnome/xml-sax/xml-sax.html">nicedocumentation</a>.writtenby
2986<a href="mailto:james@daa.com.au">JamesHenstridge</a>.</p>
Daniel Veillardc8eab3a1999-09-04 18:27:23 +00002987
Daniel Veillardfabafd52006-06-08 08:16:33 +00002988<p>You can debug the SAX behaviour by using
2989the<strong>testSAX</strong>program located in the gnome-xml module (it's
2990usuallynot shipped in thebinary packages of libxml, but you can find it in
2991the tarsourcedistribution). Here is the sequence of callbacks that would be
2992reportedbytestSAX when parsing the example XML document shown earlier:</p>
Daniel Veillardc8eab3a1999-09-04 18:27:23 +00002993<pre>SAX.setDocumentLocator()
2994SAX.startDocument()
2995SAX.getEntity(amp)
2996SAX.startElement(EXAMPLE, prop1='gnome is great', prop2='&amp;amp; linux too')
2997SAX.characters( , 3)
2998SAX.startElement(head)
2999SAX.characters( , 4)
3000SAX.startElement(title)
3001SAX.characters(Welcome to Gnome, 16)
3002SAX.endElement(title)
3003SAX.characters( , 3)
3004SAX.endElement(head)
3005SAX.characters( , 3)
3006SAX.startElement(chapter)
3007SAX.characters( , 4)
3008SAX.startElement(title)
3009SAX.characters(The Linux adventure, 19)
3010SAX.endElement(title)
3011SAX.characters( , 4)
3012SAX.startElement(p)
3013SAX.characters(bla bla bla ..., 15)
3014SAX.endElement(p)
3015SAX.characters( , 4)
3016SAX.startElement(image, href='linus.gif')
3017SAX.endElement(image)
3018SAX.characters( , 4)
3019SAX.startElement(p)
3020SAX.characters(..., 3)
3021SAX.endElement(p)
3022SAX.characters( , 3)
3023SAX.endElement(chapter)
3024SAX.characters( , 1)
3025SAX.endElement(EXAMPLE)
3026SAX.endDocument()</pre>
3027
Daniel Veillardfabafd52006-06-08 08:16:33 +00003028<p>Most of the other interfaces of libxml2 are based on the
3029DOMtree-buildingfacility, so nearly everything up to the end of this
3030documentpresupposes theuse of the standard DOM tree build. Note that the DOM
3031treeitself is built bya set of registered default callbacks, without
3032internalspecificinterface.</p>
Daniel Veillardc8eab3a1999-09-04 18:27:23 +00003033
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003034<h2><a name="Validation">Validation &amp; DTDs</a></h2>
3035
3036<p>Table of Content:</p>
3037<ol>
3038 <li><a href="#General5">General overview</a></li>
3039 <li><a href="#definition">The definition</a></li>
3040 <li><a href="#Simple">Simple rules</a>
3041 <ol>
Daniel Veillard9c466822001-10-25 12:03:39 +00003042 <li><a href="#reference">How to reference a DTD from a document</a></li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003043 <li><a href="#Declaring">Declaring elements</a></li>
3044 <li><a href="#Declaring1">Declaring attributes</a></li>
3045 </ol>
3046 </li>
3047 <li><a href="#Some">Some examples</a></li>
3048 <li><a href="#validate">How to validate</a></li>
3049 <li><a href="#Other">Other resources</a></li>
3050</ol>
3051
3052<h3><a name="General5">General overview</a></h3>
3053
3054<p>Well what is validation and what is a DTD ?</p>
3055
Daniel Veillardfabafd52006-06-08 08:16:33 +00003056<p>DTD is the acronym for Document Type Definition. This is a
3057descriptionofthe content for a family of XML files. This is part of the
3058XML1.0specification, and allows one to describe and verify that a
3059givendocumentinstance conforms to the set of rules detailing its structure
3060andcontent.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003061
Daniel Veillardfabafd52006-06-08 08:16:33 +00003062<p>Validation is the process of checking a document against a
3063DTD(moregenerally against a set of construction rules).</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003064
Daniel Veillardfabafd52006-06-08 08:16:33 +00003065<p>The validation process and building DTDs are the two most difficultpartsof
3066the XML life cycle. Briefly a DTD defines all the possible elementsto befound
3067within your document, what is the formal shape of your documenttree(by
3068defining the allowed content of an element; either text, aregularexpression
3069for the allowed list of children, or mixed content i.e.both textand
3070children). The DTD also defines the valid attributes for allelements andthe
3071types of those attributes.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003072
3073<h3><a name="definition1">The definition</a></h3>
3074
Daniel Veillard69839ba2006-06-06 13:27:03 +00003075<p>The <a href="http://www.w3.org/TR/REC-xml">W3C XML Recommendation</a>(<a
Daniel Veillardfabafd52006-06-08 08:16:33 +00003076href="http://www.xml.com/axml/axml.html">Tim Bray's annotated
3077versionofRev1</a>):</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003078<ul>
Daniel Veillard69839ba2006-06-06 13:27:03 +00003079 <li><a
3080 href="http://www.w3.org/TR/REC-xml#elemdecls">Declaringelements</a></li>
3081 <li><a
3082 href="http://www.w3.org/TR/REC-xml#attdecls">Declaringattributes</a></li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003083</ul>
3084
Daniel Veillardfabafd52006-06-08 08:16:33 +00003085<p>(unfortunately) all this is inherited from the SGML world, the
3086syntaxisancient...</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003087
3088<h3><a name="Simple1">Simple rules</a></h3>
3089
Daniel Veillardfabafd52006-06-08 08:16:33 +00003090<p>Writing DTDs can be done in many ways. The rules to build them if
3091youneedsomething permanent or something which can evolve over time can
3092beradicallydifferent. Really complex DTDs like DocBook ones are flexible
3093butquiteharder to design. I will just focus on DTDs for a formats with a
3094fixedsimplestructure. It is just a set of basic rules, and definitely
3095notexhaustive norusable for complex DTD design.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003096
3097<h4><a name="reference1">How to reference a DTD from a document</a>:</h4>
3098
Daniel Veillardfabafd52006-06-08 08:16:33 +00003099<p>Assuming the top element of the document is <code>spec</code>and the
3100dtdisplaced in the file <code>mydtd</code>in the
3101subdirectory<code>dtds</code>ofthe directory from where the document were
3102loaded:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003103
3104<p><code>&lt;!DOCTYPE spec SYSTEM "dtds/mydtd"&gt;</code></p>
3105
3106<p>Notes:</p>
3107<ul>
Daniel Veillard0b28e882002-07-24 23:47:05 +00003108 <li>The system string is actually an URI-Reference (as defined in <a
Daniel Veillardfabafd52006-06-08 08:16:33 +00003109 href="http://www.ietf.org/rfc/rfc2396.txt">RFC 2396</a>) so you can
3110 useafull URL string indicating the location of your DTD on the Web. This
3111 isareally good thing to do if you want others to validate
3112 yourdocument.</li>
3113 <li>It is also possible to associate a <code>PUBLIC</code>identifier(amagic
3114 string) so that the DTD is looked up in catalogs on the clientsidewithout
3115 having to locate it on the web.</li>
3116 <li>A DTD contains a set of element and attribute declarations,
3117 buttheydon't define what the root of the document should be. This
3118 isexplicitlytold to the parser/validator as the first element
3119 ofthe<code>DOCTYPE</code>declaration.</li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003120</ul>
3121
3122<h4><a name="Declaring2">Declaring elements</a>:</h4>
3123
3124<p>The following declares an element <code>spec</code>:</p>
3125
3126<p><code>&lt;!ELEMENT spec (front, body, back?)&gt;</code></p>
3127
Daniel Veillardfabafd52006-06-08 08:16:33 +00003128<p>It also expresses that the spec element contains one<code>front</code>,one
3129<code>body</code>and one optional<code>back</code>children elements inthis
3130order. The declaration of oneelement of the structure and its contentare done
3131in a single declaration.Similarly the following
3132declares<code>div1</code>elements:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003133
Daniel Veillard51737272002-01-23 23:10:38 +00003134<p><code>&lt;!ELEMENT div1 (head, (p | list | note)*, div2?)&gt;</code></p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003135
Daniel Veillardfabafd52006-06-08 08:16:33 +00003136<p>which means div1 contains one <code>head</code>then a series
3137ofoptional<code>p</code>, <code>list</code>s and <code>note</code>s and
3138thenanoptional <code>div2</code>. And last but not least an element
3139cancontaintext:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003140
3141<p><code>&lt;!ELEMENT b (#PCDATA)&gt;</code></p>
3142
Daniel Veillardfabafd52006-06-08 08:16:33 +00003143<p><code>b</code>contains text or being of mixed content (text and
3144elementsinno particular order):</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003145
3146<p><code>&lt;!ELEMENT p (#PCDATA|a|ul|b|i|em)*&gt;</code></p>
3147
Daniel Veillardfabafd52006-06-08 08:16:33 +00003148<p><code>p </code>can contain text or
3149<code>a</code>,<code>ul</code>,<code>b</code>, <code>i </code>or
3150<code>em</code>elements inno particularorder.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003151
3152<h4><a name="Declaring1">Declaring attributes</a>:</h4>
3153
Daniel Veillard0b28e882002-07-24 23:47:05 +00003154<p>Again the attributes declaration includes their content definition:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003155
3156<p><code>&lt;!ATTLIST termdef name CDATA #IMPLIED&gt;</code></p>
3157
Daniel Veillardfabafd52006-06-08 08:16:33 +00003158<p>means that the element <code>termdef</code>can have
3159a<code>name</code>attribute containing text (<code>CDATA</code>) and which
3160isoptional(<code>#IMPLIED</code>). The attribute value can also be
3161definedwithin aset:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003162
Daniel Veillardfabafd52006-06-08 08:16:33 +00003163<p><code>&lt;!ATTLIST list
3164type(bullets|ordered|glossary)"ordered"&gt;</code></p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003165
Daniel Veillardfabafd52006-06-08 08:16:33 +00003166<p>means <code>list</code>element have a <code>type</code>attribute
3167with3allowed values "bullets", "ordered" or "glossary" and which
3168defaultto"ordered" if the attribute is not explicitly specified.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003169
Daniel Veillardfabafd52006-06-08 08:16:33 +00003170<p>The content type of an attribute can be
3171text(<code>CDATA</code>),anchor/reference/references(<code>ID</code>/<code>IDREF</code>/<code>IDREFS</code>),entity(ies)(<code>ENTITY</code>/<code>ENTITIES</code>)
3172orname(s)(<code>NMTOKEN</code>/<code>NMTOKENS</code>). The following
3173definesthat a<code>chapter</code>element can have an
3174optional<code>id</code>attributeof type <code>ID</code>, usable for reference
3175fromattribute of typeIDREF:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003176
3177<p><code>&lt;!ATTLIST chapter id ID #IMPLIED&gt;</code></p>
3178
Daniel Veillardfabafd52006-06-08 08:16:33 +00003179<p>The last value of an attribute definition can
3180be<code>#REQUIRED</code>meaning that the attribute has to be
3181given,<code>#IMPLIED</code>meaning that it is optional, or the default
3182value(possibly prefixed by<code>#FIXED</code>if it is the only allowed).</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003183
3184<p>Notes:</p>
3185<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00003186 <li>Usually the attributes pertaining to a given element are declared
3187 inasingle expression, but it is just a convention adopted by a lot
3188 ofDTDwriters:
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003189 <pre>&lt;!ATTLIST termdef
3190 id ID #REQUIRED
3191 name CDATA #IMPLIED&gt;</pre>
Daniel Veillardfabafd52006-06-08 08:16:33 +00003192 <p>The previous construct defines
3193 both<code>id</code>and<code>name</code>attributes for the
3194 element<code>termdef</code>.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003195 </li>
3196</ul>
3197
3198<h3><a name="Some1">Some examples</a></h3>
3199
Daniel Veillardfabafd52006-06-08 08:16:33 +00003200<p>The directory <code>test/valid/dtds/</code>in the
3201libxml2distributioncontains some complex DTD examples. The example in
3202thefile<code>test/valid/dia.xml</code>shows an XML file where the simple
3203DTDisdirectly included within the document.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003204
3205<h3><a name="validate1">How to validate</a></h3>
3206
Daniel Veillardfabafd52006-06-08 08:16:33 +00003207<p>The simplest way is to use the xmllint program included with
3208libxml.The<code>--valid</code>option turns-on validation of the files given
3209asinput.For example the following validates a copy of the first revision of
3210theXML1.0 specification:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003211
3212<p><code>xmllint --valid --noout test/valid/REC-xml-19980210.xml</code></p>
3213
Daniel Veillard0b28e882002-07-24 23:47:05 +00003214<p>the -- noout is used to disable output of the resulting tree.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003215
Daniel Veillardfabafd52006-06-08 08:16:33 +00003216<p>The <code>--dtdvalid dtd</code>allows validation of the
3217document(s)againsta given DTD.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003218
Daniel Veillard8a469172003-06-12 16:05:07 +00003219<p>Libxml2 exports an API to handle DTDs and validation, check the <a
Daniel Veillard69839ba2006-06-06 13:27:03 +00003220href="http://xmlsoft.org/html/libxml-valid.html">associateddescription</a>.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003221
3222<h3><a name="Other1">Other resources</a></h3>
3223
Daniel Veillardfabafd52006-06-08 08:16:33 +00003224<p>DTDs are as old as SGML. So there may be a number of examples
3225on-line,Iwill just list one for now, others pointers welcome:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003226<ul>
3227 <li><a href="http://www.xml101.com:8081/dtd/">XML-101 DTD</a></li>
3228</ul>
3229
Daniel Veillardfabafd52006-06-08 08:16:33 +00003230<p>I suggest looking at the examples found under test/valid/dtd and any
3231ofthelarge number of books available on XML. The dia example in
3232test/validshouldbe both simple and complete enough to allow you to build your
3233own.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003234
3235<p></p>
3236
3237<h2><a name="Memory">Memory Management</a></h2>
3238
3239<p>Table of Content:</p>
3240<ol>
3241 <li><a href="#General3">General overview</a></li>
Daniel Veillard8a469172003-06-12 16:05:07 +00003242 <li><a href="#setting">Setting libxml2 set of memory routines</a></li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003243 <li><a href="#cleanup">Cleaning up after parsing</a></li>
3244 <li><a href="#Debugging">Debugging routines</a></li>
3245 <li><a href="#General4">General memory requirements</a></li>
3246</ol>
3247
3248<h3><a name="General3">General overview</a></h3>
3249
3250<p>The module <code><a
Daniel Veillardfabafd52006-06-08 08:16:33 +00003251href="http://xmlsoft.org/html/libxml-xmlmemory.html">xmlmemory.h</a></code>providesthe
3252interfaces to the libxml2 memory system:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003253<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00003254 <li>libxml2 does not use the libc memory allocator directly
3255 butxmlFree(),xmlMalloc() and xmlRealloc()</li>
3256 <li>those routines can be reallocated to a specific set of
3257 routine,bydefault the libc ones i.e. free(), malloc() and realloc()</li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003258 <li>the xmlmemory.c module includes a set of debugging routine</li>
3259</ul>
3260
Daniel Veillard8a469172003-06-12 16:05:07 +00003261<h3><a name="setting">Setting libxml2 set of memory routines</a></h3>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003262
Daniel Veillardfabafd52006-06-08 08:16:33 +00003263<p>It is sometimes useful to not use the default memory allocator,
3264eitherfordebugging, analysis or to implement a specific behaviour on
3265memorymanagement(like on embedded systems). Two function calls are available
3266to doso:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003267<ul>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003268 <li><a
Daniel Veillardfabafd52006-06-08 08:16:33 +00003269 href="http://xmlsoft.org/html/libxml-xmlmemory.html">xmlMemGet()</a>whichreturn
3270 the current set of functions in use by the parser</li>
Daniel Veillard69839ba2006-06-06 13:27:03 +00003271 <li><a
Daniel Veillardfabafd52006-06-08 08:16:33 +00003272 href="http://xmlsoft.org/html/libxml-xmlmemory.html">xmlMemSetup()</a>whichallow
3273 to set up a new set of memory allocation functions</li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003274</ul>
3275
Daniel Veillardfabafd52006-06-08 08:16:33 +00003276<p>Of course a call to xmlMemSetup() should probably be done beforecallingany
3277other libxml2 routines (unless you are sure your allocationsroutines
3278arecompatibles).</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003279
3280<h3><a name="cleanup">Cleaning up after parsing</a></h3>
3281
Daniel Veillardfabafd52006-06-08 08:16:33 +00003282<p>Libxml2 is not stateless, there is a few set of memory
3283structuresneedingallocation before the parser is fully functional (some
3284encodingstructuresfor example). This also mean that once parsing is finished
3285there isa tinyamount of memory (a few hundred bytes) which can be recollected
3286if youdon'treuse the parser immediately:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003287<ul>
Daniel Veillard69839ba2006-06-06 13:27:03 +00003288 <li><a
Daniel Veillardfabafd52006-06-08 08:16:33 +00003289 href="http://xmlsoft.org/html/libxml-parser.html">xmlCleanupParser()</a>isa
3290 centralized routine to free the parsing states. Note that
3291 itwon'tdeallocate any produced tree if any (use the xmlFreeDoc()
3292 andrelatedroutines for this).</li>
Daniel Veillard69839ba2006-06-06 13:27:03 +00003293 <li><a
Daniel Veillardfabafd52006-06-08 08:16:33 +00003294 href="http://xmlsoft.org/html/libxml-parser.html">xmlInitParser()</a>isthe
3295 dual routine allowing to preallocate the parsing statewhich can beuseful
3296 for example to avoid initialization reentrancyproblems when usinglibxml2
3297 in multithreaded applications</li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003298</ul>
3299
Daniel Veillardfabafd52006-06-08 08:16:33 +00003300<p>Generally xmlCleanupParser() is safe, if needed the state will berebuildat
3301the next invocation of parser routines, but be careful of theconsequencesin
3302multithreaded applications.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003303
3304<h3><a name="Debugging">Debugging routines</a></h3>
3305
Daniel Veillardfabafd52006-06-08 08:16:33 +00003306<p>When configured using --with-mem-debug flag (off by default), libxml2usesa
3307set of memory allocation debugging routines keeping track of
3308allallocatedblocks and the location in the code where the routine was called.
3309Acouple ofother debugging routines allow to dump the memory allocated infos
3310toa fileor call a specific routine when a given block number is allocated:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003311<ul>
3312 <li><a
Daniel Veillard69839ba2006-06-06 13:27:03 +00003313 href="http://xmlsoft.org/html/libxml-xmlmemory.html">xmlMallocLoc()</a><a
Daniel Veillardfabafd52006-06-08 08:16:33 +00003314 href="http://xmlsoft.org/html/libxml-xmlmemory.html">xmlReallocLoc()</a>and<a
3315 href="http://xmlsoft.org/html/libxml-xmlmemory.html">xmlMemStrdupLoc()</a>arethe
3316 memory debugging replacement allocation routines</li>
Daniel Veillard69839ba2006-06-06 13:27:03 +00003317 <li><a
Daniel Veillardfabafd52006-06-08 08:16:33 +00003318 href="http://xmlsoft.org/html/libxml-xmlmemory.html">xmlMemoryDump()</a>dumpsall
3319 the informations about the allocated memory block leftsin
3320 the<code>.memdump</code>file</li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003321</ul>
3322
Daniel Veillardfabafd52006-06-08 08:16:33 +00003323<p>When developing libxml2 memory debug is enabled, the tests
3324programscallxmlMemoryDump () and the "make test" regression tests will check
3325foranymemory leak during the full regression test sequence, this helps
3326alotensuring that libxml2 does not leak memory and bullet
3327proofmemoryallocations use (some libc implementations are known to be far
3328toopermissiveresulting in major portability problems!).</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003329
Daniel Veillardfabafd52006-06-08 08:16:33 +00003330<p>If the .memdump reports a leak, it displays the allocation functionandalso
3331tries to give some informations about the content and structure
3332oftheallocated blocks left. This is sufficient in most cases to find
3333theculprit,but not always. Assuming the allocation problem is reproducible,
3334itispossible to find more easily:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003335<ol>
3336 <li>write down the block number xxxx not allocated</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00003337 <li>export the environment variable XML_MEM_BREAKPOINT=xxxx ,
3338 theeasiestwhen using GDB is to simply give the command
Daniel Veillard75794822002-04-11 16:24:32 +00003339 <p><code>set environment XML_MEM_BREAKPOINT xxxx</code></p>
3340 <p>before running the program.</p>
3341 </li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00003342 <li>run the program under a debugger and set a
3343 breakpointonxmlMallocBreakpoint() a specific function called when this
3344 preciseblockis allocated</li>
3345 <li>when the breakpoint is reached you can then do a fine analysis
3346 oftheallocation an step to see the condition resulting in
3347 themissingdeallocation.</li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003348</ol>
3349
Daniel Veillardfabafd52006-06-08 08:16:33 +00003350<p>I used to use a commercial tool to debug libxml2 memory problems
3351butafternoticing that it was not detecting memory leaks that simple
3352mechanismwasused and proved extremely efficient until now. Lately I have also
3353used <a href="http://developer.kde.org/~sewardj/">valgrind</a>with quite
3354somesuccess,it is tied to the i386 architecture since it works by emulating
3355theprocessorand instruction set, it is slow but extremely efficient, i.e.
3356itspot memoryusage errors in a very precise way.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003357
3358<h3><a name="General4">General memory requirements</a></h3>
3359
Daniel Veillardfabafd52006-06-08 08:16:33 +00003360<p>How much libxml2 memory require ? It's hard to tell in average itdependsof
3361a number of things:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003362<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00003363 <li>the parser itself should work in a fixed amount of memory,
3364 exceptforinformation maintained about the stacks of names and
3365 entitieslocations.The I/O and encoding handlers will probably account for
3366 a fewKBytes.This is true for both the XML and HTML parser (though the
3367 HTMLparserneed more state).</li>
3368 <li>If you are generating the DOM tree then memory requirements
3369 willgrownearly linear with the size of the data. In general for
3370 abalancedtextual document the internal memory requirement is about 4
3371 timesthesize of the UTF8 serialization of this document (example
3372 theXML-1.0recommendation is a bit more of 150KBytes and takes 650KBytes
3373 ofmainmemory when parsed). Validation will add a amount of memory
3374 requiredformaintaining the external Dtd state which should be linear
3375 withthecomplexity of the content model defined by the Dtd</li>
3376 <li>If you need to work with fixed memory requirements or don't needthefull
3377 DOM tree then using the <a href="xmlreader.html">xmlReaderinterface</a>is
3378 probably the best way toproceed, it still allows tovalidate or operate on
3379 subset of the tree ifneeded.</li>
3380 <li>If you don't care about the advanced features of libxml2likevalidation,
3381 DOM, XPath or XPointer, don't use entities, need to workwithfixed memory
3382 requirements, and try to get the fastest parsingpossiblethen the SAX
3383 interface should be used, but it has knownrestrictions.</li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003384</ul>
3385
3386<p></p>
3387
3388<h2><a name="Encodings">Encodings support</a></h2>
3389
Daniel Veillardfabafd52006-06-08 08:16:33 +00003390<p>If you are not really familiar with Internationalization (usual
3391shortcutisI18N) , Unicode, characters and glyphs, I suggest you read a <a
3392href="http://www.tbray.org/ongoing/When/200x/2003/04/06/Unicode">presentation</a>byTim
3393Bray on Unicode and why you should care about it.</p>
Daniel Veillard67952602006-01-05 15:29:44 +00003394
Daniel Veillardfabafd52006-06-08 08:16:33 +00003395<p>If you don't understand why <b>it does not make sense to have
3396astringwithout knowing what encoding it uses</b>, then as Joel Spolsky said
3397<a href="http://www.joelonsoftware.com/articles/Unicode.html">please do
3398notwriteanother line of code until you finish reading that article.</a>. It
3399isaprerequisite to understand this page, and avoid a lot of
3400problemswithlibxml2, XML or text processing in general.</p>
Daniel Veillarde5d68de2005-03-10 15:03:40 +00003401
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003402<p>Table of Content:</p>
3403<ol>
Daniel Veillardfabafd52006-06-08 08:16:33 +00003404 <li><a href="encoding.html#What">What does internationalization
3405 supportmean?</a></li>
3406 <li><a href="encoding.html#internal">The internal encoding,
3407 howandwhy</a></li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003408 <li><a href="encoding.html#implemente">How is it implemented ?</a></li>
3409 <li><a href="encoding.html#Default">Default supported encodings</a></li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00003410 <li><a href="encoding.html#extend">How to extend theexistingsupport</a></li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003411</ol>
3412
3413<h3><a name="What">What does internationalization support mean ?</a></h3>
3414
Daniel Veillardfabafd52006-06-08 08:16:33 +00003415<p>XML was designed from the start to allow the support of any charactersetby
3416using Unicode. Any conformant XML parser has to support the UTF-8andUTF-16
3417default encodings which can both express the full unicode ranges.UTF8is a
3418variable length encoding whose greatest points are to reuse thesameencoding
3419for ASCII and to save space for Western encodings, but it is abitmore complex
3420to handle in practice. UTF-16 use 2 bytes per character(andsometimes combines
3421two pairs), it makes implementation easier, but looksabit overkill for
3422Western languages encoding. Moreover the XMLspecificationallows the document
3423to be encoded in other encodings at thecondition thatthey are clearly labeled
3424as such. For example the following isa wellformedXML document encoded in
3425ISO-8859-1 and using accentuated lettersthat weFrench like for both markup
3426and content:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003427<pre>&lt;?xml version="1.0" encoding="ISO-8859-1"?&gt;
3428&lt;très&gt;là&lt;/très&gt;</pre>
3429
Daniel Veillard8a469172003-06-12 16:05:07 +00003430<p>Having internationalization support in libxml2 means the following:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003431<ul>
3432 <li>the document is properly parsed</li>
3433 <li>informations about it's encoding are saved</li>
3434 <li>it can be modified</li>
3435 <li>it can be saved in its original encoding</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00003436 <li>it can also be saved in another encoding supported by
3437 libxml2(forexample straight UTF8 or even an ASCII form)</li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003438</ul>
3439
Daniel Veillardfabafd52006-06-08 08:16:33 +00003440<p>Another very important point is that the whole libxml2 API,
3441withtheexception of a few routines to read with a specific encoding or save
3442toaspecific encoding, is completely agnostic about the original encoding
3443ofthedocument.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003444
Daniel Veillardfabafd52006-06-08 08:16:33 +00003445<p>It should be noted too that the HTML parser embedded in libxml2 nowobeythe
3446same rules too, the following document will be (as of 2.2.2) handledinan
3447internationalized fashion by libxml2 too:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003448<pre>&lt;!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"
3449 "http://www.w3.org/TR/REC-html40/loose.dtd"&gt;
3450&lt;html lang="fr"&gt;
3451&lt;head&gt;
3452 &lt;META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"&gt;
3453&lt;/head&gt;
3454&lt;body&gt;
3455&lt;p&gt;W3C crée des standards pour le Web.&lt;/body&gt;
3456&lt;/html&gt;</pre>
3457
3458<h3><a name="internal">The internal encoding, how and why</a></h3>
3459
Daniel Veillardfabafd52006-06-08 08:16:33 +00003460<p>One of the core decisions was to force all documents to be converted
3461toadefault internal encoding, and that encoding to be UTF-8, here
3462aretherationales for those choices:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003463<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00003464 <li>keeping the native encoding in the internal form would force
3465 thelibxmlusers (or the code associated) to be fully aware of the encoding
3466 oftheoriginal document, for examples when adding a text node to
3467 adocument,the content would have to be provided in the document
3468 encoding,i.e. theclient code would have to check it before hand, make
3469 sure it'sconformantto the encoding, etc ... Very hard in practice, though
3470 in somespecificcases this may make sense.</li>
3471 <li>the second decision was which encoding. From the XML spec only
3472 UTF8andUTF16 really makes sense as being the two only encodings for
3473 whichthereis mandatory support. UCS-4 (32 bits fixed size encoding)
3474 couldbeconsidered an intelligent choice too since it's a direct
3475 Unicodemappingsupport. I selected UTF-8 on the basis of efficiency
3476 andcompatibilitywith surrounding software:
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003477 <ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00003478 <li>UTF-8 while a bit more complex to convert from/to (i.e.slightlymore
3479 costly to import and export CPU wise) is also far morecompactthan
3480 UTF-16 (and UCS-4) for a majority of the documents I seeit usedfor
3481 right now (RPM RDF catalogs, advogato data, variousconfigurationfile
3482 formats, etc.) and the key point for today'scomputerarchitecture is
3483 efficient uses of caches. If one nearlydouble thememory requirement
3484 to store the same amount of data, thiswill trashcaches (main
3485 memory/external caches/internal caches) and mytake isthat this harms
3486 the system far more than the CPU requirementsneededfor the conversion
3487 to UTF-8</li>
3488 <li>Most of libxml2 version 1 users were using it with
3489 straightASCIImost of the time, doing the conversion with an
3490 internalencodingrequiring all their code to be rewritten was a
3491 seriousshow-stopperfor using UTF-16 or UCS-4.</li>
3492 <li>UTF-8 is being used as the de-facto internal encoding
3493 standardforrelated code like the <a
3494 href="http://www.pango.org/">pango</a>upcoming Gnome text widget,
3495 anda lot of Unix code (yet another placewhere Unix programmer base
3496 takesa different approach from Microsoft- they are using UTF-16)</li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003497 </ul>
3498 </li>
3499</ul>
3500
Daniel Veillard8a469172003-06-12 16:05:07 +00003501<p>What does this mean in practice for the libxml2 user:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003502<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00003503 <li>xmlChar, the libxml2 data type is a byte, those bytes must
3504 beassembledas UTF-8 valid strings. The proper way to terminate an xmlChar
3505 *stringis simply to append 0 byte, as usual.</li>
3506 <li>One just need to make sure that when using chars outside the
3507 ASCIIset,the values has been properly converted to UTF-8</li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003508</ul>
3509
3510<h3><a name="implemente">How is it implemented ?</a></h3>
3511
Daniel Veillardfabafd52006-06-08 08:16:33 +00003512<p>Let's describe how all this works within libxml, basically
3513theI18N(internationalization) support get triggered only during I/O
3514operation,i.e.when reading a document or saving one. Let's look first at
3515thereadingsequence:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003516<ol>
Daniel Veillardfabafd52006-06-08 08:16:33 +00003517 <li>when a document is processed, we usually don't know the
3518 encoding,asimple heuristic allows to detect UTF-16 and UCS-4 from
3519 encodingswherethe ASCII range (0-0x7F) maps with ASCII</li>
3520 <li>the xml declaration if available is parsed, including
3521 theencodingdeclaration. At that point, if the autodetected encoding
3522 isdifferentfrom the one declared a call to xmlSwitchEncoding()
3523 isissued.</li>
3524 <li>If there is no encoding declaration, then the input has to be
3525 ineitherUTF-8 or UTF-16, if it is not then at some point when
3526 processingtheinput, the converter/checker of UTF-8 form will raise an
3527 encodingerror.You may end-up with a garbled document, or no document at
3528 all !Example:
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003529 <pre>~/XML -&gt; ./xmllint err.xml
3530err.xml:1: error: Input is not proper UTF-8, indicate encoding !
3531&lt;très&gt;là&lt;/très&gt;
3532 ^
3533err.xml:1: error: Bytes: 0xE8 0x73 0x3E 0x6C
3534&lt;très&gt;là&lt;/très&gt;
3535 ^</pre>
3536 </li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00003537 <li>xmlSwitchEncoding() does an encoding name lookup, canonicalize
3538 it,andthen search the default registered encoding converters for
3539 thatencoding.If it's not within the default set and iconv() support has
3540 beencompiledit, it will ask iconv for such an encoder. If this fails then
3541 theparserwill report an error and stops processing:
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003542 <pre>~/XML -&gt; ./xmllint err2.xml
3543err2.xml:1: error: Unsupported encoding UnsupportedEnc
3544&lt;?xml version="1.0" encoding="UnsupportedEnc"?&gt;
3545 ^</pre>
3546 </li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00003547 <li>From that point the encoder processes progressively the input
3548 (itisplugged as a front-end to the I/O module) for that entity.
3549 Itcapturesand converts on-the-fly the document to be parsed to UTF-8.
3550 Theparseritself just does UTF-8 checking of this input and
3551 processittransparently. The only difference is that the encoding
3552 informationhasbeen added to the parsing context (more precisely to
3553 theinputcorresponding to this entity).</li>
3554 <li>The result (when using DOM) is an internal form completely in
3555 UTF-8withjust an encoding information on the document node.</li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003556</ol>
3557
Daniel Veillardfabafd52006-06-08 08:16:33 +00003558<p>Ok then what happens when saving the document (assuming
3559youcollected/builtan xmlDoc DOM like structure) ? It depends on the
3560functioncalled,xmlSaveFile() will just try to save in the original
3561encoding,whilexmlSaveFileTo() and xmlSaveFileEnc() can optionally save to
3562agivenencoding:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003563<ol>
Daniel Veillardfabafd52006-06-08 08:16:33 +00003564 <li>if no encoding is given, libxml2 will look for an
3565 encodingvalueassociated to the document and if it exists will try to save
3566 tothatencoding,
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003567 <p>otherwise everything is written in the internal form, i.e. UTF-8</p>
3568 </li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00003569 <li>so if an encoding was specified, either at the API level or
3570 onthedocument, libxml2 will again canonicalize the encoding name,
3571 lookupfor aconverter in the registered set or through iconv. If not
3572 foundthefunction will return an error code</li>
3573 <li>the converter is placed before the I/O buffer layer, as another
3574 kindofbuffer, then libxml2 will simply push the UTF-8 serialization
3575 tothroughthat buffer, which will then progressively be converted and
3576 pushedontothe I/O layer.</li>
3577 <li>It is possible that the converter code fails on some input,
3578 forexampletrying to push an UTF-8 encoded Chinese character through
3579 theUTF-8 toISO-8859-1 converter won't work. Since the encoders
3580 areprogressive theywill just report the error and the number of
3581 bytesconverted, at thatpoint libxml2 will decode the offending
3582 character,remove it from thebuffer and replace it with the associated
3583 charRefencoding &amp;#123; andresume the conversion. This guarantees that
3584 anydocument will be savedwithout losses (except for markup names where
3585 thisis not legal, this isa problem in the current version, in practice
3586 avoidusing non-asciicharacters for tag or attribute names). A special
3587 "ascii"encoding nameis used to save documents to a pure ascii form can be
3588 usedwhenportability is really crucial</li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003589</ol>
3590
Daniel Veillardabfca612004-01-07 23:38:02 +00003591<p>Here are a few examples based on the same test document:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003592<pre>~/XML -&gt; ./xmllint isolat1
3593&lt;?xml version="1.0" encoding="ISO-8859-1"?&gt;
3594&lt;très&gt;là&lt;/très&gt;
3595~/XML -&gt; ./xmllint --encode UTF-8 isolat1
3596&lt;?xml version="1.0" encoding="UTF-8"?&gt;
Daniel Veillard69839ba2006-06-06 13:27:03 +00003597&lt;très&gt;là  &lt;/très&gt;
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003598~/XML -&gt; </pre>
3599
Daniel Veillardfabafd52006-06-08 08:16:33 +00003600<p>The same processing is applied (and reuse most of the code) for
3601HTMLI18Nprocessing. Looking up and modifying the content encoding is a
3602bitmoredifficult since it is located in a &lt;meta&gt; tag under
3603the&lt;head&gt;,so a couple of functions htmlGetMetaEncoding()
3604andhtmlSetMetaEncoding() havebeen provided. The parser also attempts to
3605switchencoding on the fly whendetecting such a tag on input. Except for that
3606theprocessing is the same(and again reuses the same code).</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003607
3608<h3><a name="Default">Default supported encodings</a></h3>
3609
Daniel Veillardfabafd52006-06-08 08:16:33 +00003610<p>libxml2 has a set of default converters for the followingencodings(located
3611in encoding.c):</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003612<ol>
3613 <li>UTF-8 is supported by default (null handlers)</li>
3614 <li>UTF-16, both little and big endian</li>
3615 <li>ISO-Latin-1 (ISO-8859-1) covering most western languages</li>
3616 <li>ASCII, useful mostly for saving</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00003617 <li>HTML, a specific handler for the conversion of UTF-8 to ASCII
3618 withHTMLpredefined entities like &amp;copy; for the Copyright sign.</li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003619</ol>
3620
Daniel Veillardfabafd52006-06-08 08:16:33 +00003621<p>More over when compiled on an Unix platform with iconv support the
3622fullsetof encodings supported by iconv can be instantly be used by libxml. On
3623alinuxmachine with glibc-2.1 the list of supported encodings and aliases
3624fill3 fullpages, and include UCS-4, the full set of ISO-Latin encodings, and
3625thevariousJapanese ones.</p>
Daniel Veillard67952602006-01-05 15:29:44 +00003626
Daniel Veillardfabafd52006-06-08 08:16:33 +00003627<p>To convert from the UTF-8 values returned from the API to
3628anotherencodingthen it is possible to use the function provided from <a
Daniel Veillard69839ba2006-06-06 13:27:03 +00003629href="html/libxml-encoding.html">the encoding module</a>like <a
Daniel Veillardfabafd52006-06-08 08:16:33 +00003630href="html/libxml-encoding.html#UTF8Toisolat1">UTF8Toisolat1</a>, or
3631usethePOSIX <a
3632href="http://www.opengroup.org/onlinepubs/009695399/functions/iconv.html">iconv()</a>APIdirectly.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003633
3634<h4>Encoding aliases</h4>
3635
Daniel Veillardfabafd52006-06-08 08:16:33 +00003636<p>From 2.2.3, libxml2 has support to register encoding names aliases.Thegoal
3637is to be able to parse document whose encoding is supported butwherethe name
3638differs (for example from the default set of names acceptedbyiconv). The
3639following functions allow to register and handle new aliasesforexisting
3640encodings. Once registered libxml2 will automatically lookupthealiases when
3641handling a document:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003642<ul>
3643 <li>int xmlAddEncodingAlias(const char *name, const char *alias);</li>
3644 <li>int xmlDelEncodingAlias(const char *alias);</li>
3645 <li>const char * xmlGetEncodingAlias(const char *alias);</li>
3646 <li>void xmlCleanupEncodingAliases(void);</li>
3647</ul>
3648
3649<h3><a name="extend">How to extend the existing support</a></h3>
3650
Daniel Veillardfabafd52006-06-08 08:16:33 +00003651<p>Well adding support for new encoding, or overriding one of
3652theencoders(assuming it is buggy) should not be hard, just write input
3653andoutputconversion routines to/from UTF-8, and register
3654themusingxmlNewCharEncodingHandler(name, xxxToUTF8, UTF8Toxxx), and they
3655willbecalled automatically if the parser(s) encounter such an
3656encodingname(register it uppercase, this will help). The description of
3657theencoders,their arguments and expected return values are described in
3658theencoding.hheader.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003659
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003660<h2><a name="IO">I/O Interfaces</a></h2>
3661
3662<p>Table of Content:</p>
3663<ol>
3664 <li><a href="#General1">General overview</a></li>
3665 <li><a href="#basic">The basic buffer type</a></li>
3666 <li><a href="#Input">Input I/O handlers</a></li>
3667 <li><a href="#Output">Output I/O handlers</a></li>
3668 <li><a href="#entities">The entities loader</a></li>
3669 <li><a href="#Example2">Example of customized I/O</a></li>
3670</ol>
3671
3672<h3><a name="General1">General overview</a></h3>
3673
3674<p>The module <code><a
Daniel Veillardfabafd52006-06-08 08:16:33 +00003675href="http://xmlsoft.org/html/libxml-xmlio.html">xmlIO.h</a></code>providestheinterfaces
3676to the libxml2 I/O system. This consists of 4 main parts:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003677<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00003678 <li>Entities loader, this is a routine which tries to fetch
3679 theentities(files) based on their PUBLIC and SYSTEM identifiers. The
3680 defaultloaderdon't look at the public identifier since libxml2 do not
3681 maintainacatalog. You can redefine you own entity loader
3682 byusing<code>xmlGetExternalEntityLoader()</code>and<code>xmlSetExternalEntityLoader()</code>.<a
3683 href="#entities">Check theexample</a>.</li>
3684 <li>Input I/O buffers which are a commodity structure used by
3685 theparser(s)input layer to handle fetching the informations to feed
3686 theparser. Thisprovides buffering and is also a placeholder where
3687 theencodingconverters to UTF8 are piggy-backed.</li>
3688 <li>Output I/O buffers are similar to the Input ones and fulfillsimilartask
3689 but when generating a serialization from a tree.</li>
3690 <li>A mechanism to register sets of I/O callbacks and associate
3691 themwithspecific naming schemes like the protocol part of the URIs.
3692 <p>This affect the default I/O operations and allows to use
3693 specificI/Ohandlers for certain names.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003694 </li>
3695</ul>
3696
Daniel Veillardfabafd52006-06-08 08:16:33 +00003697<p>The general mechanism used when loading
3698http://rpmfind.net/xml.htmlforexample in the HTML parser is the following:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003699<ol>
Daniel Veillardfabafd52006-06-08 08:16:33 +00003700 <li>The default entity loader
3701 calls<code>xmlNewInputFromFile()</code>withthe parsing context and the
3702 URIstring.</li>
3703 <li>the URI string is checked against the existing registered
3704 handlersusingtheir match() callback function, if the HTTP module was
3705 compiledin, it isregistered and its match() function will succeeds</li>
3706 <li>the open() function of the handler is called and if
3707 successfulwillreturn an I/O Input buffer</li>
3708 <li>the parser will the start reading from this buffer
3709 andprogressivelyfetch information from the resource, calling the
3710 read()function of thehandler until the resource is exhausted</li>
3711 <li>if an encoding change is detected it will be installed on
3712 theinputbuffer, providing buffering and efficient use of
3713 theconversionroutines</li>
3714 <li>once the parser has finished, the close() function of the
3715 handleriscalled once and the Input buffer and associated
3716 resourcesaredeallocated.</li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003717</ol>
3718
Daniel Veillardfabafd52006-06-08 08:16:33 +00003719<p>The user defined callbacks are checked first to allow overriding
3720ofthedefault libxml2 I/O routines.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003721
3722<h3><a name="basic">The basic buffer type</a></h3>
3723
Daniel Veillardfabafd52006-06-08 08:16:33 +00003724<p>All the buffer manipulation handling is done
3725usingthe<code>xmlBuffer</code>type define in <code><a
3726href="http://xmlsoft.org/html/libxml-tree.html">tree.h</a></code>which
3727isaresizable memory buffer. The buffer allocation strategy can be selected
3728tobeeither best-fit or use an exponential doubling one (CPU vs.
3729memoryusetrade-off). The values
3730are<code>XML_BUFFER_ALLOC_EXACT</code>and<code>XML_BUFFER_ALLOC_DOUBLEIT</code>,and
3731can be set individually or on asystem wide basis
3732using<code>xmlBufferSetAllocationScheme()</code>. A numberof functions allows
3733tomanipulate buffers with names starting
3734withthe<code>xmlBuffer...</code>prefix.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003735
3736<h3><a name="Input">Input I/O handlers</a></h3>
3737
Daniel Veillardfabafd52006-06-08 08:16:33 +00003738<p>An Input I/O handler is a
3739simplestructure<code>xmlParserInputBuffer</code>containing a context
3740associated totheresource (file descriptor, or pointer to a protocol handler),
3741the read()andclose() callbacks to use and an xmlBuffer. And extra xmlBuffer
3742and acharsetencoding handler are also present to support charset
3743conversionwhenneeded.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003744
3745<h3><a name="Output">Output I/O handlers</a></h3>
3746
Daniel Veillardfabafd52006-06-08 08:16:33 +00003747<p>An Output handler <code>xmlOutputBuffer</code>is completely similar
3748toanInput one except the callbacks are write() and close().</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003749
3750<h3><a name="entities">The entities loader</a></h3>
3751
Daniel Veillardfabafd52006-06-08 08:16:33 +00003752<p>The entity loader resolves requests for new entities and create
3753inputsforthe parser. Creating an input from a filename or an URI string
3754isdonethrough the xmlNewInputFromFile() routine. The default entity loader
3755donothandle the PUBLIC identifier associated with an entity (if any). So
3756itjustcalls xmlNewInputFromFile() with the SYSTEM identifier (which
3757ismandatory inXML).</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003758
Daniel Veillardfabafd52006-06-08 08:16:33 +00003759<p>If you want to hook up a catalog mechanism then you simply need
3760tooverridethe default entity loader, here is an example:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003761<pre>#include &lt;libxml/xmlIO.h&gt;
3762
3763xmlExternalEntityLoader defaultLoader = NULL;
3764
3765xmlParserInputPtr
3766xmlMyExternalEntityLoader(const char *URL, const char *ID,
3767 xmlParserCtxtPtr ctxt) {
3768 xmlParserInputPtr ret;
3769 const char *fileID = NULL;
3770 /* lookup for the fileID depending on ID */
3771
3772 ret = xmlNewInputFromFile(ctxt, fileID);
3773 if (ret != NULL)
3774 return(ret);
3775 if (defaultLoader != NULL)
3776 ret = defaultLoader(URL, ID, ctxt);
3777 return(ret);
3778}
3779
3780int main(..) {
3781 ...
3782
3783 /*
3784 * Install our own entity loader
3785 */
3786 defaultLoader = xmlGetExternalEntityLoader();
3787 xmlSetExternalEntityLoader(xmlMyExternalEntityLoader);
3788
3789 ...
3790}</pre>
3791
3792<h3><a name="Example2">Example of customized I/O</a></h3>
3793
Daniel Veillard69839ba2006-06-06 13:27:03 +00003794<p>This example come from <a
Daniel Veillardfabafd52006-06-08 08:16:33 +00003795href="http://xmlsoft.org/messages/0708.html">areal use case</a>,xmlDocDump()
3796closes the FILE * passed by the applicationand this was aproblem. The <a
3797href="http://xmlsoft.org/messages/0711.html">solution</a>wasto redefine anew
3798output handler with the closing call deactivated:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003799<ol>
Daniel Veillardfabafd52006-06-08 08:16:33 +00003800 <li>First define a new I/O output allocator where the output don't
3801 closethefile:
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003802 <pre>xmlOutputBufferPtr
3803xmlOutputBufferCreateOwn(FILE *file, xmlCharEncodingHandlerPtr encoder) {
Daniel Veillard69839ba2006-06-06 13:27:03 +00003804    xmlOutputBufferPtr ret;
3805    
3806    if (xmlOutputCallbackInitialized == 0)
3807        xmlRegisterDefaultOutputCallbacks();
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003808
Daniel Veillard69839ba2006-06-06 13:27:03 +00003809    if (file == NULL) return(NULL);
3810    ret = xmlAllocOutputBuffer(encoder);
3811    if (ret != NULL) {
3812        ret-&gt;context = file;
3813        ret-&gt;writecallback = xmlFileWrite;
3814        ret-&gt;closecallback = NULL; /* No close callback */
3815    }
3816    return(ret);
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003817} </pre>
3818 </li>
3819 <li>And then use it to save the document:
3820 <pre>FILE *f;
3821xmlOutputBufferPtr output;
3822xmlDocPtr doc;
3823int res;
3824
3825f = ...
3826doc = ....
3827
3828output = xmlOutputBufferCreateOwn(f, NULL);
3829res = xmlSaveFileTo(output, doc, NULL);
3830 </pre>
3831 </li>
3832</ol>
3833
3834<h2><a name="Catalog">Catalog support</a></h2>
3835
3836<p>Table of Content:</p>
3837<ol>
3838 <li><a href="General2">General overview</a></li>
3839 <li><a href="#definition">The definition</a></li>
3840 <li><a href="#Simple">Using catalogs</a></li>
3841 <li><a href="#Some">Some examples</a></li>
3842 <li><a href="#reference">How to tune catalog usage</a></li>
3843 <li><a href="#validate">How to debug catalog processing</a></li>
3844 <li><a href="#Declaring">How to create and maintain catalogs</a></li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00003845 <li><a href="#implemento">The implementor corner quick review
3846 oftheAPI</a></li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003847 <li><a href="#Other">Other resources</a></li>
3848</ol>
3849
3850<h3><a name="General2">General overview</a></h3>
3851
Daniel Veillardfabafd52006-06-08 08:16:33 +00003852<p>What is a catalog? Basically it's a lookup mechanism used when an
3853entity(afile or a remote resource) references another entity. The catalog
3854lookupisinserted between the moment the reference is recognized by the
3855software(XMLparser, stylesheet processing, or even images referenced for
3856inclusionin arendering) and the time where loading that resource is
3857actuallystarted.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003858
3859<p>It is basically used for 3 things:</p>
3860<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00003861 <li>mapping from "logical" names, the public identifiers and a
3862 moreconcretename usable for download (and URI). For example it can
3863 associatethelogical name
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003864 <p>"-//OASIS//DTD DocBook XML V4.1.2//EN"</p>
Daniel Veillardfabafd52006-06-08 08:16:33 +00003865 <p>of the DocBook 4.1.2 XML DTD with the actual URL where it
3866 canbedownloaded</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003867 <p>http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd</p>
3868 </li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00003869 <li>remapping from a given URL to another one, like an
3870 HTTPindirectionsaying that
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003871 <p>"http://www.oasis-open.org/committes/tr.xsl"</p>
3872 <p>should really be looked at</p>
3873 <p>"http://www.oasis-open.org/committes/entity/stylesheets/base/tr.xsl"</p>
3874 </li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00003875 <li>providing a local cache mechanism allowing to load
3876 theentitiesassociated to public identifiers or remote resources, this is
3877 areallyimportant feature for any significant deployment of XML or
3878 SGMLsince itallows to avoid the aleas and delays associated to
3879 fetchingremoteresources.</li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003880</ul>
3881
3882<h3><a name="definition">The definitions</a></h3>
3883
3884<p>Libxml, as of 2.4.3 implements 2 kind of catalogs:</p>
3885<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00003886 <li>the older SGML catalogs, the official spec is SGML
3887 OpenTechnicalResolution TR9401:1997, but is better understood by reading
3888 <a href="http://www.jclark.com/sp/catalog.htm">the SP
3889 Catalogpage</a>fromJames Clark. This is relatively old and not the
3890 preferredmode ofoperation of libxml.</li>
Daniel Veillard69839ba2006-06-06 13:27:03 +00003891 <li><a
Daniel Veillardfabafd52006-06-08 08:16:33 +00003892 href="http://www.oasis-open.org/committees/entity/spec.html">XMLCatalogs</a>isfar
3893 more flexible, more recent, uses an XML syntax andshould scale
3894 quitebetter. This is the default option of libxml.</li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003895</ul>
3896
3897<p></p>
3898
3899<h3><a name="Simple">Using catalog</a></h3>
3900
Daniel Veillardfabafd52006-06-08 08:16:33 +00003901<p>In a normal environment libxml2 will by default check the presence
3902ofacatalog in /etc/xml/catalog, and assuming it has been
3903correctlypopulated,the processing is completely transparent to the document
3904user. Totake aconcrete example, suppose you are authoring a DocBook document,
3905thisonestarts with the following DOCTYPE definition:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003906<pre>&lt;?xml version='1.0'?&gt;
3907&lt;!DOCTYPE book PUBLIC "-//Norman Walsh//DTD DocBk XML V3.1.4//EN"
3908 "http://nwalsh.com/docbook/xml/3.1.4/db3xml.dtd"&gt;</pre>
3909
Daniel Veillardfabafd52006-06-08 08:16:33 +00003910<p>When validating the document with libxml, the catalog will
3911beautomaticallyconsulted to lookup the public identifier "-//Norman
3912Walsh//DTDDocBk XMLV3.1.4//EN" and the
3913systemidentifier"http://nwalsh.com/docbook/xml/3.1.4/db3xml.dtd", and if
3914theseentities havebeen installed on your system and the catalogs actually
3915point tothem, libxmlwill fetch them from the local disk.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003916
Daniel Veillardfabafd52006-06-08 08:16:33 +00003917<p style="font-size: 10pt"><strong>Note</strong>: Really don't usethisDOCTYPE
3918example it's a really old version, but is fine as an example.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003919
Daniel Veillardfabafd52006-06-08 08:16:33 +00003920<p>Libxml2 will check the catalog each time that it is requested to
3921loadanentity, this includes DTD, external parsed entities, stylesheets, etc
3922...Ifyour system is correctly configured all the authoring phase
3923andprocessingshould use only local files, even if your document stays
3924portablebecause ituses the canonical public and system ID, referencing the
3925remotedocument.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003926
3927<h3><a name="Some">Some examples:</a></h3>
3928
Daniel Veillardfabafd52006-06-08 08:16:33 +00003929<p>Here is a couple of fragments from XML Catalogs used in
3930libxml2earlyregression tests in <code>test/catalogs</code>:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003931<pre>&lt;?xml version="1.0"?&gt;
3932&lt;!DOCTYPE catalog PUBLIC
3933 "-//OASIS//DTD Entity Resolution XML Catalog V1.0//EN"
3934 "http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd"&gt;
3935&lt;catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"&gt;
3936 &lt;public publicId="-//OASIS//DTD DocBook XML V4.1.2//EN"
3937 uri="http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd"/&gt;
3938...</pre>
3939
Daniel Veillardfabafd52006-06-08 08:16:33 +00003940<p>This is the beginning of a catalog for DocBook 4.1.2, XML
3941Catalogsarewritten in XML, there is a specific namespace for
3942catalogelements"urn:oasis:names:tc:entity:xmlns:xml:catalog". The first entry
3943inthiscatalog is a <code>public</code>mapping it allows to associate
3944aPublicIdentifier with an URI.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003945<pre>...
3946 &lt;rewriteSystem systemIdStartString="http://www.oasis-open.org/docbook/"
3947 rewritePrefix="file:///usr/share/xml/docbook/"/&gt;
3948...</pre>
3949
Daniel Veillardfabafd52006-06-08 08:16:33 +00003950<p>A <code>rewriteSystem</code>is a very powerful instruction, it saysthatany
3951URI starting with a given prefix should be looked at anotherURIconstructed by
3952replacing the prefix with an new one. In effect this actslikea cache system
3953for a full area of the Web. In practice it is extremelyusefulwith a file
3954prefix if you have installed a copy of those resources onyourlocal system.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003955<pre>...
3956&lt;delegatePublic publicIdStartString="-//OASIS//DTD XML Catalog //"
3957 catalog="file:///usr/share/xml/docbook.xml"/&gt;
3958&lt;delegatePublic publicIdStartString="-//OASIS//ENTITIES DocBook XML"
3959 catalog="file:///usr/share/xml/docbook.xml"/&gt;
3960&lt;delegatePublic publicIdStartString="-//OASIS//DTD DocBook XML"
3961 catalog="file:///usr/share/xml/docbook.xml"/&gt;
3962&lt;delegateSystem systemIdStartString="http://www.oasis-open.org/docbook/"
3963 catalog="file:///usr/share/xml/docbook.xml"/&gt;
3964&lt;delegateURI uriStartString="http://www.oasis-open.org/docbook/"
3965 catalog="file:///usr/share/xml/docbook.xml"/&gt;
3966...</pre>
3967
Daniel Veillardfabafd52006-06-08 08:16:33 +00003968<p>Delegation is the core features which allows to build a tree
3969ofcatalogs,easier to maintain than a single catalog, based on
3970PublicIdentifier, SystemIdentifier or URI prefixes it instructs the
3971catalogsoftware to look upentries in another resource. This feature allow to
3972buildhierarchies ofcatalogs, the set of entries presented should be
3973sufficient toredirect theresolution of all DocBook references to the specific
3974catalogin<code>/usr/share/xml/docbook.xml</code>this one in turn could
3975delegateallreferences for DocBook 4.2.1 to a specific catalog installed at
3976the sametimeas the DocBook resources on the local machine.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003977
3978<h3><a name="reference">How to tune catalog usage:</a></h3>
3979
Daniel Veillardfabafd52006-06-08 08:16:33 +00003980<p>The user can change the default catalog behaviour by redirecting
3981queriestoits own set of catalogs, this can be done by
3982settingthe<code>XML_CATALOG_FILES</code>environment variable to a list of
3983catalogs,anempty one should deactivate loading the
3984default<code>/etc/xml/catalog</code>default catalog</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003985
3986<h3><a name="validate">How to debug catalog processing:</a></h3>
3987
Daniel Veillardfabafd52006-06-08 08:16:33 +00003988<p>Setting up the <code>XML_DEBUG_CATALOG</code>environment variable
3989willmakelibxml2 output debugging informations for each catalog
3990operations,forexample:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00003991<pre>orchis:~/XML -&gt; xmllint --memory --noout test/ent2
3992warning: failed to load external entity "title.xml"
3993orchis:~/XML -&gt; export XML_DEBUG_CATALOG=
3994orchis:~/XML -&gt; xmllint --memory --noout test/ent2
3995Failed to parse catalog /etc/xml/catalog
3996Failed to parse catalog /etc/xml/catalog
3997warning: failed to load external entity "title.xml"
3998Catalogs cleanup
3999orchis:~/XML -&gt; </pre>
4000
Daniel Veillardfabafd52006-06-08 08:16:33 +00004001<p>The test/ent2 references an entity, running the parser from memorymakesthe
4002base URI unavailable and the the "title.xml" entity cannot beloaded.Setting
4003up the debug environment variable allows to detect that anattempt ismade to
4004load the <code>/etc/xml/catalog</code>but since it's notpresent theresolution
4005fails.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004006
Daniel Veillardfabafd52006-06-08 08:16:33 +00004007<p>But the most advanced way to debug XML catalog processing is to
4008usethe<strong>xmlcatalog</strong>command shipped with libxml2, it allows
4009toloadcatalogs and make resolution queries to see what is going on. This
4010isalsoused for the regression tests:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004011<pre>orchis:~/XML -&gt; ./xmlcatalog test/catalogs/docbook.xml \
4012 "-//OASIS//DTD DocBook XML V4.1.2//EN"
4013http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd
4014orchis:~/XML -&gt; </pre>
4015
Daniel Veillardfabafd52006-06-08 08:16:33 +00004016<p>For debugging what is going on, adding one -v flags increase
4017theverbositylevel to indicate the processing done (adding a second flag
4018alsoindicatewhat elements are recognized at parsing):</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004019<pre>orchis:~/XML -&gt; ./xmlcatalog -v test/catalogs/docbook.xml \
4020 "-//OASIS//DTD DocBook XML V4.1.2//EN"
4021Parsing catalog test/catalogs/docbook.xml's content
4022Found public match -//OASIS//DTD DocBook XML V4.1.2//EN
4023http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd
4024Catalogs cleanup
4025orchis:~/XML -&gt; </pre>
4026
Daniel Veillardfabafd52006-06-08 08:16:33 +00004027<p>A shell interface is also available to debug and process
4028multiplequeries(and for regression tests):</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004029<pre>orchis:~/XML -&gt; ./xmlcatalog -shell test/catalogs/docbook.xml \
4030 "-//OASIS//DTD DocBook XML V4.1.2//EN"
4031&gt; help
4032Commands available:
4033public PublicID: make a PUBLIC identifier lookup
4034system SystemID: make a SYSTEM identifier lookup
4035resolve PublicID SystemID: do a full resolver lookup
4036add 'type' 'orig' 'replace' : add an entry
4037del 'values' : remove values
4038dump: print the current catalog state
4039debug: increase the verbosity level
4040quiet: decrease the verbosity level
4041exit: quit the shell
4042&gt; public "-//OASIS//DTD DocBook XML V4.1.2//EN"
4043http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd
4044&gt; quit
4045orchis:~/XML -&gt; </pre>
4046
Daniel Veillardfabafd52006-06-08 08:16:33 +00004047<p>This should be sufficient for most debugging purpose, this wasactuallyused
4048heavily to debug the XML Catalog implementation itself.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004049
Daniel Veillard69839ba2006-06-06 13:27:03 +00004050<h3><a name="Declaring">How to create and maintain</a>catalogs:</h3>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004051
Daniel Veillardfabafd52006-06-08 08:16:33 +00004052<p>Basically XML Catalogs are XML files, you can either use XML toolstomanage
4053them or use <strong>xmlcatalog</strong>for this. The basic stepisto create a
4054catalog the -create option provide this facility:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004055<pre>orchis:~/XML -&gt; ./xmlcatalog --create tst.xml
4056&lt;?xml version="1.0"?&gt;
4057&lt;!DOCTYPE catalog PUBLIC "-//OASIS//DTD Entity Resolution XML Catalog V1.0//EN"
4058 "http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd"&gt;
4059&lt;catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"/&gt;
4060orchis:~/XML -&gt; </pre>
4061
Daniel Veillardfabafd52006-06-08 08:16:33 +00004062<p>By default xmlcatalog does not overwrite the original catalog and
4063savetheresult on the standard output, this can be overridden using
4064the-nooutoption. The <code>-add</code>command allows to add entries
4065inthecatalog:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004066<pre>orchis:~/XML -&gt; ./xmlcatalog --noout --create --add "public" \
4067 "-//OASIS//DTD DocBook XML V4.1.2//EN" \
4068 http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd tst.xml
4069orchis:~/XML -&gt; cat tst.xml
4070&lt;?xml version="1.0"?&gt;
4071&lt;!DOCTYPE catalog PUBLIC "-//OASIS//DTD Entity Resolution XML Catalog V1.0//EN" \
4072 "http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd"&gt;
4073&lt;catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"&gt;
4074&lt;public publicId="-//OASIS//DTD DocBook XML V4.1.2//EN"
4075 uri="http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd"/&gt;
4076&lt;/catalog&gt;
4077orchis:~/XML -&gt; </pre>
4078
Daniel Veillardfabafd52006-06-08 08:16:33 +00004079<p>The <code>-add</code>option will always take 3 parameters even if
4080someofthe XML Catalog constructs (like nextCatalog) will have only
4081asingleargument, just pass a third empty string, it will be ignored.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004082
Daniel Veillardfabafd52006-06-08 08:16:33 +00004083<p>Similarly the <code>-del</code>option remove matching entries
4084fromthecatalog:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004085<pre>orchis:~/XML -&gt; ./xmlcatalog --del \
4086 "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" tst.xml
4087&lt;?xml version="1.0"?&gt;
4088&lt;!DOCTYPE catalog PUBLIC "-//OASIS//DTD Entity Resolution XML Catalog V1.0//EN"
4089 "http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd"&gt;
4090&lt;catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"/&gt;
4091orchis:~/XML -&gt; </pre>
4092
Daniel Veillardfabafd52006-06-08 08:16:33 +00004093<p>The catalog is now empty. Note that the matching
4094of<code>-del</code>isexact and would have worked in a similar fashion with
4095thePublic IDstring.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004096
Daniel Veillardfabafd52006-06-08 08:16:33 +00004097<p>This is rudimentary but should be sufficient to manage a not
4098toocomplexcatalog tree of resources.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004099
Daniel Veillardfabafd52006-06-08 08:16:33 +00004100<h3><a name="implemento">The implementor corner quick review
4101oftheAPI:</a></h3>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004102
Daniel Veillardfabafd52006-06-08 08:16:33 +00004103<p>First, and like for every other module of libxml, there is
4104anautomaticallygenerated <a href="html/libxml-catalog.html">API page
4105forcatalogsupport</a>.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004106
4107<p>The header for the catalog interfaces should be included as:</p>
4108<pre>#include &lt;libxml/catalog.h&gt;</pre>
4109
Daniel Veillardfabafd52006-06-08 08:16:33 +00004110<p>The API is voluntarily kept very simple. First it is not
4111obviousthatapplications really need access to it since it is the default
4112behaviouroflibxml2 (Note: it is possible to completely override libxml2
4113defaultcatalogby using <a
4114href="html/libxml-parser.html">xmlSetExternalEntityLoader</a>toplug
4115anapplication specific resolver).</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004116
Daniel Veillard8a469172003-06-12 16:05:07 +00004117<p>Basically libxml2 support 2 catalog lists:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004118<ul>
4119 <li>the default one, global shared by all the application</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00004120 <li>a per-document catalog, this one is built if the document
4121 usesthe<code>oasis-xml-catalog</code>PIs to specify its own catalog list,
4122 itisassociated to the parser context and destroyed when the
4123 parsingcontextis destroyed.</li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004124</ul>
4125
4126<p>the document one will be used first if it exists.</p>
4127
4128<h4>Initialization routines:</h4>
4129
Daniel Veillardfabafd52006-06-08 08:16:33 +00004130<p>xmlInitializeCatalog(), xmlLoadCatalog() and xmlLoadCatalogs()
4131shouldbeused at startup to initialize the catalog, if the catalog
4132shouldbeinitialized with specific values xmlLoadCatalog()
4133orxmlLoadCatalogs()should be called before xmlInitializeCatalog() which
4134wouldotherwise do adefault initialization first.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004135
Daniel Veillardfabafd52006-06-08 08:16:33 +00004136<p>The xmlCatalogAddLocal() call is used by the parser to grow thedocumentown
4137catalog list if needed.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004138
4139<h4>Preferences setup:</h4>
4140
Daniel Veillardfabafd52006-06-08 08:16:33 +00004141<p>The XML Catalog spec requires the possibility to select
4142defaultpreferencesbetween public and system
4143delegation,xmlCatalogSetDefaultPrefer() allowsthis, xmlCatalogSetDefaults()
4144andxmlCatalogGetDefaults() allow to control ifXML Catalogs resolution
4145shouldbe forbidden, allowed for global catalog, fordocument catalog or both,
4146thedefault is to allow both.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004147
Daniel Veillardfabafd52006-06-08 08:16:33 +00004148<p>And of course xmlCatalogSetDebug() allows to generate
4149debugmessages(through the xmlGenericError() mechanism).</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004150
4151<h4>Querying routines:</h4>
4152
Daniel Veillardfabafd52006-06-08 08:16:33 +00004153<p>xmlCatalogResolve(),
4154xmlCatalogResolveSystem(),xmlCatalogResolvePublic()and xmlCatalogResolveURI()
4155are relatively explicitif you read the XMLCatalog specification they
4156correspond to section 7algorithms, they shouldalso work if you have loaded an
4157SGML catalog with asimplified semantic.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004158
Daniel Veillardfabafd52006-06-08 08:16:33 +00004159<p>xmlCatalogLocalResolve() and xmlCatalogLocalResolveURI() are the
4160samebutoperate on the document catalog list</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004161
4162<h4>Cleanup and Miscellaneous:</h4>
4163
Daniel Veillardfabafd52006-06-08 08:16:33 +00004164<p>xmlCatalogCleanup() free-up the global catalog, xmlCatalogFreeLocal()isthe
4165per-document equivalent.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004166
Daniel Veillardfabafd52006-06-08 08:16:33 +00004167<p>xmlCatalogAdd() and xmlCatalogRemove() are used to dynamically
4168modifythefirst catalog in the global list, and xmlCatalogDump() allows to
4169dumpacatalog state, those routines are primarily designed for xmlcatalog,
4170I'mnotsure that exposing more complex interfaces (like navigation ones)
4171wouldbereally useful.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004172
Daniel Veillardfabafd52006-06-08 08:16:33 +00004173<p>The xmlParseCatalogFile() is a function used to load XML Catalogfiles,it's
4174similar as xmlParseFile() except it bypass all catalog lookups,it'sprovided
4175because this functionality may be useful for client tools.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004176
4177<h4>threaded environments:</h4>
4178
Daniel Veillardfabafd52006-06-08 08:16:33 +00004179<p>Since the catalog tree is built progressively, some care has been
4180takentotry to avoid troubles in multithreaded environments. The code is
4181nowthreadsafe assuming that the libxml2 library has been compiled
4182withthreadssupport.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004183
4184<p></p>
4185
4186<h3><a name="Other">Other resources</a></h3>
4187
Daniel Veillardfabafd52006-06-08 08:16:33 +00004188<p>The XML Catalog specification is relatively recent so there
4189isn'tmuchliterature to point at:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004190<ul>
Daniel Veillard63d83142002-05-20 06:51:05 +00004191 <li>You can find a good rant from Norm Walsh about <a
Daniel Veillardfabafd52006-06-08 08:16:33 +00004192 href="http://www.arbortext.com/Think_Tank/XML_Resources/Issue_Three/issue_three.html">theneedfor
4193 catalogs</a>, it provides a lot of context informations even ifIdon't
4194 agree with everything presented. Norm also wrote a morerecentarticle <a
4195 href="http://wwws.sun.com/software/xml/developers/resolver/article/">XMLentitiesand
4196 URI resolvers</a>describing them.</li>
4197 <li>An <a
4198 href="http://home.ccil.org/~cowan/XML/XCatalog.html">oldXMLcatalog
4199 proposal</a>from John Cowan</li>
4200 <li>The <a href="http://www.rddl.org/">Resource
4201 DirectoryDescriptionLanguage</a>(RDDL) another catalog system but more
4202 orientedtowardproviding metadata for XML namespaces.</li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004203 <li>the page from the OASIS Technical <a
Daniel Veillardfabafd52006-06-08 08:16:33 +00004204 href="http://www.oasis-open.org/committees/entity/">Committee
4205 onEntityResolution</a>who maintains XML Catalog, you will find pointers
4206 tothespecification update, some background and pointers to
4207 otherstoolsproviding XML Catalog support</li>
4208 <li>There is a <a href="buildDocBookCatalog">shell script</a>to
4209 generateXMLCatalogs for DocBook 4.1.2 . If it can write to the
4210 /etc/xml/directory,it will set-up /etc/xml/catalog and /etc/xml/docbook
4211 based ontheresources found on the system. Otherwise it will just
4212 create~/xmlcatalogand ~/dbkxmlcatalog and doing:
Daniel Veillard8594de92003-04-25 10:08:44 +00004213 <p><code>export XML_CATALOG_FILES=$HOME/xmlcatalog</code></p>
Daniel Veillardfabafd52006-06-08 08:16:33 +00004214 <p>should allow to process DocBook documentations withoutrequiringnetwork
4215 accesses for the DTD or stylesheets</p>
Daniel Veillard35e937a2002-01-19 22:21:54 +00004216 </li>
Daniel Veillard69839ba2006-06-06 13:27:03 +00004217 <li>I have uploaded <a
Daniel Veillardfabafd52006-06-08 08:16:33 +00004218 href="ftp://xmlsoft.org/libxml2/test/dbk412catalog.tar.gz">asmalltarball</a>containing
4219 XML Catalogs for DocBook 4.1.2 which seemsto workfine for me too</li>
Daniel Veillard69839ba2006-06-06 13:27:03 +00004220 <li>The <a
Daniel Veillardfabafd52006-06-08 08:16:33 +00004221 href="http://www.xmlsoft.org/xmlcatalog_man.html">xmlcatalogmanualpage</a></li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004222</ul>
4223
Daniel Veillard69839ba2006-06-06 13:27:03 +00004224<p>If you have suggestions for corrections or additions, simply contactme:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004225
4226<h2><a name="library">The parser interfaces</a></h2>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004227
Daniel Veillardfabafd52006-06-08 08:16:33 +00004228<p>This section is directly intended to help programmers
4229gettingbootstrappedusing the XML tollkit from the C language. It is not
4230intended tobeextensive. I hope the automatically generated documents will
4231providethecompleteness required, but as a separate set of documents. The
4232interfacesofthe XML parser are by principle low level, Those interested in a
4233higherlevelAPI should <a href="#DOM">look at DOM</a>.</p>
Daniel Veillardccb09631998-10-27 06:21:04 +00004234
Daniel Veillardfabafd52006-06-08 08:16:33 +00004235<p>The <a href="html/libxml-parser.html">parser interfaces
4236forXML</a>areseparated from the <a
4237href="html/libxml-htmlparser.html">HTMLparserinterfaces</a>. Let's have a
4238look at how the XML parser can becalled:</p>
Daniel Veillard0142b842000-01-14 14:45:24 +00004239
Daniel Veillard88f00ae2000-03-02 00:15:55 +00004240<h3><a name="Invoking">Invoking the parser : the pull method</a></h3>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004241
Daniel Veillardfabafd52006-06-08 08:16:33 +00004242<p>Usually, the first thing to do is to read an XML input. The
4243parseracceptsdocuments either from in-memory strings or from files. The
4244functionsaredefined in "parser.h":</p>
Daniel Veillard10c6a8f1998-10-28 01:00:12 +00004245<dl>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004246 <dt><code>xmlDocPtr xmlParseMemory(char *buffer, int size);</code></dt>
Daniel Veillard88f00ae2000-03-02 00:15:55 +00004247 <dd><p>Parse a null-terminated string containing the document.</p>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004248 </dd>
Daniel Veillard10c6a8f1998-10-28 01:00:12 +00004249</dl>
4250<dl>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004251 <dt><code>xmlDocPtr xmlParseFile(const char *filename);</code></dt>
Daniel Veillard69839ba2006-06-06 13:27:03 +00004252 <dd><p>Parse an XML document contained in a (possibly compressed)file.</p>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004253 </dd>
Daniel Veillard10c6a8f1998-10-28 01:00:12 +00004254</dl>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004255
Daniel Veillardfabafd52006-06-08 08:16:33 +00004256<p>The parser returns a pointer to the document structure (or NULL in
4257caseoffailure).</p>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004258
Daniel Veillard88f00ae2000-03-02 00:15:55 +00004259<h3 id="Invoking1">Invoking the parser: the push method</h3>
Daniel Veillard0142b842000-01-14 14:45:24 +00004260
Daniel Veillardfabafd52006-06-08 08:16:33 +00004261<p>In order for the application to keep the control when the document
4262isbeingfetched (which is common for GUI based programs) libxml2 provides
4263apushinterface, too, as of version 1.8.3. Here are the interfacefunctions:</p>
Daniel Veillard0142b842000-01-14 14:45:24 +00004264<pre>xmlParserCtxtPtr xmlCreatePushParserCtxt(xmlSAXHandlerPtr sax,
4265 void *user_data,
4266 const char *chunk,
4267 int size,
4268 const char *filename);
4269int xmlParseChunk (xmlParserCtxtPtr ctxt,
4270 const char *chunk,
4271 int size,
4272 int terminate);</pre>
4273
Daniel Veillard88f00ae2000-03-02 00:15:55 +00004274<p>and here is a simple example showing how to use the interface:</p>
Daniel Veillard0142b842000-01-14 14:45:24 +00004275<pre> FILE *f;
4276
4277 f = fopen(filename, "r");
4278 if (f != NULL) {
4279 int res, size = 1024;
4280 char chars[1024];
4281 xmlParserCtxtPtr ctxt;
4282
4283 res = fread(chars, 1, 4, f);
Daniel Veillard60979bd2000-07-10 12:17:33 +00004284 if (res &gt; 0) {
Daniel Veillard0142b842000-01-14 14:45:24 +00004285 ctxt = xmlCreatePushParserCtxt(NULL, NULL,
4286 chars, res, filename);
Daniel Veillard60979bd2000-07-10 12:17:33 +00004287 while ((res = fread(chars, 1, size, f)) &gt; 0) {
Daniel Veillard0142b842000-01-14 14:45:24 +00004288 xmlParseChunk(ctxt, chars, res, 0);
4289 }
4290 xmlParseChunk(ctxt, chars, 0, 1);
Daniel Veillard60979bd2000-07-10 12:17:33 +00004291 doc = ctxt-&gt;myDoc;
Daniel Veillard0142b842000-01-14 14:45:24 +00004292 xmlFreeParserCtxt(ctxt);
4293 }
4294 }</pre>
4295
Daniel Veillardfabafd52006-06-08 08:16:33 +00004296<p>The HTML parser embedded into libxml2 also has a push
4297interface;thefunctions are just prefixed by "html" rather than "xml".</p>
Daniel Veillard0142b842000-01-14 14:45:24 +00004298
4299<h3 id="Invoking2">Invoking the parser: the SAX interface</h3>
4300
Daniel Veillardfabafd52006-06-08 08:16:33 +00004301<p>The tree-building interface makes the parser memory-hungry,
4302firstloadingthe document in memory and then building the tree itself. Reading
4303adocumentwithout building the tree is possible using the SAX interfaces
4304(seeSAX.h and<a
4305href="http://www.daa.com.au/~james/gnome/xml-sax/xml-sax.html">JamesHenstridge'sdocumentation</a>).
4306Note also that the push interface can belimited to SAX:just use the two first
4307arguments of<code>xmlCreatePushParserCtxt()</code>.</p>
Daniel Veillardccb09631998-10-27 06:21:04 +00004308
Daniel Veillard2f4dfc41999-09-24 14:03:48 +00004309<h3><a name="Building">Building a tree from scratch</a></h3>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004310
Daniel Veillardfabafd52006-06-08 08:16:33 +00004311<p>The other way to get an XML tree in memory is by building
4312it.Basicallythere is a set of functions dedicated to building new
4313elements.(These arealso described in &lt;libxml/tree.h&gt;.) For example,
4314here is apiece ofcode that produces the XML document used in the previous
4315examples:</p>
Daniel Veillard60979bd2000-07-10 12:17:33 +00004316<pre> #include &lt;libxml/tree.h&gt;
Daniel Veillard361d8452000-04-03 19:48:13 +00004317 xmlDocPtr doc;
Daniel Veillard25940b71998-10-29 05:51:30 +00004318 xmlNodePtr tree, subtree;
4319
4320 doc = xmlNewDoc("1.0");
Daniel Veillard60979bd2000-07-10 12:17:33 +00004321 doc-&gt;children = xmlNewDocNode(doc, NULL, "EXAMPLE", NULL);
4322 xmlSetProp(doc-&gt;children, "prop1", "gnome is great");
4323 xmlSetProp(doc-&gt;children, "prop2", "&amp; linux too");
4324 tree = xmlNewChild(doc-&gt;children, NULL, "head", NULL);
Daniel Veillard25940b71998-10-29 05:51:30 +00004325 subtree = xmlNewChild(tree, NULL, "title", "Welcome to Gnome");
Daniel Veillard60979bd2000-07-10 12:17:33 +00004326 tree = xmlNewChild(doc-&gt;children, NULL, "chapter", NULL);
Daniel Veillard25940b71998-10-29 05:51:30 +00004327 subtree = xmlNewChild(tree, NULL, "title", "The Linux adventure");
4328 subtree = xmlNewChild(tree, NULL, "p", "bla bla bla ...");
4329 subtree = xmlNewChild(tree, NULL, "image", NULL);
4330 xmlSetProp(subtree, "href", "linus.gif");</pre>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004331
4332<p>Not really rocket science ...</p>
Daniel Veillard25940b71998-10-29 05:51:30 +00004333
Daniel Veillard2f4dfc41999-09-24 14:03:48 +00004334<h3><a name="Traversing">Traversing the tree</a></h3>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004335
Daniel Veillardfabafd52006-06-08 08:16:33 +00004336<p>Basically by <a href="html/libxml-tree.html">including"tree.h"</a>yourcode
4337has access to the internal structure of all the elementsof the tree.The names
4338should be somewhat simple
4339like<strong>parent</strong>,<strong>children</strong>,
4340<strong>next</strong>,<strong>prev</strong>,<strong>properties</strong>,
4341etc... For example, stillwith the previousexample:</p>
Daniel Veillard60979bd2000-07-10 12:17:33 +00004342<pre><code>doc-&gt;children-&gt;children-&gt;children</code></pre>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004343
4344<p>points to the title element,</p>
Daniel Veillard91e9d582001-02-26 07:31:12 +00004345<pre>doc-&gt;children-&gt;children-&gt;next-&gt;children-&gt;children</pre>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004346
Daniel Veillardfabafd52006-06-08 08:16:33 +00004347<p>points to the text node containing the chapter title
4348"TheLinuxadventure".</p>
Daniel Veillard10c6a8f1998-10-28 01:00:12 +00004349
Daniel Veillardfabafd52006-06-08 08:16:33 +00004350<p><strong>NOTE</strong>: XML allows <em>PI</em>s and
4351<em>comments</em>tobepresent before the document root, so
4352<code>doc-&gt;children</code>maypointto an element which is not the document
4353Root Element; afunction<code>xmlDocGetRootElement()</code>was added for this
4354purpose.</p>
Daniel Veillardb24054a1999-12-18 15:32:46 +00004355
Daniel Veillard2f4dfc41999-09-24 14:03:48 +00004356<h3><a name="Modifying">Modifying the tree</a></h3>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004357
Daniel Veillardfabafd52006-06-08 08:16:33 +00004358<p>Functions are provided for reading and writing the document content.Hereis
4359an excerpt from the <a href="html/libxml-tree.html">tree API</a>:</p>
Daniel Veillard25940b71998-10-29 05:51:30 +00004360<dl>
Daniel Veillardfabafd52006-06-08 08:16:33 +00004361 <dt><code>xmlAttrPtr xmlSetProp(xmlNodePtr node, const xmlChar
4362 *name,constxmlChar *value);</code></dt>
4363 <dd><p>This sets (or changes) an attribute carried by an ELEMENT
4364 node.Thevalue can be NULL.</p>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004365 </dd>
Daniel Veillard25940b71998-10-29 05:51:30 +00004366</dl>
4367<dl>
Daniel Veillardfabafd52006-06-08 08:16:33 +00004368 <dt><code>const xmlChar *xmlGetProp(xmlNodePtr node,
4369 constxmlChar*name);</code></dt>
4370 <dd><p>This function returns a pointer to new copy of thepropertycontent.
4371 Note that the user must deallocate the result.</p>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004372 </dd>
Daniel Veillard25940b71998-10-29 05:51:30 +00004373</dl>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004374
Daniel Veillardfabafd52006-06-08 08:16:33 +00004375<p>Two functions are provided for reading and writing the text
4376associatedwithelements:</p>
Daniel Veillard25940b71998-10-29 05:51:30 +00004377<dl>
Daniel Veillardfabafd52006-06-08 08:16:33 +00004378 <dt><code>xmlNodePtr xmlStringGetNodeList(xmlDocPtr doc,
4379 constxmlChar*value);</code></dt>
4380 <dd><p>This function takes an "external" string and converts it toonetext
4381 node or possibly to a list of entity and text nodes.Allnon-predefined
4382 entity references like &amp;Gnome; will bestoredinternally as entity
4383 nodes, hence the result of the function maynot bea single node.</p>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004384 </dd>
Daniel Veillard25940b71998-10-29 05:51:30 +00004385</dl>
4386<dl>
Daniel Veillardfabafd52006-06-08 08:16:33 +00004387 <dt><code>xmlChar *xmlNodeListGetString(xmlDocPtr doc, xmlNodePtr
4388 list,intinLine);</code></dt>
4389 <dd><p>This function is the inverseof<code>xmlStringGetNodeList()</code>.
4390 It generates a newstringcontaining the content of the text and entity
4391 nodes. Note theextraargument inLine. If this argument is set to 1, the
4392 function willexpandentity references. For example, instead of
4393 returning the&amp;Gnome;XML encoding in the string, it will substitute
4394 it with itsvalue (say,"GNU Network Object Model Environment").</p>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004395 </dd>
Daniel Veillard25940b71998-10-29 05:51:30 +00004396</dl>
Daniel Veillard10c6a8f1998-10-28 01:00:12 +00004397
Daniel Veillard2f4dfc41999-09-24 14:03:48 +00004398<h3><a name="Saving">Saving a tree</a></h3>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004399
4400<p>Basically 3 options are possible:</p>
Daniel Veillard25940b71998-10-29 05:51:30 +00004401<dl>
Daniel Veillardfabafd52006-06-08 08:16:33 +00004402 <dt><code>void xmlDocDumpMemory(xmlDocPtr cur,
4403 xmlChar**mem,int*size);</code></dt>
Daniel Veillard88f00ae2000-03-02 00:15:55 +00004404 <dd><p>Returns a buffer into which the document has been saved.</p>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004405 </dd>
Daniel Veillard25940b71998-10-29 05:51:30 +00004406</dl>
4407<dl>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004408 <dt><code>extern void xmlDocDump(FILE *f, xmlDocPtr doc);</code></dt>
Daniel Veillard88f00ae2000-03-02 00:15:55 +00004409 <dd><p>Dumps a document to an open file descriptor.</p>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004410 </dd>
Daniel Veillard25940b71998-10-29 05:51:30 +00004411</dl>
4412<dl>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004413 <dt><code>int xmlSaveFile(const char *filename, xmlDocPtr cur);</code></dt>
Daniel Veillardfabafd52006-06-08 08:16:33 +00004414 <dd><p>Saves the document to a file. In this case,
4415 thecompressioninterface is triggered if it has been turned on.</p>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004416 </dd>
Daniel Veillard25940b71998-10-29 05:51:30 +00004417</dl>
Daniel Veillard10c6a8f1998-10-28 01:00:12 +00004418
Daniel Veillard2f4dfc41999-09-24 14:03:48 +00004419<h3><a name="Compressio">Compression</a></h3>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004420
Daniel Veillardfabafd52006-06-08 08:16:33 +00004421<p>The library transparently handles compression when
4422doingfile-basedaccesses. The level of compression on saves can be turned on
4423eithergloballyor individually for one file:</p>
Daniel Veillard25940b71998-10-29 05:51:30 +00004424<dl>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004425 <dt><code>int xmlGetDocCompressMode (xmlDocPtr doc);</code></dt>
Daniel Veillard88f00ae2000-03-02 00:15:55 +00004426 <dd><p>Gets the document compression ratio (0-9).</p>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004427 </dd>
Daniel Veillard25940b71998-10-29 05:51:30 +00004428</dl>
4429<dl>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004430 <dt><code>void xmlSetDocCompressMode (xmlDocPtr doc, int mode);</code></dt>
Daniel Veillard88f00ae2000-03-02 00:15:55 +00004431 <dd><p>Sets the document compression ratio.</p>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004432 </dd>
Daniel Veillard25940b71998-10-29 05:51:30 +00004433</dl>
4434<dl>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004435 <dt><code>int xmlGetCompressMode(void);</code></dt>
Daniel Veillard88f00ae2000-03-02 00:15:55 +00004436 <dd><p>Gets the default compression ratio.</p>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004437 </dd>
Daniel Veillard25940b71998-10-29 05:51:30 +00004438</dl>
4439<dl>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004440 <dt><code>void xmlSetCompressMode(int mode);</code></dt>
Daniel Veillard88f00ae2000-03-02 00:15:55 +00004441 <dd><p>Sets the default compression ratio.</p>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004442 </dd>
Daniel Veillard25940b71998-10-29 05:51:30 +00004443</dl>
4444
Daniel Veillard2f4dfc41999-09-24 14:03:48 +00004445<h2><a name="Entities">Entities or no entities</a></h2>
Daniel Veillardc8eab3a1999-09-04 18:27:23 +00004446
Daniel Veillardfabafd52006-06-08 08:16:33 +00004447<p>Entities in principle are similar to simple C macros. An entity
4448definesanabbreviation for a given string that you can reuse many times
4449throughoutthecontent of your document. Entities are especially useful when a
4450givenstringmay occur frequently within a document, or to confine the change
4451neededto adocument to a restricted area in the internal subset of the
4452document (atthebeginning). Example:</p>
Daniel Veillard60979bd2000-07-10 12:17:33 +00004453<pre>1 &lt;?xml version="1.0"?&gt;
Daniel Veillardc8eab3a1999-09-04 18:27:23 +000044542 &lt;!DOCTYPE EXAMPLE SYSTEM "example.dtd" [
Daniel Veillard60979bd2000-07-10 12:17:33 +000044553 &lt;!ENTITY xml "Extensible Markup Language"&gt;
44564 ]&gt;
44575 &lt;EXAMPLE&gt;
Daniel Veillardc8eab3a1999-09-04 18:27:23 +000044586 &amp;xml;
Daniel Veillard60979bd2000-07-10 12:17:33 +000044597 &lt;/EXAMPLE&gt;</pre>
Daniel Veillardc8eab3a1999-09-04 18:27:23 +00004460
Daniel Veillardfabafd52006-06-08 08:16:33 +00004461<p>Line 3 declares the xml entity. Line 6 uses the xml entity, byprefixingits
4462name with '&amp;' and following it by ';' without any spacesadded. Thereare 5
4463predefined entities in libxml2 allowing you to escapecharacters
4464withpredefined meaning in some parts of the xml
4465documentcontent:<strong>&amp;lt;</strong>for the character
4466'&lt;',<strong>&amp;gt;</strong>for the character
4467'&gt;',<strong>&amp;apos;</strong>for the
4468character''',<strong>&amp;quot;</strong>for the character
4469'"',and<strong>&amp;amp;</strong>for the character '&amp;'.</p>
Daniel Veillardc8eab3a1999-09-04 18:27:23 +00004470
Daniel Veillardfabafd52006-06-08 08:16:33 +00004471<p>One of the problems related to entities is that you may want the
4472parsertosubstitute an entity's content so that you can see the replacement
4473textinyour application. Or you may prefer to keep entity references as such
4474inthecontent to be able to save the document back without losing
4475thisusuallyprecious information (if the user went through the pain
4476ofexplicitlydefining entities, he may have a a rather negative attitude if
4477youblindlysubstitute them as saving time). The <a
4478href="html/libxml-parser.html#xmlSubstituteEntitiesDefault">xmlSubstituteEntitiesDefault()</a>functionallows
4479you to check and change the behaviour, which is to notsubstituteentities by
4480default.</p>
Daniel Veillardc8eab3a1999-09-04 18:27:23 +00004481
Daniel Veillardfabafd52006-06-08 08:16:33 +00004482<p>Here is the DOM tree built by libxml2 for the previous document
4483inthedefault case:</p>
Daniel Veillard60979bd2000-07-10 12:17:33 +00004484<pre>/gnome/src/gnome-xml -&gt; ./xmllint --debug test/ent1
Daniel Veillardc8eab3a1999-09-04 18:27:23 +00004485DOCUMENT
4486version=1.0
4487 ELEMENT EXAMPLE
4488 TEXT
4489 content=
4490 ENTITY_REF
4491 INTERNAL_GENERAL_ENTITY xml
4492 content=Extensible Markup Language
4493 TEXT
4494 content=</pre>
4495
4496<p>And here is the result when substituting entities:</p>
Daniel Veillard60979bd2000-07-10 12:17:33 +00004497<pre>/gnome/src/gnome-xml -&gt; ./tester --debug --noent test/ent1
Daniel Veillardc8eab3a1999-09-04 18:27:23 +00004498DOCUMENT
4499version=1.0
4500 ELEMENT EXAMPLE
4501 TEXT
4502 content= Extensible Markup Language</pre>
4503
Daniel Veillardfabafd52006-06-08 08:16:33 +00004504<p>So, entities or no entities? Basically, it depends on your use
4505case.Isuggest that you keep the non-substituting default behaviour and
4506avoidusingentities in your XML document or data if you are not willing to
4507handletheentity references elements in the DOM tree.</p>
Daniel Veillardc8eab3a1999-09-04 18:27:23 +00004508
Daniel Veillardfabafd52006-06-08 08:16:33 +00004509<p>Note that at save time libxml2 enforces the conversion of
4510thepredefinedentities where necessary to prevent well-formedness problems,
4511andwill alsotransparently replace those with chars (i.e. it will not
4512generateentityreference elements in the DOM tree or call the reference() SAX
4513callbackwhenfinding them in the input).</p>
Daniel Veillardc8eab3a1999-09-04 18:27:23 +00004514
Daniel Veillardfabafd52006-06-08 08:16:33 +00004515<p><span style="background-color: #FF0000">WARNING</span>: handlingentitieson
4516top of the libxml2 SAX interface is difficult!!! If you plan
4517tousenon-predefined entities in your documents, then the learning curve
4518tohandlethen using the SAX API may be long. If you plan to use
4519complexdocuments, Istrongly suggest you consider using the DOM interface
4520instead andlet libxmldeal with the complexity rather than trying to do it
4521yourself.</p>
Daniel Veillard7b9c4b72000-08-25 16:26:50 +00004522
Daniel Veillard2f4dfc41999-09-24 14:03:48 +00004523<h2><a name="Namespaces">Namespaces</a></h2>
Daniel Veillardc8eab3a1999-09-04 18:27:23 +00004524
Daniel Veillard8a469172003-06-12 16:05:07 +00004525<p>The libxml2 library implements <a
Daniel Veillardfabafd52006-06-08 08:16:33 +00004526href="http://www.w3.org/TR/REC-xml-names/">XML
4527namespaces</a>supportbyrecognizing namespace constructs in the input, and
4528does namespacelookupautomatically when building the DOM tree. A namespace
4529declarationisassociated with an in-memory structure and all elements or
4530attributeswithinthat namespace point to it. Hence testing the namespace is a
4531simple andfastequality operation at the user level.</p>
Daniel Veillardc8eab3a1999-09-04 18:27:23 +00004532
Daniel Veillardfabafd52006-06-08 08:16:33 +00004533<p>I suggest that people using libxml2 use a namespace, and declare it
4534intheroot element of their document as the default namespace. Then they
4535don'tneedto use the prefix in the content but we will have a basis for
4536futuresemanticrefinement and merging of data from different sources. This
4537doesn'tincreasethe size of the XML output significantly, but significantly
4538increasesitsvalue in the long-term. Example:</p>
Daniel Veillard60979bd2000-07-10 12:17:33 +00004539<pre>&lt;mydoc xmlns="http://mydoc.example.org/schemas/"&gt;
4540 &lt;elem1&gt;...&lt;/elem1&gt;
4541 &lt;elem2&gt;...&lt;/elem2&gt;
4542&lt;/mydoc&gt;</pre>
Daniel Veillardc8eab3a1999-09-04 18:27:23 +00004543
Daniel Veillardfabafd52006-06-08 08:16:33 +00004544<p>The namespace value has to be an absolute URL, but the URL doesn't
4545havetopoint to any existing resource on the Web. It will bind all the
4546elementandattributes with that URL. I suggest to use an URL within a
4547domainyoucontrol, and that the URL should contain some kind of version
4548informationifpossible. For example,
4549<code>"http://www.gnome.org/gnumeric/1.0/"</code>isagood namespace scheme.</p>
Daniel Veillardec303412000-03-24 13:41:54 +00004550
Daniel Veillardfabafd52006-06-08 08:16:33 +00004551<p>Then when you load a file, make sure that a namespace
4552carryingtheversion-independent prefix is installed on the root element of
4553yourdocument,and if the version information don't match something you know,
4554warnthe userand be liberal in what you accept as the input. Also do *not* try
4555tobasenamespace checking on the prefix value. &lt;foo:text&gt; may be
4556exactlythesame as &lt;bar:text&gt; in another document. What really matters
4557is theURIassociated with the element or the attribute, not the prefix string
4558(whichisjust a shortcut for the full URI). In libxml, element and attributes
4559havean<code>ns</code>field pointing to an xmlNs structure detailing
4560thenamespaceprefix and its URI.</p>
Daniel Veillardc8eab3a1999-09-04 18:27:23 +00004561
4562<p>@@Interfaces@@</p>
Daniel Veillard3e35f8e2003-10-21 00:05:38 +00004563<pre>xmlNodePtr node;
Daniel Veillardfc8dc352003-10-18 09:07:46 +00004564if(!strncmp(node-&gt;name,"mytag",5)
4565 &amp;&amp; node-&gt;ns
4566 &amp;&amp; !strcmp(node-&gt;ns-&gt;href,"http://www.mysite.com/myns/1.0")) {
4567 ...
Daniel Veillard3e35f8e2003-10-21 00:05:38 +00004568}</pre>
Daniel Veillardc8eab3a1999-09-04 18:27:23 +00004569
Daniel Veillardfabafd52006-06-08 08:16:33 +00004570<p>Usually people object to using namespaces together with validitychecking.I
4571will try to make sure that using namespaces won't break validitychecking,so
4572even if you plan to use or currently are using validation Istronglysuggest
4573adding namespaces to your document. A default
4574namespacescheme<code>xmlns="http://...."</code>should not break validity even
4575onlessflexible parsers. Using namespaces to mix and differentiate
4576contentcomingfrom multiple DTDs will certainly break current validation
4577schemes. Tochecksuch documents one needs to use schema-validation, which is
4578supportedinlibxml2 as well. See <a
4579href="http://www.relaxng.org/">relagx-ng</a>and <a
Daniel Veillard3e35f8e2003-10-21 00:05:38 +00004580href="http://www.w3c.org/XML/Schema">w3c-schema</a>.</p>
Daniel Veillardc8eab3a1999-09-04 18:27:23 +00004581
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004582<h2><a name="Upgrading">Upgrading 1.x code</a></h2>
Daniel Veillardc8eab3a1999-09-04 18:27:23 +00004583
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004584<p>Incompatible changes:</p>
Daniel Veillardc8eab3a1999-09-04 18:27:23 +00004585
Daniel Veillardfabafd52006-06-08 08:16:33 +00004586<p>Version 2 of libxml2 is the first version introducing
4587seriousbackwardincompatible changes. The main goals were:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004588<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00004589 <li>a general cleanup. A number of mistakes inherited from the
4590 veryearlyversions couldn't be changed due to compatibility
4591 constraints.Examplethe "childs" element in the nodes.</li>
4592 <li>Uniformization of the various nodes, at least for their header
4593 andlinkparts (doc, parent, children, prev, next), the goal is
4594 asimplerprogramming model and simplifying the task of the
4595 DOMimplementors.</li>
4596 <li>better conformances to the XML specification, for example version1.xhad
4597 an heuristic to try to detect ignorable white spaces. As a resulttheSAX
4598 event generated were ignorableWhitespace() while the
4599 specrequirescharacter() in that case. This also mean that a number of
4600 DOMnodecontaining blank text may populate the DOM tree which were
4601 notpresentbefore.</li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004602</ul>
Daniel Veillardc8eab3a1999-09-04 18:27:23 +00004603
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004604<h3>How to fix libxml-1.x code:</h3>
Daniel Veillardc8eab3a1999-09-04 18:27:23 +00004605
Daniel Veillardfabafd52006-06-08 08:16:33 +00004606<p>So client code of libxml designed to run with version 1.x may have
4607tobechanged to compile against version 2.x of libxml. Here is a list
4608ofchangesthat I have collected, they may not be sufficient, so in case you
4609findotherchange which are required, <a
4610href="mailto:Daniel.Veillard@w3.org">dropme amail</a>:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004611<ol>
Daniel Veillardfabafd52006-06-08 08:16:33 +00004612 <li>The package name have changed from libxml to libxml2, the librarynameis
4613 now -lxml2 . There is a new xml2-config script which should beused
4614 toselect the right parameters libxml2</li>
4615 <li>Node <strong>childs</strong>field has
4616 beenrenamed<strong>children</strong>so s/childs/children/g should
4617 beapplied(probability of having "childs" anywhere else is close to 0+</li>
4618 <li>The document don't have anymore a <strong>root</strong>element
4619 ithasbeen replaced by <strong>children</strong>and usually you will
4620 getalist of element here. For example a Dtd element for the
4621 internalsubsetand it's declaration may be found in that list, as well
4622 asprocessinginstructions or comments found before or after the
4623 documentroot element.Use <strong>xmlDocGetRootElement(doc)</strong>to get
4624 theroot element ofa document. Alternatively if you are sure to not
4625 referenceDTDs nor havePIs or comments before or after the
4626 rootelements/-&gt;root/-&gt;children/g will probably do it.</li>
4627 <li>The white space issue, this one is more complex, unless special
4628 caseofvalidating parsing, the line breaks and spaces usually used
4629 forindentingand formatting the document content becomes significant. So
4630 theyarereported by SAX and if your using the DOM tree, corresponding
4631 nodesaregenerated. Too approach can be taken:
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004632 <ol>
Daniel Veillardfabafd52006-06-08 08:16:33 +00004633 <li>lazy one, use the
4634 compatibilitycall<strong>xmlKeepBlanksDefault(0)</strong>but be aware
4635 that youarerelying on a special (and possibly broken) set of
4636 heuristicsoflibxml to detect ignorable blanks. Don't complain if it
4637 breaksormake your application not 100% clean w.r.t. to it's
4638 input.</li>
4639 <li>the Right Way: change you code to accept
4640 possiblyinsignificantblanks characters, or have your tree populated
4641 withweird blank textnodes. You can spot them using the
4642 commodityfunction<strong>xmlIsBlankNode(node)</strong>returning 1 for
4643 suchblanknodes.</li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004644 </ol>
Daniel Veillardfabafd52006-06-08 08:16:33 +00004645 <p>Note also that with the new default the output functions don't
4646 addanyextra indentation when saving a tree in order to be able to
4647 roundtrip(read and save) without inflating the document with
4648 extraformattingchars.</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004649 </li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00004650 <li>The include path has changed to $prefix/libxml/ and
4651 theincludesthemselves uses this new prefix in includes instructions...
4652 Ifyou areusing (as expected) the
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004653 <pre>xml2-config --cflags</pre>
Daniel Veillardfabafd52006-06-08 08:16:33 +00004654 <p>output to generate you compile commands this will probably work
4655 outofthe box</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004656 </li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00004657 <li>xmlDetectCharEncoding takes an extra argument indicating the
4658 lengthinbyte of the head of the document available for character
4659 detection.</li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004660</ol>
Daniel Veillardc8eab3a1999-09-04 18:27:23 +00004661
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004662<h3>Ensuring both libxml-1.x and libxml-2.x compatibility</h3>
Daniel Veillardc8eab3a1999-09-04 18:27:23 +00004663
Daniel Veillardfabafd52006-06-08 08:16:33 +00004664<p>Two new version of libxml (1.8.11) and libxml2 (2.3.4) have beenreleasedto
4665allow smooth upgrade of existing libxml v1code whileretainingcompatibility.
4666They offers the following:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004667<ol>
Daniel Veillardfabafd52006-06-08 08:16:33 +00004668 <li>similar include naming, one
4669 shoulduse<strong>#include&lt;libxml/...&gt;</strong>in both cases.</li>
4670 <li>similar identifiers defined via macros for the child and
4671 rootfields:respectively<strong>xmlChildrenNode</strong>and<strong>xmlRootNode</strong></li>
4672 <li>a new macro <strong>LIBXML_TEST_VERSION</strong>which should
4673 beinsertedonce in the client code</li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004674</ol>
Daniel Veillardc8eab3a1999-09-04 18:27:23 +00004675
Daniel Veillardfabafd52006-06-08 08:16:33 +00004676<p>So the roadmap to upgrade your existing libxml applications
4677isthefollowing:</p>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004678<ol>
4679 <li>install the libxml-1.8.8 (and libxml-devel-1.8.8) packages</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00004680 <li>find all occurrences where the xmlDoc <strong>root</strong>field
4681 isusedand change it to <strong>xmlRootNode</strong></li>
4682 <li>similarly find all occurrences where
4683 thexmlNode<strong>childs</strong>field is used and change
4684 itto<strong>xmlChildrenNode</strong></li>
4685 <li>add a <strong>LIBXML_TEST_VERSION</strong>macro somewhere
4686 inyour<strong>main()</strong>or in the library init entry point</li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004687 <li>Recompile, check compatibility, it should still work</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00004688 <li>Change your configure script to look first for xml2-config and
4689 fallbackusing xml-config . Use the --cflags and --libs output of the
4690 commandasthe Include and Linking parameters needed to use libxml.</li>
4691 <li>install libxml2-2.3.x and libxml2-devel-2.3.x
4692 (libxml-1.8.yandlibxml-devel-1.8.y can be kept simultaneously)</li>
4693 <li>remove your config.cache, relaunch your configuration
4694 mechanism,andrecompile, if steps 2 and 3 were done right it should
4695 compileas-is</li>
4696 <li>Test that your application is still running correctly, if not thismaybe
4697 due to extra empty nodes due to formating spaces being kept
4698 inlibxml2contrary to libxml1, in that case insert
4699 xmlKeepBlanksDefault(1)in yourcode before calling the parser
4700 (nextto<strong>LIBXML_TEST_VERSION</strong>is a fine place).</li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004701</ol>
Daniel Veillardc8eab3a1999-09-04 18:27:23 +00004702
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004703<p>Following those steps should work. It worked for some of my own code.</p>
Daniel Veillardc8eab3a1999-09-04 18:27:23 +00004704
Daniel Veillardfabafd52006-06-08 08:16:33 +00004705<p>Let me put some emphasis on the fact that there is far more
4706changesfromlibxml 1.x to 2.x than the ones you may have to patch for. The
4707overallcodehas been considerably cleaned up and the conformance to the
4708XMLspecificationhas been drastically improved too. Don't take those changes
4709asan excuse tonot upgrade, it may cost a lot on the long term ...</p>
Daniel Veillardc8eab3a1999-09-04 18:27:23 +00004710
Daniel Veillard52dcab32001-10-30 12:51:17 +00004711<h2><a name="Thread">Thread safety</a></h2>
4712
Daniel Veillardfabafd52006-06-08 08:16:33 +00004713<p>Starting with 2.4.7, libxml2 makes provisions to ensure
4714thatconcurrentthreads can safely work in parallel parsing different
4715documents.There ishowever a couple of things to do to ensure it:</p>
Daniel Veillard52dcab32001-10-30 12:51:17 +00004716<ul>
4717 <li>configure the library accordingly using the --with-threads options</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00004718 <li>call xmlInitParser() in the "main" thread before using any ofthelibxml2
4719 API (except possibly selecting a different memoryallocator)</li>
Daniel Veillard52dcab32001-10-30 12:51:17 +00004720</ul>
4721
Daniel Veillardfabafd52006-06-08 08:16:33 +00004722<p>Note that the thread safety cannot be ensured for multiple
4723threadssharingthe same document, the locking must be done at the application
4724level,libxmlexports a basic mutex and reentrant mutexes API
4725in&lt;libxml/threads.h&gt;.The parts of the library checked for thread
4726safetyare:</p>
Daniel Veillard52dcab32001-10-30 12:51:17 +00004727<ul>
4728 <li>concurrent loading</li>
4729 <li>file access resolution</li>
4730 <li>catalog access</li>
4731 <li>catalog building</li>
4732 <li>entities lookup/accesses</li>
4733 <li>validation</li>
4734 <li>global variables per-thread override</li>
4735 <li>memory handling</li>
4736</ul>
4737
Daniel Veillardfabafd52006-06-08 08:16:33 +00004738<p>XPath is supposed to be thread safe now, but this
4739wasn'ttestedseriously.</p>
Daniel Veillard52dcab32001-10-30 12:51:17 +00004740
Daniel Veillard35008381999-10-25 13:15:52 +00004741<h2><a name="DOM"></a><a name="Principles">DOM Principles</a></h2>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004742
Daniel Veillardfabafd52006-06-08 08:16:33 +00004743<p><a href="http://www.w3.org/DOM/">DOM</a>stands for the
4744<em>DocumentObjectModel</em>; this is an API for accessing XML or HTML
4745structureddocuments.Native support for DOM in Gnome is on the way (module
4746gnome-dom),and will bebased on gnome-xml. This will be a far cleaner
4747interface tomanipulate XMLfiles within Gnome since it won't expose the
4748internalstructure.</p>
Daniel Veillard14fff061999-06-22 21:49:07 +00004749
Daniel Veillard8a469172003-06-12 16:05:07 +00004750<p>The current DOM implementation on top of libxml2 is the <a
Daniel Veillardfabafd52006-06-08 08:16:33 +00004751href="http://cvs.gnome.org/lxr/source/gdome2/">gdome2 Gnome module</a>,thisis
4752a full DOM interface, thanks to Paolo Casarini, check the <a
4753href="http://www.cs.unibo.it/~casarini/gdome2/">Gdome2
4754homepage</a>formoreinformations.</p>
Daniel Veillardc8eab3a1999-09-04 18:27:23 +00004755
Daniel Veillard35008381999-10-25 13:15:52 +00004756<h2><a name="Example"></a><a name="real">A real example</a></h2>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004757
Daniel Veillardfabafd52006-06-08 08:16:33 +00004758<p>Here is a real size example, where the actual content of
4759theapplicationdata is not kept in the DOM tree but uses internal structures.
4760Itis based ona proposal to keep a database of jobs related to Gnome, with
4761anXML basedstorage structure. Here is an <a href="gjobs.xml">XML
4762encodedjobsbase</a>:</p>
Daniel Veillard60979bd2000-07-10 12:17:33 +00004763<pre>&lt;?xml version="1.0"?&gt;
4764&lt;gjob:Helping xmlns:gjob="http://www.gnome.org/some-location"&gt;
4765 &lt;gjob:Jobs&gt;
Daniel Veillard14fff061999-06-22 21:49:07 +00004766
Daniel Veillard60979bd2000-07-10 12:17:33 +00004767 &lt;gjob:Job&gt;
4768 &lt;gjob:Project ID="3"/&gt;
4769 &lt;gjob:Application&gt;GBackup&lt;/gjob:Application&gt;
4770 &lt;gjob:Category&gt;Development&lt;/gjob:Category&gt;
Daniel Veillard14fff061999-06-22 21:49:07 +00004771
Daniel Veillard60979bd2000-07-10 12:17:33 +00004772 &lt;gjob:Update&gt;
4773 &lt;gjob:Status&gt;Open&lt;/gjob:Status&gt;
4774 &lt;gjob:Modified&gt;Mon, 07 Jun 1999 20:27:45 -0400 MET DST&lt;/gjob:Modified&gt;
4775 &lt;gjob:Salary&gt;USD 0.00&lt;/gjob:Salary&gt;
4776 &lt;/gjob:Update&gt;
Daniel Veillard14fff061999-06-22 21:49:07 +00004777
Daniel Veillard60979bd2000-07-10 12:17:33 +00004778 &lt;gjob:Developers&gt;
4779 &lt;gjob:Developer&gt;
4780 &lt;/gjob:Developer&gt;
4781 &lt;/gjob:Developers&gt;
Daniel Veillard14fff061999-06-22 21:49:07 +00004782
Daniel Veillard60979bd2000-07-10 12:17:33 +00004783 &lt;gjob:Contact&gt;
4784 &lt;gjob:Person&gt;Nathan Clemons&lt;/gjob:Person&gt;
4785 &lt;gjob:Email&gt;nathan@windsofstorm.net&lt;/gjob:Email&gt;
4786 &lt;gjob:Company&gt;
4787 &lt;/gjob:Company&gt;
4788 &lt;gjob:Organisation&gt;
4789 &lt;/gjob:Organisation&gt;
4790 &lt;gjob:Webpage&gt;
4791 &lt;/gjob:Webpage&gt;
4792 &lt;gjob:Snailmail&gt;
4793 &lt;/gjob:Snailmail&gt;
4794 &lt;gjob:Phone&gt;
4795 &lt;/gjob:Phone&gt;
4796 &lt;/gjob:Contact&gt;
Daniel Veillard14fff061999-06-22 21:49:07 +00004797
Daniel Veillard60979bd2000-07-10 12:17:33 +00004798 &lt;gjob:Requirements&gt;
Daniel Veillard14fff061999-06-22 21:49:07 +00004799 The program should be released as free software, under the GPL.
Daniel Veillard60979bd2000-07-10 12:17:33 +00004800 &lt;/gjob:Requirements&gt;
Daniel Veillard14fff061999-06-22 21:49:07 +00004801
Daniel Veillard60979bd2000-07-10 12:17:33 +00004802 &lt;gjob:Skills&gt;
4803 &lt;/gjob:Skills&gt;
Daniel Veillard14fff061999-06-22 21:49:07 +00004804
Daniel Veillard60979bd2000-07-10 12:17:33 +00004805 &lt;gjob:Details&gt;
Daniel Veillard14fff061999-06-22 21:49:07 +00004806 A GNOME based system that will allow a superuser to configure
4807 compressed and uncompressed files and/or file systems to be backed
4808 up with a supported media in the system. This should be able to
4809 perform via find commands generating a list of files that are passed
4810 to tar, dd, cpio, cp, gzip, etc., to be directed to the tape machine
4811 or via operations performed on the filesystem itself. Email
4812 notification and GUI status display very important.
Daniel Veillard60979bd2000-07-10 12:17:33 +00004813 &lt;/gjob:Details&gt;
Daniel Veillard14fff061999-06-22 21:49:07 +00004814
Daniel Veillard60979bd2000-07-10 12:17:33 +00004815 &lt;/gjob:Job&gt;
Daniel Veillard14fff061999-06-22 21:49:07 +00004816
Daniel Veillard60979bd2000-07-10 12:17:33 +00004817 &lt;/gjob:Jobs&gt;
4818&lt;/gjob:Helping&gt;</pre>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004819
Daniel Veillardfabafd52006-06-08 08:16:33 +00004820<p>While loading the XML file into an internal DOM tree is a matter
4821ofcallingonly a couple of functions, browsing the tree to gather the data
4822andgeneratethe internal structures is harder, and more error prone.</p>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004823
Daniel Veillardfabafd52006-06-08 08:16:33 +00004824<p>The suggested principle is to be tolerant with respect to
4825theinputstructure. For example, the ordering of the attributes is
4826notsignificant,the XML specification is clear about it. It's also usually a
4827goodidea not todepend on the order of the children of a given node, unless
4828itreally makesthings harder. Here is some code to parse the information for
4829aperson:</p>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004830<pre>/*
Daniel Veillard14fff061999-06-22 21:49:07 +00004831 * A person record
4832 */
4833typedef struct person {
4834 char *name;
4835 char *email;
4836 char *company;
4837 char *organisation;
4838 char *smail;
4839 char *webPage;
4840 char *phone;
4841} person, *personPtr;
4842
4843/*
4844 * And the code needed to parse it
4845 */
4846personPtr parsePerson(xmlDocPtr doc, xmlNsPtr ns, xmlNodePtr cur) {
4847 personPtr ret = NULL;
4848
4849DEBUG("parsePerson\n");
4850 /*
4851 * allocate the struct
4852 */
4853 ret = (personPtr) malloc(sizeof(person));
4854 if (ret == NULL) {
4855 fprintf(stderr,"out of memory\n");
Daniel Veillardb05deb71999-08-10 19:04:08 +00004856 return(NULL);
Daniel Veillard14fff061999-06-22 21:49:07 +00004857 }
4858 memset(ret, 0, sizeof(person));
4859
4860 /* We don't care what the top level element name is */
Daniel Veillard60979bd2000-07-10 12:17:33 +00004861 cur = cur-&gt;xmlChildrenNode;
Daniel Veillard14fff061999-06-22 21:49:07 +00004862 while (cur != NULL) {
Daniel Veillard60979bd2000-07-10 12:17:33 +00004863 if ((!strcmp(cur-&gt;name, "Person")) &amp;&amp; (cur-&gt;ns == ns))
4864 ret-&gt;name = xmlNodeListGetString(doc, cur-&gt;xmlChildrenNode, 1);
4865 if ((!strcmp(cur-&gt;name, "Email")) &amp;&amp; (cur-&gt;ns == ns))
4866 ret-&gt;email = xmlNodeListGetString(doc, cur-&gt;xmlChildrenNode, 1);
4867 cur = cur-&gt;next;
Daniel Veillard14fff061999-06-22 21:49:07 +00004868 }
4869
4870 return(ret);
Daniel Veillardb05deb71999-08-10 19:04:08 +00004871}</pre>
4872
Daniel Veillard91e9d582001-02-26 07:31:12 +00004873<p>Here are a couple of things to notice:</p>
Daniel Veillard14fff061999-06-22 21:49:07 +00004874<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00004875 <li>Usually a recursive parsing style is the more convenient one: XMLdatais
4876 by nature subject to repetitive constructs and usually
4877 exhibitshighlystructured patterns.</li>
4878 <li>The two arguments of type <em>xmlDocPtr</em>and
4879 <em>xmlNsPtr</em>,i.e.the pointer to the global XML document and the
4880 namespace reserved totheapplication. Document wide information are needed
4881 for example todecodeentities and it's a good coding practice to define a
4882 namespace foryourapplication set of data and test that the element and
4883 attributesyou'reanalyzing actually pertains to your application space.
4884 This isdone by asimple equality test (cur-&gt;ns == ns).</li>
4885 <li>To retrieve text and attributes value, you can use
4886 thefunction<em>xmlNodeListGetString</em>to gather all the text and
4887 entityreferencenodes generated by the DOM output and produce an single
4888 textstring.</li>
Daniel Veillard14fff061999-06-22 21:49:07 +00004889</ul>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004890
Daniel Veillardfabafd52006-06-08 08:16:33 +00004891<p>Here is another piece of code used to parse another level
4892ofthestructure:</p>
Daniel Veillard60979bd2000-07-10 12:17:33 +00004893<pre>#include &lt;libxml/tree.h&gt;
Daniel Veillard361d8452000-04-03 19:48:13 +00004894/*
Daniel Veillard14fff061999-06-22 21:49:07 +00004895 * a Description for a Job
4896 */
4897typedef struct job {
4898 char *projectID;
4899 char *application;
4900 char *category;
4901 personPtr contact;
4902 int nbDevelopers;
4903 personPtr developers[100]; /* using dynamic alloc is left as an exercise */
4904} job, *jobPtr;
4905
4906/*
4907 * And the code needed to parse it
4908 */
4909jobPtr parseJob(xmlDocPtr doc, xmlNsPtr ns, xmlNodePtr cur) {
4910 jobPtr ret = NULL;
4911
4912DEBUG("parseJob\n");
4913 /*
4914 * allocate the struct
4915 */
4916 ret = (jobPtr) malloc(sizeof(job));
4917 if (ret == NULL) {
4918 fprintf(stderr,"out of memory\n");
Daniel Veillardb05deb71999-08-10 19:04:08 +00004919 return(NULL);
Daniel Veillard14fff061999-06-22 21:49:07 +00004920 }
4921 memset(ret, 0, sizeof(job));
4922
4923 /* We don't care what the top level element name is */
Daniel Veillard60979bd2000-07-10 12:17:33 +00004924 cur = cur-&gt;xmlChildrenNode;
Daniel Veillard14fff061999-06-22 21:49:07 +00004925 while (cur != NULL) {
4926
Daniel Veillard60979bd2000-07-10 12:17:33 +00004927 if ((!strcmp(cur-&gt;name, "Project")) &amp;&amp; (cur-&gt;ns == ns)) {
4928 ret-&gt;projectID = xmlGetProp(cur, "ID");
4929 if (ret-&gt;projectID == NULL) {
Daniel Veillardb05deb71999-08-10 19:04:08 +00004930 fprintf(stderr, "Project has no ID\n");
4931 }
4932 }
Daniel Veillard60979bd2000-07-10 12:17:33 +00004933 if ((!strcmp(cur-&gt;name, "Application")) &amp;&amp; (cur-&gt;ns == ns))
4934 ret-&gt;application = xmlNodeListGetString(doc, cur-&gt;xmlChildrenNode, 1);
4935 if ((!strcmp(cur-&gt;name, "Category")) &amp;&amp; (cur-&gt;ns == ns))
4936 ret-&gt;category = xmlNodeListGetString(doc, cur-&gt;xmlChildrenNode, 1);
4937 if ((!strcmp(cur-&gt;name, "Contact")) &amp;&amp; (cur-&gt;ns == ns))
4938 ret-&gt;contact = parsePerson(doc, ns, cur);
4939 cur = cur-&gt;next;
Daniel Veillard14fff061999-06-22 21:49:07 +00004940 }
4941
4942 return(ret);
Daniel Veillardb05deb71999-08-10 19:04:08 +00004943}</pre>
Daniel Veillard14fff061999-06-22 21:49:07 +00004944
Daniel Veillardfabafd52006-06-08 08:16:33 +00004945<p>Once you are used to it, writing this kind of code is quite
4946simple,butboring. Ultimately, it could be possible to write stubbers taking
4947eitherCdata structure definitions, a set of XML examples or an XML DTD
4948andproducethe code needed to import and export the content between C data
4949andXMLstorage. This is left as an exercise to the reader :-)</p>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004950
Daniel Veillardfabafd52006-06-08 08:16:33 +00004951<p>Feel free to use <a href="example/gjobread.c">the code for the
4952fullCparsing example</a>as a template, it is also available with Makefile
4953intheGnome CVS base under gnome-xml/example</p>
Daniel Veillardb05deb71999-08-10 19:04:08 +00004954
Daniel Veillardc310d562000-06-23 18:32:15 +00004955<h2><a name="Contributi">Contributions</a></h2>
4956<ul>
Daniel Veillardfabafd52006-06-08 08:16:33 +00004957 <li>Bjorn Reese, William Brack and Thomas Broyer have provided a
4958 numberofpatches, Gary Pennington worked on the validation API,
4959 threadingsupportand Solaris port.</li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +00004960 <li>John Fleck helps maintaining the documentation and man pages.</li>
Daniel Veillardfabafd52006-06-08 08:16:33 +00004961 <li><a href="mailto:igor@zlatkovic.com">Igor Zlatkovic</a>is
4962 nowthemaintainer of the Windows port, <a
4963 href="http://www.zlatkovic.com/projects/libxml/index.html">heprovidesbinaries</a></li>
Daniel Veillard69839ba2006-06-06 13:27:03 +00004964 <li><a href="mailto:Gary.Pennington@sun.com">Gary Pennington</a>provides<a
4965 href="http://garypennington.net/libxml2/">Solaris binaries</a></li>
Daniel Veillarde356c282001-03-10 12:32:04 +00004966 <li><a
Daniel Veillardfabafd52006-06-08 08:16:33 +00004967 href="http://mail.gnome.org/archives/xml/2001-March/msg00014.html">MattSergeant</a>developed<a
4968 href="http://axkit.org/download/">XML::LibXSLT</a>, a Perl
4969 wrapperforlibxml2/libxslt as part of the <a
4970 href="http://axkit.com/">AxKitXMLapplication server</a></li>
Daniel Veillard69839ba2006-06-06 13:27:03 +00004971 <li><a href="mailto:fnatter@gmx.net">Felix Natter</a>and <a
4972 href="mailto:geertk@ai.rug.nl">Geert Kloosterman</a>provide <a
Daniel Veillardfabafd52006-06-08 08:16:33 +00004973 href="libxml-doc.el">an emacs module</a>to lookup
4974 libxml(2)functionsdocumentation</li>
Daniel Veillard69839ba2006-06-06 13:27:03 +00004975 <li><a href="mailto:sherwin@nlm.nih.gov">Ziying Sherwin</a>provided <a
Daniel Veillardaf43f632002-03-08 15:05:20 +00004976 href="http://xmlsoft.org/messages/0488.html">man pages</a></li>
Daniel Veillard5168dbf2001-07-07 00:18:23 +00004977 <li>there is a module for <a
Daniel Veillardfabafd52006-06-08 08:16:33 +00004978 href="http://acs-misc.sourceforge.net/nsxml.html">libxml/libxsltsupportin
4979 OpenNSD/AOLServer</a></li>
4980 <li><a href="mailto:dkuhlman@cutter.rexx.com">Dave
4981 Kuhlman</a>providedthefirst version of libxml/libxslt <a
Daniel Veillard2d347fa2002-03-17 10:34:11 +00004982 href="http://www.rexx.com/~dkuhlman">wrappers for Python</a></li>
Daniel Veillard1aadc442001-11-28 13:10:32 +00004983 <li>Petr Kozelka provides <a
Daniel Veillardfabafd52006-06-08 08:16:33 +00004984 href="http://sourceforge.net/projects/libxml2-pas">Pascal units
4985 togluelibxml2</a>with Kylix and Delphi and other Pascal compilers</li>
Daniel Veillard69839ba2006-06-06 13:27:03 +00004986 <li><a href="mailto:aleksey@aleksey.com">Aleksey Sanin</a>implemented the<a
Daniel Veillardfabafd52006-06-08 08:16:33 +00004987 href="http://www.w3.org/Signature/">XML Canonicalization and
4988 XMLDigitalSignature</a><a
4989 href="http://www.aleksey.com/xmlsec/">implementations forlibxml2</a></li>
4990 <li><a href="mailto:Steve.Ball@explain.com.au">SteveBall</a>andcontributors
4991 maintain <a href="http://tclxml.sourceforge.net/">tclbindings for libxml2
4992 andlibxslt</a>, as well as <a
4993 href="http://tclxml.sf.net/tkxmllint.html">tkxmllint</a>a GUI
4994 forxmllintand <a
4995 href="http://tclxml.sf.net/tkxsltproc.html">tkxsltproc</a>a GUIfor
4996 xsltproc.</li>
Daniel Veillardc310d562000-06-23 18:32:15 +00004997</ul>
4998
Daniel Veillardc8eab3a1999-09-04 18:27:23 +00004999<p></p>
Daniel Veillardccb09631998-10-27 06:21:04 +00005000</body>
5001</html>