blob: 289c7f518760a99deb15c406adcb8b3b26566248 [file] [log] [blame]
Daniel Veillardc9484202001-10-24 12:35:52 +00001<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
2<html>
3<head>
4<meta content="text/html; charset=ISO-8859-1" http-equiv="Content-Type">
5<style type="text/css"><!--
6TD {font-size: 10pt; font-family: Verdana,Arial,Helvetica}
7BODY {font-size: 10pt; font-family: Verdana,Arial,Helvetica; margin-top: 5pt; margin-left: 0pt; margin-right: 0pt}
8H1 {font-size: 16pt; font-family: Verdana,Arial,Helvetica}
9H2 {font-size: 14pt; font-family: Verdana,Arial,Helvetica}
10H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
11--></style>
12<title>The XML library interfaces</title>
13</head>
14<body bgcolor="#8b7765" text="#000000" link="#000000" vlink="#000000">
15<table border="0" width="100%" cellpadding="5" cellspacing="0" align="center"><tr>
16<td width="180">
17<a href="http://www.gnome.org/"><img src="smallfootonly.gif" alt="Gnome Logo"></a><a href="http://www.w3.org/Status"><img src="w3c.png" alt="W3C Logo"></a><a href="http://www.redhat.com/"><img src="redhat.gif" alt="Red Hat Logo"></a>
18</td>
19<td><table border="0" width="90%" cellpadding="2" cellspacing="0" align="center" bgcolor="#000000"><tr><td><table width="100%" border="0" cellspacing="1" cellpadding="3" bgcolor="#fffacd"><tr><td align="center">
20<h1>The XML C library for Gnome</h1>
21<h2>The XML library interfaces</h2>
22</td></tr></table></td></tr></table></td>
23</tr></table>
24<table border="0" cellpadding="4" cellspacing="0" width="100%" align="center"><tr><td bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="2" width="100%"><tr>
25<td valign="top" width="200" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td><table width="100%" border="0" cellspacing="1" cellpadding="3">
26<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr>
27<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
28<li><a href="index.html">Home</a></li>
29<li><a href="FAQ.html">FAQ</a></li>
30<li><a href="intro.html">Introduction</a></li>
31<li><a href="docs.html">Documentation</a></li>
32<li><a href="bugs.html">Reporting bugs and getting help</a></li>
33<li><a href="help.html">How to help</a></li>
34<li><a href="downloads.html">Downloads</a></li>
35<li><a href="news.html">News</a></li>
36<li><a href="XML.html">XML</a></li>
37<li><a href="XSLT.html">XSLT</a></li>
38<li><a href="architecture.html">An overview of libxml architecture</a></li>
39<li><a href="tree.html">The tree output</a></li>
40<li><a href="interface.html">The SAX interface</a></li>
41<li><a href="library.html">The XML library interfaces</a></li>
42<li><a href="entities.html">Entities or no entities</a></li>
43<li><a href="namespaces.html">Namespaces</a></li>
44<li><a href="valid.html">Validation, or are you afraid of DTDs ?</a></li>
45<li><a href="DOM.html">DOM Principles</a></li>
46<li><a href="example.html">A real example</a></li>
47<li><a href="contribs.html">Contributions</a></li>
48<li><a href="encoding.html">Encodings support</a></li>
49<li><a href="catalog.html">Catalogs support</a></li>
50<li><a href="xmlio.html">I/O interfaces</a></li>
51<li><a href="xmlmem.html">Memory interfaces</a></li>
52<li><a href="xmldtd.html">DTD support</a></li>
53<li><a href="xml.html">flat page</a></li>
54</ul></td></tr>
55</table></td></tr></table></td>
56<td valign="top" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%"><tr><td><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td><table border="0" cellpadding="3" cellspacing="1" width="100%"><tr><td bgcolor="#fffacd">
57<p>This section is directly intended to help programmers getting bootstrapped
58using the XML library from the C language. It is not intended to be
59extensive. I hope the automatically generated documents will provide the
60completeness required, but as a separate set of documents. The interfaces of
61the XML library are by principle low level, there is nearly zero abstraction.
62Those interested in a higher level API should <a href="#DOM">look at
63DOM</a>.</p>
64<p>The <a href="html/libxml-parser.html">parser interfaces for XML</a> are
65separated from the <a href="html/libxml-htmlparser.html">HTML parser
66interfaces</a>. Let's have a look at how the XML parser can be called:</p>
67<h3><a name="Invoking">Invoking the parser : the pull method</a></h3>
68<p>Usually, the first thing to do is to read an XML input. The parser accepts
69documents either from in-memory strings or from files. The functions are
70defined in &quot;parser.h&quot;:</p>
71<dl>
72<dt><code>xmlDocPtr xmlParseMemory(char *buffer, int size);</code></dt>
73<dd><p>Parse a null-terminated string containing the document.</p></dd>
74</dl>
75<dl>
76<dt><code>xmlDocPtr xmlParseFile(const char *filename);</code></dt>
77<dd><p>Parse an XML document contained in a (possibly compressed)
78 file.</p></dd>
79</dl>
80<p>The parser returns a pointer to the document structure (or NULL in case of
81failure).</p>
82<h3 id="Invoking1">Invoking the parser: the push method</h3>
83<p>In order for the application to keep the control when the document is
84being fetched (which is common for GUI based programs) libxml provides a push
85interface, too, as of version 1.8.3. Here are the interface functions:</p>
86<pre>xmlParserCtxtPtr xmlCreatePushParserCtxt(xmlSAXHandlerPtr sax,
87 void *user_data,
88 const char *chunk,
89 int size,
90 const char *filename);
91int xmlParseChunk (xmlParserCtxtPtr ctxt,
92 const char *chunk,
93 int size,
94 int terminate);</pre>
95<p>and here is a simple example showing how to use the interface:</p>
96<pre> FILE *f;
97
98 f = fopen(filename, &quot;r&quot;);
99 if (f != NULL) {
100 int res, size = 1024;
101 char chars[1024];
102 xmlParserCtxtPtr ctxt;
103
104 res = fread(chars, 1, 4, f);
105 if (res &gt; 0) {
106 ctxt = xmlCreatePushParserCtxt(NULL, NULL,
107 chars, res, filename);
108 while ((res = fread(chars, 1, size, f)) &gt; 0) {
109 xmlParseChunk(ctxt, chars, res, 0);
110 }
111 xmlParseChunk(ctxt, chars, 0, 1);
112 doc = ctxt-&gt;myDoc;
113 xmlFreeParserCtxt(ctxt);
114 }
115 }</pre>
116<p>The HTML parser embedded into libxml also has a push interface; the
117functions are just prefixed by &quot;html&quot; rather than &quot;xml&quot;.</p>
118<h3 id="Invoking2">Invoking the parser: the SAX interface</h3>
119<p>The tree-building interface makes the parser memory-hungry, first loading
120the document in memory and then building the tree itself. Reading a document
121without building the tree is possible using the SAX interfaces (see SAX.h and
122<a href="http://www.daa.com.au/~james/gnome/xml-sax/xml-sax.html">James
123Henstridge's documentation</a>). Note also that the push interface can be
124limited to SAX: just use the two first arguments of
125<code>xmlCreatePushParserCtxt()</code>.</p>
126<h3><a name="Building">Building a tree from scratch</a></h3>
127<p>The other way to get an XML tree in memory is by building it. Basically
128there is a set of functions dedicated to building new elements. (These are
129also described in &lt;libxml/tree.h&gt;.) For example, here is a piece of
130code that produces the XML document used in the previous examples:</p>
131<pre> #include &lt;libxml/tree.h&gt;
132 xmlDocPtr doc;
133 xmlNodePtr tree, subtree;
134
135 doc = xmlNewDoc(&quot;1.0&quot;);
136 doc-&gt;children = xmlNewDocNode(doc, NULL, &quot;EXAMPLE&quot;, NULL);
137 xmlSetProp(doc-&gt;children, &quot;prop1&quot;, &quot;gnome is great&quot;);
138 xmlSetProp(doc-&gt;children, &quot;prop2&quot;, &quot;&amp; linux too&quot;);
139 tree = xmlNewChild(doc-&gt;children, NULL, &quot;head&quot;, NULL);
140 subtree = xmlNewChild(tree, NULL, &quot;title&quot;, &quot;Welcome to Gnome&quot;);
141 tree = xmlNewChild(doc-&gt;children, NULL, &quot;chapter&quot;, NULL);
142 subtree = xmlNewChild(tree, NULL, &quot;title&quot;, &quot;The Linux adventure&quot;);
143 subtree = xmlNewChild(tree, NULL, &quot;p&quot;, &quot;bla bla bla ...&quot;);
144 subtree = xmlNewChild(tree, NULL, &quot;image&quot;, NULL);
145 xmlSetProp(subtree, &quot;href&quot;, &quot;linus.gif&quot;);</pre>
146<p>Not really rocket science ...</p>
147<h3><a name="Traversing">Traversing the tree</a></h3>
148<p>Basically by <a href="html/libxml-tree.html">including &quot;tree.h&quot;</a> your
149code has access to the internal structure of all the elements of the tree.
150The names should be somewhat simple like <strong>parent</strong>,
151<strong>children</strong>, <strong>next</strong>, <strong>prev</strong>,
152<strong>properties</strong>, etc... For example, still with the previous
153example:</p>
154<pre><code>doc-&gt;children-&gt;children-&gt;children</code></pre>
155<p>points to the title element,</p>
156<pre>doc-&gt;children-&gt;children-&gt;next-&gt;children-&gt;children</pre>
157<p>points to the text node containing the chapter title &quot;The Linux
158adventure&quot;.</p>
159<p>
160<strong>NOTE</strong>: XML allows <em>PI</em>s and <em>comments</em> to be
161present before the document root, so <code>doc-&gt;children</code> may point
162to an element which is not the document Root Element; a function
163<code>xmlDocGetRootElement()</code> was added for this purpose.</p>
164<h3><a name="Modifying">Modifying the tree</a></h3>
165<p>Functions are provided for reading and writing the document content. Here
166is an excerpt from the <a href="html/libxml-tree.html">tree API</a>:</p>
167<dl>
168<dt><code>xmlAttrPtr xmlSetProp(xmlNodePtr node, const xmlChar *name, const
169 xmlChar *value);</code></dt>
170<dd><p>This sets (or changes) an attribute carried by an ELEMENT node.
171 The value can be NULL.</p></dd>
172</dl>
173<dl>
174<dt><code>const xmlChar *xmlGetProp(xmlNodePtr node, const xmlChar
175 *name);</code></dt>
176<dd><p>This function returns a pointer to new copy of the property
177 content. Note that the user must deallocate the result.</p></dd>
178</dl>
179<p>Two functions are provided for reading and writing the text associated
180with elements:</p>
181<dl>
182<dt><code>xmlNodePtr xmlStringGetNodeList(xmlDocPtr doc, const xmlChar
183 *value);</code></dt>
184<dd><p>This function takes an &quot;external&quot; string and converts it to one
185 text node or possibly to a list of entity and text nodes. All
186 non-predefined entity references like &amp;Gnome; will be stored
187 internally as entity nodes, hence the result of the function may not be
188 a single node.</p></dd>
189</dl>
190<dl>
191<dt><code>xmlChar *xmlNodeListGetString(xmlDocPtr doc, xmlNodePtr list, int
192 inLine);</code></dt>
193<dd><p>This function is the inverse of
194 <code>xmlStringGetNodeList()</code>. It generates a new string
195 containing the content of the text and entity nodes. Note the extra
196 argument inLine. If this argument is set to 1, the function will expand
197 entity references. For example, instead of returning the &amp;Gnome;
198 XML encoding in the string, it will substitute it with its value (say,
199 &quot;GNU Network Object Model Environment&quot;).</p></dd>
200</dl>
201<h3><a name="Saving">Saving a tree</a></h3>
202<p>Basically 3 options are possible:</p>
203<dl>
204<dt><code>void xmlDocDumpMemory(xmlDocPtr cur, xmlChar**mem, int
205 *size);</code></dt>
206<dd><p>Returns a buffer into which the document has been saved.</p></dd>
207</dl>
208<dl>
209<dt><code>extern void xmlDocDump(FILE *f, xmlDocPtr doc);</code></dt>
210<dd><p>Dumps a document to an open file descriptor.</p></dd>
211</dl>
212<dl>
213<dt><code>int xmlSaveFile(const char *filename, xmlDocPtr cur);</code></dt>
214<dd><p>Saves the document to a file. In this case, the compression
215 interface is triggered if it has been turned on.</p></dd>
216</dl>
217<h3><a name="Compressio">Compression</a></h3>
218<p>The library transparently handles compression when doing file-based
219accesses. The level of compression on saves can be turned on either globally
220or individually for one file:</p>
221<dl>
222<dt><code>int xmlGetDocCompressMode (xmlDocPtr doc);</code></dt>
223<dd><p>Gets the document compression ratio (0-9).</p></dd>
224</dl>
225<dl>
226<dt><code>void xmlSetDocCompressMode (xmlDocPtr doc, int mode);</code></dt>
227<dd><p>Sets the document compression ratio.</p></dd>
228</dl>
229<dl>
230<dt><code>int xmlGetCompressMode(void);</code></dt>
231<dd><p>Gets the default compression ratio.</p></dd>
232</dl>
233<dl>
234<dt><code>void xmlSetCompressMode(int mode);</code></dt>
235<dd><p>Sets the default compression ratio.</p></dd>
236</dl>
237<p><a href="mailto:daniel@veillard.com">Daniel Veillard</a></p>
238</td></tr></table></td></tr></table></td></tr></table></td>
239</tr></table></td></tr></table>
240</body>
241</html>