blob: 9d9bd9025380daa81bd1ad88edf32d51f4d9e1a4 [file] [log] [blame]
Daniel Veillard43d3f612001-11-10 11:57:23 +00001<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd">
Daniel Veillardc9484202001-10-24 12:35:52 +00002<html>
3<head>
Daniel Veillard7216cfd2002-11-08 15:10:00 +00004<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
Daniel Veillardc332dab2002-03-29 14:08:27 +00005<link rel="SHORTCUT ICON" href="/favicon.ico">
Daniel Veillardc9484202001-10-24 12:35:52 +00006<style type="text/css"><!--
Daniel Veillard373a4752002-02-21 14:46:29 +00007TD {font-family: Verdana,Arial,Helvetica}
8BODY {font-family: Verdana,Arial,Helvetica; margin-top: 2em; margin-left: 0em; margin-right: 0em}
9H1 {font-family: Verdana,Arial,Helvetica}
10H2 {font-family: Verdana,Arial,Helvetica}
11H3 {font-family: Verdana,Arial,Helvetica}
Daniel Veillardb8cfbd12001-10-25 10:53:28 +000012A:link, A:visited, A:active { text-decoration: underline }
Daniel Veillardc9484202001-10-24 12:35:52 +000013--></style>
14<title>A real example</title>
15</head>
16<body bgcolor="#8b7765" text="#000000" link="#000000" vlink="#000000">
17<table border="0" width="100%" cellpadding="5" cellspacing="0" align="center"><tr>
18<td width="180">
Daniel Veillard8f40f1e2002-08-28 21:18:45 +000019<a href="http://www.gnome.org/"><img src="gnome2.png" alt="Gnome2 Logo"></a><a href="http://www.w3.org/Status"><img src="w3c.png" alt="W3C Logo"></a><a href="http://www.redhat.com/"><img src="redhat.gif" alt="Red Hat Logo"></a><div align="left"><a href="http://xmlsoft.org/"><img src="Libxml2-Logo-180x168.gif" alt="Made with Libxml2 Logo"></a></div>
Daniel Veillardc9484202001-10-24 12:35:52 +000020</td>
21<td><table border="0" width="90%" cellpadding="2" cellspacing="0" align="center" bgcolor="#000000"><tr><td><table width="100%" border="0" cellspacing="1" cellpadding="3" bgcolor="#fffacd"><tr><td align="center">
22<h1>The XML C library for Gnome</h1>
23<h2>A real example</h2>
24</td></tr></table></td></tr></table></td>
25</tr></table>
26<table border="0" cellpadding="4" cellspacing="0" width="100%" align="center"><tr><td bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="2" width="100%"><tr>
Daniel Veillard594cf0b2001-10-25 08:09:12 +000027<td valign="top" width="200" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td>
28<table width="100%" border="0" cellspacing="1" cellpadding="3">
Daniel Veillardc9484202001-10-24 12:35:52 +000029<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr>
Daniel Veillard4a603e42003-01-11 14:18:53 +000030<tr><td bgcolor="#fffacd">
31<form action="search.php" enctype="application/x-www-form-urlencoded" method="GET">
32<input name="query" type="TEXT" size="20" value=""><input name="submit" type="submit" value="Search ...">
33</form>
34<ul>
Daniel Veillardc9484202001-10-24 12:35:52 +000035<li><a href="index.html">Home</a></li>
Daniel Veillardc9484202001-10-24 12:35:52 +000036<li><a href="intro.html">Introduction</a></li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +000037<li><a href="FAQ.html">FAQ</a></li>
Daniel Veillardc9484202001-10-24 12:35:52 +000038<li><a href="docs.html">Documentation</a></li>
39<li><a href="bugs.html">Reporting bugs and getting help</a></li>
40<li><a href="help.html">How to help</a></li>
41<li><a href="downloads.html">Downloads</a></li>
42<li><a href="news.html">News</a></li>
Daniel Veillard7b602b42002-01-08 13:26:00 +000043<li><a href="XMLinfo.html">XML</a></li>
Daniel Veillardc9484202001-10-24 12:35:52 +000044<li><a href="XSLT.html">XSLT</a></li>
Daniel Veillard6dbcaf82002-02-20 14:37:47 +000045<li><a href="python.html">Python and bindings</a></li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +000046<li><a href="architecture.html">libxml architecture</a></li>
Daniel Veillardc9484202001-10-24 12:35:52 +000047<li><a href="tree.html">The tree output</a></li>
48<li><a href="interface.html">The SAX interface</a></li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +000049<li><a href="xmldtd.html">Validation &amp; DTDs</a></li>
50<li><a href="xmlmem.html">Memory Management</a></li>
51<li><a href="encoding.html">Encodings support</a></li>
52<li><a href="xmlio.html">I/O Interfaces</a></li>
53<li><a href="catalog.html">Catalog support</a></li>
54<li><a href="library.html">The parser interfaces</a></li>
Daniel Veillardc9484202001-10-24 12:35:52 +000055<li><a href="entities.html">Entities or no entities</a></li>
56<li><a href="namespaces.html">Namespaces</a></li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +000057<li><a href="upgrade.html">Upgrading 1.x code</a></li>
Daniel Veillard52dcab32001-10-30 12:51:17 +000058<li><a href="threads.html">Thread safety</a></li>
Daniel Veillardc9484202001-10-24 12:35:52 +000059<li><a href="DOM.html">DOM Principles</a></li>
60<li><a href="example.html">A real example</a></li>
61<li><a href="contribs.html">Contributions</a></li>
Daniel Veillard7b4b2f92003-01-06 13:11:20 +000062<li><a href="xmlreader.html">The Reader Interface</a></li>
Daniel Veillardfc59c092002-06-05 14:48:26 +000063<li><a href="tutorial/index.html">Tutorial</a></li>
Daniel Veillard7b4b2f92003-01-06 13:11:20 +000064<li><a href="guidelines.html">XML Guidelines</a></li>
Daniel Veillard594cf0b2001-10-25 08:09:12 +000065<li>
66<a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a>
67</li>
Daniel Veillard5ede35e2002-10-01 11:37:35 +000068</ul>
69</td></tr>
Daniel Veillard3bf65be2002-01-23 12:36:34 +000070</table>
71<table width="100%" border="0" cellspacing="1" cellpadding="3">
Daniel Veillard594cf0b2001-10-25 08:09:12 +000072<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Related links</b></center></td></tr>
Daniel Veillard8acca112002-01-21 09:52:27 +000073<tr><td bgcolor="#fffacd"><ul>
Daniel Veillard594cf0b2001-10-25 08:09:12 +000074<li><a href="http://mail.gnome.org/archives/xml/">Mail archive</a></li>
75<li><a href="http://xmlsoft.org/XSLT/">XSLT libxslt</a></li>
Daniel Veillard4a859202002-01-08 11:49:22 +000076<li><a href="http://phd.cs.unibo.it/gdome2/">DOM gdome2</a></li>
Daniel Veillard2d347fa2002-03-17 10:34:11 +000077<li><a href="http://www.aleksey.com/xmlsec/">XML-DSig xmlsec</a></li>
Daniel Veillard594cf0b2001-10-25 08:09:12 +000078<li><a href="ftp://xmlsoft.org/">FTP</a></li>
Daniel Veillardc84f8b52002-12-19 22:12:47 +000079<li><a href="http://www.zlatkovic.com/projects/libxml/">Windows binaries</a></li>
Daniel Veillarddb9dfd92001-11-26 17:25:02 +000080<li><a href="http://garypennington.net/libxml2/">Solaris binaries</a></li>
Daniel Veillardcb7543b2002-09-09 10:54:06 +000081<li><a href="http://www.zveno.com/open_source/libxml2xslt.html">MacOsX binaries</a></li>
Daniel Veillarde6d8e202002-05-02 06:11:10 +000082<li><a href="http://sourceforge.net/projects/libxml2-pas/">Pascal bindings</a></li>
Daniel Veillard2d347fa2002-03-17 10:34:11 +000083<li><a href="http://bugzilla.gnome.org/buglist.cgi?product=libxml&amp;product=libxml2">Bug Tracker</a></li>
Daniel Veillard594cf0b2001-10-25 08:09:12 +000084</ul></td></tr>
85</table>
Daniel Veillard4a603e42003-01-11 14:18:53 +000086<table width="100%" border="0" cellspacing="1" cellpadding="3">
87<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>API Indexes</b></center></td></tr>
88<tr><td bgcolor="#fffacd"><ul>
89<li><a href="APIchunk0.html">Alphabetic</a></li>
90<li><a href="APIconstructors.html">Constructors</a></li>
91<li><a href="APIfunctions.html">Functions/Types</a></li>
92<li><a href="APIfiles.html">Modules</a></li>
93<li><a href="APIsymbols.html">Symbols</a></li>
94</ul></td></tr>
95</table>
Daniel Veillard594cf0b2001-10-25 08:09:12 +000096</td></tr></table></td>
Daniel Veillardc9484202001-10-24 12:35:52 +000097<td valign="top" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%"><tr><td><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td><table border="0" cellpadding="3" cellspacing="1" width="100%"><tr><td bgcolor="#fffacd">
98<p>Here is a real size example, where the actual content of the application
99data is not kept in the DOM tree but uses internal structures. It is based on
100a proposal to keep a database of jobs related to Gnome, with an XML based
101storage structure. Here is an <a href="gjobs.xml">XML encoded jobs
102base</a>:</p>
103<pre>&lt;?xml version=&quot;1.0&quot;?&gt;
104&lt;gjob:Helping xmlns:gjob=&quot;http://www.gnome.org/some-location&quot;&gt;
105 &lt;gjob:Jobs&gt;
106
107 &lt;gjob:Job&gt;
108 &lt;gjob:Project ID=&quot;3&quot;/&gt;
109 &lt;gjob:Application&gt;GBackup&lt;/gjob:Application&gt;
110 &lt;gjob:Category&gt;Development&lt;/gjob:Category&gt;
111
112 &lt;gjob:Update&gt;
113 &lt;gjob:Status&gt;Open&lt;/gjob:Status&gt;
114 &lt;gjob:Modified&gt;Mon, 07 Jun 1999 20:27:45 -0400 MET DST&lt;/gjob:Modified&gt;
115 &lt;gjob:Salary&gt;USD 0.00&lt;/gjob:Salary&gt;
116 &lt;/gjob:Update&gt;
117
118 &lt;gjob:Developers&gt;
119 &lt;gjob:Developer&gt;
120 &lt;/gjob:Developer&gt;
121 &lt;/gjob:Developers&gt;
122
123 &lt;gjob:Contact&gt;
124 &lt;gjob:Person&gt;Nathan Clemons&lt;/gjob:Person&gt;
125 &lt;gjob:Email&gt;nathan@windsofstorm.net&lt;/gjob:Email&gt;
126 &lt;gjob:Company&gt;
127 &lt;/gjob:Company&gt;
128 &lt;gjob:Organisation&gt;
129 &lt;/gjob:Organisation&gt;
130 &lt;gjob:Webpage&gt;
131 &lt;/gjob:Webpage&gt;
132 &lt;gjob:Snailmail&gt;
133 &lt;/gjob:Snailmail&gt;
134 &lt;gjob:Phone&gt;
135 &lt;/gjob:Phone&gt;
136 &lt;/gjob:Contact&gt;
137
138 &lt;gjob:Requirements&gt;
139 The program should be released as free software, under the GPL.
140 &lt;/gjob:Requirements&gt;
141
142 &lt;gjob:Skills&gt;
143 &lt;/gjob:Skills&gt;
144
145 &lt;gjob:Details&gt;
146 A GNOME based system that will allow a superuser to configure
147 compressed and uncompressed files and/or file systems to be backed
148 up with a supported media in the system. This should be able to
149 perform via find commands generating a list of files that are passed
150 to tar, dd, cpio, cp, gzip, etc., to be directed to the tape machine
151 or via operations performed on the filesystem itself. Email
152 notification and GUI status display very important.
153 &lt;/gjob:Details&gt;
154
155 &lt;/gjob:Job&gt;
156
157 &lt;/gjob:Jobs&gt;
158&lt;/gjob:Helping&gt;</pre>
159<p>While loading the XML file into an internal DOM tree is a matter of
Daniel Veillard63d83142002-05-20 06:51:05 +0000160calling only a couple of functions, browsing the tree to gather the data and
Daniel Veillardc9484202001-10-24 12:35:52 +0000161generate the internal structures is harder, and more error prone.</p>
162<p>The suggested principle is to be tolerant with respect to the input
163structure. For example, the ordering of the attributes is not significant,
164the XML specification is clear about it. It's also usually a good idea not to
165depend on the order of the children of a given node, unless it really makes
166things harder. Here is some code to parse the information for a person:</p>
167<pre>/*
168 * A person record
169 */
170typedef struct person {
171 char *name;
172 char *email;
173 char *company;
174 char *organisation;
175 char *smail;
176 char *webPage;
177 char *phone;
178} person, *personPtr;
179
180/*
181 * And the code needed to parse it
182 */
183personPtr parsePerson(xmlDocPtr doc, xmlNsPtr ns, xmlNodePtr cur) {
184 personPtr ret = NULL;
185
186DEBUG(&quot;parsePerson\n&quot;);
187 /*
188 * allocate the struct
189 */
190 ret = (personPtr) malloc(sizeof(person));
191 if (ret == NULL) {
192 fprintf(stderr,&quot;out of memory\n&quot;);
193 return(NULL);
194 }
195 memset(ret, 0, sizeof(person));
196
197 /* We don't care what the top level element name is */
198 cur = cur-&gt;xmlChildrenNode;
199 while (cur != NULL) {
200 if ((!strcmp(cur-&gt;name, &quot;Person&quot;)) &amp;&amp; (cur-&gt;ns == ns))
201 ret-&gt;name = xmlNodeListGetString(doc, cur-&gt;xmlChildrenNode, 1);
202 if ((!strcmp(cur-&gt;name, &quot;Email&quot;)) &amp;&amp; (cur-&gt;ns == ns))
203 ret-&gt;email = xmlNodeListGetString(doc, cur-&gt;xmlChildrenNode, 1);
204 cur = cur-&gt;next;
205 }
206
207 return(ret);
208}</pre>
209<p>Here are a couple of things to notice:</p>
210<ul>
211<li>Usually a recursive parsing style is the more convenient one: XML data
Daniel Veillard63d83142002-05-20 06:51:05 +0000212 is by nature subject to repetitive constructs and usually exhibits highly
213 structured patterns.</li>
Daniel Veillard0b28e882002-07-24 23:47:05 +0000214 <li>The two arguments of type <em>xmlDocPtr</em> and <em>xmlNsPtr</em>,
Daniel Veillardc9484202001-10-24 12:35:52 +0000215 i.e. the pointer to the global XML document and the namespace reserved to
216 the application. Document wide information are needed for example to
217 decode entities and it's a good coding practice to define a namespace for
218 your application set of data and test that the element and attributes
219 you're analyzing actually pertains to your application space. This is
220 done by a simple equality test (cur-&gt;ns == ns).</li>
Daniel Veillard0b28e882002-07-24 23:47:05 +0000221 <li>To retrieve text and attributes value, you can use the function
Daniel Veillardc9484202001-10-24 12:35:52 +0000222 <em>xmlNodeListGetString</em> to gather all the text and entity reference
223 nodes generated by the DOM output and produce an single text string.</li>
224</ul>
225<p>Here is another piece of code used to parse another level of the
226structure:</p>
227<pre>#include &lt;libxml/tree.h&gt;
228/*
229 * a Description for a Job
230 */
231typedef struct job {
232 char *projectID;
233 char *application;
234 char *category;
235 personPtr contact;
236 int nbDevelopers;
237 personPtr developers[100]; /* using dynamic alloc is left as an exercise */
238} job, *jobPtr;
239
240/*
241 * And the code needed to parse it
242 */
243jobPtr parseJob(xmlDocPtr doc, xmlNsPtr ns, xmlNodePtr cur) {
244 jobPtr ret = NULL;
245
246DEBUG(&quot;parseJob\n&quot;);
247 /*
248 * allocate the struct
249 */
250 ret = (jobPtr) malloc(sizeof(job));
251 if (ret == NULL) {
252 fprintf(stderr,&quot;out of memory\n&quot;);
253 return(NULL);
254 }
255 memset(ret, 0, sizeof(job));
256
257 /* We don't care what the top level element name is */
258 cur = cur-&gt;xmlChildrenNode;
259 while (cur != NULL) {
260
261 if ((!strcmp(cur-&gt;name, &quot;Project&quot;)) &amp;&amp; (cur-&gt;ns == ns)) {
262 ret-&gt;projectID = xmlGetProp(cur, &quot;ID&quot;);
263 if (ret-&gt;projectID == NULL) {
264 fprintf(stderr, &quot;Project has no ID\n&quot;);
265 }
266 }
267 if ((!strcmp(cur-&gt;name, &quot;Application&quot;)) &amp;&amp; (cur-&gt;ns == ns))
268 ret-&gt;application = xmlNodeListGetString(doc, cur-&gt;xmlChildrenNode, 1);
269 if ((!strcmp(cur-&gt;name, &quot;Category&quot;)) &amp;&amp; (cur-&gt;ns == ns))
270 ret-&gt;category = xmlNodeListGetString(doc, cur-&gt;xmlChildrenNode, 1);
271 if ((!strcmp(cur-&gt;name, &quot;Contact&quot;)) &amp;&amp; (cur-&gt;ns == ns))
272 ret-&gt;contact = parsePerson(doc, ns, cur);
273 cur = cur-&gt;next;
274 }
275
276 return(ret);
277}</pre>
278<p>Once you are used to it, writing this kind of code is quite simple, but
Daniel Veillard63d83142002-05-20 06:51:05 +0000279boring. Ultimately, it could be possible to write stubbers taking either C
Daniel Veillardc9484202001-10-24 12:35:52 +0000280data structure definitions, a set of XML examples or an XML DTD and produce
281the code needed to import and export the content between C data and XML
282storage. This is left as an exercise to the reader :-)</p>
283<p>Feel free to use <a href="example/gjobread.c">the code for the full C
284parsing example</a> as a template, it is also available with Makefile in the
285Gnome CVS base under gnome-xml/example</p>
Daniel Veillard3f4c40f2002-02-13 09:19:28 +0000286<p><a href="bugs.html">Daniel Veillard</a></p>
Daniel Veillardc9484202001-10-24 12:35:52 +0000287</td></tr></table></td></tr></table></td></tr></table></td>
288</tr></table></td></tr></table>
289</body>
290</html>