blob: 14ccaa1c41398f5f3a1bfe4c63871310f5bcbd2f [file] [log] [blame]
Daniel Veillardc9484202001-10-24 12:35:52 +00001<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
2<html>
3<head>
4<meta content="text/html; charset=ISO-8859-1" http-equiv="Content-Type">
5<style type="text/css"><!--
6TD {font-size: 10pt; font-family: Verdana,Arial,Helvetica}
7BODY {font-size: 10pt; font-family: Verdana,Arial,Helvetica; margin-top: 5pt; margin-left: 0pt; margin-right: 0pt}
8H1 {font-size: 16pt; font-family: Verdana,Arial,Helvetica}
9H2 {font-size: 14pt; font-family: Verdana,Arial,Helvetica}
10H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
11--></style>
12<title>A real example</title>
13</head>
14<body bgcolor="#8b7765" text="#000000" link="#000000" vlink="#000000">
15<table border="0" width="100%" cellpadding="5" cellspacing="0" align="center"><tr>
16<td width="180">
17<a href="http://www.gnome.org/"><img src="smallfootonly.gif" alt="Gnome Logo"></a><a href="http://www.w3.org/Status"><img src="w3c.png" alt="W3C Logo"></a><a href="http://www.redhat.com/"><img src="redhat.gif" alt="Red Hat Logo"></a>
18</td>
19<td><table border="0" width="90%" cellpadding="2" cellspacing="0" align="center" bgcolor="#000000"><tr><td><table width="100%" border="0" cellspacing="1" cellpadding="3" bgcolor="#fffacd"><tr><td align="center">
20<h1>The XML C library for Gnome</h1>
21<h2>A real example</h2>
22</td></tr></table></td></tr></table></td>
23</tr></table>
24<table border="0" cellpadding="4" cellspacing="0" width="100%" align="center"><tr><td bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="2" width="100%"><tr>
Daniel Veillard594cf0b2001-10-25 08:09:12 +000025<td valign="top" width="200" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td>
26<table width="100%" border="0" cellspacing="1" cellpadding="3">
Daniel Veillardc9484202001-10-24 12:35:52 +000027<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr>
28<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
29<li><a href="index.html">Home</a></li>
30<li><a href="FAQ.html">FAQ</a></li>
31<li><a href="intro.html">Introduction</a></li>
32<li><a href="docs.html">Documentation</a></li>
33<li><a href="bugs.html">Reporting bugs and getting help</a></li>
34<li><a href="help.html">How to help</a></li>
35<li><a href="downloads.html">Downloads</a></li>
36<li><a href="news.html">News</a></li>
37<li><a href="XML.html">XML</a></li>
38<li><a href="XSLT.html">XSLT</a></li>
39<li><a href="architecture.html">An overview of libxml architecture</a></li>
40<li><a href="tree.html">The tree output</a></li>
41<li><a href="interface.html">The SAX interface</a></li>
42<li><a href="library.html">The XML library interfaces</a></li>
43<li><a href="entities.html">Entities or no entities</a></li>
44<li><a href="namespaces.html">Namespaces</a></li>
45<li><a href="valid.html">Validation, or are you afraid of DTDs ?</a></li>
46<li><a href="DOM.html">DOM Principles</a></li>
47<li><a href="example.html">A real example</a></li>
48<li><a href="contribs.html">Contributions</a></li>
49<li><a href="encoding.html">Encodings support</a></li>
50<li><a href="catalog.html">Catalogs support</a></li>
51<li><a href="xmlio.html">I/O interfaces</a></li>
52<li><a href="xmlmem.html">Memory interfaces</a></li>
53<li><a href="xmldtd.html">DTD support</a></li>
Daniel Veillard594cf0b2001-10-25 08:09:12 +000054<li>
55<a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a>
56</li>
Daniel Veillardc9484202001-10-24 12:35:52 +000057</ul></td></tr>
Daniel Veillard594cf0b2001-10-25 08:09:12 +000058</table>
59<table width="100%" border="0" cellspacing="1" cellpadding="3">
60<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Related links</b></center></td></tr>
61<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
62<li><a href="http://mail.gnome.org/archives/xml/">Mail archive</a></li>
63<li><a href="http://xmlsoft.org/XSLT/">XSLT libxslt</a></li>
64<li><a href="http://www.cs.unibo.it/~casarini/gdome2/">DOM gdome2</a></li>
65<li><a href="ftp://xmlsoft.org/">FTP</a></li>
66<li><a href="http://www.fh-frankfurt.de/~igor/projects/libxml/">Windows binaries</a></li>
67<li><a href="http://pages.eidosnet.co.uk/~garypen/libxml/">Solaris binaries</a></li>
68</ul></td></tr>
69</table>
70</td></tr></table></td>
Daniel Veillardc9484202001-10-24 12:35:52 +000071<td valign="top" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%"><tr><td><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td><table border="0" cellpadding="3" cellspacing="1" width="100%"><tr><td bgcolor="#fffacd">
72<p>Here is a real size example, where the actual content of the application
73data is not kept in the DOM tree but uses internal structures. It is based on
74a proposal to keep a database of jobs related to Gnome, with an XML based
75storage structure. Here is an <a href="gjobs.xml">XML encoded jobs
76base</a>:</p>
77<pre>&lt;?xml version=&quot;1.0&quot;?&gt;
78&lt;gjob:Helping xmlns:gjob=&quot;http://www.gnome.org/some-location&quot;&gt;
79 &lt;gjob:Jobs&gt;
80
81 &lt;gjob:Job&gt;
82 &lt;gjob:Project ID=&quot;3&quot;/&gt;
83 &lt;gjob:Application&gt;GBackup&lt;/gjob:Application&gt;
84 &lt;gjob:Category&gt;Development&lt;/gjob:Category&gt;
85
86 &lt;gjob:Update&gt;
87 &lt;gjob:Status&gt;Open&lt;/gjob:Status&gt;
88 &lt;gjob:Modified&gt;Mon, 07 Jun 1999 20:27:45 -0400 MET DST&lt;/gjob:Modified&gt;
89 &lt;gjob:Salary&gt;USD 0.00&lt;/gjob:Salary&gt;
90 &lt;/gjob:Update&gt;
91
92 &lt;gjob:Developers&gt;
93 &lt;gjob:Developer&gt;
94 &lt;/gjob:Developer&gt;
95 &lt;/gjob:Developers&gt;
96
97 &lt;gjob:Contact&gt;
98 &lt;gjob:Person&gt;Nathan Clemons&lt;/gjob:Person&gt;
99 &lt;gjob:Email&gt;nathan@windsofstorm.net&lt;/gjob:Email&gt;
100 &lt;gjob:Company&gt;
101 &lt;/gjob:Company&gt;
102 &lt;gjob:Organisation&gt;
103 &lt;/gjob:Organisation&gt;
104 &lt;gjob:Webpage&gt;
105 &lt;/gjob:Webpage&gt;
106 &lt;gjob:Snailmail&gt;
107 &lt;/gjob:Snailmail&gt;
108 &lt;gjob:Phone&gt;
109 &lt;/gjob:Phone&gt;
110 &lt;/gjob:Contact&gt;
111
112 &lt;gjob:Requirements&gt;
113 The program should be released as free software, under the GPL.
114 &lt;/gjob:Requirements&gt;
115
116 &lt;gjob:Skills&gt;
117 &lt;/gjob:Skills&gt;
118
119 &lt;gjob:Details&gt;
120 A GNOME based system that will allow a superuser to configure
121 compressed and uncompressed files and/or file systems to be backed
122 up with a supported media in the system. This should be able to
123 perform via find commands generating a list of files that are passed
124 to tar, dd, cpio, cp, gzip, etc., to be directed to the tape machine
125 or via operations performed on the filesystem itself. Email
126 notification and GUI status display very important.
127 &lt;/gjob:Details&gt;
128
129 &lt;/gjob:Job&gt;
130
131 &lt;/gjob:Jobs&gt;
132&lt;/gjob:Helping&gt;</pre>
133<p>While loading the XML file into an internal DOM tree is a matter of
134calling only a couple of functions, browsing the tree to gather the ata and
135generate the internal structures is harder, and more error prone.</p>
136<p>The suggested principle is to be tolerant with respect to the input
137structure. For example, the ordering of the attributes is not significant,
138the XML specification is clear about it. It's also usually a good idea not to
139depend on the order of the children of a given node, unless it really makes
140things harder. Here is some code to parse the information for a person:</p>
141<pre>/*
142 * A person record
143 */
144typedef struct person {
145 char *name;
146 char *email;
147 char *company;
148 char *organisation;
149 char *smail;
150 char *webPage;
151 char *phone;
152} person, *personPtr;
153
154/*
155 * And the code needed to parse it
156 */
157personPtr parsePerson(xmlDocPtr doc, xmlNsPtr ns, xmlNodePtr cur) {
158 personPtr ret = NULL;
159
160DEBUG(&quot;parsePerson\n&quot;);
161 /*
162 * allocate the struct
163 */
164 ret = (personPtr) malloc(sizeof(person));
165 if (ret == NULL) {
166 fprintf(stderr,&quot;out of memory\n&quot;);
167 return(NULL);
168 }
169 memset(ret, 0, sizeof(person));
170
171 /* We don't care what the top level element name is */
172 cur = cur-&gt;xmlChildrenNode;
173 while (cur != NULL) {
174 if ((!strcmp(cur-&gt;name, &quot;Person&quot;)) &amp;&amp; (cur-&gt;ns == ns))
175 ret-&gt;name = xmlNodeListGetString(doc, cur-&gt;xmlChildrenNode, 1);
176 if ((!strcmp(cur-&gt;name, &quot;Email&quot;)) &amp;&amp; (cur-&gt;ns == ns))
177 ret-&gt;email = xmlNodeListGetString(doc, cur-&gt;xmlChildrenNode, 1);
178 cur = cur-&gt;next;
179 }
180
181 return(ret);
182}</pre>
183<p>Here are a couple of things to notice:</p>
184<ul>
185<li>Usually a recursive parsing style is the more convenient one: XML data
186 is by nature subject to repetitive constructs and usually exibits highly
187 stuctured patterns.</li>
188<li>The two arguments of type <em>xmlDocPtr</em> and <em>xmlNsPtr</em>,
189 i.e. the pointer to the global XML document and the namespace reserved to
190 the application. Document wide information are needed for example to
191 decode entities and it's a good coding practice to define a namespace for
192 your application set of data and test that the element and attributes
193 you're analyzing actually pertains to your application space. This is
194 done by a simple equality test (cur-&gt;ns == ns).</li>
195<li>To retrieve text and attributes value, you can use the function
196 <em>xmlNodeListGetString</em> to gather all the text and entity reference
197 nodes generated by the DOM output and produce an single text string.</li>
198</ul>
199<p>Here is another piece of code used to parse another level of the
200structure:</p>
201<pre>#include &lt;libxml/tree.h&gt;
202/*
203 * a Description for a Job
204 */
205typedef struct job {
206 char *projectID;
207 char *application;
208 char *category;
209 personPtr contact;
210 int nbDevelopers;
211 personPtr developers[100]; /* using dynamic alloc is left as an exercise */
212} job, *jobPtr;
213
214/*
215 * And the code needed to parse it
216 */
217jobPtr parseJob(xmlDocPtr doc, xmlNsPtr ns, xmlNodePtr cur) {
218 jobPtr ret = NULL;
219
220DEBUG(&quot;parseJob\n&quot;);
221 /*
222 * allocate the struct
223 */
224 ret = (jobPtr) malloc(sizeof(job));
225 if (ret == NULL) {
226 fprintf(stderr,&quot;out of memory\n&quot;);
227 return(NULL);
228 }
229 memset(ret, 0, sizeof(job));
230
231 /* We don't care what the top level element name is */
232 cur = cur-&gt;xmlChildrenNode;
233 while (cur != NULL) {
234
235 if ((!strcmp(cur-&gt;name, &quot;Project&quot;)) &amp;&amp; (cur-&gt;ns == ns)) {
236 ret-&gt;projectID = xmlGetProp(cur, &quot;ID&quot;);
237 if (ret-&gt;projectID == NULL) {
238 fprintf(stderr, &quot;Project has no ID\n&quot;);
239 }
240 }
241 if ((!strcmp(cur-&gt;name, &quot;Application&quot;)) &amp;&amp; (cur-&gt;ns == ns))
242 ret-&gt;application = xmlNodeListGetString(doc, cur-&gt;xmlChildrenNode, 1);
243 if ((!strcmp(cur-&gt;name, &quot;Category&quot;)) &amp;&amp; (cur-&gt;ns == ns))
244 ret-&gt;category = xmlNodeListGetString(doc, cur-&gt;xmlChildrenNode, 1);
245 if ((!strcmp(cur-&gt;name, &quot;Contact&quot;)) &amp;&amp; (cur-&gt;ns == ns))
246 ret-&gt;contact = parsePerson(doc, ns, cur);
247 cur = cur-&gt;next;
248 }
249
250 return(ret);
251}</pre>
252<p>Once you are used to it, writing this kind of code is quite simple, but
253boring. Ultimately, it could be possble to write stubbers taking either C
254data structure definitions, a set of XML examples or an XML DTD and produce
255the code needed to import and export the content between C data and XML
256storage. This is left as an exercise to the reader :-)</p>
257<p>Feel free to use <a href="example/gjobread.c">the code for the full C
258parsing example</a> as a template, it is also available with Makefile in the
259Gnome CVS base under gnome-xml/example</p>
260<p><a href="mailto:daniel@veillard.com">Daniel Veillard</a></p>
261</td></tr></table></td></tr></table></td></tr></table></td>
262</tr></table></td></tr></table>
263</body>
264</html>