blob: db36110907f51190465fa7cdc5a6e5a9218f029b [file] [log] [blame]
Daniel Veillardc9484202001-10-24 12:35:52 +00001<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
2<html>
3<head>
4<meta content="text/html; charset=ISO-8859-1" http-equiv="Content-Type">
5<style type="text/css"><!--
6TD {font-size: 10pt; font-family: Verdana,Arial,Helvetica}
7BODY {font-size: 10pt; font-family: Verdana,Arial,Helvetica; margin-top: 5pt; margin-left: 0pt; margin-right: 0pt}
8H1 {font-size: 16pt; font-family: Verdana,Arial,Helvetica}
9H2 {font-size: 14pt; font-family: Verdana,Arial,Helvetica}
10H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
11--></style>
12<title>A real example</title>
13</head>
14<body bgcolor="#8b7765" text="#000000" link="#000000" vlink="#000000">
15<table border="0" width="100%" cellpadding="5" cellspacing="0" align="center"><tr>
16<td width="180">
17<a href="http://www.gnome.org/"><img src="smallfootonly.gif" alt="Gnome Logo"></a><a href="http://www.w3.org/Status"><img src="w3c.png" alt="W3C Logo"></a><a href="http://www.redhat.com/"><img src="redhat.gif" alt="Red Hat Logo"></a>
18</td>
19<td><table border="0" width="90%" cellpadding="2" cellspacing="0" align="center" bgcolor="#000000"><tr><td><table width="100%" border="0" cellspacing="1" cellpadding="3" bgcolor="#fffacd"><tr><td align="center">
20<h1>The XML C library for Gnome</h1>
21<h2>A real example</h2>
22</td></tr></table></td></tr></table></td>
23</tr></table>
24<table border="0" cellpadding="4" cellspacing="0" width="100%" align="center"><tr><td bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="2" width="100%"><tr>
25<td valign="top" width="200" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td><table width="100%" border="0" cellspacing="1" cellpadding="3">
26<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr>
27<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
28<li><a href="index.html">Home</a></li>
29<li><a href="FAQ.html">FAQ</a></li>
30<li><a href="intro.html">Introduction</a></li>
31<li><a href="docs.html">Documentation</a></li>
32<li><a href="bugs.html">Reporting bugs and getting help</a></li>
33<li><a href="help.html">How to help</a></li>
34<li><a href="downloads.html">Downloads</a></li>
35<li><a href="news.html">News</a></li>
36<li><a href="XML.html">XML</a></li>
37<li><a href="XSLT.html">XSLT</a></li>
38<li><a href="architecture.html">An overview of libxml architecture</a></li>
39<li><a href="tree.html">The tree output</a></li>
40<li><a href="interface.html">The SAX interface</a></li>
41<li><a href="library.html">The XML library interfaces</a></li>
42<li><a href="entities.html">Entities or no entities</a></li>
43<li><a href="namespaces.html">Namespaces</a></li>
44<li><a href="valid.html">Validation, or are you afraid of DTDs ?</a></li>
45<li><a href="DOM.html">DOM Principles</a></li>
46<li><a href="example.html">A real example</a></li>
47<li><a href="contribs.html">Contributions</a></li>
48<li><a href="encoding.html">Encodings support</a></li>
49<li><a href="catalog.html">Catalogs support</a></li>
50<li><a href="xmlio.html">I/O interfaces</a></li>
51<li><a href="xmlmem.html">Memory interfaces</a></li>
52<li><a href="xmldtd.html">DTD support</a></li>
53<li><a href="xml.html">flat page</a></li>
54</ul></td></tr>
55</table></td></tr></table></td>
56<td valign="top" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%"><tr><td><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td><table border="0" cellpadding="3" cellspacing="1" width="100%"><tr><td bgcolor="#fffacd">
57<p>Here is a real size example, where the actual content of the application
58data is not kept in the DOM tree but uses internal structures. It is based on
59a proposal to keep a database of jobs related to Gnome, with an XML based
60storage structure. Here is an <a href="gjobs.xml">XML encoded jobs
61base</a>:</p>
62<pre>&lt;?xml version=&quot;1.0&quot;?&gt;
63&lt;gjob:Helping xmlns:gjob=&quot;http://www.gnome.org/some-location&quot;&gt;
64 &lt;gjob:Jobs&gt;
65
66 &lt;gjob:Job&gt;
67 &lt;gjob:Project ID=&quot;3&quot;/&gt;
68 &lt;gjob:Application&gt;GBackup&lt;/gjob:Application&gt;
69 &lt;gjob:Category&gt;Development&lt;/gjob:Category&gt;
70
71 &lt;gjob:Update&gt;
72 &lt;gjob:Status&gt;Open&lt;/gjob:Status&gt;
73 &lt;gjob:Modified&gt;Mon, 07 Jun 1999 20:27:45 -0400 MET DST&lt;/gjob:Modified&gt;
74 &lt;gjob:Salary&gt;USD 0.00&lt;/gjob:Salary&gt;
75 &lt;/gjob:Update&gt;
76
77 &lt;gjob:Developers&gt;
78 &lt;gjob:Developer&gt;
79 &lt;/gjob:Developer&gt;
80 &lt;/gjob:Developers&gt;
81
82 &lt;gjob:Contact&gt;
83 &lt;gjob:Person&gt;Nathan Clemons&lt;/gjob:Person&gt;
84 &lt;gjob:Email&gt;nathan@windsofstorm.net&lt;/gjob:Email&gt;
85 &lt;gjob:Company&gt;
86 &lt;/gjob:Company&gt;
87 &lt;gjob:Organisation&gt;
88 &lt;/gjob:Organisation&gt;
89 &lt;gjob:Webpage&gt;
90 &lt;/gjob:Webpage&gt;
91 &lt;gjob:Snailmail&gt;
92 &lt;/gjob:Snailmail&gt;
93 &lt;gjob:Phone&gt;
94 &lt;/gjob:Phone&gt;
95 &lt;/gjob:Contact&gt;
96
97 &lt;gjob:Requirements&gt;
98 The program should be released as free software, under the GPL.
99 &lt;/gjob:Requirements&gt;
100
101 &lt;gjob:Skills&gt;
102 &lt;/gjob:Skills&gt;
103
104 &lt;gjob:Details&gt;
105 A GNOME based system that will allow a superuser to configure
106 compressed and uncompressed files and/or file systems to be backed
107 up with a supported media in the system. This should be able to
108 perform via find commands generating a list of files that are passed
109 to tar, dd, cpio, cp, gzip, etc., to be directed to the tape machine
110 or via operations performed on the filesystem itself. Email
111 notification and GUI status display very important.
112 &lt;/gjob:Details&gt;
113
114 &lt;/gjob:Job&gt;
115
116 &lt;/gjob:Jobs&gt;
117&lt;/gjob:Helping&gt;</pre>
118<p>While loading the XML file into an internal DOM tree is a matter of
119calling only a couple of functions, browsing the tree to gather the ata and
120generate the internal structures is harder, and more error prone.</p>
121<p>The suggested principle is to be tolerant with respect to the input
122structure. For example, the ordering of the attributes is not significant,
123the XML specification is clear about it. It's also usually a good idea not to
124depend on the order of the children of a given node, unless it really makes
125things harder. Here is some code to parse the information for a person:</p>
126<pre>/*
127 * A person record
128 */
129typedef struct person {
130 char *name;
131 char *email;
132 char *company;
133 char *organisation;
134 char *smail;
135 char *webPage;
136 char *phone;
137} person, *personPtr;
138
139/*
140 * And the code needed to parse it
141 */
142personPtr parsePerson(xmlDocPtr doc, xmlNsPtr ns, xmlNodePtr cur) {
143 personPtr ret = NULL;
144
145DEBUG(&quot;parsePerson\n&quot;);
146 /*
147 * allocate the struct
148 */
149 ret = (personPtr) malloc(sizeof(person));
150 if (ret == NULL) {
151 fprintf(stderr,&quot;out of memory\n&quot;);
152 return(NULL);
153 }
154 memset(ret, 0, sizeof(person));
155
156 /* We don't care what the top level element name is */
157 cur = cur-&gt;xmlChildrenNode;
158 while (cur != NULL) {
159 if ((!strcmp(cur-&gt;name, &quot;Person&quot;)) &amp;&amp; (cur-&gt;ns == ns))
160 ret-&gt;name = xmlNodeListGetString(doc, cur-&gt;xmlChildrenNode, 1);
161 if ((!strcmp(cur-&gt;name, &quot;Email&quot;)) &amp;&amp; (cur-&gt;ns == ns))
162 ret-&gt;email = xmlNodeListGetString(doc, cur-&gt;xmlChildrenNode, 1);
163 cur = cur-&gt;next;
164 }
165
166 return(ret);
167}</pre>
168<p>Here are a couple of things to notice:</p>
169<ul>
170<li>Usually a recursive parsing style is the more convenient one: XML data
171 is by nature subject to repetitive constructs and usually exibits highly
172 stuctured patterns.</li>
173<li>The two arguments of type <em>xmlDocPtr</em> and <em>xmlNsPtr</em>,
174 i.e. the pointer to the global XML document and the namespace reserved to
175 the application. Document wide information are needed for example to
176 decode entities and it's a good coding practice to define a namespace for
177 your application set of data and test that the element and attributes
178 you're analyzing actually pertains to your application space. This is
179 done by a simple equality test (cur-&gt;ns == ns).</li>
180<li>To retrieve text and attributes value, you can use the function
181 <em>xmlNodeListGetString</em> to gather all the text and entity reference
182 nodes generated by the DOM output and produce an single text string.</li>
183</ul>
184<p>Here is another piece of code used to parse another level of the
185structure:</p>
186<pre>#include &lt;libxml/tree.h&gt;
187/*
188 * a Description for a Job
189 */
190typedef struct job {
191 char *projectID;
192 char *application;
193 char *category;
194 personPtr contact;
195 int nbDevelopers;
196 personPtr developers[100]; /* using dynamic alloc is left as an exercise */
197} job, *jobPtr;
198
199/*
200 * And the code needed to parse it
201 */
202jobPtr parseJob(xmlDocPtr doc, xmlNsPtr ns, xmlNodePtr cur) {
203 jobPtr ret = NULL;
204
205DEBUG(&quot;parseJob\n&quot;);
206 /*
207 * allocate the struct
208 */
209 ret = (jobPtr) malloc(sizeof(job));
210 if (ret == NULL) {
211 fprintf(stderr,&quot;out of memory\n&quot;);
212 return(NULL);
213 }
214 memset(ret, 0, sizeof(job));
215
216 /* We don't care what the top level element name is */
217 cur = cur-&gt;xmlChildrenNode;
218 while (cur != NULL) {
219
220 if ((!strcmp(cur-&gt;name, &quot;Project&quot;)) &amp;&amp; (cur-&gt;ns == ns)) {
221 ret-&gt;projectID = xmlGetProp(cur, &quot;ID&quot;);
222 if (ret-&gt;projectID == NULL) {
223 fprintf(stderr, &quot;Project has no ID\n&quot;);
224 }
225 }
226 if ((!strcmp(cur-&gt;name, &quot;Application&quot;)) &amp;&amp; (cur-&gt;ns == ns))
227 ret-&gt;application = xmlNodeListGetString(doc, cur-&gt;xmlChildrenNode, 1);
228 if ((!strcmp(cur-&gt;name, &quot;Category&quot;)) &amp;&amp; (cur-&gt;ns == ns))
229 ret-&gt;category = xmlNodeListGetString(doc, cur-&gt;xmlChildrenNode, 1);
230 if ((!strcmp(cur-&gt;name, &quot;Contact&quot;)) &amp;&amp; (cur-&gt;ns == ns))
231 ret-&gt;contact = parsePerson(doc, ns, cur);
232 cur = cur-&gt;next;
233 }
234
235 return(ret);
236}</pre>
237<p>Once you are used to it, writing this kind of code is quite simple, but
238boring. Ultimately, it could be possble to write stubbers taking either C
239data structure definitions, a set of XML examples or an XML DTD and produce
240the code needed to import and export the content between C data and XML
241storage. This is left as an exercise to the reader :-)</p>
242<p>Feel free to use <a href="example/gjobread.c">the code for the full C
243parsing example</a> as a template, it is also available with Makefile in the
244Gnome CVS base under gnome-xml/example</p>
245<p><a href="mailto:daniel@veillard.com">Daniel Veillard</a></p>
246</td></tr></table></td></tr></table></td></tr></table></td>
247</tr></table></td></tr></table>
248</body>
249</html>