blob: 4a43c4863e2b28748641dcfa177d4bd4a5e1f8e5 [file] [log] [blame]
Daniel Veillard43d3f612001-11-10 11:57:23 +00001<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd">
Daniel Veillardc9484202001-10-24 12:35:52 +00002<html>
3<head>
Daniel Veillard7216cfd2002-11-08 15:10:00 +00004<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
Daniel Veillardc332dab2002-03-29 14:08:27 +00005<link rel="SHORTCUT ICON" href="/favicon.ico">
Daniel Veillardc9484202001-10-24 12:35:52 +00006<style type="text/css"><!--
Daniel Veillard373a4752002-02-21 14:46:29 +00007TD {font-family: Verdana,Arial,Helvetica}
8BODY {font-family: Verdana,Arial,Helvetica; margin-top: 2em; margin-left: 0em; margin-right: 0em}
9H1 {font-family: Verdana,Arial,Helvetica}
10H2 {font-family: Verdana,Arial,Helvetica}
11H3 {font-family: Verdana,Arial,Helvetica}
Daniel Veillardb8cfbd12001-10-25 10:53:28 +000012A:link, A:visited, A:active { text-decoration: underline }
Daniel Veillardc9484202001-10-24 12:35:52 +000013--></style>
14<title>Entities or no entities</title>
15</head>
16<body bgcolor="#8b7765" text="#000000" link="#000000" vlink="#000000">
17<table border="0" width="100%" cellpadding="5" cellspacing="0" align="center"><tr>
18<td width="180">
Daniel Veillard8f40f1e2002-08-28 21:18:45 +000019<a href="http://www.gnome.org/"><img src="gnome2.png" alt="Gnome2 Logo"></a><a href="http://www.w3.org/Status"><img src="w3c.png" alt="W3C Logo"></a><a href="http://www.redhat.com/"><img src="redhat.gif" alt="Red Hat Logo"></a><div align="left"><a href="http://xmlsoft.org/"><img src="Libxml2-Logo-180x168.gif" alt="Made with Libxml2 Logo"></a></div>
Daniel Veillardc9484202001-10-24 12:35:52 +000020</td>
21<td><table border="0" width="90%" cellpadding="2" cellspacing="0" align="center" bgcolor="#000000"><tr><td><table width="100%" border="0" cellspacing="1" cellpadding="3" bgcolor="#fffacd"><tr><td align="center">
22<h1>The XML C library for Gnome</h1>
23<h2>Entities or no entities</h2>
24</td></tr></table></td></tr></table></td>
25</tr></table>
26<table border="0" cellpadding="4" cellspacing="0" width="100%" align="center"><tr><td bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="2" width="100%"><tr>
Daniel Veillard594cf0b2001-10-25 08:09:12 +000027<td valign="top" width="200" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td>
28<table width="100%" border="0" cellspacing="1" cellpadding="3">
Daniel Veillardc9484202001-10-24 12:35:52 +000029<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr>
Daniel Veillard4a603e42003-01-11 14:18:53 +000030<tr><td bgcolor="#fffacd">
31<form action="search.php" enctype="application/x-www-form-urlencoded" method="GET">
32<input name="query" type="TEXT" size="20" value=""><input name="submit" type="submit" value="Search ...">
33</form>
34<ul>
Daniel Veillardc9484202001-10-24 12:35:52 +000035<li><a href="index.html">Home</a></li>
Daniel Veillardc9484202001-10-24 12:35:52 +000036<li><a href="intro.html">Introduction</a></li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +000037<li><a href="FAQ.html">FAQ</a></li>
Daniel Veillardc9484202001-10-24 12:35:52 +000038<li><a href="docs.html">Documentation</a></li>
39<li><a href="bugs.html">Reporting bugs and getting help</a></li>
40<li><a href="help.html">How to help</a></li>
41<li><a href="downloads.html">Downloads</a></li>
42<li><a href="news.html">News</a></li>
Daniel Veillard7b602b42002-01-08 13:26:00 +000043<li><a href="XMLinfo.html">XML</a></li>
Daniel Veillardc9484202001-10-24 12:35:52 +000044<li><a href="XSLT.html">XSLT</a></li>
Daniel Veillard6dbcaf82002-02-20 14:37:47 +000045<li><a href="python.html">Python and bindings</a></li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +000046<li><a href="architecture.html">libxml architecture</a></li>
Daniel Veillardc9484202001-10-24 12:35:52 +000047<li><a href="tree.html">The tree output</a></li>
48<li><a href="interface.html">The SAX interface</a></li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +000049<li><a href="xmldtd.html">Validation &amp; DTDs</a></li>
50<li><a href="xmlmem.html">Memory Management</a></li>
51<li><a href="encoding.html">Encodings support</a></li>
52<li><a href="xmlio.html">I/O Interfaces</a></li>
53<li><a href="catalog.html">Catalog support</a></li>
54<li><a href="library.html">The parser interfaces</a></li>
Daniel Veillardc9484202001-10-24 12:35:52 +000055<li><a href="entities.html">Entities or no entities</a></li>
56<li><a href="namespaces.html">Namespaces</a></li>
Daniel Veillardb8cfbd12001-10-25 10:53:28 +000057<li><a href="upgrade.html">Upgrading 1.x code</a></li>
Daniel Veillard52dcab32001-10-30 12:51:17 +000058<li><a href="threads.html">Thread safety</a></li>
Daniel Veillardc9484202001-10-24 12:35:52 +000059<li><a href="DOM.html">DOM Principles</a></li>
60<li><a href="example.html">A real example</a></li>
61<li><a href="contribs.html">Contributions</a></li>
Daniel Veillard7b4b2f92003-01-06 13:11:20 +000062<li><a href="xmlreader.html">The Reader Interface</a></li>
Daniel Veillardfc59c092002-06-05 14:48:26 +000063<li><a href="tutorial/index.html">Tutorial</a></li>
Daniel Veillard7b4b2f92003-01-06 13:11:20 +000064<li><a href="guidelines.html">XML Guidelines</a></li>
Daniel Veillard594cf0b2001-10-25 08:09:12 +000065<li>
66<a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a>
67</li>
Daniel Veillard5ede35e2002-10-01 11:37:35 +000068</ul>
69</td></tr>
Daniel Veillard3bf65be2002-01-23 12:36:34 +000070</table>
71<table width="100%" border="0" cellspacing="1" cellpadding="3">
Daniel Veillard594cf0b2001-10-25 08:09:12 +000072<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Related links</b></center></td></tr>
Daniel Veillard8acca112002-01-21 09:52:27 +000073<tr><td bgcolor="#fffacd"><ul>
Daniel Veillard594cf0b2001-10-25 08:09:12 +000074<li><a href="http://mail.gnome.org/archives/xml/">Mail archive</a></li>
75<li><a href="http://xmlsoft.org/XSLT/">XSLT libxslt</a></li>
Daniel Veillard4a859202002-01-08 11:49:22 +000076<li><a href="http://phd.cs.unibo.it/gdome2/">DOM gdome2</a></li>
Daniel Veillard2d347fa2002-03-17 10:34:11 +000077<li><a href="http://www.aleksey.com/xmlsec/">XML-DSig xmlsec</a></li>
Daniel Veillard594cf0b2001-10-25 08:09:12 +000078<li><a href="ftp://xmlsoft.org/">FTP</a></li>
Daniel Veillardc84f8b52002-12-19 22:12:47 +000079<li><a href="http://www.zlatkovic.com/projects/libxml/">Windows binaries</a></li>
Daniel Veillarddb9dfd92001-11-26 17:25:02 +000080<li><a href="http://garypennington.net/libxml2/">Solaris binaries</a></li>
Daniel Veillardcb7543b2002-09-09 10:54:06 +000081<li><a href="http://www.zveno.com/open_source/libxml2xslt.html">MacOsX binaries</a></li>
Daniel Veillarde6d8e202002-05-02 06:11:10 +000082<li><a href="http://sourceforge.net/projects/libxml2-pas/">Pascal bindings</a></li>
Daniel Veillard2d347fa2002-03-17 10:34:11 +000083<li><a href="http://bugzilla.gnome.org/buglist.cgi?product=libxml&amp;product=libxml2">Bug Tracker</a></li>
Daniel Veillard594cf0b2001-10-25 08:09:12 +000084</ul></td></tr>
85</table>
Daniel Veillard4a603e42003-01-11 14:18:53 +000086<table width="100%" border="0" cellspacing="1" cellpadding="3">
87<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>API Indexes</b></center></td></tr>
88<tr><td bgcolor="#fffacd"><ul>
89<li><a href="APIchunk0.html">Alphabetic</a></li>
90<li><a href="APIconstructors.html">Constructors</a></li>
91<li><a href="APIfunctions.html">Functions/Types</a></li>
92<li><a href="APIfiles.html">Modules</a></li>
93<li><a href="APIsymbols.html">Symbols</a></li>
94</ul></td></tr>
95</table>
Daniel Veillard594cf0b2001-10-25 08:09:12 +000096</td></tr></table></td>
Daniel Veillardc9484202001-10-24 12:35:52 +000097<td valign="top" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%"><tr><td><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td><table border="0" cellpadding="3" cellspacing="1" width="100%"><tr><td bgcolor="#fffacd">
98<p>Entities in principle are similar to simple C macros. An entity defines an
99abbreviation for a given string that you can reuse many times throughout the
100content of your document. Entities are especially useful when a given string
101may occur frequently within a document, or to confine the change needed to a
102document to a restricted area in the internal subset of the document (at the
103beginning). Example:</p>
104<pre>1 &lt;?xml version=&quot;1.0&quot;?&gt;
1052 &lt;!DOCTYPE EXAMPLE SYSTEM &quot;example.dtd&quot; [
1063 &lt;!ENTITY xml &quot;Extensible Markup Language&quot;&gt;
1074 ]&gt;
1085 &lt;EXAMPLE&gt;
1096 &amp;xml;
1107 &lt;/EXAMPLE&gt;</pre>
111<p>Line 3 declares the xml entity. Line 6 uses the xml entity, by prefixing
112its name with '&amp;' and following it by ';' without any spaces added. There
Daniel Veillard63d83142002-05-20 06:51:05 +0000113are 5 predefined entities in libxml allowing you to escape characters with
Daniel Veillardc9484202001-10-24 12:35:52 +0000114predefined meaning in some parts of the xml document content:
115<strong>&amp;lt;</strong> for the character '&lt;', <strong>&amp;gt;</strong>
116for the character '&gt;', <strong>&amp;apos;</strong> for the character ''',
117<strong>&amp;quot;</strong> for the character '&quot;', and
118<strong>&amp;amp;</strong> for the character '&amp;'.</p>
119<p>One of the problems related to entities is that you may want the parser to
120substitute an entity's content so that you can see the replacement text in
121your application. Or you may prefer to keep entity references as such in the
122content to be able to save the document back without losing this usually
123precious information (if the user went through the pain of explicitly
124defining entities, he may have a a rather negative attitude if you blindly
Daniel Veillard63d83142002-05-20 06:51:05 +0000125substitute them as saving time). The <a href="html/libxml-parser.html#XMLSUBSTITUTEENTITIESDEFAULT">xmlSubstituteEntitiesDefault()</a>
Daniel Veillardc9484202001-10-24 12:35:52 +0000126function allows you to check and change the behaviour, which is to not
127substitute entities by default.</p>
128<p>Here is the DOM tree built by libxml for the previous document in the
129default case:</p>
130<pre>/gnome/src/gnome-xml -&gt; ./xmllint --debug test/ent1
131DOCUMENT
132version=1.0
133 ELEMENT EXAMPLE
134 TEXT
135 content=
136 ENTITY_REF
137 INTERNAL_GENERAL_ENTITY xml
138 content=Extensible Markup Language
139 TEXT
140 content=</pre>
141<p>And here is the result when substituting entities:</p>
142<pre>/gnome/src/gnome-xml -&gt; ./tester --debug --noent test/ent1
143DOCUMENT
144version=1.0
145 ELEMENT EXAMPLE
146 TEXT
147 content= Extensible Markup Language</pre>
148<p>So, entities or no entities? Basically, it depends on your use case. I
149suggest that you keep the non-substituting default behaviour and avoid using
150entities in your XML document or data if you are not willing to handle the
151entity references elements in the DOM tree.</p>
152<p>Note that at save time libxml enforces the conversion of the predefined
153entities where necessary to prevent well-formedness problems, and will also
154transparently replace those with chars (i.e. it will not generate entity
155reference elements in the DOM tree or call the reference() SAX callback when
156finding them in the input).</p>
157<p>
158<span style="background-color: #FF0000">WARNING</span>: handling entities
159on top of the libxml SAX interface is difficult!!! If you plan to use
Daniel Veillard63d83142002-05-20 06:51:05 +0000160non-predefined entities in your documents, then the learning curve to handle
Daniel Veillardc9484202001-10-24 12:35:52 +0000161then using the SAX API may be long. If you plan to use complex documents, I
162strongly suggest you consider using the DOM interface instead and let libxml
163deal with the complexity rather than trying to do it yourself.</p>
Daniel Veillard3f4c40f2002-02-13 09:19:28 +0000164<p><a href="bugs.html">Daniel Veillard</a></p>
Daniel Veillardc9484202001-10-24 12:35:52 +0000165</td></tr></table></td></tr></table></td></tr></table></td>
166</tr></table></td></tr></table>
167</body>
168</html>