blob: 32f25ce267986c74a55d9605c954d050bc133bf7 [file] [log] [blame] [view]
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +04001TinyXML-2
2=========
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -08003
4TinyXML is a simple, small, efficient, C++ XML parser that can be
5easily integrated into other programs.
6
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -08007The master is hosted on github:
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +04008https://github.com/leethomason/tinyxml2
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -08009
Lee Thomason (grinliz)6a22be22012-04-04 12:39:05 -070010The online HTML version of these docs:
11http://grinninglizard.com/tinyxml2docs/index.html
12
Lee Thomason (grinliz)ae209f62012-04-04 22:00:07 -070013Examples are in the "related pages" tab of the HTML docs.
Lee Thomason (grinliz)6a22be22012-04-04 12:39:05 -070014
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +040015What it does.
16-------------
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -080017
18In brief, TinyXML parses an XML document, and builds from that a
19Document Object Model (DOM) that can be read, modified, and saved.
20
21XML stands for "eXtensible Markup Language." It is a general purpose
22human and machine readable markup language to describe arbitrary data.
23All those random file formats created to store application data can
24all be replaced with XML. One parser for everything.
25
26http://en.wikipedia.org/wiki/XML
27
28There are different ways to access and interact with XML data.
29TinyXML-2 uses a Document Object Model (DOM), meaning the XML data is parsed
30into a C++ objects that can be browsed and manipulated, and then
31written to disk or another output stream. You can also construct an XML document
32from scratch with C++ objects and write this to disk or another output
33stream. You can even use TinyXML-2 to stream XML programmatically from
34code without creating a document first.
35
36TinyXML-2 is designed to be easy and fast to learn. It is one header and
37one cpp file. Simply add these to your project and off you go.
38There is an example file - xmltest.cpp - to get you started.
39
40TinyXML-2 is released under the ZLib license,
41so you can use it in open source or commercial code. The details
42of the license are at the top of every source file.
43
44TinyXML-2 attempts to be a flexible parser, but with truly correct and
45compliant XML output. TinyXML-2 should compile on any reasonably C++
46compliant system. It does not rely on exceptions, RTTI, or the STL.
47
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +040048What it doesn’t do.
49-------------------
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -080050
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -080051TinyXML-2 doesn't parse or use DTDs (Document Type Definitions) or XSLs
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -080052(eXtensible Stylesheet Language.) There are other parsers out there
53that are much more fully
54featured. But they are also much bigger, take longer to set up in
55your project, have a higher learning curve, and often have a more
56restrictive license. If you are working with browsers or have more
57complete XML needs, TinyXML-2 is not the parser for you.
58
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +040059TinyXML-1 vs. TinyXML-2
60-----------------------
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -080061
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -080062Which should you use? TinyXML-2 uses a similar API to TinyXML-1 and the same
63rich test cases. But the implementation of the parser is completely re-written
64to make it more appropriate for use in a game. It uses less memory, is faster,
Lee Thomason6f381b72012-03-02 12:59:39 -080065and uses far few memory allocations.
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -080066
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -080067TinyXML-2 has no requirement for STL, but has also dropped all STL support. All
68strings are query and set as 'const char*'. This allows the use of internal
69allocators, and keeps the code much simpler.
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -080070
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -080071Both parsers:
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +040072
731. Simple to use with similar APIs.
742. DOM based parser.
753. UTF-8 Unicode support. http://en.wikipedia.org/wiki/UTF-8
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -080076
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -080077Advantages of TinyXML-2
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +040078
791. The focus of all future dev.
802. Many fewer memory allocation (1/10th to 1/100th), uses less memory
81 (about 40% of TinyXML-1), and faster.
823. No STL requirement.
834. More modern C++, including a proper namespace.
845. Proper and useful handling of whitespace
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -080085
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -080086Advantages of TinyXML-1
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -080087
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400881. Can report the location of parsing errors.
892. Support for some C++ STL conventions: streams and strings
903. Very mature and well debugged code base.
91
92Features
93--------
94
95### Memory Model
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -080096
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -080097An XMLDocument is a C++ object like any other, that can be on the stack, or
98new'd and deleted on the heap.
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -080099
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800100However, any sub-node of the Document, XMLElement, XMLText, etc, can only
101be created by calling the appropriate XMLDocument::NewElement, NewText, etc.
102method. Although you have pointers to these objects, they are still owned
103by the Document. When the Document is deleted, so are all the nodes it contains.
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800104
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400105### White Space
Lee Thomason (grinliz)ec7777e2012-02-26 20:51:27 -0800106
107Microsoft has an excellent article on white space: http://msdn.microsoft.com/en-us/library/ms256097.aspx
108
Lee Thomason (grinliz)c5defa62012-09-08 22:06:14 -0700109By default, TinyXML-2 preserves white space in a (hopefully) sane way that is almost complient with the
Lee Thomasonc50b6b42012-03-24 12:51:47 -0700110spec.(TinyXML-1 used a completely outdated model.)
Lee Thomason (grinliz)ec7777e2012-02-26 20:51:27 -0800111
112As a first step, all newlines / carriage-returns / line-feeds are normalized to a
113line-feed character, as required by the XML spec.
114
115White space in text is preserved. For example:
Lee Thomason12d5a032012-02-29 16:19:03 -0800116
Lee Thomason (grinliz)ec7777e2012-02-26 20:51:27 -0800117 <element> Hello, World</element>
Lee Thomason (grinliz)ec7777e2012-02-26 20:51:27 -0800118
Lee Thomasonc50b6b42012-03-24 12:51:47 -0700119The leading space before the "Hello" and the double space after the comma are
120preserved. Line-feeds are preserved, as in this example:
Lee Thomason (grinliz)ec7777e2012-02-26 20:51:27 -0800121
Lee Thomason (grinliz)ec7777e2012-02-26 20:51:27 -0800122 <element> Hello again,
123 World</element>
Lee Thomason (grinliz)ec7777e2012-02-26 20:51:27 -0800124
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400125However, white space between elements is **not** preserved. Although not strictly
Lee Thomasonc50b6b42012-03-24 12:51:47 -0700126compliant, tracking and reporting inter-element space is awkward, and not normally
127valuable. TinyXML-2 sees these as the same XML:
Lee Thomason (grinliz)ec7777e2012-02-26 20:51:27 -0800128
Lee Thomason12d5a032012-02-29 16:19:03 -0800129 <document>
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400130 <data>1</data>
131 <data>2</data>
132 <data>3</data>
Lee Thomason12d5a032012-02-29 16:19:03 -0800133 </document>
Lee Thomason (grinliz)ec7777e2012-02-26 20:51:27 -0800134
Lee Thomason12d5a032012-02-29 16:19:03 -0800135 <document><data>1</data><data>2</data><data>3</data></document>
Lee Thomason (grinliz)ec7777e2012-02-26 20:51:27 -0800136
Lee Thomason (grinliz)c5defa62012-09-08 22:06:14 -0700137#### Whitespace Collapse
138
139For some applications, it is preferable to collapse whitespace. TinyXML-2
140supports this with the 'whitespace' parameter to the XMLDocument constructor.
141(The default is to preserve whitespace, as described above.)
142
143However, you may also use COLLAPSE_WHITESPACE, which will:
144
145 * Remove leading and trailing whitespace
146 * Convert newlines and line-feeds into a space character
147 * Collapse a run of any number of space characters into a single space character
148
149This can be useful for text documents stored in XML.
150
151Note that (currently) there is a performance impact for using COLLAPSE_WHITESPACE.
152It essentially causes the XML to be parsed twice.
153
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400154### Entities
155
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800156TinyXML-2 recognizes the pre-defined "character entities", meaning special
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800157characters. Namely:
158
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800159 &amp; &
160 &lt; <
161 &gt; >
162 &quot; "
163 &apos; '
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800164
165These are recognized when the XML document is read, and translated to there
166UTF-8 equivalents. For instance, text with the XML of:
167
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800168 Far &amp; Away
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800169
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800170will have the Value() of "Far & Away" when queried from the XMLText object,
171and will be written back to the XML stream/file as an ampersand.
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800172
173Additionally, any character can be specified by its Unicode code point:
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800174The syntax "&#xA0;" or "&#160;" are both to the non-breaking space characher.
175This is called a 'numeric character reference'. Any numeric character reference
176that isn't one of the special entities above, will be read, but written as a
177regular code point. The output is correct, but the entity syntax isn't preserved.
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800178
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400179### Printing
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800180
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400181#### Print to file
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800182You can directly use the convenience function:
Lee Thomason12d5a032012-02-29 16:19:03 -0800183
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800184 XMLDocument doc;
185 ...
Lee Thomason77d7f202012-07-29 18:51:41 -0700186 doc.SaveFile( "foo.xml" );
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800187
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800188Or the XMLPrinter class:
Lee Thomason12d5a032012-02-29 16:19:03 -0800189
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800190 XMLPrinter printer( fp );
191 doc.Print( &printer );
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800192
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400193#### Print to memory
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800194Printing to memory is supported by the XMLPrinter.
Lee Thomason12d5a032012-02-29 16:19:03 -0800195
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800196 XMLPrinter printer;
197 doc->Print( &printer );
198 // printer.CStr() has a const char* to the XML
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800199
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400200#### Print without an XMLDocument
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800201
Lee Thomason12d5a032012-02-29 16:19:03 -0800202When loading, an XML parser is very useful. However, sometimes
203when saving, it just gets in the way. The code is often set up
204for streaming, and constructing the DOM is just overhead.
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800205
Lee Thomason12d5a032012-02-29 16:19:03 -0800206The Printer supports the streaming case. The following code
207prints out a trivially simple XML file without ever creating
208an XML document.
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800209
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800210 XMLPrinter printer( fp );
211 printer.OpenElement( "foo" );
212 printer.PushAttribute( "foo", "bar" );
213 printer.CloseElement();
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800214
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400215Examples
216--------
Lee Thomason87e475a2012-03-20 11:55:29 -0700217
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400218#### Load and parse an XML file.
219
Lee Thomason87e475a2012-03-20 11:55:29 -0700220 /* ------ Example 1: Load and parse an XML file. ---- */
221 {
222 XMLDocument doc;
223 doc.LoadFile( "dream.xml" );
224 }
Lee Thomason87e475a2012-03-20 11:55:29 -0700225
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400226#### Lookup information.
227
Lee Thomason87e475a2012-03-20 11:55:29 -0700228 /* ------ Example 2: Lookup information. ---- */
229 {
230 XMLDocument doc;
231 doc.LoadFile( "dream.xml" );
232
233 // Structure of the XML file:
Lee Thomasonc50b6b42012-03-24 12:51:47 -0700234 // - Element "PLAY" the root Element, which is the
235 // FirstChildElement of the Document
236 // - - Element "TITLE" child of the root PLAY Element
237 // - - - Text child of the TITLE Element
Lee Thomason87e475a2012-03-20 11:55:29 -0700238
Lee Thomasonc50b6b42012-03-24 12:51:47 -0700239 // Navigate to the title, using the convenience function,
240 // with a dangerous lack of error checking.
Lee Thomason87e475a2012-03-20 11:55:29 -0700241 const char* title = doc.FirstChildElement( "PLAY" )->FirstChildElement( "TITLE" )->GetText();
242 printf( "Name of play (1): %s\n", title );
243
Lee Thomasonc50b6b42012-03-24 12:51:47 -0700244 // Text is just another Node to TinyXML-2. The more
245 // general way to get to the XMLText:
Lee Thomason87e475a2012-03-20 11:55:29 -0700246 XMLText* textNode = doc.FirstChildElement( "PLAY" )->FirstChildElement( "TITLE" )->FirstChild()->ToText();
247 title = textNode->Value();
248 printf( "Name of play (2): %s\n", title );
249 }
Lee Thomason87e475a2012-03-20 11:55:29 -0700250
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400251Using and Installing
252--------------------
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800253
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800254There are 2 files in TinyXML-2:
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400255* tinyxml2.cpp
256* tinyxml2.h
257
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800258And additionally a test file:
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400259* xmltest.cpp
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800260
Lee Thomason2f6e4762012-04-05 08:52:07 -0700261Simply compile and run. There is a visual studio 2010 project included, a simple Makefile,
262an XCode project, and a cmake CMakeLists.txt included to help you. The top of tinyxml.h
263even has a simple g++ command line if you are are *nix and don't want to use a build system.
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800264
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400265Documentation
266-------------
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800267
268The documentation is build with Doxygen, using the 'dox'
269configuration file.
270
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400271License
272-------
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800273
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800274TinyXML-2 is released under the zlib license:
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800275
276This software is provided 'as-is', without any express or implied
277warranty. In no event will the authors be held liable for any
278damages arising from the use of this software.
279
280Permission is granted to anyone to use this software for any
281purpose, including commercial applications, and to alter it and
282redistribute it freely, subject to the following restrictions:
283
2841. The origin of this software must not be misrepresented; you must
285not claim that you wrote the original software. If you use this
286software in a product, an acknowledgment in the product documentation
287would be appreciated but is not required.
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -08002882. Altered source versions must be plainly marked as such, and
289must not be misrepresented as being the original software.
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -08002903. This notice may not be removed or altered from any source
291distribution.
292
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400293Contributors
294------------
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800295
296Thanks very much to everyone who sends suggestions, bugs, ideas, and
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800297encouragement. It all helps, and makes this project fun.
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800298
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800299The original TinyXML-1 has many contributors, who all deserve thanks
300in shaping what is a very successful library. Extra thanks to Yves
301Berquin and Andrew Ellerton who were key contributors.
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800302
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800303TinyXML-2 grew from that effort. Lee Thomason is the original author
304of TinyXML-2 (and TinyXML-1) but hopefully TinyXML-2 will be improved
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400305by many contributors.