blob: 1c2132ab0a43316066d1ee437183dd7086cdfa5e [file] [log] [blame] [view]
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +04001TinyXML-2
2=========
Lee Thomason16881142013-09-12 23:53:08 -07003![TinyXML-2 Logo](http://www.grinninglizard.com/tinyxml2/TinyXML2_small.png)
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -08004
Vasily Biryukov9a975b72013-05-11 21:41:42 +06005TinyXML-2 is a simple, small, efficient, C++ XML parser that can be
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -08006easily integrated into other programs.
7
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -08008The master is hosted on github:
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +04009https://github.com/leethomason/tinyxml2
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -080010
Lee Thomason (grinliz)6a22be22012-04-04 12:39:05 -070011The online HTML version of these docs:
12http://grinninglizard.com/tinyxml2docs/index.html
13
Lee Thomason (grinliz)ae209f62012-04-04 22:00:07 -070014Examples are in the "related pages" tab of the HTML docs.
Lee Thomason (grinliz)6a22be22012-04-04 12:39:05 -070015
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +040016What it does.
17-------------
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -080018
Vasily Biryukov9a975b72013-05-11 21:41:42 +060019In brief, TinyXML-2 parses an XML document, and builds from that a
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -080020Document Object Model (DOM) that can be read, modified, and saved.
21
22XML stands for "eXtensible Markup Language." It is a general purpose
23human and machine readable markup language to describe arbitrary data.
24All those random file formats created to store application data can
25all be replaced with XML. One parser for everything.
26
27http://en.wikipedia.org/wiki/XML
28
29There are different ways to access and interact with XML data.
30TinyXML-2 uses a Document Object Model (DOM), meaning the XML data is parsed
31into a C++ objects that can be browsed and manipulated, and then
32written to disk or another output stream. You can also construct an XML document
33from scratch with C++ objects and write this to disk or another output
34stream. You can even use TinyXML-2 to stream XML programmatically from
35code without creating a document first.
36
37TinyXML-2 is designed to be easy and fast to learn. It is one header and
38one cpp file. Simply add these to your project and off you go.
39There is an example file - xmltest.cpp - to get you started.
40
41TinyXML-2 is released under the ZLib license,
42so you can use it in open source or commercial code. The details
43of the license are at the top of every source file.
44
45TinyXML-2 attempts to be a flexible parser, but with truly correct and
46compliant XML output. TinyXML-2 should compile on any reasonably C++
47compliant system. It does not rely on exceptions, RTTI, or the STL.
48
Kevin Wojniak273f7b42013-01-09 09:21:13 -080049What it doesn't do.
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +040050-------------------
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -080051
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -080052TinyXML-2 doesn't parse or use DTDs (Document Type Definitions) or XSLs
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -080053(eXtensible Stylesheet Language.) There are other parsers out there
54that are much more fully
55featured. But they are also much bigger, take longer to set up in
56your project, have a higher learning curve, and often have a more
57restrictive license. If you are working with browsers or have more
58complete XML needs, TinyXML-2 is not the parser for you.
59
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +040060TinyXML-1 vs. TinyXML-2
61-----------------------
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -080062
Lee Thomason74a81cf2013-09-12 23:59:43 -070063TinyXML-2 is now the focus of all development, well tested, and your
64best choice unless you have a requirement to maintain TinyXML-1 code.
65
66TinyXML-2 uses a similar API to TinyXML-1 and the same
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -080067rich test cases. But the implementation of the parser is completely re-written
68to make it more appropriate for use in a game. It uses less memory, is faster,
Kevin Wojniak273f7b42013-01-09 09:21:13 -080069and uses far fewer memory allocations.
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -080070
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -080071TinyXML-2 has no requirement for STL, but has also dropped all STL support. All
72strings are query and set as 'const char*'. This allows the use of internal
73allocators, and keeps the code much simpler.
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -080074
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -080075Both parsers:
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +040076
771. Simple to use with similar APIs.
782. DOM based parser.
793. UTF-8 Unicode support. http://en.wikipedia.org/wiki/UTF-8
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -080080
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -080081Advantages of TinyXML-2
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +040082
831. The focus of all future dev.
842. Many fewer memory allocation (1/10th to 1/100th), uses less memory
85 (about 40% of TinyXML-1), and faster.
863. No STL requirement.
874. More modern C++, including a proper namespace.
885. Proper and useful handling of whitespace
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -080089
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -080090Advantages of TinyXML-1
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -080091
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400921. Can report the location of parsing errors.
932. Support for some C++ STL conventions: streams and strings
943. Very mature and well debugged code base.
95
96Features
97--------
98
99### Memory Model
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800100
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800101An XMLDocument is a C++ object like any other, that can be on the stack, or
102new'd and deleted on the heap.
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800103
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800104However, any sub-node of the Document, XMLElement, XMLText, etc, can only
105be created by calling the appropriate XMLDocument::NewElement, NewText, etc.
106method. Although you have pointers to these objects, they are still owned
107by the Document. When the Document is deleted, so are all the nodes it contains.
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800108
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400109### White Space
Lee Thomason (grinliz)ec7777e2012-02-26 20:51:27 -0800110
Lee Thomason (grinliz)9b6011c2012-09-09 19:12:06 -0700111#### Whitespace Preservation (default)
112
Lee Thomason (grinliz)ec7777e2012-02-26 20:51:27 -0800113Microsoft has an excellent article on white space: http://msdn.microsoft.com/en-us/library/ms256097.aspx
114
Lee Thomason (grinliz)c5defa62012-09-08 22:06:14 -0700115By default, TinyXML-2 preserves white space in a (hopefully) sane way that is almost complient with the
Lee Thomason (grinliz)9b6011c2012-09-09 19:12:06 -0700116spec. (TinyXML-1 used a completely different model, much more similar to 'collapse', below.)
Lee Thomason (grinliz)ec7777e2012-02-26 20:51:27 -0800117
118As a first step, all newlines / carriage-returns / line-feeds are normalized to a
119line-feed character, as required by the XML spec.
120
121White space in text is preserved. For example:
Lee Thomason12d5a032012-02-29 16:19:03 -0800122
Lee Thomason (grinliz)ec7777e2012-02-26 20:51:27 -0800123 <element> Hello, World</element>
Lee Thomason (grinliz)ec7777e2012-02-26 20:51:27 -0800124
Lee Thomasonc50b6b42012-03-24 12:51:47 -0700125The leading space before the "Hello" and the double space after the comma are
126preserved. Line-feeds are preserved, as in this example:
Lee Thomason (grinliz)ec7777e2012-02-26 20:51:27 -0800127
Lee Thomason (grinliz)ec7777e2012-02-26 20:51:27 -0800128 <element> Hello again,
129 World</element>
Lee Thomason (grinliz)ec7777e2012-02-26 20:51:27 -0800130
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400131However, white space between elements is **not** preserved. Although not strictly
Lee Thomasonc50b6b42012-03-24 12:51:47 -0700132compliant, tracking and reporting inter-element space is awkward, and not normally
133valuable. TinyXML-2 sees these as the same XML:
Lee Thomason (grinliz)ec7777e2012-02-26 20:51:27 -0800134
Lee Thomason12d5a032012-02-29 16:19:03 -0800135 <document>
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400136 <data>1</data>
137 <data>2</data>
138 <data>3</data>
Lee Thomason12d5a032012-02-29 16:19:03 -0800139 </document>
Lee Thomason (grinliz)ec7777e2012-02-26 20:51:27 -0800140
Lee Thomason12d5a032012-02-29 16:19:03 -0800141 <document><data>1</data><data>2</data><data>3</data></document>
Lee Thomason (grinliz)ec7777e2012-02-26 20:51:27 -0800142
Lee Thomason (grinliz)c5defa62012-09-08 22:06:14 -0700143#### Whitespace Collapse
144
Lee Thomason (grinliz)9b6011c2012-09-09 19:12:06 -0700145For some applications, it is preferable to collapse whitespace. Collapsing
146whitespace gives you "HTML-like" behavior, which is sometimes more suitable
147for hand typed documents.
148
149TinyXML-2 supports this with the 'whitespace' parameter to the XMLDocument constructor.
Lee Thomason (grinliz)c5defa62012-09-08 22:06:14 -0700150(The default is to preserve whitespace, as described above.)
151
152However, you may also use COLLAPSE_WHITESPACE, which will:
153
Lee Thomason (grinliz)9b6011c2012-09-09 19:12:06 -0700154* Remove leading and trailing whitespace
155* Convert newlines and line-feeds into a space character
156* Collapse a run of any number of space characters into a single space character
Lee Thomason (grinliz)c5defa62012-09-08 22:06:14 -0700157
158Note that (currently) there is a performance impact for using COLLAPSE_WHITESPACE.
159It essentially causes the XML to be parsed twice.
160
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400161### Entities
162
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800163TinyXML-2 recognizes the pre-defined "character entities", meaning special
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800164characters. Namely:
165
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800166 &amp; &
167 &lt; <
168 &gt; >
169 &quot; "
170 &apos; '
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800171
172These are recognized when the XML document is read, and translated to there
173UTF-8 equivalents. For instance, text with the XML of:
174
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800175 Far &amp; Away
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800176
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800177will have the Value() of "Far & Away" when queried from the XMLText object,
178and will be written back to the XML stream/file as an ampersand.
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800179
180Additionally, any character can be specified by its Unicode code point:
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800181The syntax "&#xA0;" or "&#160;" are both to the non-breaking space characher.
182This is called a 'numeric character reference'. Any numeric character reference
183that isn't one of the special entities above, will be read, but written as a
184regular code point. The output is correct, but the entity syntax isn't preserved.
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800185
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400186### Printing
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800187
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400188#### Print to file
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800189You can directly use the convenience function:
Lee Thomason12d5a032012-02-29 16:19:03 -0800190
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800191 XMLDocument doc;
192 ...
Lee Thomason77d7f202012-07-29 18:51:41 -0700193 doc.SaveFile( "foo.xml" );
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800194
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800195Or the XMLPrinter class:
Lee Thomason12d5a032012-02-29 16:19:03 -0800196
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800197 XMLPrinter printer( fp );
198 doc.Print( &printer );
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800199
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400200#### Print to memory
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800201Printing to memory is supported by the XMLPrinter.
Lee Thomason12d5a032012-02-29 16:19:03 -0800202
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800203 XMLPrinter printer;
Vasily Biryukov9a975b72013-05-11 21:41:42 +0600204 doc.Print( &printer );
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800205 // printer.CStr() has a const char* to the XML
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800206
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400207#### Print without an XMLDocument
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800208
Lee Thomason12d5a032012-02-29 16:19:03 -0800209When loading, an XML parser is very useful. However, sometimes
210when saving, it just gets in the way. The code is often set up
211for streaming, and constructing the DOM is just overhead.
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800212
Lee Thomason12d5a032012-02-29 16:19:03 -0800213The Printer supports the streaming case. The following code
214prints out a trivially simple XML file without ever creating
215an XML document.
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800216
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800217 XMLPrinter printer( fp );
218 printer.OpenElement( "foo" );
219 printer.PushAttribute( "foo", "bar" );
220 printer.CloseElement();
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800221
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400222Examples
223--------
Lee Thomason87e475a2012-03-20 11:55:29 -0700224
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400225#### Load and parse an XML file.
226
Lee Thomason87e475a2012-03-20 11:55:29 -0700227 /* ------ Example 1: Load and parse an XML file. ---- */
228 {
229 XMLDocument doc;
230 doc.LoadFile( "dream.xml" );
231 }
Lee Thomason87e475a2012-03-20 11:55:29 -0700232
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400233#### Lookup information.
234
Lee Thomason87e475a2012-03-20 11:55:29 -0700235 /* ------ Example 2: Lookup information. ---- */
236 {
237 XMLDocument doc;
238 doc.LoadFile( "dream.xml" );
239
240 // Structure of the XML file:
Lee Thomasonc50b6b42012-03-24 12:51:47 -0700241 // - Element "PLAY" the root Element, which is the
242 // FirstChildElement of the Document
243 // - - Element "TITLE" child of the root PLAY Element
244 // - - - Text child of the TITLE Element
Lee Thomason87e475a2012-03-20 11:55:29 -0700245
Lee Thomasonc50b6b42012-03-24 12:51:47 -0700246 // Navigate to the title, using the convenience function,
247 // with a dangerous lack of error checking.
Lee Thomason87e475a2012-03-20 11:55:29 -0700248 const char* title = doc.FirstChildElement( "PLAY" )->FirstChildElement( "TITLE" )->GetText();
249 printf( "Name of play (1): %s\n", title );
250
Lee Thomasonc50b6b42012-03-24 12:51:47 -0700251 // Text is just another Node to TinyXML-2. The more
252 // general way to get to the XMLText:
Lee Thomason87e475a2012-03-20 11:55:29 -0700253 XMLText* textNode = doc.FirstChildElement( "PLAY" )->FirstChildElement( "TITLE" )->FirstChild()->ToText();
254 title = textNode->Value();
255 printf( "Name of play (2): %s\n", title );
256 }
Lee Thomason87e475a2012-03-20 11:55:29 -0700257
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400258Using and Installing
259--------------------
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800260
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800261There are 2 files in TinyXML-2:
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400262* tinyxml2.cpp
263* tinyxml2.h
264
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800265And additionally a test file:
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400266* xmltest.cpp
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800267
Lee Thomason2f6e4762012-04-05 08:52:07 -0700268Simply compile and run. There is a visual studio 2010 project included, a simple Makefile,
Guillaume Pd60fe352013-05-12 15:11:37 +0200269an XCode project, a Code::Blocks project, and a cmake CMakeLists.txt included to help you.
270The top of tinyxml.h even has a simple g++ command line if you are are *nix and don't want
271to use a build system.
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800272
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400273Documentation
274-------------
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800275
276The documentation is build with Doxygen, using the 'dox'
277configuration file.
278
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400279License
280-------
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800281
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800282TinyXML-2 is released under the zlib license:
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800283
284This software is provided 'as-is', without any express or implied
285warranty. In no event will the authors be held liable for any
286damages arising from the use of this software.
287
288Permission is granted to anyone to use this software for any
289purpose, including commercial applications, and to alter it and
290redistribute it freely, subject to the following restrictions:
291
2921. The origin of this software must not be misrepresented; you must
293not claim that you wrote the original software. If you use this
294software in a product, an acknowledgment in the product documentation
295would be appreciated but is not required.
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -08002962. Altered source versions must be plainly marked as such, and
297must not be misrepresented as being the original software.
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -08002983. This notice may not be removed or altered from any source
299distribution.
300
Arkadiy Shapkin2204dda2012-07-21 02:15:50 +0400301Contributors
302------------
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800303
304Thanks very much to everyone who sends suggestions, bugs, ideas, and
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800305encouragement. It all helps, and makes this project fun.
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800306
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800307The original TinyXML-1 has many contributors, who all deserve thanks
308in shaping what is a very successful library. Extra thanks to Yves
309Berquin and Andrew Ellerton who were key contributors.
Lee Thomason (grinliz)9c38d132012-02-24 21:50:50 -0800310
Lee Thomason (grinliz)28129862012-02-25 21:11:20 -0800311TinyXML-2 grew from that effort. Lee Thomason is the original author
312of TinyXML-2 (and TinyXML-1) but hopefully TinyXML-2 will be improved
Lee Thomason74a81cf2013-09-12 23:59:43 -0700313by many contributors.
314
315Thanks to John Mackay for the TinyXML-2 logo.
316