pugixml documentation

From 26ab424b0302f73704c58b3b6deb62a85bfacba8 Mon Sep 17 00:00:00 2001 From: "arseny.kapoulkine" Date: Sun, 11 Jul 2010 15:29:31 +0000 Subject: docs: Removed old documents git-svn-id: http://pugixml.googlecode.com/svn/trunk@592 99668b35-9821-0410-8761-19e4c4f06640 --- docs/index.html | 818 -------------------------------------------------------- 1 file changed, 818 deletions(-) delete mode 100644 docs/index.html (limited to 'docs/index.html') diff --git a/docs/index.html b/docs/index.html deleted file mode 100644 index 2c710eb..0000000 --- a/docs/index.html +++ /dev/null @@ -1,818 +0,0 @@ - - - -pugixml documentation - - - - - - -

- -

- - -

Introduction

pugixml is just another XML parser. This is a successor to -pugxml (well, to be honest, the only part -that is left as is is wildcard matching code; the rest was either heavily refactored or rewritten -from scratch). The main features are:

- -

low memory consumption and fragmentation (the win over pugxml is ~1.3 times, TinyXML -- ~2.5 times, Xerces (DOM) - ~4.3 times ¹). Exact numbers can -be seen in Comparison with existing parsers section.
extremely high parsing speed (the win over pugxml is ~6 times, TinyXML - ~10 -times, Xerces-DOM - ~17.6 times ¹
extremely high parsing speed (well, I'm repeating myself, but it's so fast, that it outperforms -Expat by 2.8 times on test XML) ²
more or less standard-conformant (it will parse any standard-compliant file correctly, with the -exception of DTD related issues)
pretty much error-ignorant (it will not choke on something like <text>You & Me</text>, -like expat will; it will parse files with data in wrong encoding; and so on)
clean interface (a heavily refactored pugxml's one)
more or less Unicode-aware (actually, it assumes UTF-8 encoding of the input data, though -it will readily work with ANSI - no UTF-16 for now (see Future work), with -helper conversion functions (UTF-8 <-> UTF-16/32 (whatever is the default for std::wstring & wchar_t))
fully standard compliant C++ code (approved by Comeau -strict mode); the library is multiplatform (see reference for platforms -list)
high flexibility. You can control many aspects of file parsing and DOM tree building via parsing -options. -

- -

Okay, you might ask - what's the catch? Everything is so cute - it's small, fast, robust, clean solution -for parsing XML. What is missing? Ok, we are fair developers - so here is a misfeature list:

- -

memory consumption. It beats every DOM-based parser that I know of - but when SAX parser comes, -there is no chance. You can't process a 2 Gb XML file with less than 4 Gb of memory - and do it fast. -Though pugixml behaves better, than all other DOM-based parser, so if you're stuck with DOM, -it's not a problem.
memory consumption. Ok, I'm repeating myself. Again. When other parsers will allow you to provide -XML file in a constant storage (or even as a memory mapped area), pugixml will not. So you'll -have to copy the entire data into a non-constant storage. Moreover, it should persist during the -parser's lifetime (the reasons for that and more about lifetimes is written below). Again, if you're -ok with DOM - it should not be a problem, because the overall memory consumption is less (well, though -you'll need a contiguous chunk of memory, which can be a problem).
lack of validation, DTD processing, XML namespaces, proper handling of encoding. If you need those - -go take MSXML or XercesC or anything like that.
lack of UTF-16/32 parsing. This is not implemented for now, but this is the features for the next -release.

- -

- -¹ The tests were done on a 1 mb XML file with a 4 levels deep tree -with a small amount of text. The times are that of building DOM tree. pugixml was run in default -parsing mode, so differences in speed are even bigger with minimal settings.
-² Obviously, you can't estimate time of building DOM tree for a -SAX parser, so the times of reading the data into storage that closely represented the structure of -an XML file were measured. - -

- - -

Quick start

- -

Here there is a small collection of code snippets to help the reader begin using pugixml.

- -

For everything you can do with pugixml, you need a document. There are several ways to obtain it:

- -


-#include <fstream>
-#include <iostream>
-
-#include "pugixml.hpp"
-
-using namespace std;
-using namespace pugi;
-
-int main()
-{
-    // Several ways to get XML document
-
-    {
-        // Load from string
-        xml_document doc;
-
-        cout << doc.load("<sample-xml>some text <b>in bold</b> here</sample-xml>") << endl;
-    }
-
-    {
-        // Load from file
-        xml_document doc;
-
-        cout << doc.load_file("sample.xml") << endl;
-    }
-
-    {
-        // Load from any input stream (STL)
-        xml_document doc;
-
-        std::ifstream in("sample.xml");
-        cout << doc.load(in) << endl;
-    }
-
-    {
-        // More advanced: parse the specified string without duplicating it
-        xml_document doc;
-
-        char* s = new char[100];
-        strcpy(s, "<sample-xml>some text <b>in bold</b> here</sample-xml>");
-        cout << doc.parse(transfer_ownership_tag(), s) << endl;
-    }
-
-    {
-        // Even more advanced: assume manual lifetime control
-        xml_document doc;
-
-        char* s = new char[100];
-        strcpy(s, "<sample-xml>some text <b>in bold</b> here</sample-xml>");
-        cout << doc.parse(s) << endl;
-
-        delete[] s; // <-- after this point, all string contents of document is invalid!
-    }
-
-    {
-        // Or just create document from code?
-        xml_document doc;
-
-        // add nodes to document (see next samples)
-    }
-}
-

_Winnie C++ Colorizer

- -

This sample should print a row of 1, meaning that all load/parse functions returned true (of course, if sample.xml does not exist or is malformed, there will be 0's)

- -

Once you have your document, there are several ways to extract data from it.

- -


-#include <iostream>
-
-#include "pugixml.hpp"
-
-using namespace std;
-using namespace pugi;
-
-struct bookstore_traverser: public xml_tree_walker
-{
-    virtual bool for_each(xml_node& n)
-    {
-        for (int i = 0; i < depth(); ++i) cout << "  "; // indentation
-
-        if (n.type() == node_element) cout << n.name() << endl;
-        else cout << n.value() << endl;
-
-        return true; // continue traversal
-    }
-};
-
-int main()
-{
-    xml_document doc;
-    doc.load("<bookstore><book title='ShaderX'><price>3</price></book><book title='GPU Gems'><price>4</price></book></bookstore>");
-
-    // If you want to iterate through nodes...
-
-    {
-        // Get a bookstore node
-        xml_node bookstore = doc.child("bookstore");
-
-        // Iterate through books
-        for (xml_node book = bookstore.child("book"); book; book = book.next_sibling("book"))
-        {
-            cout << "Book " << book.attribute("title").value() << ", price " << book.child("price").first_child().value() << endl;
-        }
-
-        // Output:
-        // Book ShaderX, price 3
-        // Book GPU Gems, price 4
-    }
-
-    {
-        // Alternative way to get a bookstore node (wildcards)
-        xml_node bookstore = doc.child_w("*[sS]tore"); // this will select bookstore, anyStore, Store, etc.
-
-        // Iterate through books with STL compatible iterators
-        for (xml_node::iterator it = bookstore.begin(); it != bookstore.end(); ++it)
-        {
-            // Note the use of helper function child_value()
-            cout << "Book " << it->attribute("title").value() << ", price " << it->child_value("price") << endl;
-        }
-        
-        // Output:
-        // Book ShaderX, price 3
-        // Book GPU Gems, price 4
-    }
-
-    {
-        // You can also traverse the whole tree (or a subtree)
-        bookstore_traverser t;
-
-        doc.traverse(t);
-        
-        // Output:
-        // bookstore
-        //   book
-        //     price
-        //       3
-        //   book
-        //     price
-        //       4
-
-        doc.first_child().traverse(t);
-
-        // Output:
-        // book
-        //   price
-        //     3
-        // book
-        //   price
-        //     4
-    }
-
-    // If you want a distinct node...
-
-    {
-        // You can specify the way to it through child() functions
-        cout << doc.child("bookstore").child("book").next_sibling().attribute("title").value() << endl;
-
-        // Output:
-        // GPU Gems
-    
-        // You can use a sometimes convenient path function
-        cout << doc.first_element_by_path("bookstore/book/price").child_value() << endl;
-        
-        // Output:
-        // 3
-
-        // And you can use powerful XPath expressions
-        cout << doc.select_single_node("/bookstore/book[@title = 'ShaderX']/price").node().child_value() << endl;
-        
-        // Output:
-        // 3
-
-        // Of course, XPath is much more powerful
-
-        // Compile query that prints total price of all Gems book in store
-        xpath_query query("sum(/bookstore/book[contains(@title, 'Gems')]/price)");
-
-        cout << query.evaluate_number(doc) << endl;
-
-        // Output:
-        // 4
-
-        // You can apply the same XPath query to any document. For example, let's add another Gems
-        // book (more detail about modifying tree in next sample):
-        xml_node book = doc.child("bookstore").append_child();
-        book.set_name("book");
-        book.append_attribute("title") = "Game Programming Gems 2";
-        
-        xml_node price = book.append_child();
-        price.set_name("price");
-
-        xml_node price_text = price.append_child(node_pcdata);
-        price_text.set_value("5.3");
-    
-        // Now let's reevaluate query
-        cout << query.evaluate_number(doc) << endl;
-
-        // Output:
-        // 9.3
-    }
-}
-

_Winnie C++ Colorizer

- -

Finally, let's get into more details about tree modification and saving.

- -


-#include <iostream>
-
-#include "pugixml.hpp"
-
-using namespace std;
-using namespace pugi;
-
-int main()
-{
-    // For this example, we'll start with an empty document and create nodes in it from code
-    xml_document doc;
-
-    // Append several children and set values/names at once
-    doc.append_child(node_comment).set_value("This is a test comment");
-    doc.append_child().set_name("application");
-
-    // Let's add a few modules
-    xml_node application = doc.child("application");
-
-    // Save node wrapper for convenience
-    xml_node module_a = application.append_child();
-    module_a.set_name("module");
-    
-    // Add an attribute, immediately setting it's value
-    module_a.append_attribute("name").set_value("A");
-
-    // You can use operator=
-    module_a.append_attribute("folder") = "/work/app/module_a";
-
-    // Or even assign numbers
-    module_a.append_attribute("status") = 85.4;
-
-    // Let's add another module
-    xml_node module_c = application.append_child();
-    module_c.set_name("module");
-    module_c.append_attribute("name") = "C";
-    module_c.append_attribute("folder") = "/work/app/module_c";
-
-    // Oh, we missed module B. Not a problem, let's insert it before module C
-    xml_node module_b = application.insert_child_before(node_element, module_c);
-    module_b.set_name("module");
-    module_b.append_attribute("folder") = "/work/app/module_b";
-
-    // We can do the same thing for attributes
-    module_b.insert_attribute_before("name", module_b.attribute("folder")) = "B";
-    
-    // Let's add some text in module A
-    module_a.append_child(node_pcdata).set_value("Module A description");
-
-    // Well, there's not much left to do here. Let's output our document to file using several formatting options
-
-    doc.save_file("sample_saved_1.xml");
-    
-    // Contents of file sample_saved_1.xml (tab size = 4):
-    // <?xml version="1.0"?>
-    // <!--This is a test comment-->
-    // <application>
-    //     <module name="A" folder="/work/app/module_a" status="85.4">Module A description</module>
-    //     <module name="B" folder="/work/app/module_b" />
-    //     <module name="C" folder="/work/app/module_c" />
-    // </application>
-
-    // Let's use two spaces for indentation instead of tab character
-    doc.save_file("sample_saved_2.xml", "  ");
-
-    // Contents of file sample_saved_2.xml:
-    // <?xml version="1.0"?>
-    // <!--This is a test comment-->
-    // <application>
-    //   <module name="A" folder="/work/app/module_a" status="85.4">Module A description</module>
-    //   <module name="B" folder="/work/app/module_b" />
-    //   <module name="C" folder="/work/app/module_c" />
-    // </application>
-    
-    // Let's save a raw XML file
-    doc.save_file("sample_saved_3.xml", "", format_raw);
-    
-    // Contents of file sample_saved_3.xml:
-    // <?xml version="1.0"?><!--This is a test comment--><application><module name="A" folder="/work/app/module_a" status="85.4">Module A description</module><module name="B" folder="/work/app/module_b" /><module name="C" folder="/work/app/module_c" /></application>
-
-    // Finally, you can print a subtree to any output stream (including cout)
-    xml_writer_stream writer(cout);
-    doc.child("application").child("module").print(writer);
-
-    // Output:
-    // <module name="A" folder="/work/app/module_a" status="85.4">Module A description</module>
-}
-

_Winnie C++ Colorizer

- -

Note, that these examples do not cover the whole pugixml API. For further information, look into reference section.

- -

- - -

Reference

- -

pugixml is a library for parsing XML files, which means that you give it XML data some way, -and it gives you the DOM tree and the ways to traverse it and to get some useful information from it. -The library source consist of two headers, pugixml.hpp and pugiconfig.hpp, and two source -files, pugixml.cpp and pugixpath.cpp. -You can either compile cpp files in your project, or build a static library. -All library classes reside in namespace pugi, so you can either use fully qualified -names (pugi::xml_node) or write a using declaration (using namespace pugi;, using -pugi::xml_node) and use plain names. All classes have eitther xml_ or xpath_ prefix.

- -

By default it's supposed that you compile the source file with your project (add it into your -project, or add relevant entry in your Makefile, or do whatever you need to do with your compilation -environment). The library is written in standard-conformant C++ and was tested on following platforms:

- -

Windows 32-bit (MSVC 3 6.0, MSVC 7.0 (2002), MSVC 7.1 (2003), MSVC 8.0 (2005), MSVC 9.0 (2008), MSVC 10.0 (2010), ICC⁴ 8.0, ICC 8.1, GCC 3.4.2 (MinGW), GCC 4.4.0 (MinGW), BCC⁵ 5.82, DMC⁶ 8.50, Comeau C++ 4.3.3, PGI⁷ 6.2, CW⁸ 8.0) -
Windows 64-bit (MSVC 9.0 (2008)) -
Linux 32-bit (GCC 3.2) -
Sony Playstation Portable (GCC 3.4.2; in PUGIXML_NO_STL mode) -
Sony Playstation 3 (GCC 4.0.2; in PUGIXML_NO_EXCEPTIONS mode (-fno-exceptions)) -
Microsoft Xbox (MSVC 7.1) -
Microsoft Xbox 360 (MSVC 8.0) -

- -

The documentation for pugixml classes, functions and constants is available here.

- -

- -³ MSVC is Microsoft Visual C++ Compiler
-⁴ ICC is Intel C++ Compiler
-⁵ BCC is Borland C++ Compiler
-⁶ DMC is Digital Mars C++ Compiler
-⁷ PGI is Portland Group C++ Compiler
-⁸ CW is Metrowerks CodeWarrior - -

- - -

W3C compliance

- -

pugixml is not a compliant XML parser. The main reason for that is that it does not reject -most malformed XML files. The more or less complete list of incompatibilities follows (I will be talking -of ones when using parse_w3c mode): - -

The parser is completely DOCTYPE-ignorant, that is, it does not even skip all possible DOCTYPEs -correctly, let alone use them for parsing -
It accepts multiple attributes with the same name in one node -
It is charset-ignorant -
It accepts invalid attribute values (those with < in them) and does not reject invalid entity -references or character references (in fact, it does not do DOCTYPE parsing, so it does not perform -entity reference expansion) -
It does not reject comments with -- inside -
It does not reject PI with the names of 'xml' and alike -
And some other things that I forgot to mention -

- -In short, it accepts some malformed XML files and does not do anything that is related to DOCTYPE. -This is because the main goal was developing fast, easy-to-use and error ignorant (so you can get -something even from a malformed document) parser, there are some good validating and conformant -parsers already.

- -

- - -

Comparison with existing parsers

- -

This table summarizes the comparison in terms of time and memory consumption between pugixml and -other parsers. For DOM parsers (all, except Expat, irrXML and SAX parser of XercesC), the process is -as follows:

- -

construct DOM tree from file, which is preloaded in memory (all parsers take const char* and size -as an input). 'parse time' means number of CPU clocks which is spent, 'parse allocs' - number of allocations, -'parse memory' - peak memory consumption -
traverse DOM tree to fill information from it into some structure (which is the same for all parsers, -of course). 'walk time' means number of CPU clocks which is spent, 'walk allocs' - number of allocations -

- -

For SAX parsers, the parse step is skipped (hence the N/A in relevant table cells), structure is -filled during 'walk' step.

- -

For all parsers, 'total time' column means total time spent on the whole process, 'total allocs' - -total allocation count, 'total memory' - peak memory consumption for the whole process.

- -

The tests were performed on a 1 Mb XML file with a small amount of text. They were compiled with -Microsoft Visual C++ 8.0 (2005) compiler in Release mode, with checked iterators/secure STL turned -off. The test system is AMD Sempron 2500+, 512 Mb RAM.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

parser	parse time	parse allocs	parse memory	walk time	walk allocs	total time	total allocs	total memory
irrXML	N/A	N/A	N/A	352 Mclocks	697 245	356 Mclocks	697 284	906 kb
Expat	N/A	N/A	N/A	97 Mclocks	19	97 Mclocks	23	1028 kb
TinyXML	168 Mclocks	50 163	5447 kb	37 Mclocks	0	242 Mclocks	50 163	5447 kb
PugXML	100 Mclocks	106 597	2747 kb	38 Mclocks	0	206 Mclocks	131 677	2855 kb
XercesC SAX	N/A	N/A	N/A	411 Mclocks	70 380	411 Mclocks	70 495	243 kb
XercesC DOM	300 Mclocks	30 491	9251 kb	65 Mclocks	1	367 Mclocks	30 492	9251 kb
pugixml	17 Mclocks	40	2154 kb	14 Mclocks	0	32 Mclocks	40	2154 kb
pugixml (test of non-destructive parsing)	12 Mclocks	51	1632 kb	21 Mclocks	0	34 Mclocks	51	1632 kb

- -

Note, that non-destructive parsing mode was just a test and is not yet in pugixml.

- -

- - -

FAQ

- -

Q: I do not have/want STL support. How can I compile pugixml without STL?

A: There is an undocumented define PUGIXML_NO_STL. If you uncomment the relevant line -in pugixml header file, it will compile without any STL classes. The reason it is undocumented -are that it will make some documented functions not available (specifically, xml_document::load, that -operates on std::istream, xml_node::path function, XPath-related functions and classes and as_utf16/as_utf8 -conversion functions). Otherwise, it will work fine.

- -

Q: Do paths that are accepted by first_element_by_path have to end with delimiter?

A: Either way will work, both /path/to/node/ and /path/to/node is fine.

- -

I'm always open for questions; feel free to write them to arseny.kapoulkine@gmail.com. -

- -

- - -

Bugs

- -

I'm always open for bug reports; feel free to write them to arseny.kapoulkine@gmail.com. -Please provide as much information as possible - version of pugixml, compiling and OS environment -(compiler and it's version, STL version, OS version, etc.), the description of the situation in which -the bug arises, the code and data files that show the bug, etc. - the more, the better. Though, please, -do not send executable files.

- -

Note, that you can also submit bug reports/suggestions at -project page. - -

- - -

Future work

- -

Here are some improvements that will be done in future versions (they are sorted by priority, the -upper ones will get there sooner).

- -

Support for UTF-16 files (parsing BOM to get file's type and converting UTF-16 file to UTF-8 buffer -if necessary) -
More intelligent parsing of DOCTYPE (it does not always skip DOCTYPE for now) -
XML 1.1 changes (changed EOL handling, normalization issues, etc.) -
Name your own? -

- -

- - -

Changelog

- -

15.07.2006 - v0.1 -

First private release for testing purposes - -

6.11.2006 - v0.2 -

First public release. Changes:

Introduced child_value(name) and child_value_w(name) -
Fixed child_value() (for empty nodes) -
Fixed xml_parser_impl warning at W4 -
parse_eol_pcdata and parse_eol_attribute flags + parse_minimal optimizations -
Optimizations of strconv_t -

- -

21.02.2007 - v0.3 -

Refactored, reworked and improved version. Changes:

Interface:
- Added XPath -
- Added tree modification functions -
- Added no STL compilation mode -
- Added saving document to file -
- Refactored parsing flags -
- Removed xml_parser class in favor of xml_document -
- Added transfer ownership parsing mode -
- Modified the way xml_tree_walker works -
- Iterators are now non-constant -
-
Implementation:
- Support of several compilers and platforms -
- Refactored and sped up parsing core -
- Improved standard compliancy -
- Added XPath implementation -
- Fixed several bugs -
-

31.10.2007 - v0.34 -

Maintenance release. Changes:

Improved compatibility (supported Digital Mars C++, MSVC 6, CodeWarrior 8, PGI C++, Comeau, supported PS3 and XBox360) -
Fixed bug with loading from text-mode iostreams -
Fixed leak when transfer_ownership is true and parsing is failing -
Fixed bug in saving (\r and \n are now escaped in attribute values) -
PUGIXML_NO_EXCEPTION flag for platforms without exception handling -
Renamed free() to destroy() - some macro conflicts were reported -

18.01.2009 - v0.4 -

Changes:

Bugs:
- Documentation fix in samples for parse() with manual lifetime control -
- Fixed document order sorting in XPath (it caused wrong order of nodes after xpath_node_set::sort and wrong results of some XPath queries) -
-
Node printing changes:
- Single quotes are no longer escaped when printing nodes -
- Symbols in second half of ASCII table are no longer escaped when printing nodes; because of this, format_utf8 flag is deleted as it's no longer needed and -format_write_bom is renamed to format_write_bom_utf8. -
- Reworked node printing - now it works via xml_writer interface; implementations for FILE* and std::ostream are available. As a side-effect, xml_document::save_file -now works without STL. -
-
New features:
- Added unsigned integer support for attributes (xml_attribute::as_uint, xml_attribute::operator=) -
- Now document declaration (<?xml ...?>) is parsed as node with type node_declaration when parse_declaration flag is specified (access to encoding/version is performed as if they -were attributes, i.e. doc.child("xml").attribute("version").as_float()); corresponding flags for node printing were also added -
- Added support for custom memory management (see set_memory_management_functions for details) -
- Implemented node/attribute copying (see xml_node::insert_copy_* and xml_node::append_copy for details) -
- Added find_child_by_attribute and find_child_by_attribute_w to simplify parsing code in some cases (i.e. COLLADA files) -
- Added file offset information querying for debugging purposes (now you're able to determine exact location of any xml_node in parsed file, see xml_node::offset_debug for details) -
- Improved error handling for parsing - now load(), load_file() and parse() return xml_parse_result, which contains error code and last parsed offset; this does not break old interface as xml_parse_result can be implicitly casted to bool. -
-

8.02.2009 - v0.41 -

Maintenance release. Changes:

Fixed bug with node printing (occasionally some content was not written to output stream) -

17.09.2009 - v0.42 -

Maintenance release. Changes:

Fixed deallocation in case of custom allocation functions or if delete[] / free are incompatible -
XPath parser fixed for incorrect queries (i.e. incorrect XPath queries should now always fail to compile) -
Added PUGIXML_API/PUGIXML_CLASS/PUGIXML_FUNCTION configuration macros to control class/function attributes -
Const-correctness fixes for find_child_by_attribute -
Improved compatibility (miscellaneous warning fixes, fixed cstring include dependency for GCC) -
Fixed iterator begin/end and print function to work correctly for empty nodes -
Added xml_attribute::set_value overloads for different types -

8.11.2009 - v0.5 -

Major bugfix release. Changes:

XPath bugfixes:
- Fixed translate(), lang() and concat() functions (infinite loops/crashes) -
- Fixed compilation of queries with empty literal strings ("") -
- Fixed axis tests: they never add empty nodes/attributes to the resulting node set now -
- Fixed string-value evaluation for node-set (the result excluded some text descendants) -
- Fixed self:: axis (it behaved like ancestor-or-self::) -
- Fixed following:: and preceding:: axes (they included descendent and ancestor nodes, respectively) -
- Minor fix for namespace-uri() function (namespace declaration scope includes the parent element of namespace declaration attribute) -
- Some incorrect queries are no longer parsed now (i.e. foo: *) -
- Fixed text()/etc. node test parsing bug (i.e. foo[text()] failed to compile) -
- Fixed root step (/) - it now selects empty node set if query is evaluated on empty node -
- Fixed string to number conversion ("123 " converted to NaN, "123 .456" converted to 123.456 - now the results are 123 and NaN, respectively) -
- Node set copying now preserves sorted type; leads to better performance on some queries -
-
Miscellaneous bugfixes:
- Fixed xml_node::offset_debug for PI nodes -
- Added empty attribute checks to xml_node::remove_attribute -
- Fixed node_pi and node_declaration copying -
- Const-correctness fixes -
-
Specification changes:
- xpath_node::select_nodes() and related functions now throw exception if expression return type is not node set (instead of assertion) -
- xml_node::traverse() now sets depth to -1 for both begin() and end() callbacks (was 0 at begin() and -1 at end()) -
- In case of non-raw node printing a newline is output after PCDATA inside nodes if the PCDATA has siblings -
- UTF8 -> wchar_t conversion now considers 5-byte UTF8-like sequences as invalid -
-
New features:
- Added xpath_node_set::operator[] for index-based iteration -
- Added xpath_query::return_type() -
- Added getter accessors for memory-management functions -
-

7.05.2010 - v0.6 -

Changes:

Bug fixes:
- Fixed document corruption on failed parsing bug -
- XPath string <-> number conversion improvements (increased precision, fixed crash for huge numbers) -
-
Major Unicode improvements:
- Introduced encoding support (automatic/manual encoding detection on load, manual encoding selection on save, conversion from/to UTF8, UTF16 LE/BE, UTF32 LE/BE) -
- Introduced wchar_t mode (you can set PUGIXML_WCHAR_MODE define to switch pugixml internal encoding from UTF8 to wchar_t; all functions are switched to their Unicode variants) -
- Load/save functions now support wide streams -
-
Specification changes:
- parse() API changed to load_buffer/load_buffer_inplace/load_buffer_inplace_own; load_buffer APIs do not require zero-terminated strings. -
- Renamed as_utf16 to as_wide -
- Changed xml_node::offset_debug return type and xml_parse_result::offset type to ptrdiff_t -
-
Miscellaneous:
- Optimized document parsing and saving -
- All STL includes in pugixml.hpp are replaced with forward declarations -
- Added contrib/ folder with Boost.Foreach compatibility helpers for iterators and header-only configuration support through special header -
-

- - -

25.05.2010 - v0.7 -

Changes:

Compatibility:
- Added parse() and as_utf16 for compatibility (these functions are deprecated and will be removed in pugixml-1.0) -
- Wildcard functions, document_order/precompute_document_order functions, format_write_bom_utf8 and parse_wnorm_attribute flags are deprecated and will be removed in version 1.0 -
-
Optimizations:
- Changed internal memory management: internal allocator is used for both metadata and name/value data; allocated pages are deleted if all allocations from them are deleted -
- Optimized memory consumption: sizeof(xml_node_struct) reduced from 40 bytes to 32 bytes on x86 -
- Unicode conversion optimizations -
- Optimized debug mode parsing/saving by order of magnitude -
-
Bug fixes / specification changes:
- Improved DOCTYPE parsing: now parser recognizes all well-formed DOCTYPE declarations -
- Fixed as_uint() for large numbers (i.e. 2^32-1) -
- Nodes/attributes with empty names are now printed as :anonymous -
-

- - -

- -

- - -

Acknowledgements

- -

Kristen Wegner for pugxml parser -
Neville Franks for contributions to pugxml parser -

- -

- - -

License

- -

The pugixml parser is distributed under the MIT license:

- -

-Copyright (c) 2006-2010 Arseny Kapoulkine
-
-Permission is hereby granted, free of charge, to any person
-obtaining a copy of this software and associated documentation
-files (the "Software"), to deal in the Software without
-restriction, including without limitation the rights to use,
-copy, modify, merge, publish, distribute, sublicense, and/or sell
-copies of the Software, and to permit persons to whom the
-Software is furnished to do so, subject to the following
-conditions:
-
-The above copyright notice and this permission notice shall be
-included in all copies or substantial portions of the Software.
-
-THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
-EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
-OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
-NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
-HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
-WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
-FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
-OTHER DEALINGS IN THE SOFTWARE.
-

- -

Revised 25 May, 2010

- - -- cgit v1.2.3