From 0a97bad6608a2b1ea01ae6ce18bab63abf0c9210 Mon Sep 17 00:00:00 2001
From: "arseny.kapoulkine"
pugixml is just another XML parser. This is a successor to
pugxml (well, to be honest, the only part
that is left as is is wildcard matching code; the rest was either heavily refactored or rewritten
-from scratch). The main features (call it USP) are:Contents
pugixml is a DOM-based parser. This means, that the XML document is converted to a tree. -Each XML tag is converted to a node in DOM tree. If a tag is contained in some other tag, its node -is a child to the outer tag's one. Comments, CDATA sections and PIs (Processing Instructions) also are -transformed into tree nodes, as is the standalone text. Each node has its type.
+Here there is a small collection of code snippets to help the reader begin using pugixml.
-Here is an example of an XML document: +
For everything you can do with pugixml, you need a document. There are several ways to obtain it:
--<?xml version="1.0"?> -<mesh name="mesh_root"> - <!-- here is a mesh node --> - some text - <![CDATA[[someothertext]]> - some more text - <node attr1="value1" /> - <node attr1="value2"> - <?TARGET somedata?> - <innernode/> - </node> -</mesh> -- -It gets converted to the following tree (note, that with some parsing options comments, PIs and CDATA -sections are not stored in the tree, and with some options there are also nodes with whitespaces -and the contents of PCDATA sections is a bit different (with trailing/leading whitespaces). So generally -the resulting DOM tree depends on the parsing options): - - - -
The parent-children relations are shown with lines. Some nodes have previous and next siblings -(for example, the next sibling for node_comment node is node_pcdata with value "some text", and the -previous sibling for node_element with name "mesh" is node_pi with target "xml" (target for PI nodes -is stored in the node name)).
-pugixml is a library for parsing XML files, which means that you give it XML data some way, -and it gives you the DOM tree and the ways to traverse it and to get some useful information from it. -The library source consist of two files, the header pugixml.hpp, and the source code pugixml.cpp. -You can either compile cpp file in your project, or build a static library (or perhaps even a DLL), -or make the whole code use inline linkage and make one big file (as it was done in pugxml). -All library classes reside in namespace pugi, so you can either use fully qualified -names (pugi::xml_node) or write a using declaration (using namespace pugi;, using -pugi::xml_node) and use plain names. All classes have the xml_ prefix.
- -By default it's supposed that you compile the source file with your project (add it into your -project, or add relevant entry in your Makefile, or do whatever you need to do with your compilation -environment). The library is written in standard-conformant C++ and was tested on win32 platform -(MSVC 7.1 (2003), MSVC 8.0 (2005)).
- - -xml_parser class is the core of parsing process; you initiate parsing with it, you get DOM -tree from it, the nodes and attributes are stored in it. You have two ways to load a file: either -provide a string with XML-data (it has to be null-terminated, and it will be modified during parsing -process, so it can not be a piece of read-only memory), or with an std::istream object (any input -stream, like std::ifstream, std::istringstream, etc.) - in this case the parser will allocate -the necessary amount of memory (equivalent to stream's size) and read everything from the stream.
- -The functions for parsing are: -
- void parse(std::istream& stream, unsigned int optmsk = parse_noset); |
stream
,
-read the chunk of data from the stream and parse it with provided options (optmsk
).
-The stream does not have to persist after the call to the function, the lifetime of internal buffer
-with stream's data is managed by pugixml.
-- char* parse(char* xmlstr, unsigned int optmsk = parse_noset); - |
- char* parse(const ownership_transfer_tag&, char* xmlstr, unsigned int optmsk = parse_noset); - |
- xml_parser(std::istream& stream, unsigned int optmsk = parse_default); |
- xml_parser(char* xmlstr, unsigned int optmsk = parse_default); |
- xml_parser(const ownership_transfer_tag&, char* xmlstr, unsigned int optmsk = parse_default); |
If you want to provide XML data after the creation of the parser, use the default ctor. Otherwise -you are free to use either parsing ctors or default ctor and later - parsing function.
- -After parsing an XML file, you'll get a DOM tree. To get access to it (or, more precisely, to its -root), call either document() function or cast xml_parser object to xml_node by -using the following functions:
- -- operator xml_node() const; - xml_node document() const; - |
Ok, easy part is behind - now let's dive into parsing options. There is a variety of them, and you -must choose them wisely to get the needed results and the best speed/least memory overhead. At first, -there are flags that determine which parts of the document will be put into DOM tree, and which will -be just skipped:
- -Then there are flags that determine how the processing of the retrieved data is done. There are -several reasons for these flags, mainly: -
Finally, there are two more flags, that indicate closing tag parsing. When pugixml meets a -close tags, there are three ways: -
Did I say finally? Ok, so finally there are some helper flags, or better groups of flags. -These are: -
A couple of words on flag usage. The parsing options are just a set of bits, with each bit corresponding -to one flag. You can turn the flag on by OR-ing the options value with this flag's constant: -
- parse_w3c | parse_wnorm_attribute --or turn the flag off by AND-ing the options value with the NEGation of this flag's constant: -
- parse_w3c & ~parse_comments --You can access the current options of parser by options() method: -
- unsigned int options() const; - unsigned int options(unsigned int optmsk); - |
If xml_parser is a heart of constructing a DOM tree from file, xml_node is a heart -of processing the tree. This is a simple wrapper, so it's small (4/8 bytes, depending on the size of -pointer), you're free to copy it and it does not own anything. I'll continue with a list of methods -with their description, with one note in advance. Some functions, that do something according to a -string-like parameter, have a pair with a suffix _w. The _w suffix tells, that this -function is doing a wildcard matching, instead of simple string comparison. You're free to use wildcards -* (that is equal to any sequence of characters (possibly empty)), ? (that is equal to -any character) and character sets ([Abc] means 'any symbol of A, b and c', [A-Z4] means -'any symbol from A to Z, or 4', [!0-9] means 'any symbol, that is not a digit'). So the wildcard -?ell_[0-9][0-9]_* will match strings like 'cell_23_xref', 'hell_00_', but will not match the -strings like 'ell_23_xref', 'cell_0_x' or 'cell_0a_x'.
- -- /// Access iterators for this node's collection of child nodes. - iterator begin() const; - iterator end() const; - - /// Access iterators for this node's collection of child nodes (same as begin/end). - iterator children_begin() const; - iterator children_end() const; - - /// Access iterators for this node's collection of attributes. - attribute_iterator attributes_begin() const; - attribute_iterator attributes_end() const; - - /// Access iterators for this node's collection of siblings. - iterator siblings_begin() const; - iterator siblings_end() const; - |
Functions, returning the iterators to walk through children/siblings/attributes. More on that in -Iterators section.
- -- operator unspecified_bool_type() const; - |
This is a safe bool-like conversion operator. You can check node's validity (if (xml_node), - if (!xml_node), if (node1 && node2 && !node3 && cond1 && ...) - you get the idea) with -it. -
- -- bool operator==(const xml_node& r) const; - bool operator!=(const xml_node& r) const; - bool operator<(const xml_node& r) const; - bool operator>(const xml_node& r) const; - bool operator<=(const xml_node& r) const; - bool operator>=(const xml_node& r) const; - |
Comparison operators
- -- bool empty() const; - |
if (node.empty())
is equivalent to if (!node)
- xml_node_type type() const; - const char* name() const; - const char* value() const; - |
Access node's properties (type, name and value). If there is no name/value, the corresponding functions -return "" - they never return NULL.
- -- xml_node child(const char* name) const; - xml_node child_w(const char* name) const; - |
Get a child node with specified name, or xml_node() (this is an invalid node) if nothing is -found
- -- xml_attribute attribute(const char* name) const; - xml_attribute attribute_w(const char* name) const; - |
Get an attribute with specified name, or xml_attribute() (this is an invalid attribute) if -nothing is found
- -- xml_node sibling(const char* name) const; - xml_node sibling_w(const char* name) const; - |
Get a node's sibling with specified name, or xml_node() if nothing is found.
-node.sibling(name)
is equivalent to node.parent().child(name)
.
- xml_node next_sibling(const char* name) const; - xml_node next_sibling_w(const char* name) const; - xml_node next_sibling() const; - |
These functions get the next sibling, that is, one of the siblings of that node, that is to the
-right. next_sibling()
just returns the right brother of the node (or xml_node()),
-the two other functions are searching for the sibling with the given name
- xml_node previous_sibling(const char* name) const; - xml_node previous_sibling_w(const char* name) const; - xml_node previous_sibling() const; - |
These functions do exactly the same as next_sibling
ones, with the exception that they
-search for the left siblings.
- xml_node parent() const; - |
Get a parent node. The parent node for the root one (the document) is considered to be the document -itself.
- -- const char* child_value() const; - |
Look for the first node of type node_pcdata or node_cdata among the -children of the current node and return its contents (or "" if nothing is found)
- -- const char* child_value(const char* name) const; - |
This is the convenient way of looking into child's child value - that is, node.child_value(name) is equivalent to node.child(name).child_value().
+#include "pugixml.hpp" -- const char* child_value_w(const char* name) const; - |
This is the convenient way of looking into child's child value - that is, node.child_value_w(name) is equivalent to node.child_w(name).child_value().
+int main() +{ + // Several ways to get XML document -- xml_attribute first_attribute() const; - xml_attribute last_attribute() const; - |
These functions get the first and last attributes of the node (or xml_attribute() if the node -has no attributes).
+ cout << doc.load("<sample-xml>some text <b>in bold</b> here</sample-xml>") << endl; + } -- xml_node first_child() const; - xml_node last_child() const; - |
These functions get the first and last children of the node (or xml_node() if the node has -no children).
+ cout << doc.load_file("sample.xml") << endl; + } -- template <typename OutputIterator> void all_elements_by_name(const char* name, OutputIterator it) const; - template <typename OutputIterator> void all_elements_by_name_w(const char* name, OutputIterator it) const; - |
Get all elements with the specified name in the subtree (depth-first search) and return them with -the help of output iterator (i.e. std::back_inserter)
+ std::ifstream in("sample.xml"); + cout << doc.load(in) << endl; + } -- template <typename Predicate> xml_attribute find_attribute(Predicate pred) const; - template <typename Predicate> xml_node find_child(Predicate pred) const; - template <typename Predicate> xml_node find_element(Predicate pred) const; - |
Find attribute, child or a node in the subtree (find_element - depth-first search) with the help -of the given predicate. Predicate should behave like a function which accepts a xml_node or -xml_attribute (for find_attribute) parameter and returns bool. The first entity for which -the predicate returned true is returned. If predicate returned false for all entities, xml_node() -or xml_attribute() is returned.
+ char* s = new char[100]; + strcpy(s, "<sample-xml>some text <b>in bold</b> here</sample-xml>"); + cout << doc.parse(transfer_ownership_tag(), s) << endl; + } -- xml_node first_element(const char* name) const; - xml_node first_element_w(const char* name) const; + { + // Even more advanced: assume manual lifetime control + xml_document doc; - xml_node first_element_by_value(const char* name, const char* value) const; - xml_node first_element_by_value_w(const char* name, const char* value) const; + char* s = new char[100]; + strcpy(s, "<sample-xml>some text <b>in bold</b> here</sample-xml>"); + cout << doc.parse(transfer_ownership_tag(), s) << endl; - xml_node first_element_by_attribute(const char* name, const char* attr_name, const char* attr_value) const; - xml_node first_element_by_attribute_w(const char* name, const char* attr_name, const char* attr_value) const; + delete[] s; // <-- after this point, all string contents of document is invalid! + } - xml_node first_element_by_attribute(const char* attr_name, const char* attr_value) const; - xml_node first_element_by_attribute_w(const char* attr_name, const char* attr_value) const; - |
Find the first node (depth-first search), which corresponds to the given criteria (i.e. either has -a matching name, or a matching value, or has an attribute with given name/value, or has an attribute -and has a matching name). Note that _w versions treat all parameters as wildcards.
+ // add nodes to document (see next samples) + } +} +- xml_node first_node(xml_node_type type) const; - |
This sample should print a row of 1, meaning that all load/parse functions returned true (of course, if sample.xml does not exist or is malformed, there will be 0's)
-Return a first node (depth-first search) with a given type, or xml_node().
+Once you have your document, there are several ways to extract data from it.
- std::string path(char delimiter = '/') const; - |
Get a path of the node (i.e. the string of names of the nodes on the path from the DOM tree root -to the node, separated with delimiter (/ by default).
+#include "pugixml.hpp" -- xml_node first_element_by_path(const char* path, char delimiter = '/') const; - |
Get the first element that has the following path. The path can be absolute (beginning with delimiter) or -relative, '..' means 'up-level' (so if we are at the path mesh/fragment/geometry/stream, ../.. -will lead us to mesh/fragment, and /mesh will lead us to mesh).
+struct bookstore_traverser: public xml_tree_walker +{ + virtual bool for_each(xml_node& n) + { + for (int i = 0; i < depth(); ++i) cout << " "; // indentation -- bool traverse(xml_tree_walker& walker) const; - |
Traverse the subtree (beginning with current node) with the walker, return the result. See -Miscellaneous section for details.
+ return true; // continue traversal + } +}; - -Like xml_node, xml_attribute is a simple wrapper of the node's attribute.
+ // If you want to iterate through nodes... -- bool operator==(const xml_attribute& r) const; - bool operator!=(const xml_attribute& r) const; - bool operator<(const xml_attribute& r) const; - bool operator>(const xml_attribute& r) const; - bool operator<=(const xml_attribute& r) const; - bool operator>=(const xml_attribute& r) const; - |
Comparison operators.
+ // Iterate through books + for (xml_node book = bookstore.child("book"); book; book = book.next_sibling("book")) + { + cout << "Book " << book.attribute("title").value() << ", price " << book.child("price").first_child().value() << endl; + } -- operator unspecified_bool_type() const; - |
Safe bool conversion - like in xml_node, use this to check for validity.
+ { + // Alternative way to get a bookstore node (wildcards) + xml_node bookstore = doc.child_w("*[sS]tore"); // this will select bookstore, anyStore, Store, etc. -- bool empty() const; - |
Like with xml_node, if (attr.empty())
is equivalent to if (!attr)
.
-
- xml_attribute next_attribute() const; - xml_attribute previous_attribute() const; - |
Get the next/previous attribute of the node, that owns the current attribute. Return xml_attribute() -if no such attribute is found.
+ // If you want a distinct node... -- const char* name() const; - const char* value() const; - |
Get the name and value of the attribute. These methods never return NULL - they return "" instead.
+ // Output: + // GPU Gems + + // You can use a sometimes convenient path function + cout << doc.first_element_by_path("bookstore/book/price").child_value() << endl; + + // Output: + // 3 -- int as_int() const; - double as_double() const; - float as_float() const; - |
Convert the value of an attribute to the desired type. If the conversion is not successfull, return -default value (0 for int, 0.0 for double, 0.0f for float). These functions rely on CRT functions ato*.
+ // Of course, XPath is much more powerful -- bool as_bool() const; - |
Convert the value of an attribute to bool. This method returns true if the first character of the -value is '1', 't', 'T', 'y' or 'Y'. Otherwise it returns false.
+ cout << query.evaluate_number(doc) << endl; - -Sometimes you have to cycle through the children or the attributes of the node. You can do it either -by using next_sibling, previous_sibling, next_attribute and previous_attribute -(along with first_child, last_child, first_attribute and last_attribute), -or you can use an iterator-like interface. There are two iterator types, xml_node_iterator and -xml_attribute_iterator. They are bidirectional constant iterators, which means that you can -either increment or decrement them, and use dereferencing and member access operators to get constant -access to node/attribute (the constness of iterators may change with the introducing of mutable trees).
+ // You can apply the same XPath query to any document. For example, let's add another Gems + // book (more detail about modifying tree in next sample): + xml_node book = doc.child("bookstore").append_child(); + book.set_name("book"); + book.append_attribute("title") = "Game Programming Gems 2"; + + xml_node price = book.append_child(); + price.set_name("price"); -In order to get the iterators, use corresponding functions of xml_node. Note that _end() -functions return past-the-end iterator, that is, in order to get the last attribute, you'll have to -do something like: + xml_node price_text = price.append_child(node_pcdata); + price_text.set_value("5.3"); + + // Now let's reevaluate query + cout << query.evaluate_number(doc) << endl; -
- if (node.attributes_begin() != node.attributes_end()) // we have at least one attribute - { - xml_attribute last_attrib = *(--node.attributes_end()); - ... + // Output: + // 9.3 } - |
If you want to traverse a subtree, you can use traverse function. There is a class -xml_tree_walker, which has some functions that you can override in order to get custom traversing -(the default one just does nothing). - -
- virtual bool begin(const xml_node&); - virtual bool end(const xml_node&); - |
These functions are called when the processing of the node starts/ends. First begin() -is called, then all children of the node are processed recursively, then end() is called. If -any of these functions returns false, the traversing is stopped and the traverse() function -returns false.
- -- virtual void push(); - virtual void pop(); - |
These functions are called before and after the processing of node's children. If node has no children, -none of these is called. The default behavior is to increment/decrement current node depth.
+Finally, let's get into more details about tree modification and saving.
- virtual int depth() const; - |
Get the current depth. You can use this function to do your own indentation, for example.
- -Lets get to some minor notes. You can safely write something like: +#include <iostream> -
- bool value = node.child("stream").attribute("compress").as_bool(); - |
As parsing is done in-situ, the XML data is to persist during the lifetime of xml_parser. If -the parsing is called via a function of xml_parser, that accepts char*, you have to ensure -yourself, that the string will outlive the xml_parser object.
- -The memory for nodes and attributes is allocated in blocks of data (the blocks form a linked list; -the default size of the block is 32 kb, though you can change it via changing a memory_block_size -constant in pugixml.hpp file. Remember that the first block is allocated on stack (it resides -inside xml_parser object), and all subsequent blocks are allocated on heap, so expect a stack overflow -when setting too large memory block size), so the xml_parser object (which contains the blocks) -should outlive all xml_node and xml_attribute objects (as well as iterators), which belong -to the parser's tree. Again, you should ensure it yourself.
+#include "pugixml.hpp" -Ok, so you are not much of documentation reader, are you? So am I. Let's assume that you're going -to parse an xml file... something like this: + // Append several children and set values/names at once + doc.append_child(node_comment).set_value("This is a test comment"); + doc.append_child().set_name("application"); -
-<?xml version="1.0" encoding="UTF-8"?> -<mesh name="Cathedral"> - <fragment name="Cathedral"> - <geometry> - <stream usage="main" source="StAnna.dmesh" compress="true" /> - <stream usage="ao" source="StAnna.ao" /> - </geometry> - </fragment> - <fragment name="Cathedral"> - ... - </fragment> - ... -</mesh> -+ // Let's add a few modules + xml_node application = doc.child("application"); -
<mesh> is a root node, it has 0 or more <fragment>s, each of them has a <geometry> -node, and there are <stream> nodes with the shown attributes. We'd like to parse the file and... -well, and do something with it's contents. There are several methods of doing that; I'll show 2 of them -(the remaining one is using iterators).
+ // Save node wrapper for convenience + xml_node module_a = application.append_child(); + module_a.set_name("module"); + + // Add an attribute, immediately setting it's value + module_a.append_attribute("name").set_value("A"); -Here we exploit the knowledge of the strict hierarchy of our XML document and read the nodes from -DOM tree accordingly. When we have an xml_node object, we can get the desired information from -it (name, value, attributes list, nearby nodes in a tree - siblings, parent and children).
+ // You can use operator= + module_a.append_attribute("folder") = "/work/app/module_a"; --#include <fstream> -#include <vector> -#include <algorithm> -#include <iterator> + // Or even assign numbers + module_a.append_attribute("status") = 85.4; -#include "pugixml.hpp" + // Let's add another module + xml_node module_c = application.append_child(); + module_c.set_name("module"); + module_c.append_attribute("name") = "C"; + module_c.append_attribute("folder") = "/work/app/module_c"; -using namespace pugi; + // Oh, we missed module B. Not a problem, let's insert it before module C + xml_node module_b = application.insert_child_before(node_element, module_c); + module_b.set_name("module"); + module_b.append_attribute("folder") = "/work/app/module_b"; -int main() -{ - std::ifstream in("mesh.xml"); - in.unsetf(std::ios::skipws); - - std::vector<char> buf; - std::copy(std::istream_iterator<char>(in), std::istream_iterator<char>(), std::back_inserter(buf)); - buf.push_back(0); // zero-terminate + // We can do the same thing for attributes + module_b.insert_attribute_before("name", module_b.attribute("folder")) = "B"; - xml_parser parser(&buf[0], pugi::parse_w3c); + // Let's add some text in module A + module_a.append_child(node_pcdata).set_value("Module A description"); - xml_node doc = parser.document(); - - if (xml_node mesh = doc.first_element("mesh")) - { - // store mesh.attribute("name").value() + // Well, there's not much left to do here. Let's output our document to file using several formatting options - for (xml_node fragment = mesh.first_element("fragment"); fragment; fragment = fragment.next_sibling()) - { - // store fragment.attribute("name").value() + doc.save_file("sample_saved_1.xml"); - if (xml_node geometry = fragment.first_element("geometry")) - for (xml_node stream = geometry.first_element("stream"); stream; stream = stream.next_sibling()) - { - // store stream.attribute("usage").value() - // store stream.attribute("source").value() - - if (stream.attribute("compress")) - // store stream.attribute("compress").as_bool() + // Contents of file sample_saved_1.xml (tab size = 4): + // <?xml version="1.0"?> + // <!--This is a test comment--> + // <application> + // <module name="A" folder="/work/app/module_a" status="85.4">Module A description</module> + // <module name="B" folder="/work/app/module_b" /> + // <module name="C" folder="/work/app/module_c" /> + // </application> + + // Let's use two spaces for indentation instead of tab character + doc.save_file("sample_saved_2.xml", " "); + + // Contents of file sample_saved_2.xml: + // <?xml version="1.0"?> + // <!--This is a test comment--> + // <application> + // <module name="A" folder="/work/app/module_a" status="85.4">Module A description</module> + // <module name="B" folder="/work/app/module_b" /> + // <module name="C" folder="/work/app/module_c" /> + // </application> - } - } - } -} - |
We can also write a class that will traverse the DOM tree and store the information from nodes based -on their names, depths, attributes, etc. This way is well known by the users of SAX parsers. To do that, -we have to write an implementation of xml_tree_walker interface
+ // Finally, you can print a subtree to any output stream (including cout) + doc.child("application").child("module").print(cout); --#include <fstream> -#include <vector> -#include <algorithm> -#include <iterator> + // Output: + // <module name="A" folder="/work/app/module_a" status="85.4">Module A description</module> +} + |
_Winnie C++ Colorizer |
Note, that these examples do not cover the whole pugixml API. For further information, look into reference section.
-using namespace pugi; +pugixml is a library for parsing XML files, which means that you give it XML data some way, +and it gives you the DOM tree and the ways to traverse it and to get some useful information from it. +The library source consist of two headers, pugixml.hpp and pugiconfig.hpp, and two source +files, pugixml.cpp and pugixpath.cpp. +You can either compile cpp files in your project, or build a static library. +All library classes reside in namespace pugi, so you can either use fully qualified +names (pugi::xml_node) or write a using declaration (using namespace pugi;, using +pugi::xml_node) and use plain names. All classes have eitther xml_ or xpath_ prefix.
-int main() -{ - std::ifstream in("mesh.xml"); - in.unsetf(std::ios::skipws); - - std::vector<char> buf; - std::copy(std::istream_iterator<char>(in), std::istream_iterator<char>(), std::back_inserter(buf)); - buf.push_back(0); // zero-terminate - - xml_parser parser(&buf[0], pugi::parse_w3c); +By default it's supposed that you compile the source file with your project (add it into your +project, or add relevant entry in your Makefile, or do whatever you need to do with your compilation +environment). The library is written in standard-conformant C++ and was tested on following platforms:
- mesh_parser mp; ++
The documentation for pugixml classes, functions and constants is available here.
So, let's talk a bit about parsing process, and about the reason for providing XML data as a contiguous -writeable block of memory. Parsing is done in-situ. This means, that the strings, representing the -parts of DOM tree (node names, attribute names and values, CDATA content, etc.) are not separately -allocated on heap, but instead are parts of the original data. This is the keypoint to parsing speed, -because it helps achieve the minimal amount of memory allocations (more on that below) and minimal -amount of copying data.
- -In-situ parsing can be done in two ways, with zero-segmenting the string (that is, set the past-the-end -character for the part of XML string to 0, see -this image for further details), and storing pointer + size of the string instead of pointer to -the beginning of ASCIIZ string.
- -Originally, pugxml had only the first way, but then authors added the second method, 'non-segmenting' -or non-destructive parsing. The advantages of this method are: you no longer need non-constant storage; -you can even read data from memory-mapped files directly. Well, there are disadvantages. -For one thing, you can not do any of the transformations in-situ. The transformations that are required -by XML standard are: -
In order to be able to modify the tree (change attribute/node names & values) with in-situ parsing, -one needs to implement two ways of storing data (both in-situ and not). The DOM tree is now mutable, -but it will change in the future releases (without introducing speed/memory overhead, except on clean- -up stage).
- -The parsing process itself is more or less straightforward, when you see it - but the impression -is fake, because the explicit jumps are made (i.e. we know, that if we come to a closing brace (>), -we should expect CDATA after it (or a new tag), so let's just jump to the corresponding code), and, -well, there can be bugs (see Bugs section).
- -And, to make things worse, memory allocation (which is done only for node and attribute structures) -is done in pools. The pools are single-linked lists with predefined block size (32 kb by default), and -well, it increases speed a lot (allocations are slow, and the memory gets fragmented when allocating -a bunch of 16-byte (attribute) or 40-byte (node) structures)
+3 MSVC is Microsoft Visual C++ CompilerQ: I do not have/want STL support. How can I compile pugixml without STL?
A: There is an undocumented define PUGIXML_NO_STL. If you uncomment the relevant line in pugixml header file, it will compile without any STL classes. The reason it is undocumented -are that it will make some documented functions not available (specifically, xml_parser() ctor and -parse() function that operate on std::istream, xml_node::path function, utf16 and utf8 conversion -functions). Otherwise, it will work fine.
+are that it will make some documented functions not available (specifically, xml_document::load, that +operates on std::istream, xml_node::path function, saving functions (xml_node::print, xml_document::save), +XPath-related functions and classes and as_utf16 and as_utf8 conversion functions). Otherwise, it will +work fine.Q: Do paths that are accepted by first_element_by_path have to end with delimiter?
A: Either way will work, both /path/to/node/ and /path/to/node is fine.
@@ -1048,16 +577,10 @@ do not send executable files. upper ones will get there sooner).The pugixml parser is distributed under the MIT license:
-Copyright (c) 2006 Arseny Kapoulkine +Copyright (c) 2006-2007 Arseny Kapoulkine Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation @@ -1125,7 +670,7 @@ OTHER DEALINGS IN THE SOFTWARE.
-Revised 8 December, 2006
-© Copyright Arseny Kapoulkine 2006. All Rights Reserved.
+Revised 21 February, 2007
+© Copyright Arseny Kapoulkine 2006-2007. All Rights Reserved.