From 1a06d7d3de3d2f30eaf3d56b7b2d0fa3446d46d8 Mon Sep 17 00:00:00 2001
From: Arseny Kapoulkine
If the task at hand is to select a subset of document nodes that match some
@@ -48,7 +48,7 @@
for these cases. pugixml implements an almost complete subset of XPath 1.0.
Because of differences in document object model and some performance implications,
there are minor violations of the official specifications, which can be found
- in Conformance to W3C specification. The rest of this section describes the interface for XPath
+ in Conformance to W3C specification. The rest of this section describes the interface for XPath
functionality. Please note that if you wish to learn to use XPath language,
you have to look for other tutorials or manuals; for example, you can read
W3Schools XPath tutorial,
@@ -58,12 +58,11 @@
- Each
- XPath expression can have one of the following types: boolean, number, string
- or node set. Boolean type corresponds to
+ Each XPath expression can have one of the following types: boolean, number,
+ string or node set. Boolean type corresponds to
- Because an XPath node can be either a node or an
- attribute, there is a special type,
+ Because an XPath node can be either a node or an attribute, there is a special
+ type,
- Like
- node and attribute handles, XPath node handles can be implicitly cast to
- boolean-like object to check if it is a null node, and also can be compared
+
+ Like node and attribute handles, XPath node handles can be implicitly cast
+ to boolean-like object to check if it is a null node, and also can be compared
for equality with each other.
- You can also create XPath nodes with one of
- the three constructors: the default constructor, the constructor that takes
- node argument, and the constructor that takes attribute and node arguments
- (in which case the attribute must belong to the attribute list of the node).
- The constructor from
- XPath expressions operate not on single nodes,
- but instead on node sets. A node set is a collection of nodes, which can
- be optionally ordered in either a forward document order or a reverse one.
- Document order is defined in XPath specification; an XPath node is before
- another node in document order if it appears before it in XML representation
- of the corresponding document.
-
- Node sets are represented by
+ You can also create XPath nodes with one of the three constructors: the default
+ constructor, the constructor that takes node argument, and the constructor
+ that takes attribute and node arguments (in which case the attribute must
+ belong to the attribute list of the node). The constructor from
+ XPath expressions operate not on single nodes, but instead on node sets.
+ A node set is a collection of nodes, which can be optionally ordered in either
+ a forward document order or a reverse one. Document order is defined in XPath
+ specification; an XPath node is before another node in document order if
+ it appears before it in XML representation of the corresponding document.
+
+ Node sets are represented by
- And it also can be iterated via indices, just
- like
+ And it also can be iterated via indices, just like
- The order of iteration depends on the order of
- nodes inside the set; the order can be queried via the following function:
+
+ The order of iteration depends on the order of nodes inside the set; the
+ order can be queried via the following function:
- Often the actual iteration is not needed;
- instead, only the first element in document order is required. For this,
- a special accessor is provided:
+
+ Often the actual iteration is not needed; instead, only the first element
+ in document order is required. For this, a special accessor is provided:
- While in the majority of cases the node
- set is returned by XPath functions, sometimes there is a need to manually
- construct a node set. For such cases, a constructor is provided which takes
- an iterator range (
+ While in the majority of cases the node set is returned by XPath functions,
+ sometimes there is a need to manually construct a node set. For such cases,
+ a constructor is provided which takes an iterator range (
- If
- you want to select nodes that match some XPath expression, you can do it
- with the following functions:
+
+ If you want to select nodes that match some XPath expression, you can do
+ it with the following functions:
If exception handling is not disabled, both functions throw xpath_exception
if the query can not be compiled or if it returns a value with type other
- than node set; see Error handling for details.
+ than node set; see Error handling for details.
- While
- compiling expressions is fast, the compilation time can introduce a significant
- overhead if the same expression is used many times on small subtrees. If
- you're doing many similar queries, consider compiling them into query objects
- (see Using query objects for further reference). Once you get a compiled query
- object, you can pass it to select functions instead of an expression string:
+
+ While compiling expressions is fast, the compilation time can introduce a
+ significant overhead if the same expression is used many times on small subtrees.
+ If you're doing many similar queries, consider compiling them into query
+ objects (see Using query objects for further reference). Once you get a compiled
+ query object, you can pass it to select functions instead of an expression
+ string:
@@ -257,6 +249,7 @@
This is an example of selecting nodes using XPath expressions (samples/xpath_select.cpp):
+
- When you call
+ When you call
- You can create a query object with the constructor
- that takes XPath expression as an argument:
+
+ You can create a query object with the constructor that takes XPath expression
+ as an argument:
- The expression is compiled and the
- compiled representation is stored in the new query object. If compilation
- fails, xpath_exception is thrown if
- exception handling is not disabled (see Error handling for details).
- After the query is created, you can query the type of the evaluation result
- using the following function:
+
+ The expression is compiled and the compiled representation is stored in the
+ new query object. If compilation fails, xpath_exception
+ is thrown if exception handling is not disabled (see Error handling for
+ details). After the query is created, you can query the type of the evaluation
+ result using the following function:
- You
- can evaluate the query using one of the following functions:
+
+ You can evaluate the query using one of the following functions:
All functions take the context node as an argument, compute the expression
@@ -339,8 +331,9 @@
value, but no type other than node set can be converted to node set. Because
of this,
Calling
- Note that
+ Note that
+
XPath queries may contain references to variables; this is useful if you
@@ -426,10 +420,10 @@
Variable references have the form
@@ -447,13 +441,12 @@
that the lifetime of the set exceeds that of query object.
- Variable sets correspond to
+ Variable sets correspond to
- You can add new variables with the
- following function:
+
+ You can add new variables with the following function:
-
-pugixml 1.4 manual |
+pugixml 1.5 manual |
Overview |
Installation |
Document:
@@ -28,15 +28,15 @@
bool
+bool
type, number type corresponds to double
type, string type corresponds to either std::string
or std::wstring
, depending on whether wide
@@ -73,11 +72,11 @@
xpath_type_number
, xpath_type_string
or xpath_type_node_set
,
accordingly.
xpath_node
,
- which is a discriminated union of these types. A value of this type contains
- two node handles, one of xml_node
+xpath_node
, which is
+ a discriminated union of these types. A value of this type contains two node
+ handles, one of xml_node
type, and another one of xml_attribute
type; at most one of them can be non-null. The accessors to get these handles
are available:
@@ -102,33 +101,30 @@
handle. For null nodes, parent
returns null handle.
xml_node
- is implicit, so you can usually pass xml_node
- to functions that expect xpath_node
.
- Apart from that you usually don't need to create your own XPath node objects,
- since they are returned to you via selection functions.
- xpath_node_set
+xml_node
is implicit, so you can usually
+ pass xml_node
to functions
+ that expect xpath_node
. Apart
+ from that you usually don't need to create your own XPath node objects, since
+ they are returned to you via selection functions.
+ xpath_node_set
object, which has an interface that resembles one of sequential random-access
containers. It has an iterator type along with usual begin/past-the-end iterator
accessors:
@@ -137,9 +133,8 @@
const_iterator xpath_node_set::begin() const;
const_iterator xpath_node_set::end() const;
-std::vector
:
+std::vector
:
const xpath_node& xpath_node_set::operator[](size_t index) const;
size_t xpath_node_set::size() const;
@@ -152,9 +147,9 @@
set size results in undefined behavior. You can use both iterator-based and
index-based access for iteration, however the iterator-based one can be faster.
-
enum xpath_node_set::type_t {type_unsorted, type_sorted, type_sorted_reverse};
type_t xpath_node_set::type() const;
@@ -178,10 +173,9 @@
will return
type_sorted
or
type_sorted_reverse
.
-xpath_node xpath_node_set::first() const;
@@ -193,11 +187,10 @@
the complexity does - if the set is sorted, the complexity is constant, otherwise
it is linear in the number of elements or worse.
-const_iterator
+const_iterator
is a typedef for const xpath_node*
), and an optional type:
xpath_node_set::xpath_node_set(const_iterator begin, const_iterator end, type_t type = type_unsorted);
@@ -212,41 +205,40 @@
xpath_node xml_node::select_single_node(const char_t* query, xpath_variable_set* variables = 0) const;
+
xpath_node xml_node::select_node(const char_t* query, xpath_variable_set* variables = 0) const;
xpath_node_set xml_node::select_nodes(const char_t* query, xpath_variable_set* variables = 0) const;
select_nodes
function compiles
the expression and then executes it with the node as a context node, and
- returns the resulting node set. select_single_node
+ returns the resulting node set. select_node
returns only the first node in document order from the result, and is equivalent
to calling select_nodes(query).first()
.
If the XPath expression does not match anything, or the node handle is null,
select_nodes
returns an empty
- set, and select_single_node
- returns null XPath node.
+ set, and select_node
returns
+ null XPath node.
xpath_node xml_node::select_single_node(const xpath_query& query) const;
+
xpath_node xml_node::select_node(const xpath_query& query) const;
xpath_node_set xml_node::select_nodes(const xpath_query& query) const;
pugi::xpath_node_set tools = doc.select_nodes("/Profile/Tools/Tool[@AllowRemote='true' and @DeriveCaptionFrom='lastparam']");
@@ -268,7 +261,7 @@
std::cout << node.node().attribute("Filename").value() << "\n";
}
-pugi::xpath_node build_tool = doc.select_single_node("//Tool[contains(Description, 'build system')]");
+pugi::xpath_node build_tool = doc.select_node("//Tool[contains(Description, 'build system')]");
if (build_tool)
std::cout << "Build tool: " << build_tool.node().attribute("Filename").value() << "\n";
@@ -278,10 +271,10 @@
select_nodes
+select_nodes
with an expression string as an argument, a query object is created behind
the scenes. A query object represents a compiled XPath expression. Query
objects can be needed in the following circumstances:
@@ -307,30 +300,29 @@
operator and store pointers to xpath_query
in the container.
explicit xpath_query::xpath_query(const char_t* query, xpath_variable_set* variables = 0);
-xpath_value_type xpath_query::return_type() const;
-bool xpath_query::evaluate_boolean(const xpath_node& n) const;
double xpath_query::evaluate_number(const xpath_node& n) const;
string_t xpath_query::evaluate_string(const xpath_node& n) const;
xpath_node_set xpath_query::evaluate_node_set(const xpath_node& n) const;
+xpath_node xpath_query::evaluate_node(const xpath_node& n) const;
evaluate_boolean
,
evaluate_number
and evaluate_string
always return a result,
- but evaluate_node_set
results
- in an error if the return type is not node set (see Error handling).
+ but evaluate_node_set
and
+ evaluate_node
result in an
+ error if the return type is not node set (see Error handling).
@@ -349,12 +342,12 @@
node.select_nodes("query")
- is equivalent to calling xpath_query("query").evaluate_node_set(node)
.
+ is equivalent to calling xpath_query("query").evaluate_node_set(node)
. Calling node.select_node("query")
is equivalent to calling xpath_query("query").evaluate_node(node)
.
evaluate_string
function returns the STL
- string; as such, it's not available in PUGIXML_NO_STL
+evaluate_string
+ function returns the STL string; as such, it's not available in PUGIXML_NO_STL
mode and also usually allocates memory. There is another string evaluation
function:
// Select nodes via compiled query
-pugi::xpath_query query_remote_tools("/Profile/Tools/Tool[@AllowRemote='true']");
+
// Select nodes via compiled query
+pugi::xpath_query query_remote_tools("/Profile/Tools/Tool[@AllowRemote='true']");
pugi::xpath_node_set tools = query_remote_tools.evaluate_node_set(doc);
std::cout << "Remote tool: ";
tools[2].node().print(std::cout);
-// Evaluate numbers via compiled query
-pugi::xpath_query query_timeouts("sum(//Tool/@Timeout)");
+// Evaluate numbers via compiled query
+pugi::xpath_query query_timeouts("sum(//Tool/@Timeout)");
std::cout << query_timeouts.evaluate_number(doc) << std::endl;
-// Evaluate strings via compiled query for different context nodes
-pugi::xpath_query query_name_valid("string-length(substring-before(@Filename, '_')) > 0 and @OutputFileMasks");
+// Evaluate strings via compiled query for different context nodes
+pugi::xpath_query query_name_valid("string-length(substring-before(@Filename, '_')) > 0 and @OutputFileMasks");
pugi::xpath_query query_name("concat(substring-before(@Filename, '_'), ' produces ', @OutputFileMasks)");
for (pugi::xml_node tool = doc.first_element_by_path("Profile/Tools/Tool"); tool; tool = tool.next_sibling())
@@ -414,7 +408,7 @@
$name
; in order to use them, you have to provide
a variable set, which includes all variables present in the query with correct
types. This set is passed to xpath_query
- constructor or to select_nodes
/select_single_node
functions:
+ constructor or to select_nodes
/select_node
functions:
explicit xpath_query::xpath_query(const char_t* query, xpath_variable_set* variables = 0);
-xpath_node xml_node::select_single_node(const char_t* query, xpath_variable_set* variables = 0) const;
+xpath_node xml_node::select_node(const char_t* query, xpath_variable_set* variables = 0) const;
xpath_node_set xml_node::select_nodes(const char_t* query, xpath_variable_set* variables = 0) const;
xpath_variable_set
type, which is essentially
- a variable container.
+xpath_variable_set
+ type, which is essentially a variable container.
xpath_variable* xpath_variable_set::add(const char_t* name, xpath_value_type type);
@@ -470,9 +463,8 @@
0
for numbers, false
for booleans, empty string for strings
and empty set for node sets.
- You can get the existing variables - with the following functions: +
+ You can get the existing variables with the following functions:
xpath_variable* xpath_variable_set::get(const char_t* name); const xpath_variable* xpath_variable_set::get(const char_t* name) const; @@ -481,14 +473,13 @@ The functions return the variable handle, or null pointer if the variable with the specified name is not found. -- Additionally, there are the helper - functions for setting the variable value by name; they try to add the variable - with the corresponding type, if it does not exist, and to set the value. - If the variable with the same name but with different type is already present, - they return
false
; they also - returnfalse
on allocation failure. - Note that these functions do not perform any type conversions. ++ Additionally, there are the helper functions for setting the variable value + by name; they try to add the variable with the corresponding type, if it + does not exist, and to set the value. If the variable with the same name + but with different type is already present, they return
false
; + they also returnfalse
on allocation + failure. Note that these functions do not perform any type conversions.bool xpath_variable_set::set(const char_t* name, bool value); bool xpath_variable_set::set(const char_t* name, double value); @@ -499,15 +490,14 @@ The variable values are copied to the internal variable storage, so you can modify or destroy them after the functions return. -- If setting variables by name is not efficient - enough, or if you have to inspect variable information or get variable values, - you can use variable handles. A variable corresponds to the
xpath_variable
type, and a variable handle - is simply a pointer toxpath_variable
. ++ If setting variables by name is not efficient enough, or if you have to inspect + variable information or get variable values, you can use variable handles. + A variable corresponds to the
-xpath_variable
+ type, and a variable handle is simply a pointer toxpath_variable
.- In - order to get variable information, you can use one of the following functions: +
+ In order to get variable information, you can use one of the following functions:
const char_t* xpath_variable::name() const; xpath_value_type xpath_variable::type() const; @@ -516,9 +506,8 @@ Note that each variable has a distinct type which is specified upon variable creation and can not be changed later. -- In - order to get variable value, you should use one of the following functions, +
+ In order to get variable value, you should use one of the following functions, depending on the variable type:
bool xpath_variable::get_boolean() const; @@ -531,9 +520,9 @@ are performed; if the type mismatch occurs, a dummy value is returned (false
for booleans,NaN
for numbers, empty string for strings and empty set for node sets). -- In order to set variable value, you should - use one of the following functions, depending on the variable type: +
+ In order to set variable value, you should use one of the following functions, + depending on the variable type:
bool xpath_variable::set(bool value); bool xpath_variable::set(double value); @@ -550,9 +539,10 @@ This is an example of using variables in XPath queries (samples/xpath_variables.cpp):+
-// Select nodes via compiled query -pugi::xpath_variable_set vars; +// Select nodes via compiled query +pugi::xpath_variable_set vars; vars.add("remote", pugi::xpath_type_boolean); pugi::xpath_query query_remote_tools("/Profile/Tools/Tool[@AllowRemote = string($remote)]", &vars); @@ -569,8 +559,8 @@ std::cout << "Local tool: "; tools_local[0].node().print(std::cout); -// You can pass the context directly to select_nodes/select_single_node -pugi::xpath_node_set tools_local_imm = doc.select_nodes("/Profile/Tools/Tool[@AllowRemote = string($remote)]", &vars); +// You can pass the context directly to select_nodes/select_node +pugi::xpath_node_set tools_local_imm = doc.select_nodes("/Profile/Tools/Tool[@AllowRemote = string($remote)]", &vars); std::cout << "Local tool imm: "; tools_local_imm[0].node().print(std::cout); @@ -580,7 +570,7 @@There are two different mechanisms for error handling in XPath implementation; @@ -588,23 +578,21 @@ is controlled with PUGIXML_NO_EXCEPTIONS define).
-- By default, XPath functions throw
xpath_exception
object in case of errors; - additionally, in the event any memory allocation fails, anstd::bad_alloc
- exception is thrown. Alsoxpath_exception
- is thrown if the query is evaluated to a node set, but the return type is - not node set. If the query constructor succeeds (i.e. no exception is thrown), - the query object is valid. Otherwise you can get the error details via one - of the following functions: ++ By default, XPath functions throw
xpath_exception
+ object in case of errors; additionally, in the event any memory allocation + fails, anstd::bad_alloc
exception is thrown. Alsoxpath_exception
is thrown if the query + is evaluated to a node set, but the return type is not node set. If the query + constructor succeeds (i.e. no exception is thrown), the query object is valid. + Otherwise you can get the error details via one of the following functions:virtual const char* xpath_exception::what() const throw(); const xpath_parse_result& xpath_exception::result() const;-- If - exceptions are disabled, then in the event of parsing failure the query is - initialized to invalid state; you can test if the query object is valid by - using it in a boolean expression:
if +
+ If exceptions are disabled, then in the event of parsing failure the query + is initialized to invalid state; you can test if the query object is valid + by using it in a boolean expression:
-if (query) { ... }
. Additionally, you can get parsing result via the result() accessor: @@ -617,9 +605,8 @@ a query as a node set results in an empty node set if the return type is not node set.- The information about parsing result is - returned via
xpath_parse_result
++ The information about parsing result is returned via
@@ -632,39 +619,39 @@ const char* description() const; }; -xpath_parse_result
object. It contains parsing status and the offset of last successfully parsed character from the beginning of the source stream:- Parsing result is represented as - the error message; it is either a null pointer, in case there is no error, - or the error message in the form of ASCII zero-terminated string. +
+ Parsing result is represented as the error message; it is either a null pointer, + in case there is no error, or the error message in the form of ASCII zero-terminated + string.
--
description()
member function can be used to get the - error message; it never returns the null pointer, so you can safely use ++
-description()
+ member function can be used to get the error message; it never returns the + null pointer, so you can safely usedescription()
even if query parsing succeeded. Note thatdescription()
- even if query parsing succeeded. Note thatdescription()
returns achar
- string even inPUGIXML_WCHAR_MODE
; - you'll have to call as_wide to get thewchar_t
string. + returns achar
string even in +PUGIXML_WCHAR_MODE
; you'll + have to call as_wide to get thewchar_t
string.- In addition to the error message, - parsing result has an
offset
++ In addition to the error message, parsing result has an
-offset
member, which contains the offset of last successfully parsed character. This offset is in units of pugi::char_t (bytes for character mode, wide characters for wide character mode).- Parsing result object can be implicitly - converted to
bool
like this: -if (result) { ... } +
+ Parsing result object can be implicitly converted to
bool
+ like this:if (result) { ... } else { ... }
.This is an example of XPath error handling (samples/xpath_error.cpp):
+
-// Exception is thrown for incorrect query syntax -try +// Exception is thrown for incorrect query syntax +try { doc.select_nodes("//nodes[#true()]"); } @@ -673,8 +660,8 @@ std::cout << "Select failed: " << e.what() << std::endl; } -// Exception is thrown for incorrect query semantics -try +// Exception is thrown for incorrect query semantics +try { doc.select_nodes("(123)/next"); } @@ -683,8 +670,8 @@ std::cout << "Select failed: " << e.what() << std::endl; } -// Exception is thrown for query with incorrect return type -try +// Exception is thrown for query with incorrect return type +try { doc.select_nodes("123"); } @@ -698,7 +685,7 @@Because of the differences in document object models, performance considerations @@ -745,7 +732,7 @@
-pugixml 1.4 manual | +pugixml 1.5 manual | Overview | Installation | Document: -- cgit v1.2.3