From f9a2dec792d9a52e1b9004793cfca9b0a463049a Mon Sep 17 00:00:00 2001 From: "arseny.kapoulkine" Date: Sun, 11 Jul 2010 16:27:23 +0000 Subject: docs: Added generated HTML documentation git-svn-id: http://pugixml.googlecode.com/svn/trunk@596 99668b35-9821-0410-8761-19e4c4f06640 --- docs/manual/saving.html | 473 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 473 insertions(+) create mode 100644 docs/manual/saving.html (limited to 'docs/manual/saving.html') diff --git a/docs/manual/saving.html b/docs/manual/saving.html new file mode 100644 index 0000000..e12b31d --- /dev/null +++ b/docs/manual/saving.html @@ -0,0 +1,473 @@ + + + +Saving document + + + + + + + + + + + +
pugixml 0.9 manual | + Overview | + Installation | + Document: + Object model · Loading · Accessing · Modifying · Saving | + XPath | + API Reference | + Table of Contents +
+PrevUpHomeNext +
+
+
+ + +

+ Often after creating a new document or loading the existing one and processing + it, it is necessary to save the result back to file. Also it is occasionally + useful to output the whole document or a subtree to some stream; use cases + include debug printing, serialization via network or other text-oriented medium, + etc. pugixml provides several functions to output any subtree of the document + to a file, stream or another generic transport interface; these functions allow + to customize the output format (see Output options), and also perform + necessary encoding conversions (see Encodings). This section documents + the relevant functionality. +

+

+ The node/attribute data is written to the destination properly formatted according + to the node type; all special XML symbols, such as < and &, are properly + escaped. In order to guard against forgotten node/attribute names, empty node/attribute + names are printed as ":anonymous". + For proper output, make sure all node and attribute names are set to meaningful + values. +

+
+ + + + + +
[Caution]Caution

+ Currently the content of CDATA sections is not escaped, so CDATA sections + with values that contain "]]>" + will result in malformed document. This will be fixed in version 1.0. +

+
+ +

+ If you want to save the whole document to a file, you can use the following + function: +

+
bool xml_document::save_file(const char* path, const char_t* indent = "\t", unsigned int flags = format_default, xml_encoding encoding = encoding_auto) const;
+
+

+ This function accepts file path as its first argument, and also three optional + arguments, which specify indentation and other output options (see Output options) + and output data encoding (see Encodings). The path has the target + operating system format, so it can be a relative or absolute one, it should + have the delimiters of target system, it should have the exact case if target + file system is case-sensitive, etc. File path is passed to system file opening + function as is. +

+

+ save_file opens the target + file for writing, outputs the requested header (by default a document declaration + is output, unless the document already has one), and then saves the document + contents. If the file could not be opened, the function returns false. Calling save_file + is equivalent to creating an xml_writer_file + object with FILE* + handle as the only constructor argument and then calling save; + see Saving document via writer interface for writer interface details. +

+
+ + + + + +
[Note]Note

+ As of version 0.9, there is no function for saving XML document to wide + character paths. Unfortunately, there is no portable way to do this; the + version 1.0 will provide such function only for platforms with the corresponding + functionality. You can use stream-saving functions as a workaround if your + STL implementation can open file streams via wchar_t paths. +

+

+ This is a simple example of saving XML document to file (samples/save_file.cpp): +

+

+ +

+
// save document to file
+std::cout << "Saving result: " << doc.save_file("save_file_output.xml") << std::endl;
+
+

+

+
+
+ +

+ For additional interoperability pugixml provides functions for saving document + to any object which implements C++ std::ostream interface. This allows you + to save documents to any standard C++ stream (i.e. file stream) or any third-party + compliant implementation (i.e. Boost Iostreams). Most notably, this allows + for easy debug output, since you can use std::cout + stream as saving target. There are two functions, one works with narrow character + streams, another handles wide character ones: +

+
void xml_document::save(std::ostream& stream, const char_t* indent = "\t", unsigned int flags = format_default, xml_encoding encoding = encoding_auto) const;
+void xml_document::save(std::wostream& stream, const char_t* indent = "\t", unsigned int flags = format_default) const;
+
+

+ save with std::ostream + argument saves the document to the stream in the same way as save_file (i.e. with requested header and + with encoding conversions). On the other hand, save + with std::wstream argument saves the document to + the wide stream with encoding_wchar + encoding. Because of this, using save + with wide character streams requires careful (usually platform-specific) + stream setup (i.e. using the imbue + function). Generally use of wide streams is discouraged, however it provides + you with the ability to save documents to non-Unicode encodings, i.e. you + can save Shift-JIS encoded data if you set the correct locale. +

+

+ Calling save with stream + target is equivalent to creating an xml_writer_stream + object with stream as the only constructor argument and then calling save; see Saving document via writer interface for writer + interface details. +

+

+ This is a simple example of saving XML document to standard output (samples/save_stream.cpp): +

+

+ +

+
// save document to standard output
+std::cout << "Document:\n";
+doc.save(std::cout);
+
+

+

+
+
+ +

+ All of the above saving functions are implemented in terms of writer interface. + This is a simple interface with a single function, which is called several + times during output process with chunks of document data as input: +

+
class xml_writer
+{
+public:
+    virtual void write(const void* data, size_t size) = 0;
+};
+
+void xml_document::save(xml_writer& writer, const char_t* indent = "\t", unsigned int flags = format_default, xml_encoding encoding = encoding_auto) const;
+
+

+ In order to output the document via some custom transport, for example sockets, + you should create an object which implements xml_writer_file + interface and pass it to save + function. xml_writer_file::write + function is called with a buffer as an input, where data + points to buffer start, and size + is equal to the buffer size in bytes. write + implementation must write the buffer to the transport; it can not save the + passed buffer pointer, as the buffer contents will change after write returns. The buffer contains the + chunk of document data in the desired encoding. +

+

+ write function is called + with relatively large blocks (size is usually several kilobytes, except for + the first block with BOM, which is output only if format_write_bom + is set, and last block, which may be small), so there is often no need for + additional buffering in the implementation. +

+

+ This is a simple example of custom writer for saving document data to STL + string (samples/save_custom_writer.cpp); + read the sample code for more complex examples: +

+

+ +

+
struct xml_string_writer: pugi::xml_writer
+{
+    std::string result;
+
+    virtual void write(const void* data, size_t size)
+    {
+        result += std::string(static_cast<const char*>(data), size);
+    }
+};
+
+

+

+
+
+ +

+ While the previously described functions saved the whole document to the + destination, it is easy to save a single subtree. The following functions + are provided: +

+
void xml_node::print(std::ostream& os, const char_t* indent = "\t", unsigned int flags = format_default, xml_encoding encoding = encoding_auto, unsigned int depth = 0) const;
+void xml_node::print(std::wostream& os, const char_t* indent = "\t", unsigned int flags = format_default, unsigned int depth = 0) const;
+void xml_node::print(xml_writer& writer, const char_t* indent = "\t", unsigned int flags = format_default, xml_encoding encoding = encoding_auto, unsigned int depth = 0) const;
+
+

+ These functions have the same arguments with the same meaning as the corresponding + xml_document::save functions, and allow you to save the + subtree to either a C++ IOstream or to any object that implements xml_writer interface. +

+

+ Saving a subtree differs from saving the whole document: the process behaves + as if format_write_bom is + off, and format_no_declaration + is on, even if actual values of the flags are different. This means that + BOM is not written to the destination, and document declaration is only written + if it is the node itself or is one of node's children. Note that this also + holds if you're saving a document; this example (samples/save_subtree.cpp) + illustrates the difference: +

+

+ +

+
// get a test document
+pugi::xml_document doc;
+doc.load("<foo bar='baz'><call>hey</call></foo>");
+
+// print document to standard output (prints <?xml version="1.0"?><foo bar="baz"><call>hey</call></foo>)
+doc.save(std::cout, "", pugi::format_raw);
+std::cout << std::endl;
+
+// print document to standard output as a regular node (prints <foo bar="baz"><call>hey</call></foo>)
+doc.print(std::cout, "", pugi::format_raw);
+std::cout << std::endl;
+
+// print a subtree to standard output (prints <call>hey</call>)
+doc.child("foo").child("call").print(std::cout, "", pugi::format_raw);
+std::cout << std::endl;
+
+

+

+
+
+ +

+ All saving functions accept the optional parameter flags. + This is a bitmask that customizes the output format; you can select the way + the document nodes are printed and select the needed additional information + that is output before the document contents. +

+
+ + + + + +
[Note]Note

+ You should use the usual bitwise arithmetics to manipulate the bitmask: + to enable a flag, use mask | flag; + to disable a flag, use mask & ~flag. +

+

+ These flags control the resulting tree contents: +

+
    +
  • + format_indent determines if all nodes + should be indented with the indentation string (this is an additional + parameter for all saving functions, and is "\t" + by default). If this flag is on, before every node the indentation string + is output several times, where the amount of indentation depends on the + node's depth relative to the output subtree. This flag has no effect + if format_raw is enabled. + This flag is on by default.

    + +
  • +
  • + format_raw switches between formatted and + raw output. If this flag is on, the nodes are not indented in any way, + and also no newlines that are not part of document text are printed. + Raw mode can be used for serialization where the result is not intended + to be read by humans; also it can be useful if the document was parsed + with parse_ws_pcdata + flag, to preserve the original document formatting as much as possible. + This flag is off by default. +
  • +
+

+ These flags control the additional output information: +

+
    +
  • + format_no_declaration allows + to disable default node declaration output. By default, if the document + is saved via save or + save_file function, and + it does not have any document declaration, a default declaration is output + before the document contents. Enabling this flag disables this declaration. + This flag has no effect in xml_node::print + functions: they never output the default declaration. This flag is off by default.

    + +
  • +
  • + format_write_bom allows to enable + Byte Order Mark (BOM) output. By default, no BOM is output, so in case + of non UTF-8 encodings the resulting document's encoding may not be recognized + by some parsers and text editors, if they do not implement sophisticated + encoding detection. Enabling this flag adds an encoding-specific BOM + to the output. This flag has no effect in xml_node::print + functions: they never output the BOM. This flag is off + by default. +
  • +
+

+ Additionally, there is one predefined option mask: +

+
  • + format_default is the default set of + flags, i.e. it has all options set to their default values. It sets formatted + output with indentation, without BOM and with default node declaration, + if necessary. +
+

+ This is an example that shows the outputs of different output options (samples/save_options.cpp): +

+

+ +

+
// get a test document
+pugi::xml_document doc;
+doc.load("<foo bar='baz'><call>hey</call></foo>");
+
+// default options; prints
+// <?xml version="1.0"?>
+// <foo bar="baz">
+//         <call>hey</call>
+// </foo>
+doc.save(std::cout);
+std::cout << std::endl;
+
+// default options with custom indentation string; prints
+// <?xml version="1.0"?>
+// <foo bar="baz">
+// --<call>hey</call>
+// </foo>
+doc.save(std::cout, "--");
+std::cout << std::endl;
+
+// default options without indentation; prints
+// <?xml version="1.0"?>
+// <foo bar="baz">
+// <call>hey</call>
+// </foo>
+doc.save(std::cout, "\t", pugi::format_default & ~pugi::format_indent); // can also pass "" instead of indentation string for the same effect
+std::cout << std::endl;
+
+// raw output; prints
+// <?xml version="1.0"?><foo bar="baz"><call>hey</call></foo>
+doc.save(std::cout, "\t", pugi::format_raw);
+std::cout << std::endl << std::endl;
+
+// raw output without declaration; prints
+// <foo bar="baz"><call>hey</call></foo>
+doc.save(std::cout, "\t", pugi::format_raw | pugi::format_no_declaration);
+std::cout << std::endl;
+
+

+

+
+
+ +

+ pugixml supports all popular Unicode encodings (UTF-8, UTF-16 (big and little + endian), UTF-32 (big and little endian); UCS-2 is naturally supported since + it's a strict subset of UTF-16) and handles all encoding conversions during + output. The output encoding is set via the encoding + parameter of saving functions, which is of type xml_encoding. + The possible values for the encoding are documented in Encodings; + the only flag that has a different meaning is encoding_auto. +

+

+ While all other flags set the exact encoding, encoding_auto + is meant for automatic encoding detection. The automatic detection does not + make sense for output encoding, since there is usually nothing to infer the + actual encoding from, so here encoding_auto + means UTF-8 encoding, which is the most popular encoding for XML data storage. + This is also the default value of output encoding; specify another value + if you do not want UTF-8 encoded output. +

+

+ Also note that wide stream saving functions do not have encoding + argument and always assume encoding_wchar + encoding. +

+
+ + + + + +
[Note]Note

+ The current behavior for Unicode conversion is to skip all invalid UTF + sequences during conversion. This behavior should not be relied upon; if + your node/attribute names do not contain any valid UTF sequences, they + may be output as if they are empty, which will result in malformed XML + document. +

+
+
+ + + +
+
+ + + +
pugixml 0.9 manual | + Overview | + Installation | + Document: + Object model · Loading · Accessing · Modifying · Saving | + XPath | + API Reference | + Table of Contents +
+PrevUpHomeNext +
+ + -- cgit v1.2.3