Age | Commit message (Collapse) | Author |
|
|
|
This adds about 40 cycles for parsing <?xml version='1.0'?> declaration and
about 70 cycles for parsing <?xml version='1.0' encoding='utf-8'?>, as
measured on a Core i7, which should be negligible for all documents.
Fixes #16.
|
|
It is probably redundant given that we have -Wold-style-cast, but it's better
to warn about casts like this in case we ever need to remove the latter flag.
|
|
Fixes #99.
|
|
|
|
|
|
Put CMakeLists.txt in the project root.
|
|
Add vs2013 projects
|
|
|
|
Previously the page size was defining the data size, and due to additional
headers (+ recently removed allocation padding) the actual allocation was a bit
bigger.
The problem is that some allocators round 2^N+k allocations to 2^N+M, which can
result in noticeable waste of space. Specifically, on 64-bit OSX allocating the
previous page size (32k+40) resulted in 32k+512 allocation, thereby wasting 472
bytes, or 1.4%.
Now we have the allocation size specified exactly and just recompute the available
data size, which can in small space savings depending on the allocator.
|
|
When using format_raw the space in the empty tag (<node />) is the only
character that does not have to be there; so format_raw almost results in
a minimal XML but not quite.
It's pretty unlikely that this is crucial for any users - the formatting
change should be benign, and it's better to improve format_raw than to add
yet another flag.
Fixes #87.
|
|
Also rename auto_deleter_fclose to close_file.
|
|
Do not assume that fclose can be converted to int(*)(FILE*)
|
|
|
|
compilers use a special calling convention for stdlib functions like fclose
|
|
|
|
Having CMakeLists.txt in the project root makes it so much easier to use pugixml
as an external dependency in another CMake project.
|
|
|
|
Also remove top-level LICENSE file since .podspec already has it.
|
|
scripts: Add CocoaPods package
|
|
errors otherwise. Get sources from zeux github
|
|
|
|
Unify the implementations by automatically deducing the unsigned type from its
signed counterpart. That allows us to use a templated function instead of
duplicating code.
|
|
|
|
|
|
This makes the coverage for basic numeric types complete (sans long double).
Fixes #78.
|
|
That way the defaults in the Makefile only matter for local runs.
|
|
Add cxxstd Makefile argument for testing C++ standards
|
|
|
|
This determines the used C++ standard.
If you do not want to use a specific C++ standard, use cxxstd=any.
The default is set to c++11.
The "define" PUGIXML_NO_CXX11 is removed from the Makefile
since it is not used in the code anyways.
|
|
This allows to perform C++11-based tests on the Linux.
|
|
This is necessary in order to comply with the C++03 standard.
|
|
Fix whitespace issues
|
|
Git warns when it finds "whitespace errors". This commit gets
rid of these whitespace errors for code and adoc files.
|
|
This utilizes the fact that pages are of limited size so we can store offset
from the object to the page in a few bits - we currently use 24 although that's
excessive given that pages are limited to ~512k.
This has several benefits:
- Pages do not have to be 64b aligned any more - this simplifies allocation flow
and frees up 40-50 bytes from xml_document::_memory.
- Header now has 8 bits available for metadata for both compact and default mode
which makes it possible to store type as-is (allowing easy type extension and
removing one add/sub operation from type checks).
- One extra bit is easily available for future metadata extension (in addition
to the bit for type encoding that could be reclaimed if necessary).
- Allocators that return 4b-aligned memory on 64-bit platforms work fine if
misaligned reads are supported.
The downside is that there is one or two extra instructions on the allocation
path. This does not seem to hurt parsing performance.
|
|
Also remove the description of behavior for trailing non-numeric characters.
It's likely this will become a parse error in the future so better leave it
as unspecified for now.
Fixes #80.
|
|
Add parse_embed_pcdata flag
This flag determines if plain character data is be stored in the parent element's value. This significantly changes the structure of the document; this flag is only recommended for parsing documents with a lot of PCDATA nodes in a very memory-constrained environment.
Most high-level APIs continue to work; code that inspects DOM using first_child()/value() will have to be adapted.
|
|
The performance cost is probably negligible and this means we treat embedded
value as the first child consistently.
|
|
|
|
Since round-tripping should not be a problem any more don't mention it.
|
|
|
|
This change fixes an important ordering issue - if element node has a PCDATA
child *after* other elements, it's impossible to tell which order the children
were in.
Since the goal of PCDATA embedding is to save memory when it's the only child,
only apply the optimization to the first child. This seems to fix all
roundtripping issues so the only caveat is that the DOM structure is different.
|
|
This is a bit awkward since preserving correct indentation structure requires
a bit of extra work, and the closing tag has to be written by _start function
to correctly process the rest of the tree.
|
|
|
|
|
|
When this flag is true, PCDATA value is saved to the parent element instead of
allocating a new node.
This prevents some documents from round-tripping since it loses information,
but can provide a significant memory reduction and parsing speedup for some
documents.
|
|
|
|
|
|
|
|
Also refactor to use the same case and run after common options.
|