Thanks to @jangernert for the upgrades to Document
serialization!
Document::to_string_with_options
allowing to customize document serializationDocument::SaveOptions
containing the currently supported serialization options, as provided internally by libxml
- the
Document::to_string()
serialization method is now implemented throughfmt::Display
and no longer takes an optional boolean flag. The default behavior is now unformatted serialization - previouslyto_string(false)
, whileto_string(true)
can be realized via
.to_string_with_options(SaveOptions { format: true, ..SaveOptions::default()})`
Thanks to @Alexhuszagh for contributing all enhancements for the 0.2.12
release!
- BOM-aware Unicode support
- New
Parser
methods allowing to specify an explicit encoding:parse_file_with_encoding
,parse_string_with_encoding
,is_well_formed_html_with_encoding
- Default encodings in
Parser
are now left for libxml to guess internally, rather than defaulted toutf-8
.
RoNode::to_hashable
andRoNode::null
for parity with existingNode
-leveraging applications
RoNode
primitive for simple and efficient read-only parallel processing- Benchmarking a 120 MB XML document shows a twenty five fold speedup, when comparing
Node
to parallel rayon processing overRoNode
with a 32 logical core desktop - While
RoNode
is added as an experiment for high performance read-only scans, any mutability requires usingNode
and incurring a bookkeeping cost of safety at runtime. - Introduced benchmarking via
criterion
, only installed during development. benches/parsing_benchmarks
contains examples of parallel scanning viarayon
iterators.- added
Document::get_root_readonly
method for obtaining aRoNode
root. - added
Context::node_evaluate_readonly
method for searching over aRoNode
- added
Context::get_readonly_nodes_as_vec
method for collecting xpath results asRoNode
- Squash memory leak in creating new
Node
s from the Rust API - Safely unlink
Node
s obtained via XPath searches
Minor internal changes to make the crate compile more reliably under MacOS, and other platforms which enable the LIBXML_THREAD_ENABLED
compile-time flag. Thank you @caldwell !
- implement and test
replace_child_node
for element nodes
- Internal update to Rust 2018 Edition
- fix deallocation bugs with
.import_node()
and.get_namespaces()
Node::null
placeholder that avoids the tricky memory management ofNode::mock
that can lead to memory leaks. Really a poor substitute for the betterOption<Node>
type with aNone
value, which is recommended instead.
Context::from_node
method for convenient XPath context initialization via a Node object. Possible as nodes keep a reference to their ownerDocument
object.
- Ensured memory safety of cloning xpath
Context
objects - Switched to using
Weak
references to the owner document, inNode
,Context
andObject
, to prevent memory leaks in mutli-document pipelines. - Speedup to XPath node retrieval
Node::findnodes
method for direct XPath search, without first explicitly instantiating aContext
. Reusing aContext
remains more efficient.
- Expose the underlying
libxml2
data structures in the public crate interface, to enable a first libxslt crate proof of concept.
Node::set_node_rc_guard
which allows customizing the reference-count mutability threshold for Nodes.- serialization tests for
Document
- (crate internal) full set of libxml2 bindings as produced via
bindgen
(see #39) - (crate internal) using libxml2's type language in the wrapper Rust modules
- (crate internal) setup bindings for reuse in higher-level crates, such as libxslt
NodeType::from_c_int
renamed toNodeType::from_int
, now accepting au32
argument
- Removed dependence on custom C code; also removed gcc from build dependencies
This release adds fundamental breaking changes to the API. The API continues to be considered unstable until the 1.0.0
release.
dup
anddup_from
methods for deeply duplicating a libxml2 documentis_unlinked
for quick check if aNode
has been unlinked from a parent
- safe API for
Node
s andDocument
s, with automatic pointer bookkeeping and memory deallocation, by @triptecNode
s are now bookkept by their owning document- libxml2 low-level memory deallocation is postponed until the
Document
is dropped, with the exception of unlinked nodes, who are deallocated on drop. Document::get_root_element
now has an option type, and returnsNone
for an empty DocumentNode::mock
now takes ownerDocument
as argument- proofed tests with
valgrind
and removed all obvious memory leaks
- All node operations that modify a
Node
now both require a&mut Node
argument and return aResult
type.- Full list of changed signatures in Node:
remove_attribute
,remove_property
,set_name
,set_content
,set_property
,set_property_ns
,set_attribute
,set_attribute_ns
,remove_attribute
,set_namespace
,recursively_remove_namespaces
,append_text
- Full list of changed signatures in Node:
- Tree transforming operations that use operate on
&mut self
, and no longer return a Node if the return value is identical to the argument.- Changed signatures:
add_child
,add_prev_sibling
,add_next_sibling
- Changed signatures:
Result
types should always be checked for errors, as mutability conflicts are reported during runtime.
global
module, which attempted to manage global libxml state for threaded workflows. May be readed after the API stabilizes
- We welcome Andreas (@triptec) to the core developer team!
- Workaround
.free
method for freeing nodes, until theRc<RefCell<_Node>>
free-on-drop solution by Andreas is introduced in 0.2
get_first_element_child
- similar toget_first_child
but only returns XML Elementsis_element_node
- check if a givenNode
is an XML Element
- Requiring owned
Node
function arguments only when consumed -add_*
methods largely take&Node
now.
Pushing up release to a 0.1, as contributor interest is starting to pick up, and the 0. version were getting a bit silly/wrong.
- Node methods:
unbind_node
,recursively_remove_namespaces
,set_name
, - Document methods:
import_node
- Updated gcc build to newer incantation, upped dependency version.
-
Node methods:
get_namespace_declarations
,get_property_ns
(alias:get_attribute_ns
),remove_property
(alias:remove_attribute
),get_attribute_node
,get_namespace
,lookup_namespace_prefix
,lookup_namespace_uri
-
XPath methods:
findvalue
andfindnodes
, with optional node-bound evaluation.
- The Node setter for a namespaced attribute is now
set_property_ns
(alias:set_attribute_ns
) - Node set_* methods are now consistently defined on
&mut self
- Refactored wrongly used
url
tohref
for namespace-related Node ops. - Fixed bug with Node's
get_content
method always returning empty - More stable
append_text
for node, added tests
- Namespace::new only requires a borrowed &Node now
- Fixed bug with wrongly discarded namespace prefixes on Namespace::new
- Namespace methods:
get_prefix
,get_url
- Document method:
as_node
- Node methods:
get_last_child
,get_child_nodes
,get_child_elements
,get_properties
,get_attributes
- Namespace::new takes Node argument last
- Node namespace accessors -
set_namespace
,get_namespaces
,set_ns_attribute
,set_ns_property
- Namespace registration for XPath
- stricter dependency spec in Cargo.toml
- cargo clippy compliant
- Document's
get_root_element
returns the document pointer as a Node for empty documents, type change fromOption<Node>
to simple<Node>
- Node accessors:
set_attribute
,get_attribute
,set_property
(theattribute
callers are simple aliases forproperty
) - Node
to_hashable
for simple hashing of nodes - Node
mock
for simple mock nodes in testing
Thanks to @grray for most of these improvements!
- Switched to using the more permissive MIT license, consistent with libxml2 licensing
- Fixed segfault issues with xpath contexts
- Can now evaluate
string(/foo//@bar)
type XPath expressions, and use their result via.to_string()
- The
Node.add_child
method now adds a Node, while the old behavior of creating a new node with a given namespace and name is nowNode.new_child
- Can add following siblings via
Node.add_next_sibling
- Can now add text nodes via
Node.new_text