Super-trait for all document types, promising an element type for the document element.
Super-trait for all element query API traits, promising a self type.
Super-trait for all element query API traits, promising a self type.
Simplicity and consistency of the entire query API are 2 important design considerations. For example, the query API methods themselves use no generics.
Super-trait for all element query API traits that know about the node super-type.
Backing document API, representing a document that contains a BackingNodes.Elem
root element.
Shorthand for IndexedScopedElemApi with HasParentApi
.
Shorthand for IndexedScopedElemApi with HasParentApi
. In other words, this is an ancestry-aware "scoped element"
query API.
Efficient implementations are possible for indexed elements and Saxon NodeInfo objects (backed by native tiny trees). Saxon-backed elements are not offered by core yaidom, however. Saxon tiny trees are attractive for their low memory footprint.
It is possible to offer implementations by combining the partial implementation traits (XXXLike), or by entirely custom and efficient "backend-aware" implementations.
Shorthand for ElemApi with IsNavigableApi with HasENameApi with HasTextApi
.
Shorthand for ElemApi with IsNavigableApi with HasENameApi with HasTextApi
. In other words, the minimal element query API
corresponding to James Clark's "labelled element tree" abstraction, which is implemented as yaidom "resolved"
elements.
If a yaidom element implementation (whether in yaidom itself or a "yaidom extension")
does not mix in the ClarkElemApi
trait, it is probably not to be considered "XML".
Indeed, in yaidom only the ElemBuilder
class does not mix in this trait, and indeed
it is not "XML" (lacking any knowledge about expanded names etc.), only a builder of "XML".
Hence this trait is very important in yaidom, as the "minimal XML element query API".
Generic code abstracting over yaidom element implementations should either use
this trait, or sub-trait ScopedElemApi
, depending on the abstraction level.
Partial implementation of ClarkElemApi
.
Minimal API for Documents, having a type parameter for the element type.
Minimal API for Documents, having a type parameter for the element type.
This is a purely abstract API trait. It can be useful in generic code abstracting over multiple element implementations.
This is the foundation of the yaidom uniform query API.
This is the foundation of the yaidom uniform query API. Many DOM-like element implementations in yaidom mix in this trait (indirectly, because some implementing sub-trait is mixed in), thus sharing this query API.
This trait typically does not show up in application code using yaidom, yet its (uniform) API does. Hence, it makes sense to read the documentation of this trait, knowing that the API is offered by multiple element implementations.
This trait is purely abstract. The most common implementation of this trait is eu.cdevreeze.yaidom.queryapi.ElemLike. That trait only knows about elements (and not about other nodes), and only knows that elements can have child elements (again not knowing about other child nodes). Using this minimal knowledge alone, it offers methods to query for descendant elements, descendant-or-self methods, or sub-collections thereof. It is this minimal knowledge that makes this API uniform.
This query API leverages the Scala Collections API. Query results can be manipulated using the Collections API, and the
query API implementation (in
) uses the Collections API internally.ElemLike
To illustrate usage of this API, consider the following example. Let's say we want to determine if some XML has its namespace declarations (if any) only at the root element level. We show the query code for several yaidom DOM-like element implementations.
Note that it depends on the DOM-like element implementation how to query for namespace declarations, but the code to
query for descendant or descendant-or-self elements remains the same. The method to retrieve all descendant elements
is called
, and the method to retrieve all descendant-or-self elements is called findAllElems
.
The corresponding "filtering" methods are called findAllElemsOrSelf
and filterElems
, respectively. Knowing this,
it is easy to guess the other API method names.filterElemsOrSelf
Let's start with a yaidom DOM wrapper, named
, of type rootElem
DomElem
, and query for the
"offending" descendant elements:
rootElem filterElems (elem => !convert.DomConversions.extractNamespaceDeclarations(elem.wrappedNode.getAttributes).isEmpty)
This returns all offending elements, that is, all descendant elements of the root element (excluding the root element itself) that have at least one namespace declaration.
Now let's use an eu.cdevreeze.yaidom.simple.ElemBuilder, again named
:rootElem
rootElem filterElems (elem => !elem.namespaces.isEmpty)
The query is the same as the preceding one, except for the retrieval of namespace declarations of an element. (It should be
noted that class
already has a method ElemBuilder
.)allDeclarationsAreAtTopLevel
Finally, let's use a
of type eu.cdevreeze.yaidom.indexed.Elem, which is immutable, but knows its ancestry:rootElem
rootElem filterElems (elem => !elem.namespaces.isEmpty)
This is exactly the same code as for
, because namespace declarations happen to be retrieved in the same way.ElemBuilder
If we want to query for all elements with namespace declarations, including the root element itself, we could write:
rootElem filterElemsOrSelf (elem => !elem.namespaces.isEmpty)
In summary, the extremely simple
query API is indeed a uniform query API, offered by many different
yaidom DOM-like element implementations.ElemApi
In order to get started using the API, this more formal section can safely be skipped. On the other hand, this section may provide a deeper understanding of the API.
The
trait can be understood in a precise mathematical sense, as shown below.ElemApi
The most fundamental method of this trait is
. The semantics of the other methods can be defined
directly or indirectly in terms of this method.findAllChildElems
The basic operations definable in terms of that method are
(alias for \
), filterChildElems
(alias for \\
)
and filterElemsOrSelf
(alias for \\!
). Their semantics must be as if they had been defined as follows:findTopmostElemsOrSelf
def filterChildElems(p: ThisElem => Boolean): immutable.IndexedSeq[ThisElem] = this.findAllChildElems.filter(p) def filterElemsOrSelf(p: ThisElem => Boolean): immutable.IndexedSeq[ThisElem] = Vector(this).filter(p) ++ (this.findAllChildElems flatMap (_.filterElemsOrSelf(p))) def findTopmostElemsOrSelf(p: ThisElem => Boolean): immutable.IndexedSeq[ThisElem] = if (p(this)) Vector(this) else (this.findAllChildElems flatMap (_.findTopmostElemsOrSelf(p)))
Moreover, we could have defined:
def filterElems(p: ThisElem => Boolean): immutable.IndexedSeq[ThisElem] = this.findAllChildElems flatMap (_.filterElemsOrSelf(p)) def findTopmostElems(p: ThisElem => Boolean): immutable.IndexedSeq[ThisElem] = this.findAllChildElems flatMap (_.findTopmostElemsOrSelf(p))
and:
def findAllElemsOrSelf: immutable.IndexedSeq[ThisElem] = filterElemsOrSelf(e => true) def findAllElems: immutable.IndexedSeq[ThisElem] = filterElems(e => true)
The following properties must hold (in the absence of side-effects), and can indeed be proven (given the documented "definitions" of these operations):
// Filtering elem.filterElems(p) == elem.findAllElems.filter(p) elem.filterElemsOrSelf(p) == elem.findAllElemsOrSelf.filter(p) // Finding topmost elem.findTopmostElems(p) == { elem.filterElems(p) filter { e => val hasNoMatchingAncestor = elem.filterElems(p) forall { _.findElem(_ == e).isEmpty } hasNoMatchingAncestor } } elem.findTopmostElemsOrSelf(p) == { elem.filterElemsOrSelf(p) filter { e => val hasNoMatchingAncestor = elem.filterElemsOrSelf(p) forall { _.findElem(_ == e).isEmpty } hasNoMatchingAncestor } } (elem.findTopmostElems(p) flatMap (_.filterElemsOrSelf(p))) == (elem.filterElems(p)) (elem.findTopmostElemsOrSelf(p) flatMap (_.filterElemsOrSelf(p))) == (elem.filterElemsOrSelf(p))
This is the generic element creation API.
This is the generic element creation API. It fits in the overall philosophy of yaidom in that it is based on ENames, not on QNames.
API and implementation trait for elements as containers of elements, as element nodes in a node tree.
API and implementation trait for elements as containers of elements, as element nodes in a node tree. This trait knows very little about elements. It does not know about names, attributes, etc. All it knows about elements is that elements can have element children (other node types are entirely out of scope in this trait).
The purely abstract API offered by this trait is eu.cdevreeze.yaidom.queryapi.ElemApi. See the documentation of that trait for examples of usage, and for a more formal treatment. Below follows an even more formal treatment, with proofs by induction of important properties obeyed by methods of this API. It shows the mathematical rigor of the yaidom query API. API users that are only interested in how to use the API can safely skip that formal treatment.
In order to get started using the API, this more formal section can safely be skipped. On the other hand, this section may provide a deeper understanding of the API.
The only abstract method is findAllChildElems
. Based on this method alone, this trait offers a rich API for querying elements.
This is entirely consistent with the semantics defined in the ElemApi
trait. Indeed, the implementation of the methods
follows the semantics defined there.
In the ElemApi
trait, some (simple) provable laws were mentioned. Some proofs follow below.
Below follows a proof by structural induction of one of the laws mentioned in the documentation of trait ElemApi
.
First we make a few assumptions, for this proof, and (implicitly) for the other proofs:
Based on these assumptions, we prove by induction that:
elm.filterElemsOrSelf(p) == elm.findAllElemsOrSelf.filter(p)
Base case
If elm
has no child elements, then the LHS can be rewritten as follows:
elm.filterElemsOrSelf(p) immutable.IndexedSeq(elm).filter(p) ++ (elm.findAllChildElems flatMap (_.filterElemsOrSelf(p))) // definition of filterElemsOrSelf immutable.IndexedSeq(elm).filter(p) ++ (Seq() flatMap (_.filterElemsOrSelf(p))) // there are no child elements immutable.IndexedSeq(elm).filter(p) ++ Seq() // flatMap on empty sequence returns empty sequence immutable.IndexedSeq(elm).filter(p) // property of concatenation: xs ++ Seq() == xs (immutable.IndexedSeq(elm) ++ Seq()).filter(p) // property of concatenation: xs ++ Seq() == xs (immutable.IndexedSeq(elm) ++ (elm.findAllChildElems flatMap (_ filterElemsOrSelf (e => true)))) filter p // flatMap on empty sequence (of child elements) returns empty sequence (immutable.IndexedSeq(elm).filter(e => true) ++ (elm.findAllChildElems flatMap (_ filterElemsOrSelf (e => true)))) filter p // filtering with predicate that is always true elm.filterElemsOrSelf(e => true) filter p // definition of filterElemsOrSelf elm.findAllElemsOrSelf filter p // definition of findAllElemsOrSelf
which is the RHS.
Inductive step
For the inductive step, we use the following (general) properties:
(xs.filter(p) ++ ys.filter(p)) == ((xs ++ ys) filter p) // referred to below as property (a)
and:
(xs flatMap (x => f(x) filter p)) == ((xs flatMap f) filter p) // referred to below as property (b)
If elm
does have child elements, the LHS can be rewritten as:
elm.filterElemsOrSelf(p) immutable.IndexedSeq(elm).filter(p) ++ (elm.findAllChildElems flatMap (_.filterElemsOrSelf(p))) // definition of filterElemsOrSelf immutable.IndexedSeq(elm).filter(p) ++ (elm.findAllChildElems flatMap (ch => ch.findAllElemsOrSelf filter p)) // induction hypothesis immutable.IndexedSeq(elm).filter(p) ++ ((elm.findAllChildElems.flatMap(ch => ch.findAllElemsOrSelf)) filter p) // property (b) (immutable.IndexedSeq(elm) ++ (elm.findAllChildElems flatMap (_.findAllElemsOrSelf))) filter p // property (a) (immutable.IndexedSeq(elm) ++ (elm.findAllChildElems flatMap (_ filterElemsOrSelf (e => true)))) filter p // definition of findAllElemsOrSelf (immutable.IndexedSeq(elm).filter(e => true) ++ (elm.findAllChildElems flatMap (_ filterElemsOrSelf (e => true)))) filter p // filtering with predicate that is always true elm.filterElemsOrSelf(e => true) filter p // definition of filterElemsOrSelf elm.findAllElemsOrSelf filter p // definition of findAllElemsOrSelf
which is the RHS.
This completes the proof. Other above-mentioned properties can be proven by induction in a similar way.
From the preceding proven property it easily follows (without using a proof by induction) that:
elm.filterElems(p) == elm.findAllElems.filter(p)
After all, the LHS can be rewritten as follows:
elm.filterElems(p) (elm.findAllChildElems flatMap (_.filterElemsOrSelf(p))) // definition of filterElems (elm.findAllChildElems flatMap (e => e.findAllElemsOrSelf.filter(p))) // using the property proven above (elm.findAllChildElems flatMap (_.findAllElemsOrSelf)) filter p // using property (b) above (elm.findAllChildElems flatMap (_ filterElemsOrSelf (e => true))) filter p // definition of findAllElemsOrSelf elm.filterElems(e => true) filter p // definition of filterElems elm.findAllElems filter p // definition of findAllElems
which is the RHS.
Given the above-mentioned assumptions, we prove by structural induction that:
(elm.findTopmostElemsOrSelf(p) flatMap (_.filterElemsOrSelf(p))) == (elm.filterElemsOrSelf(p))
Base case
If elm
has no child elements, and p(elm)
holds, then LHS and RHS evaluate to immutable.IndexedSeq(elm)
.
If elm
has no child elements, and p(elm)
does not hold, then LHS and RHS evaluate to immutable.IndexedSeq()
.
Inductive step
For the inductive step, we introduce the following additional (general) property, if f
and g
have the same types:
((xs flatMap f) flatMap g) == (xs flatMap (x => f(x) flatMap g)) // referred to below as property (c)
This is also known as the "associativity law for monads". (Monadic types obey 3 laws: associativity, left unit and right unit.)
If elm
does have child elements, and p(elm)
holds, the LHS can be rewritten as:
(elm.findTopmostElemsOrSelf(p) flatMap (_.filterElemsOrSelf(p))) immutable.IndexedSeq(elm) flatMap (_.filterElemsOrSelf(p)) // definition of findTopmostElemsOrSelf, knowing that p(elm) holds elm.filterElemsOrSelf(p) // definition of flatMap, applied to singleton sequence
which is the RHS. In this case, we did not even need the induction hypothesis.
If elm
does have child elements, and p(elm)
does not hold, the LHS can be rewritten as:
(elm.findTopmostElemsOrSelf(p) flatMap (_.filterElemsOrSelf(p))) (elm.findAllChildElems flatMap (_.findTopmostElemsOrSelf(p))) flatMap (_.filterElemsOrSelf(p)) // definition of findTopmostElemsOrSelf, knowing that p(elm) does not hold elm.findAllChildElems flatMap (ch => ch.findTopmostElemsOrSelf(p) flatMap (_.filterElemsOrSelf(p))) // property (c) elm.findAllChildElems flatMap (_.filterElemsOrSelf(p)) // induction hypothesis immutable.IndexedSeq() ++ (elm.findAllChildElems flatMap (_.filterElemsOrSelf(p))) // definition of concatenation immutable.IndexedSeq(elm).filter(p) ++ (elm.findAllChildElems flatMap (_.filterElemsOrSelf(p))) // definition of filter, knowing that p(elm) does not hold elm.filterElemsOrSelf(p) // definition of filterElems
which is the RHS.
From the preceding proven property it easily follows (without using a proof by induction) that:
(elm.findTopmostElems(p) flatMap (_.filterElemsOrSelf(p))) == (elm.filterElems(p))
After all, the LHS can be rewritten to:
(elm.findTopmostElems(p) flatMap (_.filterElemsOrSelf(p))) (elm.findAllChildElems flatMap (_.findTopmostElemsOrSelf(p))) flatMap (_.filterElemsOrSelf(p)) // definition of findTopmostElems elm.findAllChildElems flatMap (ch => ch.findTopmostElemsOrSelf(p) flatMap (_.filterElemsOrSelf(p))) // property (c) elm.findAllChildElems flatMap (_.filterElemsOrSelf(p)) // using the property proven above elm.filterElems(p) // definition of filterElems
which is the RHS.
There are several (unproven) properties that were used in the proofs above:
(xs.filter(p) ++ ys.filter(p)) == ((xs ++ ys) filter p) // property (a); filter distributes over concatenation (xs flatMap (x => f(x) filter p)) == ((xs flatMap f) filter p) // property (b) // Associativity law for monads ((xs flatMap f) flatMap g) == (xs flatMap (x => f(x) flatMap g)) // property (c)
Property (a) is obvious, and stated without proof. Property (c) is known as the "associativity law for monads". Property (b) is proven below.
To prove property (b), we use property (c), as well as the following property (d):
(xs filter p) == (xs flatMap (y => if (p(y)) List(y) else Nil)) // property (d)
Then property (b) can be proven as follows:
xs flatMap (x => f(x) filter p) xs flatMap (x => f(x) flatMap (y => if (p(y)) List(y) else Nil)) (xs flatMap f) flatMap (y => if (p(y)) List(y) else Nil) // property (c) (xs flatMap f) filter p
Methods findAllElemsOrSelf
, filterElemsOrSelf
, findTopmostElemsOrSelf
and findElemOrSelf
use recursion in their
implementations, but not tail-recursion. The lack of tail-recursion should not be a problem, due to limited XML tree
depths in practice. It is comparable to an "idiomatic" Scala quicksort implementation in its lack of tail-recursion.
Also in the case of quicksort, the lack of tail-recursion is acceptable due to limited recursion depths. If we want tail-recursive
implementations of the above-mentioned methods (in particular the first 3 ones), we either lose the ordering of result elements
in document order (depth-first), or we lose performance and/or clarity. That just is not worth it.
This is the element transformation API, as function API instead of OO API.
This is the element transformation API, as function API instead of OO API. That is, this is the function API corresponding to trait eu.cdevreeze.yaidom.queryapi.TransformableElemApi.
See trait TransformableElemApi
for more info about element transformations in yaidom, and their properties.
This functional API is more widely applicable than trait TransformableElemApi
. First, it can be implemented for arbitrary
element types, even non-yaidom ones. Second, implementations can easily carry state that is shared by update functions, such
as a Saxon Processor
in the case of a Saxon implementation of this API.
When using this API for elements that carry context such as "ancestry state", be careful when writing transformation functions
that are passed to the functions of this API. For example, if the element type is BackingElemApi
or a sub-type, such sensitive
state includes the base URI, document URI, the Path
relative to the root element, and most important of all, the root element itself.
It is up to the user of the API to keep such state consistent during transformations, and to be careful when depending on state
that is volatile during transformations.
Also note for BackingElemApi
elements, if a transformation function alters "ancestry state" such as (base and document) URIs,
paths etc., these altered values may be ignored, depending on the API calls made.
In order to get started using the API, this more formal section can safely be skipped. On the other hand, this section may provide a deeper understanding of the API.
Some provable properties hold about this ElemTransformationApi
API in terms of the more low level ElemUpdateApi
API.
Let's first try to define the methods of ElemTransformationApi
in terms of the ElemUpdateApi
API. Below their equivalence
will be proven. We define the following, assuming type ElemType
to be a yaidom "indexed element" type:
def addPathParameter[A](f: ElemType => A): ((ElemType, Path) => A) = { { (elm, path) => f(elm) } // Unused path } def addPathEntryParameter[A](f: ElemType => A): ((ElemType, Path.Entry) => A) = { { (elm, pathEntry) => f(elm) } // Unused path entry } def findAllChildPathEntries(elem: ElemType): Set[Path.Entry] = { elem.findAllChildElems.map(_.path.lastEntry).toSet } def findAllRelativeElemOrSelfPaths(elem: ElemType): Set[Path] = { elem.findAllElemsOrSelf.map(_.path.skippingPath(elem.path)).toSet } def findAllRelativeElemPaths(elem: ElemType): Set[Path] = { elem.findAllElems.map(_.path.skippingPath(elem.path)).toSet } // The transformation functions, defined in terms of the ElemUpdateApi def transformChildElems2(elem: ElemType, f: ElemType => ElemType): ElemType = { updateChildElems(elem, findAllChildPathEntries(elem))(addPathEntryParameter(f)) } def transformElemsOrSelf2(elem: ElemType, f: ElemType => ElemType): ElemType = { updateElemsOrSelf(elem, findAllRelativeElemOrSelfPaths(elem))(addPathParameter(f)) } def transformElems2(elem: ElemType, f: ElemType => ElemType): ElemType = { updateElems(elem, findAllRelativeElemPaths(elem))(addPathParameter(f)) }
The following property must hold, for all elements and (pure) element transformation functions:
transformChildElems(elem, f) == transformChildElems2(elem, f)
No proof is provided, but this property must obviously hold, since transformChildElems
replaces
child element nodes by applying the given function, and leaves the other child nodes alone, and
method transformChildElems2
does the same. The latter function does it via child path entries
(translated to child node indexes), iterating over child nodes in reverse order (in order not
to invalidate the next processed path entry), but the net effect is the same.
The following property holds, for all elements and (pure) element transformation functions:
transformElemsOrSelf(elem, f) == transformElemsOrSelf2(elem, f)
Below follows a proof of this property by structural induction.
Base case
If elem
has no child elements, then the LHS can be rewritten as follows:
transformElemsOrSelf(elem, f) f(transformChildElems(elem, e => transformElemsOrSelf(e, f))) // definition of transformElemsOrSelf f(elem) // there are no child element nodes, so transformChildElems is an identity function in this case updateElemsOrSelf(elem, Set(Path.Empty))(addPathParameter(f)) // only updates elem transformElemsOrSelf2(elem, f) // definition of transformElemsOrSelf2, and absence of descendant paths
which is the RHS.
Inductive step
If elem
does have child elements, the LHS can be rewritten as:
transformElemsOrSelf(elem, f) f(transformChildElems(elem, e => transformElemsOrSelf(e, f))) // definition of transformElemsOrSelf f(transformChildElems(elem, e => transformElemsOrSelf2(e, f))) // induction hypothesis f(transformChildElems2(elem, e => transformElemsOrSelf2(e, f))) // property above f(transformChildElems2(elem, e => updateElemsOrSelf(e, findAllRelativeElemOrSelfPaths(e))(addPathParameter(f)))) // definition of transformElemsOrSelf2 f(updateChildElems(elem, findAllChildPathEntries(elem))(addPathEntryParameter( e => updateElemsOrSelf(e, findAllRelativeElemOrSelfPaths(e))(addPathParameter(f)))) ) // definition of transformChildElems2 f(updateElems(elem, findAllRelativeElemOrSelfPaths(elem))(addPathParameter(f))) // property about updateElems, and knowing that the added path and path entry parameters do nothing here updateElemsOrSelf(elem, findAllRelativeElemOrSelfPaths(elem))(addPathParameter(f)) // (indirect) definition of updateElemsOrSelf transformElemsOrSelf2(elem, f) // definition of transformElemsOrSelf2
which is the RHS.
This completes the proof. For the other ElemTransformationApi
methods, analogous provable properties hold.
This is the partially implemented element transformation API, as function API instead of OO API.
This is the partially implemented element transformation API, as function API instead of OO API. That is, this is the function API corresponding to trait eu.cdevreeze.yaidom.queryapi.TransformableElemLike.
In other words, this trait has abstract methods transformChildElems
and transformChildElemsToNodeSeq
. Based on these
abstract methods, this trait offers a rich API for transforming descendant elements or descendant-or-self elements.
This is the element (functional) update API, as function API instead of OO API.
This is the element (functional) update API, as function API instead of OO API. That is, this is the function API corresponding to
trait eu.cdevreeze.yaidom.queryapi.UpdatableElemApi. A few methods, like updateTopmostElemsOrSelf
, are missing, though.
See trait UpdatableElemApi
for more info about (functional) element updates in yaidom, and their properties.
This functional API is more widely applicable than trait UpdatableElemApi
. First, it can be implemented for arbitrary
element types, even non-yaidom ones. Second, implementations can easily carry state that is shared by update functions, such
as a Saxon Processor
in the case of a Saxon implementation of this API.
Below, for most functions that take Paths or that take functions that take Paths the Paths are relative to the first argument element, so they must not be interpreted as the Paths of the elements themselves (relative to their root elements).
This is the partially implemented (functional) element update API, as function API instead of OO API.
This is the partially implemented (functional) element update API, as function API instead of OO API. That is, this is the function API corresponding to trait eu.cdevreeze.yaidom.queryapi.UpdatableElemLike.
Pair of an element and a Path.
Pair of an element and a Path. These pairs themselves offer the ElemApi query API, so they can be seen as "element implementations" themselves. They are like very light-weight "indexed" elements.
These "elements" are used in the implementation of bulk update methods in trait
, but they
can also be used in application code.UpdatableElemLike
Note that this class renders a separate query API for element-path pairs obsolete. It takes a IsNavigableApi
, using
its findAllChildElemsWithPathEntries
method, and offers the equivalent of an ElemApi
for element-path pairs.
The underlying (root) element type
API trait for elements that can be asked for their child nodes, of any node kind.
Trait partly implementing the contract for elements that have a EName, as well as attributes with EName keys.
Trait partly implementing the contract for elements that have a EName, as well as attributes with EName keys.
Using this trait (possibly in combination with other "element traits") we can abstract over several element implementations.
Trait defining the contract for elements that have a EName, as well as attributes with EName keys.
Trait defining the contract for elements that have a EName, as well as attributes with EName keys.
Using this trait (possibly in combination with other "element traits") we can abstract over several element implementations.
Implementation trait for elements that can be asked for the ancestor elements, if any.
Implementation trait for elements that can be asked for the ancestor elements, if any.
This trait only knows about elements, not about documents as root element parents.
Based on abstract method parentOption
alone, this trait offers a rich API for querying the element ancestry of an element.
API trait for elements that can be asked for the ancestor elements, if any.
API trait for elements that can be asked for the ancestor elements, if any.
This trait only knows about elements, not about documents as root element parents.
Trait defining the contract for elements that have a QName, as well as attributes with QName keys.
Trait defining the contract for elements that have a QName, as well as attributes with QName keys.
Using this trait (possibly in combination with other "element traits") we can abstract over several element implementations.
Trait defining the contract for elements that have a stored Scope.
Trait defining the contract for elements that have a stored Scope.
Using this trait (possibly in combination with other "element traits") we can abstract over several element implementations.
Trait partly implementing the contract for elements as text containers.
Trait partly implementing the contract for elements as text containers. Typical element types are both an eu.cdevreeze.yaidom.queryapi.ElemLike as well as a eu.cdevreeze.yaidom.queryapi.HasText.
Trait defining the contract for elements as text containers.
Trait defining the contract for elements as text containers. Typical element types are both an eu.cdevreeze.yaidom.queryapi.ElemLike as well as a eu.cdevreeze.yaidom.queryapi.HasText.
Abstract API for "indexed elements".
Abstract API for "indexed elements".
Note how this API removes the need for an API which is like the ElemApi
API, but taking and returning pairs
of elements and paths.
Abstract API for "indexed Scoped elements".
API and implementation trait for elements that can be navigated using paths.
API and implementation trait for elements that can be navigated using paths.
More precisely, this trait has only the following abstract methods: findChildElemByPathEntry
and findAllChildElemsWithPathEntries
.
The purely abstract API offered by this trait is eu.cdevreeze.yaidom.queryapi.IsNavigableApi. See the documentation of that trait for more information.
This trait offers Path-based navigation support.
This trait offers Path-based navigation support.
This trait typically does not show up in application code using yaidom, yet its (uniform) API does. Hence, it makes sense to read the documentation of this trait, knowing that the API is offered by multiple element implementations.
This trait is purely abstract. The most common implementation of this trait is eu.cdevreeze.yaidom.queryapi.IsNavigable.
Some properties are expected to hold for "navigable elements":
getElemOrSelfByPath(Path.Empty) == self
findElemOrSelfByPath(path1).flatMap(e => e.findElemOrSelfByPath(path2)) == findElemOrSelfByPath(path1.append(path2))
Shorthand for ClarkElemApi[E] with HasQNameApi with HasScopeApi
with some additional methods that
use the scope for resolving QName-valued text and attribute values.
Shorthand for ClarkElemApi[E] with HasQNameApi with HasScopeApi
with some additional methods that
use the scope for resolving QName-valued text and attribute values. In other words, an element query API typically
supported by element implementations, because most element implementations know about scopes, QNames, ENames and
text content, as well as offering the ElemApi
query API.
Generic code abstracting over yaidom element implementations should either use
this trait, or super-trait ClarkElemApi
, depending on the abstraction level.
Scopes resolve QNames as ENames, so some properties are expected to hold for the element "name":
this.scope.resolveQNameOption(this.qname).contains(this.resolvedName) // Therefore: this.resolvedName.localPart == this.qname.localPart this.resolvedName.namespaceUriOption == this.scope.prefixNamespaceMap.get(this.qname.prefixOption.getOrElse(""))
For the attribute "name" properties, first define:
val attributeScope = this.scope.withoutDefaultNamespace val resolvedAttrs = this.attributes map { case (attrQName, attrValue) => val resolvedAttrName = attributeScope.resolveQNameOption(attrQName).get (resolvedAttrName -> attrValue) }
Then the following must hold:
resolvedAttrs.toMap == this.resolvedAttributes.toMap
Partial implementation of ScopedElemApi
.
Extension to ElemApi that makes querying for sub-types of the element type easy.
Extension to ElemApi that makes querying for sub-types of the element type easy.
For example, XML Schema can be modeled with an object hierarchy, starting with some XsdElem super-type which mixes in trait SubtypeAwareElemApi, among other query traits. The object hierarchy could contain sub-classes of XsdElem such as XsdRootElem, GlobalElementDeclaration, etc. Then the SubtypeAwareElemApi trait makes it easy to query for all or some global element declarations, etc.
There is no magic in these traits: it is just ElemApi and ElemLike underneath. It is only the syntactic convenience that makes the difference.
The query methods of this trait take a sub-type as first value parameter. It is intentional that this is a value parameter, and not a second type parameter, since it is conceptually the most important parameter of these query methods. (If it were a second type parameter instead, the article http://hacking-scala.org/post/73854628325/advanced-type-constraints-with-type-classes would show how to make that solution robust, using some @NotNothing annotation.)
The sub-type parameter could have been a java.lang.Class
object, except that type erasure would make it less attractive
(when doing pattern matching against that type). Hence the use of a ClassTag
parameter, which undoes type erasure
for non-generic types, if available implicitly. So ClassTag
is used as a better java.lang.Class
, yet without
polluting the public API with an implicit ClassTag
parameter. (Instead, the ClassTag is made implicit inside the
method implementations.)
Default implementation of SubtypeAwareElemApi.
This is the element transformation part of the yaidom query and update API.
This is the element transformation part of the yaidom query and update API. Only a few DOM-like element implementations in yaidom mix in this trait (indirectly, because some implementing sub-trait is mixed in), thus sharing this API.
This trait typically does not show up in application code using yaidom, yet its (uniform) API does. Hence, it makes sense to read the documentation of this trait, knowing that the API is offered by multiple element implementations.
This trait is purely abstract. The most common implementation of this trait is eu.cdevreeze.yaidom.queryapi.TransformableElemLike.
That trait only knows how to transform child elements. Using this minimal knowledge, the trait offers methods to transform
descendant elements and descendant-or-self elements. Indeed, the trait is similar to ElemLike
, except that it
transforms elements instead of querying for elements.
The big conceptual difference with "updatable" elements (in trait UpdatableElemLike[N, E]
) is that "transformations" are
about applying some transforming function to an element tree, while "(functional) updates" are about "updates" at given paths.
To illustrate the use of this API, consider the following example XML:
<book:Bookstore xmlns:book="http://bookstore/book" xmlns:auth="http://bookstore/author"> <book:Book ISBN="978-0321356680" Price="35" Edition="2"> <book:Title>Effective Java (2nd Edition)</book:Title> <book:Authors> <auth:Author> <auth:First_Name>Joshua</auth:First_Name> <auth:Last_Name>Bloch</auth:Last_Name> </auth:Author> </book:Authors> </book:Book> <book:Book ISBN="978-0981531649" Price="35" Edition="2"> <book:Title>Programming in Scala: A Comprehensive Step-by-Step Guide, 2nd Edition</book:Title> <book:Authors> <auth:Author> <auth:First_Name>Martin</auth:First_Name> <auth:Last_Name>Odersky</auth:Last_Name> </auth:Author> <auth:Author> <auth:First_Name>Lex</auth:First_Name> <auth:Last_Name>Spoon</auth:Last_Name> </auth:Author> <auth:Author> <auth:First_Name>Bill</auth:First_Name> <auth:Last_Name>Venners</auth:Last_Name> </auth:Author> </book:Authors> </book:Book> </book:Bookstore>
Suppose this XML has been parsed into eu.cdevreeze.yaidom.simple.Elem variable named
. Then we can combine
author first and last names as follows:bookstoreElem
val authorNamespace = "http://bookstore/author" bookstoreElem = bookstoreElem transformElems { case elem: Elem if elem.resolvedName == EName(authorNamespace, "Author") => val firstName = (elem \ (_.localName == "First_Name")).headOption.map(_.text).getOrElse("") val lastName = (elem \ (_.localName == "Last_Name")).headOption.map(_.text).getOrElse("") val name = (firstName + " " + lastName).trim Node.textElem(QName("auth:Author"), elem.scope ++ Scope.from("auth" -> authorNamespace), name) case elem: Elem => elem } bookstoreElem = bookstoreElem.prettify(2)
When using the TransformableElemApi
API, keep the following in mind:
transformElems
and transformElemsOrSelf
methods (and their Node sequence producing counterparts)
may produce a lot of "garbage". If only a small portion of an element tree needs to be updated, the "update" methods in trait
UpdatableElemApi
may be a better fit.Top-down transformations are still possible, by combining recursion with method transformChildElems
(or
transformChildElemsToNodeSeq
). For example:
def removePrefixedNamespaceUndeclarations(elem: Elem): Elem = { elem transformChildElems { e => val newE = e.copy(scope = elem.scope.withoutDefaultNamespace ++ e.scope) removePrefixedNamespaceUndeclarations(newE) } }
API and implementation trait for transformable elements.
API and implementation trait for transformable elements.
More precisely, this trait has abstract methods transformChildElems
and transformChildElemsToNodeSeq
. Based on these
abstract methods, this trait offers a rich API for transforming descendant elements or descendant-or-self elements.
The purely abstract API offered by this trait is eu.cdevreeze.yaidom.queryapi.TransformableElemApi. See the documentation of that trait for examples of usage.
This is the functional update part of the yaidom uniform query API.
This is the functional update part of the yaidom uniform query API. It is a sub-trait of trait eu.cdevreeze.yaidom.queryapi.IsNavigableApi. Only a few DOM-like element implementations in yaidom mix in this trait (indirectly, because some implementing sub-trait is mixed in), thus sharing this query API.
This trait typically does not show up in application code using yaidom, yet its (uniform) API does. Hence, it makes sense to read the documentation of this trait, knowing that the API is offered by multiple element implementations.
This trait is purely abstract. The most common implementation of this trait is eu.cdevreeze.yaidom.queryapi.UpdatableElemLike. The trait has all the knowledge of its super-trait, but in addition to that knows the following:
Obviously methods
, children
and withChildren
must be consistent with
methods such as collectChildNodeIndexes
.findAllChildElems
Using this minimal knowledge alone, trait
not only offers the methods of its parent trait, but also:UpdatableElemLike
For the conceptual difference with "transformable" elements, see trait eu.cdevreeze.yaidom.queryapi.TransformableElemApi.
This query API leverages the Scala Collections API. Query results can be manipulated using the Collections API, and the
query API implementation (in
) uses the Collections API internally.UpdatableElemLike
To illustrate the use of this API, consider the following example XML:
<book:Bookstore xmlns:book="http://bookstore/book" xmlns:auth="http://bookstore/author"> <book:Book ISBN="978-0321356680" Price="35" Edition="2"> <book:Title>Effective Java (2nd Edition)</book:Title> <book:Authors> <auth:Author> <auth:First_Name>Joshua</auth:First_Name> <auth:Last_Name>Bloch</auth:Last_Name> </auth:Author> </book:Authors> </book:Book> <book:Book ISBN="978-0981531649" Price="35" Edition="2"> <book:Title>Programming in Scala: A Comprehensive Step-by-Step Guide, 2nd Edition</book:Title> <book:Authors> <auth:Author> <auth:First_Name>Martin</auth:First_Name> <auth:Last_Name>Odersky</auth:Last_Name> </auth:Author> <auth:Author> <auth:First_Name>Lex</auth:First_Name> <auth:Last_Name>Spoon</auth:Last_Name> </auth:Author> <auth:Author> <auth:First_Name>Bill</auth:First_Name> <auth:Last_Name>Venners</auth:Last_Name> </auth:Author> </book:Authors> </book:Book> </book:Bookstore>
Suppose this XML has been parsed into eu.cdevreeze.yaidom.simple.Elem variable named
. Then we can add a book
as follows, where we "forget" the 2nd author for the moment:bookstoreElem
import convert.ScalaXmlConversions._ val bookstoreNamespace = "http://bookstore/book" val authorNamespace = "http://bookstore/author" val fpBookXml = <book:Book xmlns:book="http://bookstore/book" xmlns:auth="http://bookstore/author" ISBN="978-1617290657" Price="33"> <book:Title>Functional Programming in Scala</book:Title> <book:Authors> <auth:Author> <auth:First_Name>Paul</auth:First_Name> <auth:Last_Name>Chiusano</auth:Last_Name> </auth:Author> </book:Authors> </book:Book> val fpBookElem = convertToElem(fpBookXml) bookstoreElem = bookstoreElem.plusChild(fpBookElem)
Note that the namespace declarations for prefixes
and book
had to be repeated in the Scala XML literal
for the added book, because otherwise the auth
method would throw an exception (since convertToElem
instances
cannot be created unless all element and attribute QNames can be resolved as ENames).Elem
The resulting bookstore seems ok, but if we print
, the result does not look pretty.
This can be fixed if the last assignment is replaced by:convertElem(bookstoreElem)
bookstoreElem = bookstoreElem.plusChild(fpBookElem).prettify(2)
knowing that an indentation of 2 spaces has been used throughout the original XML. Method
is expensive, so it
is best not to invoke it within a tight loop. As an alternative, formatting can be left to the prettify
, of
course.DocumentPrinter
The assignment above is the same as the following one:
bookstoreElem = bookstoreElem.withChildren(bookstoreElem.children :+ fpBookElem).prettify(2)
There are several methods to functionally update the children of an element. For example, method
is overloaded,
and the other variant can insert a child at a given 0-based position. Other "children update" methods are plusChild
,
minusChild
and withPatchedChildren
.withUpdatedChildren
Let's now turn to functional update methods that take
instances or collections thereof. In the example above
the second author of the added book is missing. Let's fix that:Path
val secondAuthorXml = <auth:Author xmlns:auth="http://bookstore/author"> <auth:First_Name>Runar</auth:First_Name> <auth:Last_Name>Bjarnason</auth:Last_Name> </auth:Author> val secondAuthorElem = convertToElem(secondAuthorXml) val fpBookAuthorsPaths = for { authorsPath <- indexed.Elem(bookstoreElem) filterElems { e => e.resolvedName == EName(bookstoreNamespace, "Authors") } map (_.path) if authorsPath.findAncestorPath(path => path.endsWithName(EName(bookstoreNamespace, "Book")) && bookstoreElem.getElemOrSelfByPath(path).attribute(EName("ISBN")) == "978-1617290657").isDefined } yield authorsPath require(fpBookAuthorsPaths.size == 1) val fpBookAuthorsPath = fpBookAuthorsPaths.head bookstoreElem = bookstoreElem.updateElemOrSelf(fpBookAuthorsPath) { elem => require(elem.resolvedName == EName(bookstoreNamespace, "Authors")) val rawResult = elem.plusChild(secondAuthorElem) rawResult transformElemsOrSelf (e => e.copy(scope = elem.scope.withoutDefaultNamespace ++ e.scope)) } bookstoreElem = bookstoreElem.prettify(2)
Clearly the resulting bookstore element is nicely formatted, but there was another possible issue that was taken into
account. See the line of code transforming the "raw result". That line was added in order to prevent namespace undeclarations,
which for XML version 1.0 are not allowed (with the exception of the default namespace). After all, the XML for the second
author was created with only the
namespace declared. Without the above-mentioned line of code, a namespace
undeclaration for prefix auth
would have occurred in the resulting XML, thus leading to an invalid XML 1.0 element tree.book
To illustrate functional update methods taking collections of paths, let's remove the added book from the book store. Here is one (somewhat inefficient) way to do that:
val bookPaths = indexed.Elem(bookstoreElem) filterElems (_.resolvedName == EName(bookstoreNamespace, "Book")) map (_.path) bookstoreElem = bookstoreElem.updateElemsWithNodeSeq(bookPaths.toSet) { (elem, path) => if ((elem \@ EName("ISBN")).contains("978-1617290657")) Vector() else Vector(elem) } bookstoreElem = bookstoreElem.prettify(2)
There are very many ways to write this functional update, using different functional update methods in trait
,
or even only using transformation methods in trait UpdatableElemApi
(thus not using paths).TransformableElemApi
The example code above is enough to get started using the
methods, but it makes sense to study the
entire API, and practice with it. Always keep in mind that functional updates typically mess up formatting and/or namespace
(un)declarations, unless these aspects are taken into account.
UpdatableElemApi
API and implementation trait for functionally updatable elements.
API and implementation trait for functionally updatable elements. This trait extends trait eu.cdevreeze.yaidom.queryapi.IsNavigable, adding knowledge about child nodes in general, and about the correspondence between child path entries and child indexes.
More precisely, this trait adds the following abstract methods to the abstract methods required by its super-trait:
children
, withChildren
and collectChildNodeIndexes
. Based on these abstract methods (and the super-trait), this
trait offers a rich API for functionally updating elements.
The purely abstract API offered by this trait is eu.cdevreeze.yaidom.queryapi.UpdatableElemApi. See the documentation of that trait for examples of usage, and for a more formal treatment.
Core API for element nodes that offer the central BackingElemApi with HasChildNodesApi
query API.
Core API for element nodes that offer the central BackingElemApi with HasChildNodesApi
query API. Each element implementation that
knows about expanded names as well as qualified name and that also knows about ancestor elements, should directly or indirectly
implement this API.
This API is directly implemented by elements that are used as backing elements in "yaidom dialects". The yaidom dialects use this abstract backing element API, thus allowing for multiple backing element implementation behind an yaidom XML dialect.
Efficient implementations are possible for indexed elements and Saxon NodeInfo objects (backed by Saxon native tiny trees). Saxon-backed elements are not offered by core yaidom, however. Saxon tiny trees are attractive for their low memory footprint.
Core API for element nodes that offer the central ClarkElemApi with HasChildNodesApi
query API.
Core API for element nodes that offer the central ClarkElemApi with HasChildNodesApi
query API. Each element implementation should
directly or indirectly implement this API.
This API is directly implemented by elements that know about expanded names but not about qualified names.
This companion object offers some convenience factory methods for "element predicates", that can be used in yaidom queries.
This companion object offers some convenience factory methods for "element predicates", that can be used in yaidom queries. These factory objects turn ENames and local names into "element predicates".
For example:
elem \\ (_.ename == EName(xsNamespace, "element"))
can also be written as:
elem \\ withEName(xsNamespace, "element")
(thus avoiding EName instance construction, whether or not this makes any difference in practice).
If the namespace is "obvious", and more friendly local-name-based querying is desired, the following could be written:
elem \\ withLocalName("element")
Abstract node (marker) trait hierarchy.
Abstract node (marker) trait hierarchy. It offers a common minimal API for different kinds of nodes. It also shows what yaidom typically considers to be nodes, and what it does not consider to be nodes. For example, documents are not nodes in yaidom, so it is thus prevented to create documents as element children. Moreover, attributes are typically not nodes in yaidom, although custom element implementations may think otherwise.
The down-side is that we have to consider mixing in (some or all of) these traits everywhere we create a node/element implementation.
Core API for element nodes that offer the central ScopedElemApi with HasChildNodesApi
query API.
Core API for element nodes that offer the central ScopedElemApi with HasChildNodesApi
query API. Each element implementation that
knows about expanded names as well as qualified name should directly or indirectly implement this API.
This API is directly implemented by elements that know about expanded names but not about qualified names, but that do not know about their ancestor elements.
XML Base support, for elements implementing the ClarkElemApi
query API.
XML Base support, for elements implementing the ClarkElemApi
query API.
XML Base is very simple in its algorithm, given an optional start "document URI". Base URI computation for an element then starts with the optional document URI, and processes all XML Base attributes in the reverse ancestry-or-self of the element, resolving each XML Base attribute against the base URI computed so far. According to the XML Base specification, same-document references do not alter this algorithm.
What is sensitive in XML Base processing is the resolution of an URI against an optional base URI. For example, resolving
an empty URI using the java.net.URI.resolve
method does not conform to RFC 3986
(see e.g. http://stackoverflow.com/questions/22203111/is-javas-uri-resolve-incompatible-with-rfc-3986-when-the-relative-uri-contains).
This is why the user of this XML Base support must supply a strategy for resolving URIs against optional base URIs.
Default attributes and entity resolution are out of scope for this XML Base support.
This package contains the (renewed) query API traits. It contains both the purely abstract API traits as well as the partial implementation traits.
Generic code abstracting over yaidom element implementations should either use trait
ClarkNodes.Elem
or sub-traitScopedNodes.Elem
, or evenBackingNodes.Elem
, depending on the abstraction level.These traits are combinations of several small query API traits. Most of these API traits are orthogonal.
Simplicity and consistency of the entire query API are 2 important design considerations. For example, the query API methods themselves use no parameterized types. Note how the resulting API with type members is essentially the same as the "old" yaidom query API using type parameters, except that the purely abstract traits are less constrained in the type members.
This package depends only on the core package in yaidom, but many other packages do depend on this one.
Note: whereas the old query API used F-bounded polymorphism with type parameters extensively, this new query API essentially just uses type member(s) ThisElem (and ThisNode), defined in a common super-trait. The old query API may be somewhat easier to develop (that is, convincing the compiler), but the new query API is easier to use as generic "backend" element query API. As an example, common "bridge" element query APIs come to mind, used within type-safe XML dialect DOM tree implementations. The reason this is easier with the new API is intuitively that fewer type constraints leak to the query API client code.