Packages

  • package root
    Definition Classes
    root
  • package eu
    Definition Classes
    root
  • package cdevreeze
    Definition Classes
    eu
  • package yaidom

    Yaidom is yet another Scala immutable DOM-like XML API.

    Yaidom is yet another Scala immutable DOM-like XML API. The best known Scala immutable DOM-like API is the standard scala.xml API. It:

    • attempts to offer an XPath-like querying experience, thus somewhat blurring the distinction between nodes and node collections
    • lacks first-class support for XML namespaces
    • has limited (functional) update support

    Yaidom takes a different approach, avoiding XPath-like query support in its query API, and offering good namespace and decent (functional) update support. Yaidom is also characterized by almost mathematical precision and clarity. Still, the API remains practical and pragmatic. In particular, the API user has much configuration control over parsing and serialization, because yaidom exposes the underlying JAXP parsers and serializers, which can be configured by the library user.

    Yaidom chooses its battles. For example, given that DTDs do not know about namespaces, yaidom offers good namespace support, but ignores DTDs entirely. Of course the underlying XML parser may still validate XML against a DTD, if so desired. As another example, yaidom tries to leave the handling of the gory details of XML processing (such as whitespace handling) as much as possible to JAXP (and JAXP parser/serializer configuration). As yet another example, yaidom knows nothing about (XML Schema) types of elements and attributes.

    As mentioned above, yaidom tries to treat basic XML processing with almost mathematical precision, even if this is "incorrect". At the same time, yaidom tries to be useful in practice. For example, yaidom compromises "correctness" in the following ways:

    • Yaidom does not generally consider documents to be nodes (called "document information items" in the XML Infoset), thus introducing fewer constraints on DOM-like node construction
    • Yaidom does not consider attributes to be (non-child) nodes (called "attribute information items" in the XML Infoset), thus introducing fewer constraints on DOM-like node construction
    • Yaidom does not consider namespace declarations to be attributes, thus facilitating a clear theory of namespaces
    • Yaidom tries to keep the order of the attributes (for better round-tripping), although attribute order is irrelevant according to the XML Infoset
    • Very importantly, yaidom clearly distinguishes between qualified names (QNames) and expanded names (ENames), which is essential in facilitating a clear theory of namespaces

    Yaidom, and in particular the eu.cdevreeze.yaidom.core, eu.cdevreeze.yaidom.queryapi, eu.cdevreeze.yaidom.resolved and eu.cdevreeze.yaidom.simple sub-packages, contains the following layers:

    • basic concepts, such as (qualified and expanded) names of elements and attributes (in the core package)
    • the uniform query API traits, to query elements for child, descendant and descendant-or-self elements (in the queryapi package)
    • some of the specific element implementations, mixing in those uniform query API traits (e.g. in the resolved and simple packages)

    It makes sense to read this documentation, because it helps in getting up-to-speed with yaidom.

    Basic concepts

    In real world XML, elements (and sometimes attributes) tend to have names within a certain namespace. There are 2 kinds of names at play here:

    • qualified names: prefixed names, such as book:Title, and unprefixed names, such as Edition
    • expanded names: having a namespace, such as {http://bookstore/book}Title (in James Clark notation), and not having a namespace, such as Edition

    They are represented by immutable classes eu.cdevreeze.yaidom.core.QName and eu.cdevreeze.yaidom.core.EName, respectively.

    Qualified names occur in XML, whereas expanded names do not. Yet qualified names have no meaning on their own. They need to be resolved to expanded names, via the in-scope namespaces. Note that the term "qualified name" is often used for what yaidom (and the Namespaces specification) calls "expanded name", and that most XML APIs do not distinguish between the 2 kinds of names. Yaidom has to clearly make this distinction, in order to model namespaces correctly.

    To resolve qualified names to expanded names, yaidom distinguishes between:

    • namespace declarations
    • in-scope namespaces

    They are represented by immutable classes eu.cdevreeze.yaidom.core.Declarations and eu.cdevreeze.yaidom.core.Scope, respectively.

    Namespace declarations occur in XML, whereas in-scope namespaces do not. The latter are the accumulated effect of the namespace declarations of the element itself, if any, and those in ancestor elements.

    Note: in the code examples below, we assume the following import:

    import eu.cdevreeze.yaidom.core._

    To see the resolution of qualified names in action, consider the following sample XML:

    <book:Bookstore xmlns:book="http://bookstore/book" xmlns:auth="http://bookstore/author">
      <book:Book ISBN="978-0321356680" Price="35" Edition="2">
        <book:Title>Effective Java (2nd Edition)</book:Title>
        <book:Authors>
          <auth:Author>
            <auth:First_Name>Joshua</auth:First_Name>
            <auth:Last_Name>Bloch</auth:Last_Name>
          </auth:Author>
        </book:Authors>
      </book:Book>
      <book:Book ISBN="978-0981531649" Price="35" Edition="2">
        <book:Title>Programming in Scala: A Comprehensive Step-by-Step Guide, 2nd Edition</book:Title>
        <book:Authors>
          <auth:Author>
            <auth:First_Name>Martin</auth:First_Name>
            <auth:Last_Name>Odersky</auth:Last_Name>
          </auth:Author>
          <auth:Author>
            <auth:First_Name>Lex</auth:First_Name>
            <auth:Last_Name>Spoon</auth:Last_Name>
          </auth:Author>
          <auth:Author>
            <auth:First_Name>Bill</auth:First_Name>
            <auth:Last_Name>Venners</auth:Last_Name>
          </auth:Author>
        </book:Authors>
      </book:Book>
    </book:Bookstore>

    Consider the last element with qualified name QName("book:Book"). To resolve this qualified name as expanded name, we need to know the namespaces in scope at that element. To compute the in-scope namespaces, we need to accumulate the namespace declarations of the last book:Book element and of its ancestor element(s), starting with the root element.

    The start Scope is "parent scope" Scope.Empty. Then, in the root element we find namespace declarations:

    Declarations.from("book" -> "http://bookstore/book", "auth" -> "http://bookstore/author")

    This leads to the following namespaces in scope at the root element:

    Scope.Empty.resolve(Declarations.from("book" -> "http://bookstore/book", "auth" -> "http://bookstore/author"))

    which is equal to:

    Scope.from("book" -> "http://bookstore/book", "auth" -> "http://bookstore/author")

    We find no other namespace declarations in the last book:Book element or its ancestor(s), so the computed scope is also the scope of the last book:Book element.

    Then QName("book:Book") is resolved as follows:

    Scope.from("book" -> "http://bookstore/book", "auth" -> "http://bookstore/author").resolveQNameOption(QName("book:Book"))

    which is equal to:

    Some(EName("{http://bookstore/book}Book"))

    This namespace support in yaidom has mathematical rigor. The immutable classes QName, EName, Declarations and Scope have precise definitions, reflected in their implementations, and they obey some interesting properties. For example, if we correctly define Scope operation relativize (along with resolve), we get:

    scope1.resolve(scope1.relativize(scope2)) == scope2

    This may not sound like much, but by getting the basics right, yaidom succeeds in offering first-class support for XML namespaces, without the magic and namespace-related bugs often found in other XML libraries.

    There are 2 other basic concepts in this package, representing paths to elements:

    • path builders
    • paths

    They are represented by immutable classes eu.cdevreeze.yaidom.core.PathBuilder and eu.cdevreeze.yaidom.core.Path, respectively.

    Path builders are like canonical XPath expressions, yet they do not contain the root element itself, and indexing starts with 0 instead of 1.

    For example, the last name of the first author of the last book element has path:

    Path.from(
      EName("{http://bookstore/book}Book") -> 1,
      EName("{http://bookstore/book}Authors") -> 0,
      EName("{http://bookstore/author}Author") -> 0,
      EName("{http://bookstore/author}Last_Name") -> 0
    )

    This path could be written as path builder as follows:

    PathBuilder.from(QName("book:Book") -> 1, QName("book:Authors") -> 0, QName("auth:Author") -> 0, QName("auth:Last_Name") -> 0)

    Using the Scope mentioned earlier, the latter path builder resolves to the path given before that, by invoking method PathBuilder.build(scope). In order for this to work, the Scope must be invertible. That is, there must be a one-to-one correspondence between prefixes ("" for the default namespace) and namespace URIs, because otherwise the index numbers may differ. Also note that the prefixes book and auth in the path builder are arbitrary, and need not match with the prefixes used in the XML tree itself.

    Uniform query API traits

    Yaidom provides a relatively small query API, to query an individual element for collections of child elements, descendant elements or descendant-or-self elements. The resulting collections are immutable Scala collections, that can further be manipulated using the Scala Collections API.

    This query API is uniform, in that different element implementations share (most of) the same query API. It is also element-centric (unlike standard Scala XML).

    For example, consider the XML example given earlier, as a Scala XML literal named bookstore. We can wrap this Scala XML Elem into a yaidom wrapper of type ScalaXmlElem, named bookstoreElem. Then we can query for all books, that is, all descendant-or-self elements with resolved (or expanded) name EName("{http://bookstore/book}Book"), as follows:

    bookstoreElem filterElemsOrSelf (elem => elem.resolvedName == EName("{http://bookstore/book}Book"))

    The result would be an immutable IndexedSeq of ScalaXmlElem instances, holding 2 book elements.

    We could instead have written:

    bookstoreElem.filterElemsOrSelf(EName("{http://bookstore/book}Book"))

    with the same result, due to an implicit conversion from expanded names to element predicates.

    Instead of searching for appropriate descendant-or-self elements, we could have searched for descendant elements only, without altering the result in this case:

    bookstoreElem filterElems (elem => elem.resolvedName == EName("{http://bookstore/book}Book"))

    or:

    bookstoreElem.filterElems(EName("{http://bookstore/book}Book"))

    We could even have searched for appropriate child elements only, without altering the result in this case:

    bookstoreElem filterChildElems (elem => elem.resolvedName == EName("{http://bookstore/book}Book"))

    or:

    bookstoreElem.filterChildElems(EName("{http://bookstore/book}Book"))

    or, knowing that all child elements are books:

    bookstoreElem.findAllChildElems

    We could find all authors of the Scala book as follows:

    for {
      bookElem <- bookstoreElem filterChildElems (elem => elem.resolvedName == EName("{http://bookstore/book}Book"))
      if bookElem.attributeOption(EName("ISBN")).contains("978-0981531649")
      authorElem <- bookElem filterElems (elem => elem.resolvedName == EName("{http://bookstore/author}Author"))
    } yield authorElem

    or:

    for {
      bookElem <- bookstoreElem.filterChildElems(EName("{http://bookstore/book}Book"))
      if bookElem.attributeOption(EName("ISBN")).contains("978-0981531649")
      authorElem <- bookElem.filterElems(EName("{http://bookstore/author}Author"))
    } yield authorElem

    We could even use operator notation, as follows:

    for {
      bookElem <- bookstoreElem \ (elem => elem.resolvedName == EName("{http://bookstore/book}Book"))
      if (bookElem \@ EName("ISBN")).contains("978-0981531649")
      authorElem <- bookElem \\ (elem => elem.resolvedName == EName("{http://bookstore/author}Author"))
    } yield authorElem

    or:

    for {
      bookElem <- bookstoreElem \ EName("{http://bookstore/book}Book")
      if (bookElem \@ EName("ISBN")).contains("978-0981531649")
      authorElem <- bookElem \\ EName("{http://bookstore/author}Author")
    } yield authorElem

    where \\ stands for filterElemsOrSelf.

    There is no explicit support for filtering on the "self" element itself. In the example above, we might want to check if the root element has the expected EName, for instance. That is easy to express using a simple idiom, however. The last example then becomes:

    for {
      bookstoreElem <- Vector(bookstoreElem)
      if bookstoreElem.resolvedName == EName("{http://bookstore/book}Bookstore")
      bookElem <- bookstoreElem \ EName("{http://bookstore/book}Book")
      if (bookElem \@ EName("ISBN")).contains("978-0981531649")
      authorElem <- bookElem \\ EName("{http://bookstore/author}Author")
    } yield authorElem

    Now suppose the same XML is stored in a (org.w3c.dom) DOM tree, wrapped in a DomElem bookstoreElem. Then the same queries would use exactly the same code as above! The result would be a collection of DomElem instances instead of ScalaXmlElem instances, however. There are many more element implementations in yaidom, and they share (most of) the same query API. Therefore this query API is called a uniform query API.

    The last example, using operator notation, looks a bit more "XPath-like". It is more verbose than queries in Scala XML, however, partly because in yaidom these operators cannot be chained. Yet this is with good reason. Yaidom does not blur the distinction between elements and element collections, and therefore does not offer any XPath experience. The small price paid in verbosity is made up for by precision. The yaidom query API traits have very precise definitions of their operations, as can be seen in the corresponding documentation.

    The uniform query API traits turn minimal APIs into richer APIs, where each richer API is defined very precisely in terms of the minimal API. The most important (partly concrete) query API trait is eu.cdevreeze.yaidom.queryapi.ElemLike. It needs to be given a method implementation to query for child elements (not child nodes in general, but just child elements!), and it offers methods to query for some or all child elements, descendant elements, and descendant-or-self elements. That is, the minimal API consists of abstract method findAllChildElems, and it offers methods such as filterChildElems, filterElems and filterElemsOrSelf. This trait has no knowledge about elements at all, other than the fact that elements can have child elements.

    Trait eu.cdevreeze.yaidom.queryapi.HasEName needs minimal knowledge about elements themselves, viz. that elements have a "resolved" (or expanded) name, and "resolved" attributes (mapping attribute expanded names to attribute values). That is, it needs to be given implementations of abstract methods resolvedName and resolvedAttributes, and then offers methods to query for individual attributes or the local name of the element.

    It is important to note that yaidom does not consider namespace declarations to be attributes themselves. Otherwise, there would have been circular dependencies between both concepts, because attributes with namespaces require in-scope namespaces and therefore namespace declarations for resolving the names of these attributes.

    Many traits, such as eu.cdevreeze.yaidom.queryapi.HasEName, are just "capabilities", and need to be combined with trait eu.cdevreeze.yaidom.queryapi.ElemLike in order to offer a useful element querying API.

    Note that trait eu.cdevreeze.yaidom.queryapi.ElemLike only knows about elements, not about other kinds of nodes. Of course the actual element implementations mixing in this query API know about other node types, but that knowledge is outside the uniform query API. Note that the example queries above only use the minimal element knowledge that traits ElemLike and HasEName together have about elements. Therefore the query code can be used unchanged for different element implementations.

    Trait eu.cdevreeze.yaidom.queryapi.IsNavigable is used to navigate to an element given a Path.

    Trait eu.cdevreeze.yaidom.queryapi.UpdatableElemLike (which extends trait IsNavigable) offers functional updates at given paths. Whereas the traits mentioned above know only about elements, this trait knows that elements have some node super-type.

    Instead of functional updates at given paths, elements can also be "transformed" functionally without specifying any paths. This is offered by trait eu.cdevreeze.yaidom.queryapi.TransformableElemLike. The Scala XML and DOM wrappers above do not mix in this trait.

    Three uniform query API levels

    Above, several individual query API traits were mentioned. There are, however, 3 query API levels which are interesting for those who extend yaidom with new element implementations, but also for most users of the yaidom query API. These levels are represented by "combination traits" that combine several of the query API traits mentioned (or not mentioned) above.

    The most basic level is eu.cdevreeze.yaidom.queryapi.ClarkNodes.Elem. It combines traits such as eu.cdevreeze.yaidom.queryapi.ElemApi and eu.cdevreeze.yaidom.queryapi.HasENameApi. Object eu.cdevreeze.yaidom.queryapi.ClarkNodes also contains types for non-element nodes. All element implementations that extend trait ClarkNodes.Elem should have a node hierarchy with all its kinds of nodes extending the appropriate ClarkNodes member type.

    All element implementation directly or indirectly implement the ClarkNodes.Elem trait. The part of the yaidom query API that knows about ElemApi querying and about ENames is the ClarkNodes query API level. It does not know about QNames, in-scope namespaces, ancestor elements, base URIs, etc.

    The next level is eu.cdevreeze.yaidom.queryapi.ScopedNodes.Elem. It extends the ClarkNodes.Elem trait, but offers knowledge about QNames and in-scope namespaces as well. Many element implementations offer at least this query API level. The remarks about non-element nodes above also apply here, and apply below.

    The third level is eu.cdevreeze.yaidom.queryapi.BackingNodes.Elem. It extends the ScopedNodes.Elem trait, but offers knowledge about ancestor elements and document/base URIs as well. This is the level typically used for "backing elements" in "yaidom dialects", thus allowing for multiple "XML backends" to be used behind "yaidom dialects". Yaidom dialects are specific "XML dialect" type-safe yaidom query APIs, mixing in and leveraging trait eu.cdevreeze.yaidom.queryapi.SubtypeAwareElemApi (often in combination with eu.cdevreeze.yaidom.queryapi.ScopedNodes.Elem).

    To get to know the yaidom query API and its 3 levels, it pays off to study the API documentation of traits eu.cdevreeze.yaidom.queryapi.ClarkNodes.Elem, eu.cdevreeze.yaidom.queryapi.ScopedNodes.Elem and eu.cdevreeze.yaidom.queryapi.BackingNodes.Elem.

    Some element implementations

    In package simple there are 2 immutable element implementations, eu.cdevreeze.yaidom.simple.ElemBuilder and eu.cdevreeze.yaidom.simple.Elem. Arguably, ElemBuilder is not an element implementation. Indeed, it does not even offer the ClarkNodes.Elem query API.

    Class eu.cdevreeze.yaidom.simple.Elem is the default element implementation of yaidom. It extends class eu.cdevreeze.yaidom.simple.Node. The latter also has sub-classes for text nodes, comments, entity references and processing instructions. Class eu.cdevreeze.yaidom.simple.Document contains a document Elem, but is not a Node sub-class itself. This node hierarchy offers the ScopedNodes query API, so simple elements offer the ScopedNodes.Elem query API.

    The eu.cdevreeze.yaidom.simple.Elem class has the following characteristics:

    • It is immutable, and thread-safe
    • These elements therefore cannot be queried for their parent elements
    • It mixes in query API trait eu.cdevreeze.yaidom.queryapi.ScopedNodes.Elem, eu.cdevreeze.yaidom.queryapi.UpdatableElemApi and eu.cdevreeze.yaidom.queryapi.TransformableElemApi
    • Besides the element name, attributes and child nodes, it keeps a Scope, but no Declarations
    • This makes it easy to compose these elements, as long as scopes are passed explicitly throughout the element tree
    • Equality is reference equality, because it is hard to come up with a sensible equality for this element class
    • Roundtripping cannot be entirely lossless, but this class does try to retain the attribute order (although irrelevant according to XML Infoset)
    • Packages parse and print offer DocumentParser and DocumentPrinter classes for parsing/serializing these default Elem (and Document) instances

    Creating such Elem trees by hand is a bit cumbersome, partly because scopes have to be passed to each Elem in the tree. The latter is not needed if we use class eu.cdevreeze.yaidom.simple.ElemBuilder to create element trees by hand. When the tree has been fully created as ElemBuilder, invoke method ElemBuilder.build(parentScope) to turn it into an Elem.

    Like their super-classes Node and NodeBuilder, classes Elem and ElemBuilder have very much in common. Both are immutable, easy to compose (ElemBuilder instances even more so), equality is reference equality, etc. The most important differences are as follows:

    • Instead of a Scope, an ElemBuilder contains a Declarations
    • This makes an ElemBuilder easier to compose than an Elem, because no Scope needs to be passed around throughout the tree
    • Class ElemBuilder uses a minimal query API, mixing in almost only traits ElemLike and TransformableElemLike
    • After all, an ElemBuilder neither keeps nor knows about Scopes, so does not know about resolved element/attribute names

    The Effective Java book element in the XML example above could have been written as ElemBuilder (without the inter-element whitespace) as follows:

    import NodeBuilder._
    
    elem(
      qname = QName("book:Book"),
      attributes = Vector(QName("ISBN") -> "978-0321356680", QName("Price") -> "35", QName("Edition") -> "2"),
      children = Vector(
        elem(
          qname = QName("book:Title"),
          children = Vector(
            text("Effective Java (2nd Edition)")
          )
        ),
        elem(
          qname = QName("book:Authors"),
          children = Vector(
            elem(
              qname = QName("auth:Author"),
              children = Vector(
                elem(
                  qname = QName("auth:First_Name"),
                  children = Vector(
                    text("Joshua")
                  )
                ),
                elem(
                  qname = QName("auth:Last_Name"),
                  children = Vector(
                    text("Bloch")
                  )
                )
              )
            )
          )
        )
      )
    )

    This ElemBuilder (say, eb) lacks namespace declarations for prefixes book and auth. So, the following returns false:

    eb.canBuild(Scope.Empty)

    while the following returns true:

    eb.canBuild(Scope.from("book" -> "http://bookstore/book", "auth" -> "http://bookstore/author"))

    Indeed,

    eb.build(Scope.from("book" -> "http://bookstore/book", "auth" -> "http://bookstore/author"))

    returns the element tree as Elem.

    Note that the distinction between ElemBuilder and Elem "solves" the mismatch that immutable ("functional") element trees are constructed in a bottom-up manner, while namespace scoping works in a top-down manner. (See also Anti-XML issue 78, in https://github.com/djspiewak/anti-xml/issues/78).

    There are many more element implementations in yaidom, most of them in sub-packages of this package. Yaidom is extensible in that new element implementations can be invented, for example elements that are better "roundtrippable" (at the expense of "composability"), or yaidom wrappers around other DOM-like APIs (such as XOM or JDOM2). The current element implementations in yaidom are for example:

    • Immutable class eu.cdevreeze.yaidom.simple.Elem, the default (immutable) element implementation. See above.
    • Immutable class eu.cdevreeze.yaidom.simple.ElemBuilder for creating an Elem by hand. See above.
    • Immutable class eu.cdevreeze.yaidom.resolved.Elem, which takes namespace prefixes out of the equation, and therefore makes useful (namespace-aware) equality comparisons feasible. It offers the ClarkNodes.Elem query API (as well as update/transformation support).
    • Immutable class eu.cdevreeze.yaidom.indexed.Elem, which offers views on default Elems that know the ancestry of each element. It offers the BackingNodes.Elem query API, so knows its ancestry, despite being immutable! This element implementation is handy for querying XML schemas, for example, because in schemas the ancestry of queried elements typically matters.

    One yaidom wrapper that is very useful is a Saxon tiny tree yaidom wrapper, namely SaxonElem (JVM-only). Like "indexed elements", it offers all of the BackingNodes.Elem query API. This element implementation is very efficient, especially in memory footprint (when using the default tree model, namely tiny trees). It is therefore the most attractive element implementation to use in "enterprise" production code, but only on the JVM. In combination with Saxon-EE (instead of Saxon-HE) the underlying Saxon NodeInfo objects can even carry interesting type information.

    For ad-hoc element creation, consider using "resolved" elements. They are easy to create, because there is no need to worry about namespace prefixes. Once created, they can be converted to "simple" elements, given an appropriate Scope (without default namespace).

    Packages and dependencies

    Yaidom has the following packages, and layering between packages (mentioning the lowest layers first):

    • Package eu.cdevreeze.yaidom.core, with the core concepts described above. It depends on no other yaidom packages.
    • Package eu.cdevreeze.yaidom.queryapi, with the query API traits described above. It only depends on the core package.
    • Package eu.cdevreeze.yaidom.resolved, with a minimal "James Clark" element implementation. It only depends on the core and queryapi packages.
    • Package eu.cdevreeze.yaidom.simple, with the default element implementation described above. It only depends on the core and queryapi packages.
    • Package eu.cdevreeze.yaidom.indexed, supporting "indexed" elements. It only depends on the core, queryapi and simple packages.
    • Package convert. It contains conversions between default yaidom nodes on the one hand and DOM, Scala XML, etc. on the other hand. The convert package depends on the yaidom core, queryapi, resolved and simple packages.
    • Package eu.cdevreeze.yaidom.saxon, with the Saxon wrapper element implementation described above. It only depends on the core, queryapi and convert packages.
    • Packages eu.cdevreeze.yaidom.parse and eu.cdevreeze.yaidom.print, for parsing/printing Elems. They depend on the packages mentioned above, except for indexed and saxon.
    • The other packages (except utils), such as dom and scalaxml. They depend on (some of) the packages mentioned above, but not on each other.
    • Package eu.cdevreeze.yaidom.utils, which depends on all the packages above.

    Indeed, all yaidom package dependencies are uni-directional.

    Notes on performance

    Yaidom can be quite memory-hungry. One particular cause of that is the possible creation of very many duplicate EName and QName instances. This can be the case while parsing XML into yaidom documents, or while querying yaidom element trees.

    The user of the library can reduce memory consumption to a large extent, and yaidom facilitates that.

    As for querying, prefer:

    import HasENameApi._
    
    bookstoreElem filterElemsOrSelf withEName("http://bookstore/book", "Book")

    to:

    bookstoreElem.filterElemsOrSelf(EName("http://bookstore/book", "Book"))

    to avoid unnecessary (large scale) EName object creation.

    To reduce the memory footprint of parsed XML trees, see eu.cdevreeze.yaidom.core.ENameProvider and eu.cdevreeze.yaidom.core.QNameProvider.

    For example, during the startup phase of an application, we could set the global ENameProvider as follows:

    ENameProvider.globalENameProvider.become(new ENameProvider.ENameProviderUsingImmutableCache(knownENames))

    Note that the global ENameProvider or QNameProvider can typically be configured rather late during development, but the memory cost savings can be substantial once configured. Also note that the global ENameProvider or QNameProvider can be used implicitly in application code, by writing:

    bookstoreElem filterElemsOrSelf getEName("http://bookstore/book", "Book")

    using an implicit ENameProvider, whose members are in scope. Still, for querying the first alternative using withEName is better, but there are likely many scenarios in yaidom client code where an implicit ENameProvider or QNameProvider makes sense.

    The bottom line is that yaidom can be configured to be far less memory-hungry, and that yaidom client code can also take some responsibility in reducing memory usage. Again, the Saxon wrapper implementation is an excellent and efficient choice (but only on the JVM).

    Definition Classes
    cdevreeze
  • package convert

    Support for conversions from/to yaidom.

    Support for conversions from/to yaidom. This package mostly contains conversions between yaidom objects and JAXP DOM or StAX objects, in both directions. This package does not support conversions between different yaidom element implementations. It is too low level a package for that.

    This conversion support is used by the Document parsers and printers in the parse and print packages, respectively. This package can also be used directly by consumers of the yaidom API.

    These JAXP-object conversions suggest that yaidom is optimistic about the available (heap) memory.

    This package depends on the eu.cdevreeze.yaidom.core, eu.cdevreeze.yaidom.queryapi and eu.cdevreeze.yaidom.simple packages, and not the other way around. The eu.cdevreeze.yaidom.parse and eu.cdevreeze.yaidom.print packages depend on this package.

    Definition Classes
    yaidom
  • package core

    This package contains the core concepts, such as qualified names, expanded names, namespace declarations, in-scope namespaces, paths and path builders.

    This package contains the core concepts, such as qualified names, expanded names, namespace declarations, in-scope namespaces, paths and path builders.

    This package depends on no other packages in yaidom, but almost all other packages do depend on this one.

    Definition Classes
    yaidom
  • package dom

    Wrapper around class org.w3c.dom.Element, adapting it to the eu.cdevreeze.yaidom.queryapi.ElemLike API.

    Wrapper around class org.w3c.dom.Element, adapting it to the eu.cdevreeze.yaidom.queryapi.ElemLike API.

    This wrapper is not thread-safe, and should only be used if the immutable element classes such as eu.cdevreeze.yaidom.simple.Elem are not the best fit.

    Such scenarios could be as follows:

    • Conversions from DOM to eu.cdevreeze.yaidom.simple.Elem (and back) have more runtime costs than needed or wanted.
    • Round-tripping from XML string to "tree", and back to XML string should keep the resulting XML string as much as possible the same.
    • In-place updates (instead of "functional updates") of DOM trees are desired.
    • The DOM elements are desired for their PSVI information.

    Yet be aware that the advantages of immutability and thread-safety (offered by immutable Elem classes) are lost when using this wrapper API. Mutable DOM trees are also very easy to break, even via the ElemLike API, if element predicates with side-effects are used.

    To explain the "round-tripping" item above, note that class eu.cdevreeze.yaidom.simple.Elem considers attributes in an element unordered, let alone namespace declarations. That is consistent with the XML Infoset specification, but can sometimes be impractical. Using org.w3c.dom.Element instances, parsed from XML input sources, chances are that this order is retained.

    There are of course limitations to what formatting data is retained in a DOM tree. A good example is the short versus long form of an empty element. Typically parsers do not pass any information about this distinction, so it is unknown whether the XML input source used the long or short form for an empty element.

    It should also be noted that the configuration of XML parsers and serializers can be of substantial influence on the extent that "round-tripping" keeps the XML string the same. Whitespace handling is one such area in which different configurations can lead to quite different "round-tripping" results.

    Note that in one way these wrappers are somewhat unnatural: the ElemLike API uses immutable Scala collections everywhere, whereas the elements of those collections are mutable (!) DOM node wrappers. The wrappers are idiomatic Scala in their use of the Scala Collections API, whereas the wrapped DOM nodes come from a distant past, when imperative programming and "mutability everywhere" ruled.

    In comparison to XPath against DOM trees, the ElemLike API may be more verbose, but it requires no setup and "result set handling" boilerplate.

    Definition Classes
    yaidom
  • package indexed

    This package contains element representations that contain the "context" of the element.

    This package contains element representations that contain the "context" of the element. That is, the elements in this package are pairs of a root element and a path (to the actual element itself). The "context" of an element also contains an optional document URI.

    An example of where such a representation can be useful is XML Schema. After all, to interpret an element definition in an XML schema, we need context of the element definition to determine the target namespace, or to determine whether the element definition is top level, etc.

    Below follows a simple example query, using the uniform query API:

    // Note the import of package indexed, and not of its members. That is indeed a best practice!
    import eu.cdevreeze.yaidom.indexed
    
    val indexedBookstoreElem = indexed.Elem(bookstoreElem)
    
    val scalaBookAuthors =
      for {
        bookElem <- indexedBookstoreElem \ EName("{http://bookstore/book}Book")
        if (bookElem \@ EName("ISBN")).contains("978-0981531649")
        authorElem <- bookElem \\ EName("{http://bookstore/author}Author")
      } yield authorElem

    The query for Scala book authors would have been exactly the same if normal Elems had been used instead of indexed.Elems (replacing indexedBookstoreElem by bookstoreElem)!

    There is no explicit functional update support for the indexed elements in this package. Of course the underlying elements can be functionally updated (for element implementations that offer such update support), and indexed elements can be created from the update results, but this is hardly efficient functional update support.

    One problem with efficient functional updates for indexed elements is that updating just one child element means that all subsequent child elements may have to be updated as well, adapting the stored paths. In comparison, simple elements do not have this restriction, and can be updated in isolation. Hence the functional update support for simple elements but not for the different indexed element implementations.

    Definition Classes
    yaidom
  • package java8

    The streaming element query API that can be used in Java 8, in this package and its sub-packages.

    The streaming element query API that can be used in Java 8, in this package and its sub-packages. This package itself contains some common data structures shared by the API.

    This API is experimental!

    Definition Classes
    yaidom
  • package parse

    Support for parsing XML into yaidom Documents and Elems.

    Support for parsing XML into yaidom Documents and Elems. This package offers the eu.cdevreeze.yaidom.parse.DocumentParser trait, as well as several implementations. Those implementations use JAXP (SAX, DOM or StAX), and most of them use the convert package to convert JAXP artifacts to yaidom Documents.

    For example:

    val docParser = DocumentParserUsingSax.newInstance()
    
    val doc: Document = docParser.parse(docUri)

    This example chose a SAX-based implementation, and used the default configuration of that document parser.

    Having several different fully configurable JAXP-based implementations shows that yaidom is pessimistic about the transparency of parsing and printing XML. It also shows that yaidom is optimistic about the available (heap) memory and processing power, because of the 2 separated steps of JAXP parsing/printing and (in-memory) convert conversions. Using JAXP means that escaping of characters is something that JAXP deals with, and that's definitely better than trying to do it yourself.

    One DocumentParser implementation does not use any convert conversion. That is DocumentParserUsingSax. It is likely the fastest of the DocumentParser implementations.

    The preferred DocumentParser for XML (not HTML) parsing is DocumentParserUsingDomLS, if memory usage is not an issue. This DocumentParser implementation is best integrated with DOM, and is highly configurable, although DOM LS configuration is somewhat involved.

    This package depends on the eu.cdevreeze.yaidom.core, eu.cdevreeze.yaidom.queryapi, eu.cdevreeze.yaidom.simple and convert packages, and not the other way around.

    Definition Classes
    yaidom
  • package print

    Support for "printing" yaidom Documents and Elems.

    Support for "printing" yaidom Documents and Elems. This package offers the eu.cdevreeze.yaidom.print.DocumentPrinter trait, as well as several implementations. Most of those implementations use the convert package to convert yaidom Documents to JAXP artifacts, and all use JAXP (DOM, SAX or StAX).

    For example:

    val docPrinter = DocumentPrinterUsingSax.newInstance()
    
    docPrinter.print(doc, "UTF-8", System.out)

    This example chose a SAX-based implementation, and used the default configuration of that document printer.

    Having several different fully configurable JAXP-based implementations shows that yaidom is pessimistic about the transparency of parsing and printing XML. It also shows that yaidom is optimistic about the available (heap) memory and processing power, because of the 2 separated steps of JAXP parsing/printing and (in-memory) convert conversions. Using JAXP means that escaping of characters is something that JAXP deals with, and that's definitely better than trying to do it yourself.

    One DocumentPrinter implementation does not use any convert conversion. That is DocumentPrinterUsingSax. It is likely the fastest of the DocumentPrinter implementations, as well as the one using the least memory.

    The preferred DocumentPrinter for XML (not HTML) printing is DocumentPrinterUsingDomLS, if memory usage is not an issue. This DocumentPrinter implementation is best integrated with DOM, and is highly configurable, although DOM LS configuration is somewhat involved.

    This package depends on the eu.cdevreeze.yaidom.core, eu.cdevreeze.yaidom.queryapi, eu.cdevreeze.yaidom.simple and convert packages, and not the other way around.

    Definition Classes
    yaidom
  • package queryapi

    This package contains the (renewed) query API traits.

    This package contains the (renewed) query API traits. It contains both the purely abstract API traits as well as the partial implementation traits.

    Generic code abstracting over yaidom element implementations should either use trait ClarkNodes.Elem or sub-trait ScopedNodes.Elem, or even BackingNodes.Elem, depending on the abstraction level.

    These traits are combinations of several small query API traits. Most of these API traits are orthogonal.

    Simplicity and consistency of the entire query API are 2 important design considerations. For example, the query API methods themselves use no parameterized types. Note how the resulting API with type members is essentially the same as the "old" yaidom query API using type parameters, except that the purely abstract traits are less constrained in the type members.

    This package depends only on the core package in yaidom, but many other packages do depend on this one.

    Note: whereas the old query API used F-bounded polymorphism with type parameters extensively, this new query API essentially just uses type member(s) ThisElem (and ThisNode), defined in a common super-trait. The old query API may be somewhat easier to develop (that is, convincing the compiler), but the new query API is easier to use as generic "backend" element query API. As an example, common "bridge" element query APIs come to mind, used within type-safe XML dialect DOM tree implementations. The reason this is easier with the new API is intuitively that fewer type constraints leak to the query API client code.

    Definition Classes
    yaidom
  • AnyDocumentApi
  • AnyElemApi
  • AnyElemNodeApi
  • BackingDocumentApi
  • BackingElemApi
  • BackingNodes
  • ClarkElemApi
  • ClarkElemLike
  • ClarkNodes
  • DocumentApi
  • ElemApi
  • ElemCreationApi
  • ElemLike
  • ElemTransformationApi
  • ElemTransformationLike
  • ElemUpdateApi
  • ElemUpdateLike
  • ElemWithPath
  • HasChildNodesApi
  • HasEName
  • HasENameApi
  • HasParent
  • HasParentApi
  • HasQNameApi
  • HasScopeApi
  • HasText
  • HasTextApi
  • IndexedClarkElemApi
  • IndexedScopedElemApi
  • IsNavigable
  • IsNavigableApi
  • Nodes
  • ScopedElemApi
  • ScopedElemLike
  • ScopedNodes
  • SubtypeAwareElemApi
  • SubtypeAwareElemLike
  • TransformableElemApi
  • TransformableElemLike
  • UpdatableElemApi
  • UpdatableElemLike
  • XmlBaseSupport
  • package resolved

    This package contains element representations that can be compared for (some notion of "value") equality, unlike normal yaidom nodes.

    This package contains element representations that can be compared for (some notion of "value") equality, unlike normal yaidom nodes. That notion of equality is simple to understand, but "naive". The user is of the API must take control over what is compared for equality.

    See eu.cdevreeze.yaidom.resolved.Node for why this package is named resolved.

    The most important difference with normal Elems is that qualified names do not occur, but only expanded (element and attribute) names. This reminds of James Clark notation for XML trees and expanded names, where qualified names are absent.

    Moreover, the only nodes in this package are element and text nodes.

    Below follows a simple example query, using the uniform query API:

    // Note the import of package resolved, and not of its members. That is indeed a best practice!
    import eu.cdevreeze.yaidom.resolved
    
    val resolvedBookstoreElem = resolved.Elem.from(bookstoreElem)
    
    val scalaBookAuthors =
      for {
        bookElem <- resolvedBookstoreElem \ EName("{http://bookstore/book}Book")
        if (bookElem \@ EName("ISBN")).contains("978-0981531649")
        authorElem <- bookElem \\ EName("{http://bookstore/author}Author")
      } yield authorElem

    The query for Scala book authors would have been exactly the same if normal Elems had been used instead of resolved.Elems (replacing resolvedBookstoreElem by bookstoreElem)!

    Definition Classes
    yaidom
  • package saxon

    Saxon-based BackingNodes.Elem implementation that can be used as underlying element implementation in any "yaidom dialect".

    Saxon-based BackingNodes.Elem implementation that can be used as underlying element implementation in any "yaidom dialect". If Saxon tiny trees are used under the hood, this implementation is very efficient, in particular in memory footprint.

    This package depends on the current latest major Saxon version considered stable, like 9.8. If other major Saxon versions must be supported, consider copying and adapting this code, or yaidom itself should provide a separate source tree (with copied and adapted code) from which artifacts are created that target that other Saxon major version(s).

    The dependency is on Saxon-HE, so features of Saxon-EE are not used here. That does not mean that they are not accessible, of course.

    Definition Classes
    yaidom
  • package scalaxml

    Wrapper around class scala.xml.Elem, adapting it to the eu.cdevreeze.yaidom.queryapi.ElemLike API.

    Wrapper around class scala.xml.Elem, adapting it to the eu.cdevreeze.yaidom.queryapi.ElemLike API.

    This wrapper brings the uniform yaidom query API to Scala XML literals (and Scala XML Elems in general).

    For some namespace-related pitfalls and peculiarities, see eu.cdevreeze.yaidom.scalaxml.ScalaXmlElem.

    Definition Classes
    yaidom
  • package simple

    This package contains the default element implementation.

    This package contains the default element implementation.

    This package depends only on the core and queryapi packages in yaidom, but many other packages do depend on this one.

    Definition Classes
    yaidom
  • package utils

    Several utilities, such as NamespaceUtils.

    Several utilities, such as NamespaceUtils. They are utilities "on top of yaidom", so the rest of yaidom has no dependencies on this package, but this package does depend on the rest of yaidom.

    Definition Classes
    yaidom
  • package xpath

    XPath evaluation abstraction.

    XPath evaluation abstraction. It is for yaidom what the standard Java XPath API is for Java. Like the standard Java XPath API, this yaidom XPath API is abstract in the sense that it allows for many different implementations. This yaidom XPath API has even an implementation targeting Scala-JS, so it has no dependencies on JAXP.

    This API should be useful for any XPath version, even as old as 1.0. Indeed, there is no XDM (XPath Data Model) abstraction in this API.

    Preferably implementations use (the same) yaidom element types for context items and returned nodes, or use types for them that have yaidom wrappers. This would make it easy to mix XPath evaluations with yaidom queries, at reasonably low runtime costs.

    Currently there are no classes in this XPath API for functions, function resolvers and variable resolvers.

    The remainder of yaidom has no dependency on this xpath package and its sub-packages.

    Implementing XPath support is error-prone. This is yet another reason why it is important that the remainder of yaidom does not depend on its XPath support!

    Definition Classes
    yaidom

package queryapi

This package contains the (renewed) query API traits. It contains both the purely abstract API traits as well as the partial implementation traits.

Generic code abstracting over yaidom element implementations should either use trait ClarkNodes.Elem or sub-trait ScopedNodes.Elem, or even BackingNodes.Elem, depending on the abstraction level.

These traits are combinations of several small query API traits. Most of these API traits are orthogonal.

Simplicity and consistency of the entire query API are 2 important design considerations. For example, the query API methods themselves use no parameterized types. Note how the resulting API with type members is essentially the same as the "old" yaidom query API using type parameters, except that the purely abstract traits are less constrained in the type members.

This package depends only on the core package in yaidom, but many other packages do depend on this one.

Note: whereas the old query API used F-bounded polymorphism with type parameters extensively, this new query API essentially just uses type member(s) ThisElem (and ThisNode), defined in a common super-trait. The old query API may be somewhat easier to develop (that is, convincing the compiler), but the new query API is easier to use as generic "backend" element query API. As an example, common "bridge" element query APIs come to mind, used within type-safe XML dialect DOM tree implementations. The reason this is easier with the new API is intuitively that fewer type constraints leak to the query API client code.

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. queryapi
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Type Members

  1. trait AnyDocumentApi extends AnyRef

    Super-trait for all document types, promising an element type for the document element.

  2. trait AnyElemApi extends AnyRef

    Super-trait for all element query API traits, promising a self type.

    Super-trait for all element query API traits, promising a self type.

    Simplicity and consistency of the entire query API are 2 important design considerations. For example, the query API methods themselves use no generics.

  3. trait AnyElemNodeApi extends AnyElemApi

    Super-trait for all element query API traits that know about the node super-type.

  4. trait BackingDocumentApi extends DocumentApi

    Backing document API, representing a document that contains a BackingNodes.Elem root element.

  5. trait BackingElemApi extends IndexedScopedElemApi with HasParentApi

    Shorthand for IndexedScopedElemApi with HasParentApi.

    Shorthand for IndexedScopedElemApi with HasParentApi. In other words, this is an ancestry-aware "scoped element" query API.

    Efficient implementations are possible for indexed elements and Saxon NodeInfo objects (backed by native tiny trees). Saxon-backed elements are not offered by core yaidom, however. Saxon tiny trees are attractive for their low memory footprint.

    It is possible to offer implementations by combining the partial implementation traits (XXXLike), or by entirely custom and efficient "backend-aware" implementations.

  6. trait ClarkElemApi extends ElemApi with IsNavigableApi with HasENameApi with HasTextApi

    Shorthand for ElemApi with IsNavigableApi with HasENameApi with HasTextApi.

    Shorthand for ElemApi with IsNavigableApi with HasENameApi with HasTextApi. In other words, the minimal element query API corresponding to James Clark's "labelled element tree" abstraction, which is implemented as yaidom "resolved" elements.

    If a yaidom element implementation (whether in yaidom itself or a "yaidom extension") does not mix in the ClarkElemApi trait, it is probably not to be considered "XML". Indeed, in yaidom only the ElemBuilder class does not mix in this trait, and indeed it is not "XML" (lacking any knowledge about expanded names etc.), only a builder of "XML". Hence this trait is very important in yaidom, as the "minimal XML element query API".

    Generic code abstracting over yaidom element implementations should either use this trait, or sub-trait ScopedElemApi, depending on the abstraction level.

  7. trait ClarkElemLike extends ClarkElemApi with ElemLike with IsNavigable with HasEName with HasText

    Partial implementation of ClarkElemApi.

  8. trait DocumentApi extends AnyDocumentApi

    Minimal API for Documents, having a type parameter for the element type.

    Minimal API for Documents, having a type parameter for the element type.

    This is a purely abstract API trait. It can be useful in generic code abstracting over multiple element implementations.

  9. trait ElemApi extends AnyElemApi

    This is the foundation of the yaidom uniform query API.

    This is the foundation of the yaidom uniform query API. Many DOM-like element implementations in yaidom mix in this trait (indirectly, because some implementing sub-trait is mixed in), thus sharing this query API.

    This trait typically does not show up in application code using yaidom, yet its (uniform) API does. Hence, it makes sense to read the documentation of this trait, knowing that the API is offered by multiple element implementations.

    This trait is purely abstract. The most common implementation of this trait is eu.cdevreeze.yaidom.queryapi.ElemLike. That trait only knows about elements (and not about other nodes), and only knows that elements can have child elements (again not knowing about other child nodes). Using this minimal knowledge alone, it offers methods to query for descendant elements, descendant-or-self methods, or sub-collections thereof. It is this minimal knowledge that makes this API uniform.

    This query API leverages the Scala Collections API. Query results can be manipulated using the Collections API, and the query API implementation (in ElemLike) uses the Collections API internally.

    ElemApi examples

    To illustrate usage of this API, consider the following example. Let's say we want to determine if some XML has its namespace declarations (if any) only at the root element level. We show the query code for several yaidom DOM-like element implementations.

    Note that it depends on the DOM-like element implementation how to query for namespace declarations, but the code to query for descendant or descendant-or-self elements remains the same. The method to retrieve all descendant elements is called findAllElems, and the method to retrieve all descendant-or-self elements is called findAllElemsOrSelf. The corresponding "filtering" methods are called filterElems and filterElemsOrSelf, respectively. Knowing this, it is easy to guess the other API method names.

    Let's start with a yaidom DOM wrapper, named rootElem, of type DomElem, and query for the "offending" descendant elements:

    rootElem filterElems (elem => !convert.DomConversions.extractNamespaceDeclarations(elem.wrappedNode.getAttributes).isEmpty)

    This returns all offending elements, that is, all descendant elements of the root element (excluding the root element itself) that have at least one namespace declaration.

    Now let's use an eu.cdevreeze.yaidom.simple.ElemBuilder, again named rootElem:

    rootElem filterElems (elem => !elem.namespaces.isEmpty)

    The query is the same as the preceding one, except for the retrieval of namespace declarations of an element. (It should be noted that class ElemBuilder already has a method allDeclarationsAreAtTopLevel.)

    Finally, let's use a rootElem of type eu.cdevreeze.yaidom.indexed.Elem, which is immutable, but knows its ancestry:

    rootElem filterElems (elem => !elem.namespaces.isEmpty)

    This is exactly the same code as for ElemBuilder, because namespace declarations happen to be retrieved in the same way.

    If we want to query for all elements with namespace declarations, including the root element itself, we could write:

    rootElem filterElemsOrSelf (elem => !elem.namespaces.isEmpty)

    In summary, the extremely simple ElemApi query API is indeed a uniform query API, offered by many different yaidom DOM-like element implementations.

    ElemApi more formally

    In order to get started using the API, this more formal section can safely be skipped. On the other hand, this section may provide a deeper understanding of the API.

    The ElemApi trait can be understood in a precise mathematical sense, as shown below.

    The most fundamental method of this trait is findAllChildElems. The semantics of the other methods can be defined directly or indirectly in terms of this method.

    The basic operations definable in terms of that method are \ (alias for filterChildElems), \\ (alias for filterElemsOrSelf) and \\! (alias for findTopmostElemsOrSelf). Their semantics must be as if they had been defined as follows:

    def filterChildElems(p: ThisElem => Boolean): immutable.IndexedSeq[ThisElem] =
      this.findAllChildElems.filter(p)
    
    def filterElemsOrSelf(p: ThisElem => Boolean): immutable.IndexedSeq[ThisElem] =
      Vector(this).filter(p) ++ (this.findAllChildElems flatMap (_.filterElemsOrSelf(p)))
    
    def findTopmostElemsOrSelf(p: ThisElem => Boolean): immutable.IndexedSeq[ThisElem] =
      if (p(this)) Vector(this)
      else (this.findAllChildElems flatMap (_.findTopmostElemsOrSelf(p)))

    Moreover, we could have defined:

    def filterElems(p: ThisElem => Boolean): immutable.IndexedSeq[ThisElem] =
      this.findAllChildElems flatMap (_.filterElemsOrSelf(p))
    
    def findTopmostElems(p: ThisElem => Boolean): immutable.IndexedSeq[ThisElem] =
      this.findAllChildElems flatMap (_.findTopmostElemsOrSelf(p))

    and:

    def findAllElemsOrSelf: immutable.IndexedSeq[ThisElem] = filterElemsOrSelf(e => true)
    
    def findAllElems: immutable.IndexedSeq[ThisElem] = filterElems(e => true)

    The following properties must hold (in the absence of side-effects), and can indeed be proven (given the documented "definitions" of these operations):

    // Filtering
    
    elem.filterElems(p) == elem.findAllElems.filter(p)
    
    elem.filterElemsOrSelf(p) == elem.findAllElemsOrSelf.filter(p)
    
    // Finding topmost
    
    elem.findTopmostElems(p) == {
      elem.filterElems(p) filter { e =>
        val hasNoMatchingAncestor = elem.filterElems(p) forall { _.findElem(_ == e).isEmpty }
        hasNoMatchingAncestor
      }
    }
    
    elem.findTopmostElemsOrSelf(p) == {
      elem.filterElemsOrSelf(p) filter { e =>
        val hasNoMatchingAncestor = elem.filterElemsOrSelf(p) forall { _.findElem(_ == e).isEmpty }
        hasNoMatchingAncestor
      }
    }
    
    (elem.findTopmostElems(p) flatMap (_.filterElemsOrSelf(p))) == (elem.filterElems(p))
    
    (elem.findTopmostElemsOrSelf(p) flatMap (_.filterElemsOrSelf(p))) == (elem.filterElemsOrSelf(p))
  10. trait ElemCreationApi extends AnyRef

    This is the generic element creation API.

    This is the generic element creation API. It fits in the overall philosophy of yaidom in that it is based on ENames, not on QNames.

  11. trait ElemLike extends ElemApi

    API and implementation trait for elements as containers of elements, as element nodes in a node tree.

    API and implementation trait for elements as containers of elements, as element nodes in a node tree. This trait knows very little about elements. It does not know about names, attributes, etc. All it knows about elements is that elements can have element children (other node types are entirely out of scope in this trait).

    The purely abstract API offered by this trait is eu.cdevreeze.yaidom.queryapi.ElemApi. See the documentation of that trait for examples of usage, and for a more formal treatment. Below follows an even more formal treatment, with proofs by induction of important properties obeyed by methods of this API. It shows the mathematical rigor of the yaidom query API. API users that are only interested in how to use the API can safely skip that formal treatment.

    ElemLike more formally

    In order to get started using the API, this more formal section can safely be skipped. On the other hand, this section may provide a deeper understanding of the API.

    The only abstract method is findAllChildElems. Based on this method alone, this trait offers a rich API for querying elements. This is entirely consistent with the semantics defined in the ElemApi trait. Indeed, the implementation of the methods follows the semantics defined there.

    In the ElemApi trait, some (simple) provable laws were mentioned. Some proofs follow below.

    1. Proving property about filterElemsOrSelf

    Below follows a proof by structural induction of one of the laws mentioned in the documentation of trait ElemApi.

    First we make a few assumptions, for this proof, and (implicitly) for the other proofs:

    • The function literals used in the properties ("element predicates" in this case) have no side-effects
    • These function literals terminate normally, without throwing any exception
    • These function literals are "closed terms", so the function values that are instances of these function literals are not "true closures"
    • These function literals use "fresh" variables, thus avoiding shadowing of variables defined in the context of the function literal
    • Equality on the element type is an equivalence relation (reflexive, symmetric, transitive)

    Based on these assumptions, we prove by induction that:

    elm.filterElemsOrSelf(p) == elm.findAllElemsOrSelf.filter(p)

    Base case

    If elm has no child elements, then the LHS can be rewritten as follows:

    elm.filterElemsOrSelf(p)
    immutable.IndexedSeq(elm).filter(p) ++ (elm.findAllChildElems flatMap (_.filterElemsOrSelf(p))) // definition of filterElemsOrSelf
    immutable.IndexedSeq(elm).filter(p) ++ (Seq() flatMap (_.filterElemsOrSelf(p))) // there are no child elements
    immutable.IndexedSeq(elm).filter(p) ++ Seq() // flatMap on empty sequence returns empty sequence
    immutable.IndexedSeq(elm).filter(p) // property of concatenation: xs ++ Seq() == xs
    (immutable.IndexedSeq(elm) ++ Seq()).filter(p) // property of concatenation: xs ++ Seq() == xs
    (immutable.IndexedSeq(elm) ++ (elm.findAllChildElems flatMap (_ filterElemsOrSelf (e => true)))) filter p
      // flatMap on empty sequence (of child elements) returns empty sequence
    (immutable.IndexedSeq(elm).filter(e => true) ++ (elm.findAllChildElems flatMap (_ filterElemsOrSelf (e => true)))) filter p
      // filtering with predicate that is always true
    elm.filterElemsOrSelf(e => true) filter p // definition of filterElemsOrSelf
    elm.findAllElemsOrSelf filter p // definition of findAllElemsOrSelf

    which is the RHS.

    Inductive step

    For the inductive step, we use the following (general) properties:

    (xs.filter(p) ++ ys.filter(p)) == ((xs ++ ys) filter p) // referred to below as property (a)

    and:

    (xs flatMap (x => f(x) filter p)) == ((xs flatMap f) filter p) // referred to below as property (b)

    If elm does have child elements, the LHS can be rewritten as:

    elm.filterElemsOrSelf(p)
    immutable.IndexedSeq(elm).filter(p) ++ (elm.findAllChildElems flatMap (_.filterElemsOrSelf(p))) // definition of filterElemsOrSelf
    immutable.IndexedSeq(elm).filter(p) ++ (elm.findAllChildElems flatMap (ch => ch.findAllElemsOrSelf filter p)) // induction hypothesis
    immutable.IndexedSeq(elm).filter(p) ++ ((elm.findAllChildElems.flatMap(ch => ch.findAllElemsOrSelf)) filter p) // property (b)
    (immutable.IndexedSeq(elm) ++ (elm.findAllChildElems flatMap (_.findAllElemsOrSelf))) filter p // property (a)
    (immutable.IndexedSeq(elm) ++ (elm.findAllChildElems flatMap (_ filterElemsOrSelf (e => true)))) filter p // definition of findAllElemsOrSelf
    (immutable.IndexedSeq(elm).filter(e => true) ++ (elm.findAllChildElems flatMap (_ filterElemsOrSelf (e => true)))) filter p
      // filtering with predicate that is always true
    elm.filterElemsOrSelf(e => true) filter p // definition of filterElemsOrSelf
    elm.findAllElemsOrSelf filter p // definition of findAllElemsOrSelf

    which is the RHS.

    This completes the proof. Other above-mentioned properties can be proven by induction in a similar way.

    2. Proving property about filterElems

    From the preceding proven property it easily follows (without using a proof by induction) that:

    elm.filterElems(p) == elm.findAllElems.filter(p)

    After all, the LHS can be rewritten as follows:

    elm.filterElems(p)
    (elm.findAllChildElems flatMap (_.filterElemsOrSelf(p))) // definition of filterElems
    (elm.findAllChildElems flatMap (e => e.findAllElemsOrSelf.filter(p))) // using the property proven above
    (elm.findAllChildElems flatMap (_.findAllElemsOrSelf)) filter p // using property (b) above
    (elm.findAllChildElems flatMap (_ filterElemsOrSelf (e => true))) filter p // definition of findAllElemsOrSelf
    elm.filterElems(e => true) filter p // definition of filterElems
    elm.findAllElems filter p // definition of findAllElems

    which is the RHS.

    3. Proving property about findTopmostElemsOrSelf

    Given the above-mentioned assumptions, we prove by structural induction that:

    (elm.findTopmostElemsOrSelf(p) flatMap (_.filterElemsOrSelf(p))) == (elm.filterElemsOrSelf(p))

    Base case

    If elm has no child elements, and p(elm) holds, then LHS and RHS evaluate to immutable.IndexedSeq(elm).

    If elm has no child elements, and p(elm) does not hold, then LHS and RHS evaluate to immutable.IndexedSeq().

    Inductive step

    For the inductive step, we introduce the following additional (general) property, if f and g have the same types:

    ((xs flatMap f) flatMap g) == (xs flatMap (x => f(x) flatMap g)) // referred to below as property (c)

    This is also known as the "associativity law for monads". (Monadic types obey 3 laws: associativity, left unit and right unit.)

    If elm does have child elements, and p(elm) holds, the LHS can be rewritten as:

    (elm.findTopmostElemsOrSelf(p) flatMap (_.filterElemsOrSelf(p)))
    immutable.IndexedSeq(elm) flatMap (_.filterElemsOrSelf(p)) // definition of findTopmostElemsOrSelf, knowing that p(elm) holds
    elm.filterElemsOrSelf(p) // definition of flatMap, applied to singleton sequence

    which is the RHS. In this case, we did not even need the induction hypothesis.

    If elm does have child elements, and p(elm) does not hold, the LHS can be rewritten as:

    (elm.findTopmostElemsOrSelf(p) flatMap (_.filterElemsOrSelf(p)))
    (elm.findAllChildElems flatMap (_.findTopmostElemsOrSelf(p))) flatMap (_.filterElemsOrSelf(p))
      // definition of findTopmostElemsOrSelf, knowing that p(elm) does not hold
    elm.findAllChildElems flatMap (ch => ch.findTopmostElemsOrSelf(p) flatMap (_.filterElemsOrSelf(p))) // property (c)
    elm.findAllChildElems flatMap (_.filterElemsOrSelf(p)) // induction hypothesis
    immutable.IndexedSeq() ++ (elm.findAllChildElems flatMap (_.filterElemsOrSelf(p))) // definition of concatenation
    immutable.IndexedSeq(elm).filter(p) ++ (elm.findAllChildElems flatMap (_.filterElemsOrSelf(p)))
      // definition of filter, knowing that p(elm) does not hold
    elm.filterElemsOrSelf(p) // definition of filterElems

    which is the RHS.

    4. Proving property about findTopmostElems

    From the preceding proven property it easily follows (without using a proof by induction) that:

    (elm.findTopmostElems(p) flatMap (_.filterElemsOrSelf(p))) == (elm.filterElems(p))

    After all, the LHS can be rewritten to:

    (elm.findTopmostElems(p) flatMap (_.filterElemsOrSelf(p)))
    (elm.findAllChildElems flatMap (_.findTopmostElemsOrSelf(p))) flatMap (_.filterElemsOrSelf(p)) // definition of findTopmostElems
    elm.findAllChildElems flatMap (ch => ch.findTopmostElemsOrSelf(p) flatMap (_.filterElemsOrSelf(p))) // property (c)
    elm.findAllChildElems flatMap (_.filterElemsOrSelf(p)) // using the property proven above
    elm.filterElems(p) // definition of filterElems

    which is the RHS.

    5. Properties used in the proofs above

    There are several (unproven) properties that were used in the proofs above:

    (xs.filter(p) ++ ys.filter(p)) == ((xs ++ ys) filter p) // property (a); filter distributes over concatenation
    
    (xs flatMap (x => f(x) filter p)) == ((xs flatMap f) filter p) // property (b)
    
    // Associativity law for monads
    ((xs flatMap f) flatMap g) == (xs flatMap (x => f(x) flatMap g)) // property (c)

    Property (a) is obvious, and stated without proof. Property (c) is known as the "associativity law for monads". Property (b) is proven below.

    To prove property (b), we use property (c), as well as the following property (d):

    (xs filter p) == (xs flatMap (y => if (p(y)) List(y) else Nil)) // property (d)

    Then property (b) can be proven as follows:

    xs flatMap (x => f(x) filter p)
    xs flatMap (x => f(x) flatMap (y => if (p(y)) List(y) else Nil))
    (xs flatMap f) flatMap (y => if (p(y)) List(y) else Nil) // property (c)
    (xs flatMap f) filter p

    Implementation notes

    Methods findAllElemsOrSelf, filterElemsOrSelf, findTopmostElemsOrSelf and findElemOrSelf use recursion in their implementations, but not tail-recursion. The lack of tail-recursion should not be a problem, due to limited XML tree depths in practice. It is comparable to an "idiomatic" Scala quicksort implementation in its lack of tail-recursion. Also in the case of quicksort, the lack of tail-recursion is acceptable due to limited recursion depths. If we want tail-recursive implementations of the above-mentioned methods (in particular the first 3 ones), we either lose the ordering of result elements in document order (depth-first), or we lose performance and/or clarity. That just is not worth it.

  12. trait ElemTransformationApi extends AnyRef

    This is the element transformation API, as function API instead of OO API.

    This is the element transformation API, as function API instead of OO API. That is, this is the function API corresponding to trait eu.cdevreeze.yaidom.queryapi.TransformableElemApi.

    See trait TransformableElemApi for more info about element transformations in yaidom, and their properties.

    This functional API is more widely applicable than trait TransformableElemApi. First, it can be implemented for arbitrary element types, even non-yaidom ones. Second, implementations can easily carry state that is shared by update functions, such as a Saxon Processor in the case of a Saxon implementation of this API.

    When using this API for elements that carry context such as "ancestry state", be careful when writing transformation functions that are passed to the functions of this API. For example, if the element type is BackingElemApi or a sub-type, such sensitive state includes the base URI, document URI, the Path relative to the root element, and most important of all, the root element itself. It is up to the user of the API to keep such state consistent during transformations, and to be careful when depending on state that is volatile during transformations.

    Also note for BackingElemApi elements, if a transformation function alters "ancestry state" such as (base and document) URIs, paths etc., these altered values may be ignored, depending on the API calls made.

    ElemTransformationApi more formally

    In order to get started using the API, this more formal section can safely be skipped. On the other hand, this section may provide a deeper understanding of the API.

    Some provable properties hold about this ElemTransformationApi API in terms of the more low level ElemUpdateApi API.

    Let's first try to define the methods of ElemTransformationApi in terms of the ElemUpdateApi API. Below their equivalence will be proven. We define the following, assuming type ElemType to be a yaidom "indexed element" type:

    def addPathParameter[A](f: ElemType => A): ((ElemType, Path) => A) = {
      { (elm, path) => f(elm) } // Unused path
    }
    
    def addPathEntryParameter[A](f: ElemType => A): ((ElemType, Path.Entry) => A) = {
      { (elm, pathEntry) => f(elm) } // Unused path entry
    }
    
    def findAllChildPathEntries(elem: ElemType): Set[Path.Entry] = {
      elem.findAllChildElems.map(_.path.lastEntry).toSet
    }
    
    def findAllRelativeElemOrSelfPaths(elem: ElemType): Set[Path] = {
      elem.findAllElemsOrSelf.map(_.path.skippingPath(elem.path)).toSet
    }
    
    def findAllRelativeElemPaths(elem: ElemType): Set[Path] = {
      elem.findAllElems.map(_.path.skippingPath(elem.path)).toSet
    }
    
    // The transformation functions, defined in terms of the ElemUpdateApi
    
    def transformChildElems2(elem: ElemType, f: ElemType => ElemType): ElemType = {
      updateChildElems(elem, findAllChildPathEntries(elem))(addPathEntryParameter(f))
    }
    
    def transformElemsOrSelf2(elem: ElemType, f: ElemType => ElemType): ElemType = {
      updateElemsOrSelf(elem, findAllRelativeElemOrSelfPaths(elem))(addPathParameter(f))
    }
    
    def transformElems2(elem: ElemType, f: ElemType => ElemType): ElemType = {
      updateElems(elem, findAllRelativeElemPaths(elem))(addPathParameter(f))
    }
    1. Property about transformChildElems in terms of transformChildElems2

    The following property must hold, for all elements and (pure) element transformation functions:

    transformChildElems(elem, f) == transformChildElems2(elem, f)

    No proof is provided, but this property must obviously hold, since transformChildElems replaces child element nodes by applying the given function, and leaves the other child nodes alone, and method transformChildElems2 does the same. The latter function does it via child path entries (translated to child node indexes), iterating over child nodes in reverse order (in order not to invalidate the next processed path entry), but the net effect is the same.

    2. Property about transformElemsOrSelf in terms of transformElemsOrSelf2

    The following property holds, for all elements and (pure) element transformation functions:

    transformElemsOrSelf(elem, f) == transformElemsOrSelf2(elem, f)

    Below follows a proof of this property by structural induction.

    Base case

    If elem has no child elements, then the LHS can be rewritten as follows:

    transformElemsOrSelf(elem, f)
    f(transformChildElems(elem, e => transformElemsOrSelf(e, f))) // definition of transformElemsOrSelf
    f(elem) // there are no child element nodes, so transformChildElems is an identity function in this case
    updateElemsOrSelf(elem, Set(Path.Empty))(addPathParameter(f)) // only updates elem
    transformElemsOrSelf2(elem, f) // definition of transformElemsOrSelf2, and absence of descendant paths

    which is the RHS.

    Inductive step

    If elem does have child elements, the LHS can be rewritten as:

    transformElemsOrSelf(elem, f)
    f(transformChildElems(elem, e => transformElemsOrSelf(e, f))) // definition of transformElemsOrSelf
    f(transformChildElems(elem, e => transformElemsOrSelf2(e, f))) // induction hypothesis
    f(transformChildElems2(elem, e => transformElemsOrSelf2(e, f))) // property above
    f(transformChildElems2(elem, e => updateElemsOrSelf(e, findAllRelativeElemOrSelfPaths(e))(addPathParameter(f))))
      // definition of transformElemsOrSelf2
    
    f(updateChildElems(elem, findAllChildPathEntries(elem))(addPathEntryParameter(
      e => updateElemsOrSelf(e, findAllRelativeElemOrSelfPaths(e))(addPathParameter(f))))
    ) // definition of transformChildElems2
    
    f(updateElems(elem, findAllRelativeElemOrSelfPaths(elem))(addPathParameter(f)))
      // property about updateElems, and knowing that the added path and path entry parameters do nothing here
    
    updateElemsOrSelf(elem, findAllRelativeElemOrSelfPaths(elem))(addPathParameter(f))
      // (indirect) definition of updateElemsOrSelf
    transformElemsOrSelf2(elem, f) // definition of transformElemsOrSelf2

    which is the RHS.

    This completes the proof. For the other ElemTransformationApi methods, analogous provable properties hold.

  13. trait ElemTransformationLike extends ElemTransformationApi

    This is the partially implemented element transformation API, as function API instead of OO API.

    This is the partially implemented element transformation API, as function API instead of OO API. That is, this is the function API corresponding to trait eu.cdevreeze.yaidom.queryapi.TransformableElemLike.

    In other words, this trait has abstract methods transformChildElems and transformChildElemsToNodeSeq. Based on these abstract methods, this trait offers a rich API for transforming descendant elements or descendant-or-self elements.

  14. trait ElemUpdateApi extends AnyRef

    This is the element (functional) update API, as function API instead of OO API.

    This is the element (functional) update API, as function API instead of OO API. That is, this is the function API corresponding to trait eu.cdevreeze.yaidom.queryapi.UpdatableElemApi. A few methods, like updateTopmostElemsOrSelf, are missing, though.

    See trait UpdatableElemApi for more info about (functional) element updates in yaidom, and their properties.

    This functional API is more widely applicable than trait UpdatableElemApi. First, it can be implemented for arbitrary element types, even non-yaidom ones. Second, implementations can easily carry state that is shared by update functions, such as a Saxon Processor in the case of a Saxon implementation of this API.

    Below, for most functions that take Paths or that take functions that take Paths the Paths are relative to the first argument element, so they must not be interpreted as the Paths of the elements themselves (relative to their root elements).

  15. trait ElemUpdateLike extends ElemUpdateApi

    This is the partially implemented (functional) element update API, as function API instead of OO API.

    This is the partially implemented (functional) element update API, as function API instead of OO API. That is, this is the function API corresponding to trait eu.cdevreeze.yaidom.queryapi.UpdatableElemLike.

  16. final class ElemWithPath[E <: Aux[E]] extends ElemLike

    Pair of an element and a Path.

    Pair of an element and a Path. These pairs themselves offer the ElemApi query API, so they can be seen as "element implementations" themselves. They are like very light-weight "indexed" elements.

    These "elements" are used in the implementation of bulk update methods in trait UpdatableElemLike, but they can also be used in application code.

    Note that this class renders a separate query API for element-path pairs obsolete. It takes a IsNavigableApi, using its findAllChildElemsWithPathEntries method, and offers the equivalent of an ElemApi for element-path pairs.

    E

    The underlying (root) element type

  17. trait HasChildNodesApi extends AnyElemNodeApi

    API trait for elements that can be asked for their child nodes, of any node kind.

  18. trait HasEName extends HasENameApi

    Trait partly implementing the contract for elements that have a EName, as well as attributes with EName keys.

    Trait partly implementing the contract for elements that have a EName, as well as attributes with EName keys.

    Using this trait (possibly in combination with other "element traits") we can abstract over several element implementations.

  19. trait HasENameApi extends AnyRef

    Trait defining the contract for elements that have a EName, as well as attributes with EName keys.

    Trait defining the contract for elements that have a EName, as well as attributes with EName keys.

    Using this trait (possibly in combination with other "element traits") we can abstract over several element implementations.

  20. trait HasParent extends HasParentApi

    Implementation trait for elements that can be asked for the ancestor elements, if any.

    Implementation trait for elements that can be asked for the ancestor elements, if any.

    This trait only knows about elements, not about documents as root element parents.

    Based on abstract method parentOption alone, this trait offers a rich API for querying the element ancestry of an element.

  21. trait HasParentApi extends AnyElemApi

    API trait for elements that can be asked for the ancestor elements, if any.

    API trait for elements that can be asked for the ancestor elements, if any.

    This trait only knows about elements, not about documents as root element parents.

  22. trait HasQNameApi extends AnyRef

    Trait defining the contract for elements that have a QName, as well as attributes with QName keys.

    Trait defining the contract for elements that have a QName, as well as attributes with QName keys.

    Using this trait (possibly in combination with other "element traits") we can abstract over several element implementations.

  23. trait HasScopeApi extends AnyRef

    Trait defining the contract for elements that have a stored Scope.

    Trait defining the contract for elements that have a stored Scope.

    Using this trait (possibly in combination with other "element traits") we can abstract over several element implementations.

  24. trait HasText extends HasTextApi

    Trait partly implementing the contract for elements as text containers.

    Trait partly implementing the contract for elements as text containers. Typical element types are both an eu.cdevreeze.yaidom.queryapi.ElemLike as well as a eu.cdevreeze.yaidom.queryapi.HasText.

  25. trait HasTextApi extends AnyRef

    Trait defining the contract for elements as text containers.

    Trait defining the contract for elements as text containers. Typical element types are both an eu.cdevreeze.yaidom.queryapi.ElemLike as well as a eu.cdevreeze.yaidom.queryapi.HasText.

  26. trait IndexedClarkElemApi extends ClarkElemApi

    Abstract API for "indexed elements".

    Abstract API for "indexed elements".

    Note how this API removes the need for an API which is like the ElemApi API, but taking and returning pairs of elements and paths.

  27. trait IndexedScopedElemApi extends IndexedClarkElemApi with ScopedElemApi

    Abstract API for "indexed Scoped elements".

  28. trait IsNavigable extends IsNavigableApi

    API and implementation trait for elements that can be navigated using paths.

    API and implementation trait for elements that can be navigated using paths.

    More precisely, this trait has only the following abstract methods: findChildElemByPathEntry and findAllChildElemsWithPathEntries.

    The purely abstract API offered by this trait is eu.cdevreeze.yaidom.queryapi.IsNavigableApi. See the documentation of that trait for more information.

  29. trait IsNavigableApi extends AnyElemApi

    This trait offers Path-based navigation support.

    This trait offers Path-based navigation support.

    This trait typically does not show up in application code using yaidom, yet its (uniform) API does. Hence, it makes sense to read the documentation of this trait, knowing that the API is offered by multiple element implementations.

    This trait is purely abstract. The most common implementation of this trait is eu.cdevreeze.yaidom.queryapi.IsNavigable.

    IsNavigableApi more formally

    Some properties are expected to hold for "navigable elements":

    getElemOrSelfByPath(Path.Empty) == self
    
    findElemOrSelfByPath(path1).flatMap(e => e.findElemOrSelfByPath(path2)) == findElemOrSelfByPath(path1.append(path2))
  30. trait ScopedElemApi extends ClarkElemApi with HasQNameApi with HasScopeApi

    Shorthand for ClarkElemApi[E] with HasQNameApi with HasScopeApi with some additional methods that use the scope for resolving QName-valued text and attribute values.

    Shorthand for ClarkElemApi[E] with HasQNameApi with HasScopeApi with some additional methods that use the scope for resolving QName-valued text and attribute values. In other words, an element query API typically supported by element implementations, because most element implementations know about scopes, QNames, ENames and text content, as well as offering the ElemApi query API.

    Generic code abstracting over yaidom element implementations should either use this trait, or super-trait ClarkElemApi, depending on the abstraction level.

    ScopedElemApi more formally

    Scopes resolve QNames as ENames, so some properties are expected to hold for the element "name":

    this.scope.resolveQNameOption(this.qname).contains(this.resolvedName)
    
    // Therefore:
    this.resolvedName.localPart == this.qname.localPart
    
    this.resolvedName.namespaceUriOption ==
      this.scope.prefixNamespaceMap.get(this.qname.prefixOption.getOrElse(""))

    For the attribute "name" properties, first define:

    val attributeScope = this.scope.withoutDefaultNamespace
    
    val resolvedAttrs = this.attributes map {
      case (attrQName, attrValue) =>
        val resolvedAttrName = attributeScope.resolveQNameOption(attrQName).get
        (resolvedAttrName -> attrValue)
    }

    Then the following must hold:

    resolvedAttrs.toMap == this.resolvedAttributes.toMap
  31. trait ScopedElemLike extends ScopedElemApi with ClarkElemLike

    Partial implementation of ScopedElemApi.

  32. trait SubtypeAwareElemApi extends ElemApi

    Extension to ElemApi that makes querying for sub-types of the element type easy.

    Extension to ElemApi that makes querying for sub-types of the element type easy.

    For example, XML Schema can be modeled with an object hierarchy, starting with some XsdElem super-type which mixes in trait SubtypeAwareElemApi, among other query traits. The object hierarchy could contain sub-classes of XsdElem such as XsdRootElem, GlobalElementDeclaration, etc. Then the SubtypeAwareElemApi trait makes it easy to query for all or some global element declarations, etc.

    There is no magic in these traits: it is just ElemApi and ElemLike underneath. It is only the syntactic convenience that makes the difference.

    The query methods of this trait take a sub-type as first value parameter. It is intentional that this is a value parameter, and not a second type parameter, since it is conceptually the most important parameter of these query methods. (If it were a second type parameter instead, the article http://hacking-scala.org/post/73854628325/advanced-type-constraints-with-type-classes would show how to make that solution robust, using some @NotNothing annotation.)

    The sub-type parameter could have been a java.lang.Class object, except that type erasure would make it less attractive (when doing pattern matching against that type). Hence the use of a ClassTag parameter, which undoes type erasure for non-generic types, if available implicitly. So ClassTag is used as a better java.lang.Class, yet without polluting the public API with an implicit ClassTag parameter. (Instead, the ClassTag is made implicit inside the method implementations.)

  33. trait SubtypeAwareElemLike extends ElemLike with SubtypeAwareElemApi

    Default implementation of SubtypeAwareElemApi.

  34. trait TransformableElemApi extends AnyElemNodeApi

    This is the element transformation part of the yaidom query and update API.

    This is the element transformation part of the yaidom query and update API. Only a few DOM-like element implementations in yaidom mix in this trait (indirectly, because some implementing sub-trait is mixed in), thus sharing this API.

    This trait typically does not show up in application code using yaidom, yet its (uniform) API does. Hence, it makes sense to read the documentation of this trait, knowing that the API is offered by multiple element implementations.

    This trait is purely abstract. The most common implementation of this trait is eu.cdevreeze.yaidom.queryapi.TransformableElemLike. That trait only knows how to transform child elements. Using this minimal knowledge, the trait offers methods to transform descendant elements and descendant-or-self elements. Indeed, the trait is similar to ElemLike, except that it transforms elements instead of querying for elements.

    The big conceptual difference with "updatable" elements (in trait UpdatableElemLike[N, E]) is that "transformations" are about applying some transforming function to an element tree, while "(functional) updates" are about "updates" at given paths.

    TransformableElemApi examples

    To illustrate the use of this API, consider the following example XML:

    <book:Bookstore xmlns:book="http://bookstore/book" xmlns:auth="http://bookstore/author">
      <book:Book ISBN="978-0321356680" Price="35" Edition="2">
        <book:Title>Effective Java (2nd Edition)</book:Title>
        <book:Authors>
          <auth:Author>
            <auth:First_Name>Joshua</auth:First_Name>
            <auth:Last_Name>Bloch</auth:Last_Name>
          </auth:Author>
        </book:Authors>
      </book:Book>
      <book:Book ISBN="978-0981531649" Price="35" Edition="2">
        <book:Title>Programming in Scala: A Comprehensive Step-by-Step Guide, 2nd Edition</book:Title>
        <book:Authors>
          <auth:Author>
            <auth:First_Name>Martin</auth:First_Name>
            <auth:Last_Name>Odersky</auth:Last_Name>
          </auth:Author>
          <auth:Author>
            <auth:First_Name>Lex</auth:First_Name>
            <auth:Last_Name>Spoon</auth:Last_Name>
          </auth:Author>
          <auth:Author>
            <auth:First_Name>Bill</auth:First_Name>
            <auth:Last_Name>Venners</auth:Last_Name>
          </auth:Author>
        </book:Authors>
      </book:Book>
    </book:Bookstore>

    Suppose this XML has been parsed into eu.cdevreeze.yaidom.simple.Elem variable named bookstoreElem. Then we can combine author first and last names as follows:

    val authorNamespace = "http://bookstore/author"
    
    bookstoreElem = bookstoreElem transformElems {
      case elem: Elem if elem.resolvedName == EName(authorNamespace, "Author") =>
        val firstName = (elem \ (_.localName == "First_Name")).headOption.map(_.text).getOrElse("")
        val lastName = (elem \ (_.localName == "Last_Name")).headOption.map(_.text).getOrElse("")
        val name = (firstName + " " + lastName).trim
        Node.textElem(QName("auth:Author"), elem.scope ++ Scope.from("auth" -> authorNamespace), name)
      case elem: Elem => elem
    }
    bookstoreElem = bookstoreElem.prettify(2)

    When using the TransformableElemApi API, keep the following in mind:

    • The transformElems and transformElemsOrSelf methods (and their Node sequence producing counterparts) may produce a lot of "garbage". If only a small portion of an element tree needs to be updated, the "update" methods in trait UpdatableElemApi may be a better fit.
    • Transformations operate in a bottom-up manner. This implies that parent scopes cannot be used for transforming child elements. Hence, namespace undeclarations may result, which are not allowed in XML 1.0 (except for the default namespace).

    Top-down transformations are still possible, by combining recursion with method transformChildElems (or transformChildElemsToNodeSeq). For example:

    def removePrefixedNamespaceUndeclarations(elem: Elem): Elem = {
      elem transformChildElems { e =>
        val newE = e.copy(scope = elem.scope.withoutDefaultNamespace ++ e.scope)
        removePrefixedNamespaceUndeclarations(newE)
      }
    }
  35. trait TransformableElemLike extends TransformableElemApi

    API and implementation trait for transformable elements.

    API and implementation trait for transformable elements.

    More precisely, this trait has abstract methods transformChildElems and transformChildElemsToNodeSeq. Based on these abstract methods, this trait offers a rich API for transforming descendant elements or descendant-or-self elements.

    The purely abstract API offered by this trait is eu.cdevreeze.yaidom.queryapi.TransformableElemApi. See the documentation of that trait for examples of usage.

  36. trait UpdatableElemApi extends AnyElemNodeApi with IsNavigableApi

    This is the functional update part of the yaidom uniform query API.

    This is the functional update part of the yaidom uniform query API. It is a sub-trait of trait eu.cdevreeze.yaidom.queryapi.IsNavigableApi. Only a few DOM-like element implementations in yaidom mix in this trait (indirectly, because some implementing sub-trait is mixed in), thus sharing this query API.

    This trait typically does not show up in application code using yaidom, yet its (uniform) API does. Hence, it makes sense to read the documentation of this trait, knowing that the API is offered by multiple element implementations.

    This trait is purely abstract. The most common implementation of this trait is eu.cdevreeze.yaidom.queryapi.UpdatableElemLike. The trait has all the knowledge of its super-trait, but in addition to that knows the following:

    • An element has child nodes, which may or may not be elements. Hence the extra type parameter for nodes.
    • An element knows the child node indexes of the path entries of the child elements.

    Obviously methods children, withChildren and collectChildNodeIndexes must be consistent with methods such as findAllChildElems.

    Using this minimal knowledge alone, trait UpdatableElemLike not only offers the methods of its parent trait, but also:

    • methods to functionally update an element by replacing, adding or deleting child nodes
    • methods to functionally update an element by replacing descendant-or-self elements at specified paths

    For the conceptual difference with "transformable" elements, see trait eu.cdevreeze.yaidom.queryapi.TransformableElemApi.

    This query API leverages the Scala Collections API. Query results can be manipulated using the Collections API, and the query API implementation (in UpdatableElemLike) uses the Collections API internally.

    UpdatableElemApi examples

    To illustrate the use of this API, consider the following example XML:

    <book:Bookstore xmlns:book="http://bookstore/book" xmlns:auth="http://bookstore/author">
      <book:Book ISBN="978-0321356680" Price="35" Edition="2">
        <book:Title>Effective Java (2nd Edition)</book:Title>
        <book:Authors>
          <auth:Author>
            <auth:First_Name>Joshua</auth:First_Name>
            <auth:Last_Name>Bloch</auth:Last_Name>
          </auth:Author>
        </book:Authors>
      </book:Book>
      <book:Book ISBN="978-0981531649" Price="35" Edition="2">
        <book:Title>Programming in Scala: A Comprehensive Step-by-Step Guide, 2nd Edition</book:Title>
        <book:Authors>
          <auth:Author>
            <auth:First_Name>Martin</auth:First_Name>
            <auth:Last_Name>Odersky</auth:Last_Name>
          </auth:Author>
          <auth:Author>
            <auth:First_Name>Lex</auth:First_Name>
            <auth:Last_Name>Spoon</auth:Last_Name>
          </auth:Author>
          <auth:Author>
            <auth:First_Name>Bill</auth:First_Name>
            <auth:Last_Name>Venners</auth:Last_Name>
          </auth:Author>
        </book:Authors>
      </book:Book>
    </book:Bookstore>

    Suppose this XML has been parsed into eu.cdevreeze.yaidom.simple.Elem variable named bookstoreElem. Then we can add a book as follows, where we "forget" the 2nd author for the moment:

    import convert.ScalaXmlConversions._
    
    val bookstoreNamespace = "http://bookstore/book"
    val authorNamespace = "http://bookstore/author"
    
    val fpBookXml =
      <book:Book xmlns:book="http://bookstore/book" xmlns:auth="http://bookstore/author" ISBN="978-1617290657" Price="33">
        <book:Title>Functional Programming in Scala</book:Title>
        <book:Authors>
          <auth:Author>
            <auth:First_Name>Paul</auth:First_Name>
            <auth:Last_Name>Chiusano</auth:Last_Name>
          </auth:Author>
        </book:Authors>
      </book:Book>
    val fpBookElem = convertToElem(fpBookXml)
    
    bookstoreElem = bookstoreElem.plusChild(fpBookElem)

    Note that the namespace declarations for prefixes book and auth had to be repeated in the Scala XML literal for the added book, because otherwise the convertToElem method would throw an exception (since Elem instances cannot be created unless all element and attribute QNames can be resolved as ENames).

    The resulting bookstore seems ok, but if we print convertElem(bookstoreElem), the result does not look pretty. This can be fixed if the last assignment is replaced by:

    bookstoreElem = bookstoreElem.plusChild(fpBookElem).prettify(2)

    knowing that an indentation of 2 spaces has been used throughout the original XML. Method prettify is expensive, so it is best not to invoke it within a tight loop. As an alternative, formatting can be left to the DocumentPrinter, of course.

    The assignment above is the same as the following one:

    bookstoreElem = bookstoreElem.withChildren(bookstoreElem.children :+ fpBookElem).prettify(2)

    There are several methods to functionally update the children of an element. For example, method plusChild is overloaded, and the other variant can insert a child at a given 0-based position. Other "children update" methods are minusChild, withPatchedChildren and withUpdatedChildren.

    Let's now turn to functional update methods that take Path instances or collections thereof. In the example above the second author of the added book is missing. Let's fix that:

    val secondAuthorXml =
      <auth:Author xmlns:auth="http://bookstore/author">
        <auth:First_Name>Runar</auth:First_Name>
        <auth:Last_Name>Bjarnason</auth:Last_Name>
      </auth:Author>
    val secondAuthorElem = convertToElem(secondAuthorXml)
    
    val fpBookAuthorsPaths =
      for {
        authorsPath <- indexed.Elem(bookstoreElem) filterElems { e => e.resolvedName == EName(bookstoreNamespace, "Authors") } map (_.path)
        if authorsPath.findAncestorPath(path => path.endsWithName(EName(bookstoreNamespace, "Book")) &&
          bookstoreElem.getElemOrSelfByPath(path).attribute(EName("ISBN")) == "978-1617290657").isDefined
      } yield authorsPath
    
    require(fpBookAuthorsPaths.size == 1)
    val fpBookAuthorsPath = fpBookAuthorsPaths.head
    
    bookstoreElem = bookstoreElem.updateElemOrSelf(fpBookAuthorsPath) { elem =>
      require(elem.resolvedName == EName(bookstoreNamespace, "Authors"))
      val rawResult = elem.plusChild(secondAuthorElem)
      rawResult transformElemsOrSelf (e => e.copy(scope = elem.scope.withoutDefaultNamespace ++ e.scope))
    }
    bookstoreElem = bookstoreElem.prettify(2)

    Clearly the resulting bookstore element is nicely formatted, but there was another possible issue that was taken into account. See the line of code transforming the "raw result". That line was added in order to prevent namespace undeclarations, which for XML version 1.0 are not allowed (with the exception of the default namespace). After all, the XML for the second author was created with only the auth namespace declared. Without the above-mentioned line of code, a namespace undeclaration for prefix book would have occurred in the resulting XML, thus leading to an invalid XML 1.0 element tree.

    To illustrate functional update methods taking collections of paths, let's remove the added book from the book store. Here is one (somewhat inefficient) way to do that:

    val bookPaths = indexed.Elem(bookstoreElem) filterElems (_.resolvedName == EName(bookstoreNamespace, "Book")) map (_.path)
    
    bookstoreElem = bookstoreElem.updateElemsWithNodeSeq(bookPaths.toSet) { (elem, path) =>
      if ((elem \@ EName("ISBN")).contains("978-1617290657")) Vector() else Vector(elem)
    }
    bookstoreElem = bookstoreElem.prettify(2)

    There are very many ways to write this functional update, using different functional update methods in trait UpdatableElemApi, or even only using transformation methods in trait TransformableElemApi (thus not using paths).

    The example code above is enough to get started using the UpdatableElemApi methods, but it makes sense to study the entire API, and practice with it. Always keep in mind that functional updates typically mess up formatting and/or namespace (un)declarations, unless these aspects are taken into account.

  37. trait UpdatableElemLike extends IsNavigable with UpdatableElemApi

    API and implementation trait for functionally updatable elements.

    API and implementation trait for functionally updatable elements. This trait extends trait eu.cdevreeze.yaidom.queryapi.IsNavigable, adding knowledge about child nodes in general, and about the correspondence between child path entries and child indexes.

    More precisely, this trait adds the following abstract methods to the abstract methods required by its super-trait: children, withChildren and collectChildNodeIndexes. Based on these abstract methods (and the super-trait), this trait offers a rich API for functionally updating elements.

    The purely abstract API offered by this trait is eu.cdevreeze.yaidom.queryapi.UpdatableElemApi. See the documentation of that trait for examples of usage, and for a more formal treatment.

Value Members

  1. object BackingDocumentApi
  2. object BackingElemApi
  3. object BackingNodes

    Core API for element nodes that offer the central BackingElemApi with HasChildNodesApi query API.

    Core API for element nodes that offer the central BackingElemApi with HasChildNodesApi query API. Each element implementation that knows about expanded names as well as qualified name and that also knows about ancestor elements, should directly or indirectly implement this API.

    This API is directly implemented by elements that are used as backing elements in "yaidom dialects". The yaidom dialects use this abstract backing element API, thus allowing for multiple backing element implementation behind an yaidom XML dialect.

    Efficient implementations are possible for indexed elements and Saxon NodeInfo objects (backed by Saxon native tiny trees). Saxon-backed elements are not offered by core yaidom, however. Saxon tiny trees are attractive for their low memory footprint.

  4. object ClarkElemApi
  5. object ClarkElemLike
  6. object ClarkNodes

    Core API for element nodes that offer the central ClarkElemApi with HasChildNodesApi query API.

    Core API for element nodes that offer the central ClarkElemApi with HasChildNodesApi query API. Each element implementation should directly or indirectly implement this API.

    This API is directly implemented by elements that know about expanded names but not about qualified names.

  7. object DocumentApi
  8. object ElemApi
  9. object ElemCreationApi
  10. object ElemLike
  11. object ElemTransformationApi
  12. object ElemTransformationLike
  13. object ElemUpdateApi
  14. object ElemUpdateLike
  15. object ElemWithPath
  16. object HasChildNodesApi
  17. object HasENameApi

    This companion object offers some convenience factory methods for "element predicates", that can be used in yaidom queries.

    This companion object offers some convenience factory methods for "element predicates", that can be used in yaidom queries. These factory objects turn ENames and local names into "element predicates".

    For example:

    elem \\ (_.ename == EName(xsNamespace, "element"))

    can also be written as:

    elem \\ withEName(xsNamespace, "element")

    (thus avoiding EName instance construction, whether or not this makes any difference in practice).

    If the namespace is "obvious", and more friendly local-name-based querying is desired, the following could be written:

    elem \\ withLocalName("element")
  18. object HasParent
  19. object HasParentApi
  20. object IndexedClarkElemApi
  21. object IndexedScopedElemApi
  22. object IsNavigable
  23. object IsNavigableApi
  24. object Nodes

    Abstract node (marker) trait hierarchy.

    Abstract node (marker) trait hierarchy. It offers a common minimal API for different kinds of nodes. It also shows what yaidom typically considers to be nodes, and what it does not consider to be nodes. For example, documents are not nodes in yaidom, so it is thus prevented to create documents as element children. Moreover, attributes are typically not nodes in yaidom, although custom element implementations may think otherwise.

    The down-side is that we have to consider mixing in (some or all of) these traits everywhere we create a node/element implementation.

  25. object ScopedElemApi
  26. object ScopedElemLike
  27. object ScopedNodes

    Core API for element nodes that offer the central ScopedElemApi with HasChildNodesApi query API.

    Core API for element nodes that offer the central ScopedElemApi with HasChildNodesApi query API. Each element implementation that knows about expanded names as well as qualified name should directly or indirectly implement this API.

    This API is directly implemented by elements that know about expanded names but not about qualified names, but that do not know about their ancestor elements.

  28. object SubtypeAwareElemApi
  29. object SubtypeAwareElemLike
  30. object TransformableElemApi
  31. object TransformableElemLike
  32. object UpdatableElemApi
  33. object UpdatableElemLike
  34. object XmlBaseSupport

    XML Base support, for elements implementing the ClarkElemApi query API.

    XML Base support, for elements implementing the ClarkElemApi query API.

    XML Base is very simple in its algorithm, given an optional start "document URI". Base URI computation for an element then starts with the optional document URI, and processes all XML Base attributes in the reverse ancestry-or-self of the element, resolving each XML Base attribute against the base URI computed so far. According to the XML Base specification, same-document references do not alter this algorithm.

    What is sensitive in XML Base processing is the resolution of an URI against an optional base URI. For example, resolving an empty URI using the java.net.URI.resolve method does not conform to RFC 3986 (see e.g. http://stackoverflow.com/questions/22203111/is-javas-uri-resolve-incompatible-with-rfc-3986-when-the-relative-uri-contains).

    This is why the user of this XML Base support must supply a strategy for resolving URIs against optional base URIs.

    Default attributes and entity resolution are out of scope for this XML Base support.

Inherited from AnyRef

Inherited from Any

Ungrouped