Class RrbTree<E>

  • All Implemented Interfaces:
    Iterable<E>, Collection<E>, List<E>, BaseList<E>, Sized, UnmodCollection<E>, UnmodIterable<E>, UnmodList<E>, UnmodSortedCollection<E>, UnmodSortedIterable<E>, Indented, Transformable<E>
    Direct Known Subclasses:
    RrbTree.ImRrbt, RrbTree.MutRrbt

    public abstract class RrbTree<E>
    extends Object
    implements BaseList<E>, Indented

    An RRB Tree is an immutable List (like Clojure's PersistentVector) that also supports random inserts, deletes, and can be split and joined back together in logarithmic time. This is based on the paper, "RRB-Trees: Efficient Immutable Vectors" by Phil Bagwell and Tiark Rompf, with the following differences:

    • The Relaxed nodes can be sized between n/3 and 2n/3 (Bagwell/Rompf specify n and n-1)
    • The Join operation sticks the shorter tree unaltered into the larger tree (except for very small trees which just get concatenated).

    Details were filled in from the Cormen, Leiserson, Rivest & Stein Algorithms book entry on B-Trees. Also with an awareness of the Clojure PersistentVector by Rich Hickey. All errors are by Glen Peterson.

    History (what little I know):

    1972: B-Tree: Rudolf Bayer and Ed McCreight
    1998: Purely Functional Data Structures: Chris Okasaki
    2007: Clojure's Persistent Vector (and HashMap) implementations: Rich Hickey
    2012: RRB-Tree: Phil Bagwell and Tiark Rompf

    Compared to other collections (timings summary from 2017-06-11):

    • append() - RrbTree.ImRrbt varies between 90% and 100% of the speed of PersistentVector (biggest difference above 100K). RrbTree.MutRrbt varies between 45% and 80% of the speed of PersistentVector.MutVector (biggest difference from 100 to 1M).
    • get() - varies between 50% and 150% of the speed of PersistentVector (PV wins above 1K) if you build RRB using append(). If you build rrb using random inserts (worst case), it goes from 90% at 10 items down to 15% of the speed of the PV at 1M items.
    • iterate() - is about the same speed as PersistentVector
    • insert(0, item) - beats ArrayList above 1K items (worse than ArrayList below 100 items).
    • insert(random, item) - beats ArrayList above 10K items (worse than ArrayList until then).
    • O(log n) split(), join(), and remove() (not timed yet).

    Latest detailed timing results are here.

    • Constructor Detail

      • RrbTree

        public RrbTree()
    • Method Detail

      • empty

        @NotNull
        public static <T> @NotNull RrbTree.ImRrbt<T> empty()
        Returns the empty, immutable RRB-Tree (there is only one)
      • emptyMutable

        @NotNull
        public static <T> @NotNull RrbTree.MutRrbt<T> emptyMutable()
        Returns the empty, mutable RRB-Tree (there is only one)
      • append

        @NotNull
        public abstract @NotNull RrbTree<E> append​(E t)
        Returns a new BaseList with the additional item at the end.
        Specified by:
        append in interface BaseList<E>
        Parameters:
        t - the value to append
      • appendSome

        @NotNull
        public abstract @NotNull RrbTree<E> appendSome​(@NotNull
                                                       @NotNull Fn0<? extends @NotNull Option<E>> supplier)
        If supplier returns Some, return a new BaseList with the additional item at the end. If None, just return this BaseList unmodified.
        Specified by:
        appendSome in interface BaseList<E>
        Parameters:
        supplier - return Option.Some to append, None for a no-op.
      • get

        public abstract E get​(int i)
        Specified by:
        get in interface List<E>
      • insert

        @NotNull
        public abstract @NotNull RrbTree<E> insert​(int idx,
                                                   E element)
        Inserts an item in the RRB tree pushing the current element at that index and all subsequent elements to the right.
        Parameters:
        idx - the insertion point
        element - the item to insert
        Returns:
        a new RRB-Tree with the item inserted.
      • join

        @NotNull
        public @NotNull RrbTree<E> join​(@NotNull
                                        @NotNull RrbTree<E> that)
        Joins the given tree to the right side of this tree (or this to the left side of that one) in something like O(log n) time.
      • makeNew

        @NotNull
        protected abstract @NotNull RrbTree<E> makeNew​(E @NotNull [] f,
                                                       int fi,
                                                       int fl,
                                                       @NotNull
                                                       @NotNull org.organicdesign.fp.collections.RrbTree.Node<E> r,
                                                       int s)
        Allows removing duplicated code by letting super-class produce new members of subclass types.
      • mt

        @NotNull
        protected abstract @NotNull RrbTree<E> mt()
        Creates a new empty ("M-T") tree of the appropriate (mutable/immutable) type.
      • precat

        @NotNull
        public abstract @NotNull RrbTree<E> precat​(@Nullable
                                                   @Nullable Iterable<? extends E> es)
        Add items to the beginning of this Transformable ("precat" is a PREpending version of conCAT). Precat is implemented here because it is a very cheap operation on an RRB-Tree. It's not implemented on PersistentVector because it is very expensive there.
        Specified by:
        precat in interface Transformable<E>
        Specified by:
        precat in interface UnmodIterable<E>
        Parameters:
        es - the items to add
        Returns:
        a new Transformable with the items added.
      • replace

        @NotNull
        public abstract @NotNull RrbTree<E> replace​(int index,
                                                    E item)
        Replace the item at the given index. Note: i.replace(i.size(), o) used to be equivalent to i.concat(o), but it probably won't be for the RRB tree implementation, so this will change too.
        Specified by:
        replace in interface BaseList<E>
        Parameters:
        index - the index where the value should be stored.
        item - the value to store
        Returns:
        a new ImList with the replaced item
      • size

        public abstract int size()
        Returns the number of items in this collection or iterable.
        Specified by:
        size in interface Collection<E>
        Specified by:
        size in interface List<E>
        Specified by:
        size in interface Sized
      • split

        @NotNull
        public @NotNull Tuple2<? extends RrbTree<E>,​? extends RrbTree<E>> split​(int splitIndex)
        Divides this RRB-Tree such that every index less-than the given index ends up in the left-hand tree and the indexed item and all subsequent ones end up in the right-hand tree.
        Parameters:
        splitIndex - the split point (excluded from the left-tree, included in the right one)
        Returns:
        two new sub-trees as determined by the split point. If the point is 0 or this.size() one tree will be empty (but never null).
      • without

        @NotNull
        public @NotNull RrbTree<E> without​(int index)
        Returns a new RrbTree minus the given item (all items to the right are shifted left one) This is O(log n).
      • hashCode

        public int hashCode()
        This implementation is correct and compatible with java.util.AbstractList, but O(n).
        Specified by:
        hashCode in interface Collection<E>
        Specified by:
        hashCode in interface List<E>
        Overrides:
        hashCode in class Object
      • indentedStr

        @NotNull
        public abstract @NotNull String indentedStr​(int indent)
        Returns a string where line breaks extend the given amount of indentation.
        Specified by:
        indentedStr in interface Indented
        Parameters:
        indent - the amount of indent to start at. Pretty-printed subsequent lines may have additional indent.
        Returns:
        a string with the given starting offset (in spaces) for every line.