package algebird
- Alphabetic
- By Inheritance
- algebird
- AnyRef
- Any
- Hide All
- Show All
- Public
- Protected
Package Members
Type Members
- abstract class AbstractApplicative[M[_]] extends Applicative[M]
For use from Java/minimizing code bloat in scala
- trait AbstractEventuallyAggregator[A, E, O, C] extends Aggregator[A, Either[E, O], C]
- abstract class AbstractFunctor[M[_]] extends Functor[M]
For use from Java/minimizing code bloat in scala
- abstract class AbstractGroup[T] extends Group[T]
- abstract class AbstractMonad[M[_]] extends Monad[M]
For use from Java/minimizing code bloat in scala
- abstract class AbstractMonoid[T] extends Monoid[T]
- abstract class AbstractRing[T] extends Ring[T]
- abstract class AbstractSemigroup[T] extends Semigroup[T]
- class AdaptiveCache[K, V] extends StatefulSummer[Map[K, V]]
This is a wrapper around SummingCache that attempts to grow the capacity by up to some maximum, as long as there's enough RAM.
This is a wrapper around SummingCache that attempts to grow the capacity by up to some maximum, as long as there's enough RAM. It determines that there's enough RAM to grow by maintaining a SentinelCache which keeps caching and summing the evicted values. Once the SentinelCache has grown to the same size as the current cache, plus some margin, without running out of RAM, then this indicates that we have enough headroom to double the capacity.
- sealed trait AdaptiveVector[V] extends IndexedSeq[V]
An IndexedSeq that automatically switches representation between dense and sparse depending on sparsity Should be an efficient representation for all sizes, and it should not be necessary to special case immutable algebras based on the sparsity of the vectors.
- case class AdjoinedUnit[T](ones: BigInt, get: T) extends Product with Serializable
This is for the case where your Ring[T] is a Rng (i.e.
This is for the case where your Ring[T] is a Rng (i.e. there is no unit).
- See also
http://en.wikipedia.org/wiki/Pseudo-ring#Adjoining_an_identity_element
- class AdjoinedUnitRing[T] extends Ring[AdjoinedUnit[T]]
- case class AffineFunction[R](slope: R, intercept: R) extends Serializable with Product
Represents functions of the kind: f(x) = slope * x + intercept
- class AffineFunctionMonoid[R] extends Monoid[AffineFunction[R]]
This feeds the value in on the LEFT!!! This may seem counter intuitive, but with this approach, a stream/iterator which is summed will have the same output as applying the function one at a time in order to the input.
This feeds the value in on the LEFT!!! This may seem counter intuitive, but with this approach, a stream/iterator which is summed will have the same output as applying the function one at a time in order to the input. If we did the "lexigraphically correct" thing, which might be (f+g)(x) = f(g(x)) then we would wind up reversing the list in the sum. (f1 + f2)(x) = f2(f1(x)) so that: listOfFn.foldLeft(x) { (v, fn) => fn(v) } = (Monoid.sum(listOfFn))(x)
- trait Aggregator[-A, B, +C] extends Serializable
This is a type that models map/reduce(map).
This is a type that models map/reduce(map). First each item is mapped, then we reduce with a semigroup, then finally we present the results.
Unlike Fold, Aggregator keeps it's middle aggregation type externally visible. This is because Aggregators are useful in parallel map/reduce systems where there may be some additional types needed to cross the map/reduce boundary (such a serialization and intermediate storage). If you don't care about the middle type, an _ may be used and the main utility of the instance is still preserved (e.g. def operate[T](ag: Aggregator[T, _, Int]): Int)
Note, join is very useful to combine multiple aggregations with one pass. Also GeneratedTupleAggregator.fromN((agg1, agg2, ... aggN)) can glue these together well.
This type is the the Fold.M from Haskell's fold package: https://hackage.haskell.org/package/folds-0.6.2/docs/Data-Fold-M.html
- class AggregatorApplicative[I] extends Applicative[[O]Aggregator[I, _, O]]
Aggregators are Applicatives, but this hides the middle type.
Aggregators are Applicatives, but this hides the middle type. If you need a join that does not hide the middle type use join on the trait, or GeneratedTupleAggregator.fromN
- final case class AndVal(get: Boolean) extends AnyVal with Product with Serializable
- trait Applicative[M[_]] extends Functor[M]
Simple implementation of an Applicative type-class.
Simple implementation of an Applicative type-class. There are many choices for the canonical second operation (join, sequence, joinWith, ap), all equivalent. For a Functor modeling concurrent computations with failure, like Future, combining results with join can save a lot of time over combining with flatMap. (Given two operations, if the second fails before the first completes, one can fail the entire computation right then. With flatMap, one would have to wait for the first operation to complete before failing it.)
Laws Applicatives must follow: map(apply(x))(f) == apply(f(x)) join(apply(x), apply(y)) == apply((x, y)) (sequence and joinWith specialize join - they should behave appropriately)
- Annotations
- @implicitNotFound()
- class ApplicativeGroup[T, M[_]] extends ApplicativeMonoid[T, M] with Group[M[T]]
Group and Ring ARE NOT AUTOMATIC.
Group and Ring ARE NOT AUTOMATIC. You have to check that the laws hold for your Applicative. If your M[_] is a wrapper type (Option[_], Some[_], Try[_], Future[_], etc...) this generally works.
- class ApplicativeMonoid[T, M[_]] extends ApplicativeSemigroup[T, M] with Monoid[M[T]]
This is a Monoid, for all Applicatives.
- class ApplicativeOperators[A, M[_]] extends FunctorOperators[A, M]
This enrichment allows us to use our Applicative instances in for expressions: if (import Applicative._) has been done
- class ApplicativeRing[T, M[_]] extends ApplicativeGroup[T, M] with Ring[M[T]]
Group and Ring ARE NOT AUTOMATIC.
Group and Ring ARE NOT AUTOMATIC. You have to check that the laws hold for your Applicative. If your M[_] is a wrapper type (Option[_], Some[_], Try[_], Future[_], etc...) this generally works.
- class ApplicativeSemigroup[T, M[_]] extends Semigroup[M[T]]
This is a Semigroup, for all Applicatives.
- case class Approximate[N](min: N, estimate: N, max: N, probWithinBounds: Double)(implicit numeric: Numeric[N]) extends ApproximateSet[N] with Product with Serializable
- case class ApproximateBoolean(isTrue: Boolean, withProb: Double) extends ApproximateSet[Boolean] with Product with Serializable
- abstract class ArrayBufferedOperation[I, O] extends Buffered[I, O]
- class ArrayGroup[T] extends ArrayMonoid[T] with Group[Array[T]]
Extends pair-wise sum Array monoid into a Group negate is defined as the negation of each element of the array.
- class ArrayMonoid[T] extends Monoid[Array[T]]
Pair-wise sum Array monoid.
Pair-wise sum Array monoid.
plus returns left[i] + right[i] for all array elements. The resulting array will be as long as the longest array (with its elements duplicated) zero is an empty array
- case class AveragedValue(count: Long, value: Double) extends Product with Serializable
Tracks the count and mean value of Doubles in a data stream.
Tracks the count and mean value of Doubles in a data stream.
Adding two instances of AveragedValue with + is equivalent to taking an average of the two streams, with each stream weighted by its count.
The mean calculation uses a numerically stable online algorithm suitable for large numbers of records, similar to Chan et. al.'s parallel variance algorithm on Wikipedia. As long as your count doesn't overflow a Long, the mean calculation won't overflow.
- count
the number of aggregated items
- value
the average value of all aggregated items
- See also
Moments.getCombinedMeanDouble for implementation of +
- sealed abstract class BF[A] extends Serializable
Bloom Filter data structure
- case class BFHash[A](numHashes: Int, width: Int)(implicit hash: Hash128[A]) extends Product with Serializable
- case class BFInstance[A](hashes: BFHash[A], bits: BitSet, width: Int) extends BF[A] with Product with Serializable
- case class BFItem[A](item: A, hashes: BFHash[A], width: Int) extends BF[A] with Product with Serializable
Bloom Filter with 1 value.
- case class BFSparse[A](hashes: BFHash[A], bits: EWAHCompressedBitmap, width: Int) extends BF[A] with Product with Serializable
- case class BFZero[A](hashes: BFHash[A], width: Int) extends BF[A] with Product with Serializable
Empty bloom filter.
- sealed abstract class Batched[T] extends Serializable
Batched: the free semigroup.
Batched: the free semigroup.
For any type
T
,Batched[T]
represents a way to lazily combine T values as a semigroup would (i.e. associatively). ASemigroup[T]
instance can be used to recover aT
value from aBatched[T]
.Like other free structures, Batched trades space for time. A sum of batched values defers the underlying semigroup action, instead storing all values in memory (in a tree structure). If an underlying semigroup is available,
Batched.semigroup
andBatch.monoid
can be configured to periodically sum the tree to keep the overall size belowbatchSize
.Batched[T]
values are guaranteed not to be empty -- that is, they will contain at least oneT
value. - class BatchedMonoid[T] extends BatchedSemigroup[T] with Monoid[Batched[T]]
Compacting monoid for batched values.
Compacting monoid for batched values.
This monoid ensures that the batch's tree structure has fewer than
batchSize
values in it. When more values are added, the tree is compacted usingm
. - class BatchedSemigroup[T] extends Semigroup[Batched[T]]
Compacting semigroup for batched values.
Compacting semigroup for batched values.
This semigroup ensures that the batch's tree structure has fewer than
batchSize
values in it. When more values are added, the tree is compacted usings
. - case class BloomFilterAggregator[A](bfMonoid: BloomFilterMonoid[A]) extends MonoidAggregator[A, BF[A], BF[A]] with Product with Serializable
- case class BloomFilterMonoid[A](numHashes: Int, width: Int)(implicit hash: Hash128[A]) extends Monoid[BF[A]] with BoundedSemilattice[BF[A]] with Product with Serializable
Bloom Filter - a probabilistic data structure to test presence of an element.
Bloom Filter - a probabilistic data structure to test presence of an element.
Operations 1) insert: hash the value k times, updating the bitfield at the index equal to each hashed value 2) query: hash the value k times. If there are k collisions, then return true; otherwise false.
http://en.wikipedia.org/wiki/Bloom_filter
- trait Buffered[I, O] extends Serializable
Represents something that consumes I and may emit O.
Represents something that consumes I and may emit O. Has some internal state that may be used to improve performance. Generally used to model folds or reduces (see BufferedReduce)
- trait BufferedReduce[V] extends Buffered[V, V]
This never emits on put, you must call flush designed to be use in the stackable pattern with ArrayBufferedOperation
- class BufferedSumAll[V] extends ArrayBufferedOperation[V, V] with StatefulSummer[V] with BufferedReduce[V]
- final case class Bytes(array: Array[Byte]) extends Serializable with Product
A wrapper for
Array[Byte]
that provides sane implementations ofhashCode
,equals
, andtoString
.A wrapper for
Array[Byte]
that provides sane implementations ofhashCode
,equals
, andtoString
. The wrapped array of bytes is assumed to be never modified.Note: Unfortunately we cannot make Bytes a value class because a value class may not override the
hashCode
andequals
methods (cf. SIP-15, criterion 4).Alternatives
Instead of wrapping an
Array[Byte]
with this class you can also convert anArray[Byte]
to aSeq[Byte]
via Scala'stoSeq
method:val arrayByte: Array[Byte] = Array(1.toByte) val seqByte: Seq[Byte] = arrayByte.toSeq
Like Bytes, a
Seq[Byte]
has sanehashCode
,equals
, andtoString
implementations.Performance-wise we found that a
Seq[Byte]
is comparable to Bytes. For example, aCMS[Seq[Byte]]
was measured to be only slightly slower thanCMS[Bytes]
(think: single-digit percentages).- array
the wrapped array of bytes
- See also
- sealed abstract class CMS[K] extends Serializable with CMSCounting[K, CMS]
A Count-Min sketch data structure that allows for counting and frequency estimation of elements in a data stream.
A Count-Min sketch data structure that allows for counting and frequency estimation of elements in a data stream.
Tip: If you also need to track heavy hitters ("Top N" problems), take a look at TopCMS.
Usage
This example demonstrates how to count
Long
elements with CMS, i.e.K=Long
.Note that the actual counting is always performed with a
Long
, regardless of your choice ofK
. That is, the counting table behind the scenes is backed byLong
values (at least in the current implementation), and thus the returned frequency estimates are always instances ofApproximate[Long]
.- K
The type used to identify the elements to be counted.
// Creates a monoid for a CMS that can count `Long` elements. val cmsMonoid: CMSMonoid[Long] = { val eps = 0.001 val delta = 1E-10 val seed = 1 CMS.monoid[Long](eps, delta, seed) } // Creates a CMS instance that has counted the element `1L`. val cms: CMS[Long] = cmsMonoid.create(1L) // Estimates the frequency of `1L` val estimate: Approximate[Long] = cms.frequency(1L)
Example: - case class CMSAggregator[K](cmsMonoid: CMSMonoid[K]) extends MonoidAggregator[K, CMS[K], CMS[K]] with Product with Serializable
An Aggregator for CMS.
An Aggregator for CMS. Can be created using CMS.aggregator.
- trait CMSCounting[K, C[_]] extends AnyRef
A trait for CMS implementations that can count elements in a data stream and that can answer point queries (i.e.
A trait for CMS implementations that can count elements in a data stream and that can answer point queries (i.e. frequency estimates) for these elements.
- K
The type used to identify the elements to be counted.
- C
The type of the actual CMS that implements this trait.
- case class CMSHash[K](a: Int, b: Int, width: Int)(implicit evidence$30: CMSHasher[K]) extends Serializable with Product
- trait CMSHasher[K] extends Serializable
The Count-Min sketch uses
d
(akadepth
) pair-wise independent hash functions drawn from a universal hashing family of the form:The Count-Min sketch uses
d
(akadepth
) pair-wise independent hash functions drawn from a universal hashing family of the form:h(x) = [a * x + b (mod p)] (mod m)
As a requirement for using CMS you must provide an implicit
CMSHasher[K]
for the typeK
of the items you want to count. Algebird ships with several such implicits for commonly used typesK
such asLong
andBigInt
.If your type
K
is not supported out of the box, you have two options: 1) You provide a "translation" function to convert items of your (unsupported) typeK
to a supported type such as Double, and then use thecontramap
function of CMSHasher to create the requiredCMSHasher[K]
for your type (see the documentation ofcontramap
for an example); 2) You implement aCMSHasher[K]
from scratch, using the existing CMSHasher implementations as a starting point. - trait CMSHeavyHitters[K] extends AnyRef
A trait for CMS implementations that can track heavy hitters in a data stream.
A trait for CMS implementations that can track heavy hitters in a data stream.
It is up to the implementation how the semantics of tracking heavy hitters are defined. For instance, one implementation could track the "top %" heavy hitters whereas another implementation could track the "top N" heavy hitters.
Known implementations: TopCMS.
- K
The type used to identify the elements to be counted.
- case class CMSInstance[K](countsTable: CountsTable[K], totalCount: Long, params: CMSParams[K]) extends CMS[K] with Product with Serializable
The general Count-Min sketch structure, used for holding any number of elements.
- case class CMSItem[K](item: K, totalCount: Long, params: CMSParams[K]) extends CMS[K] with Product with Serializable
Used for holding a single element, to avoid repeatedly adding elements from sparse counts tables.
- class CMSMonoid[K] extends Monoid[CMS[K]] with CommutativeMonoid[CMS[K]]
Monoid for adding CMS sketches.
Monoid for adding CMS sketches.
Usage
eps
anddelta
are parameters that bound the error of each query estimate. For example, errors in answering point queries (e.g., how often has element x appeared in the stream described by the sketch?) are often of the form: "with probability p >= 1 - delta, the estimate is close to the truth by some factor depending on eps."The type
K
is the type of items you want to count. You must provide an implicitCMSHasher[K]
forK
, and Algebird ships with several such implicits for commonly used types such asLong
andBigInt
.If your type
K
is not supported out of the box, you have two options: 1) You provide a "translation" function to convert items of your (unsupported) typeK
to a supported type such as Double, and then use thecontramap
function of CMSHasher to create the requiredCMSHasher[K]
for your type (see the documentation of CMSHasher for an example); 2) You implement aCMSHasher[K]
from scratch, using the existing CMSHasher implementations as a starting point.Note: Because Arrays in Scala/Java not have sane
equals
andhashCode
implementations, you cannot safely use types such asArray[Byte]
. Extra work is required for Arrays. For example, you may opt to convertArray[T]
to aSeq[T]
viatoSeq
, or you can provide appropriate wrapper classes. Algebird provides one such wrapper class, Bytes, to safely wrap anArray[Byte]
for use with CMS.- K
The type used to identify the elements to be counted. For example, if you want to count the occurrence of user names, you could map each username to a unique numeric ID expressed as a
Long
, and then count the occurrences of thoseLong
s with a CMS of typeK=Long
. Note that this mapping between the elements of your problem domain and their identifiers used for counting via CMS should be bijective. We require a CMSHasher context bound forK
, see CMSHasherImplicits for available implicits that can be imported. Which type K should you pick in practice? For domains that have less than2^64
unique elements, you'd typically use
Long. For larger domains you can try
BigInt, for example. Other possibilities include Spire's
SafeLongand
Numericaldata types (https://github.com/non/spire), though Algebird does not include the required implicits for CMS-hashing (cf. CMSHasherImplicits.
- case class CMSParams[K](hashes: Seq[CMSHash[K]], eps: Double, delta: Double, maxExactCountOpt: Option[Int] = None) extends Product with Serializable
Configuration parameters for CMS.
Configuration parameters for CMS.
- K
The type used to identify the elements to be counted.
- hashes
Pair-wise independent hashes functions. We need
N=depth
such functions (depth
can be derived fromdelta
).- eps
One-sided error bound on the error of each point query, i.e. frequency estimate.
- delta
A bound on the probability that a query estimate does not lie within some small interval (an interval that depends on
eps
) around the truth.- maxExactCountOpt
An Option parameter about how many exact counts a sparse CMS wants to keep.
- class CMSSummation[K] extends AnyRef
This mutable builder can be used when speed is essential and you can be sure the scope of the mutability cannot escape in an unsafe way.
This mutable builder can be used when speed is essential and you can be sure the scope of the mutability cannot escape in an unsafe way. The intended use is to allocate and call result in one method without letting a reference to the instance escape into a closure.
- case class CMSZero[K](params: CMSParams[K]) extends CMS[K] with Product with Serializable
Zero element.
Zero element. Used for initialization.
- class CassandraMurmurHash extends AnyRef
This is a very fast, non-cryptographic hash suitable for general hash-based lookup.
This is a very fast, non-cryptographic hash suitable for general hash-based lookup. See http://murmurhash.googlepages.com/ for more details.
hash32() and hash64() are MurmurHash 2.0. hash3_x64_128() is MurmurHash 3.0.
The C version of MurmurHash 2.0 found at that site was ported to Java by Andrzej Bialecki (ab at getopt org).
- class ConstantGroup[T] extends Group[T]
- class ConstantRing[T] extends ConstantGroup[T] with Ring[T]
- case class Correlation(c2: Double, m2x: Double, m2y: Double, m1x: Double, m1y: Double, m0: Double) extends Product with Serializable
A class to calculate covariance and the first two central moments of a sequence of pairs of Doubles, from which the pearson correlation coeifficient can be calculated.
A class to calculate covariance and the first two central moments of a sequence of pairs of Doubles, from which the pearson correlation coeifficient can be calculated.
m{i}x denotes the ith central moment of the first projection of the pair. m{i}y denotes the ith central moment of the second projection of the pair. c2 the covariance equivalent of the second central moment, i.e. c2 = Sum_(x,y) (x - m1x)*(y - m1y).
- case class DecayedValue(value: Double, scaledTime: Double) extends Ordered[DecayedValue] with Product with Serializable
- case class DecayedValueMonoid(eps: Double) extends Monoid[DecayedValue] with Product with Serializable
- case class DecayedVector[C[_]](vector: C[Double], scaledTime: Double) extends Product with Serializable
- final class DecayingCMS[K] extends Serializable
DecayingCMS is a module to build count-min sketch instances whose counts decay exponentially.
DecayingCMS is a module to build count-min sketch instances whose counts decay exponentially.
Similar to a Map[K, com.twitter.algebird.DecayedValue], each key is associated with a single count value that decays over time. Unlike a map, the decyaing CMS is an approximate count -- in exchange for the possibility of over-counting, we can bound its size in memory.
The intended use case is for metrics or machine learning where exact values aren't needed.
You can expect the keys with the biggest values to be fairly accurate but the very small values (rare keys or very old keys) to be lost in the noise. For both metrics and ML this should be fine: you can't learn too much from very rare values.
We recommend depth of at least 5, and width of at least 100, but you should do some experiments to determine the smallest parameters that will work for your use case.
- case class DenseHLL(bits: Int, v: Bytes) extends HLL with Product with Serializable
These are the individual instances which the Monoid knows how to add
- case class DenseVector[V](iseq: Vector[V], sparseValue: V, denseCount: Int) extends AdaptiveVector[V] with Product with Serializable
- class EitherMonoid[L, R] extends EitherSemigroup[L, R] with Monoid[Either[L, R]]
- class EitherSemigroup[L, R] extends Semigroup[Either[L, R]]
Either semigroup is useful for error handling.
Either semigroup is useful for error handling. if everything is correct, use Right (it's right, get it?), if something goes wrong, use Left. plus does the normal thing for plus(Right, Right), or plus(Left, Left), but if exactly one is Left, we return that value (to keep the error condition). Typically, the left value will be a string representing the errors.
- case class Empty[T]() extends Interval[T] with Product with Serializable
- trait EventuallyAggregator[A, E, O, C] extends AbstractEventuallyAggregator[A, E, O, C]
- class EventuallyGroup[E, O] extends EventuallyMonoid[E, O] with Group[Either[E, O]]
- See also
EventuallySemigroup
- class EventuallyMonoid[E, O] extends EventuallySemigroup[E, O] with Monoid[Either[E, O]]
- See also
EventuallySemigroup
- trait EventuallyMonoidAggregator[A, E, O, C] extends AbstractEventuallyAggregator[A, E, O, C] with MonoidAggregator[A, Either[E, O], C]
- class EventuallyRing[E, O] extends EventuallyGroup[E, O] with Ring[Either[E, O]]
- See also
EventuallySemigroup
- class EventuallySemigroup[E, O] extends Semigroup[Either[E, O]]
Classes that support algebraic structures with dynamic switching between two representations, the original type O and the eventual type E.
Classes that support algebraic structures with dynamic switching between two representations, the original type O and the eventual type E. In the case of Semigroup, we specify
- Two Semigroups eventualSemigroup and originalSemigroup
- A Semigroup homomorphism convert: O => E
- A conditional mustConvert: O => Boolean
Then we get a Semigroup[Either[E,O]], where:
Left(x) + Left(y) = Left(x+y) Left(x) + Right(y) = Left(x+convert(y)) Right(x) + Left(y) = Left(convert(x)+y) Right(x) + Right(y) = Left(convert(x+y)) if mustConvert(x+y) Right(x+y) otherwise.
EventuallyMonoid, EventuallyGroup, and EventuallyRing are defined analogously, with the contract that convert respect the appropriate structure.
- case class ExclusiveLower[T](lower: T) extends Interval[T] with Lower[T] with Product with Serializable
- case class ExclusiveUpper[T](upper: T) extends Interval[T] with Upper[T] with Product with Serializable
- case class ExpHist(conf: Config, buckets: Vector[Bucket], total: Long, time: Timestamp) extends Product with Serializable
Exponential Histogram algorithm from http://www-cs-students.stanford.edu/~datar/papers/sicomp_streams.pdf
Exponential Histogram algorithm from http://www-cs-students.stanford.edu/~datar/papers/sicomp_streams.pdf
An Exponential Histogram is a sliding window counter that can guarantee a bounded relative error. You configure the data structure with
- epsilon, the relative error you're willing to tolerate
- windowSize, the number of time ticks that you want to track
You interact with the data structure by adding (number, timestamp) pairs into the exponential histogram. querying it for an approximate counts with
guess
.The approximate count is guaranteed to be within conf.epsilon relative error of the true count seen across the supplied
windowSize
.Next steps:
- efficient serialization
- Query EH with a shorter window than the configured window
- Discussion of epsilon vs memory tradeoffs
- conf
the config values for this instance.
- buckets
Vector of timestamps of each (powers of 2) ticks. This is the key to the exponential histogram representation. See ExpHist.Canonical for more info.
- total
total ticks tracked.
total == buckets.map(_.size).sum
- time
current timestamp of this instance.
- type Field[V] = algebra.ring.Field[V]
To keep code using algebird.Field compiling, we export algebra Field
- case class First[+T](get: T) extends Product with Serializable
Tracks the "least recent", or earliest, wrapped instance of
T
by the order in which items are seen.Tracks the "least recent", or earliest, wrapped instance of
T
by the order in which items are seen.- get
wrapped instance of
T
- case class FirstAggregator[T]() extends Aggregator[T, T, T] with Product with Serializable
Aggregator that selects the first instance of
T
in the aggregated stream. - trait FlatMapPreparer[A, T] extends Preparer[A, T]
A Preparer that has had one or more flatMap operations applied.
A Preparer that has had one or more flatMap operations applied. It can only accept MonoidAggregators.
- sealed trait Fold[-I, +O] extends Serializable
Folds are first-class representations of "Traversable.foldLeft." They have the nice property that they can be fused to work in parallel over an input sequence.
Folds are first-class representations of "Traversable.foldLeft." They have the nice property that they can be fused to work in parallel over an input sequence.
A Fold accumulates inputs (I) into some internal type (X), converting to a defined output type (O) when done. We use existential types to hide internal details and to allow for internal and external (X and O) types to differ for "map" and "join."
In discussing this type we draw parallels to Function1 and related types. You can think of a fold as a function "Seq[I] => O" but in reality we do not have to materialize the input sequence at once to "run" the fold.
The traversal of the input data structure is NOT done by Fold itself. Instead we expose some methods like "overTraversable" that know how to iterate through various sequence types and drive the fold. We also expose some internal state so library authors can fold over their own types.
See the companion object for constructors.
- class FoldApplicative[I] extends Applicative[[β$1$]Fold[I, β$1$]]
Folds are Applicatives!
- final class FoldState[X, -I, +O] extends Serializable
A FoldState defines a left fold with a "hidden" accumulator type.
A FoldState defines a left fold with a "hidden" accumulator type. It is exposed so library authors can run Folds over their own sequence types.
The fold can be executed correctly according to the properties of "add" and your traversed data structure. For example, the "add" function of a monoidal fold will be associative. A FoldState is valid for only one iteration because the accumulator (seeded by "start" may be mutable.
The three components of a fold are add: (X, I) => X - updates and returns internal state for every input I start: X - the initial state end: X => O - transforms internal state to a final result
Folding over Seq(x, y) would produce the result end(add(add(start, x), y))
- class FromAlgebraGroup[T] extends FromAlgebraMonoid[T] with Group[T]
- class FromAlgebraMonoid[T] extends FromAlgebraSemigroup[T] with Monoid[T]
- class FromAlgebraRing[T] extends Ring[T]
- class FromAlgebraSemigroup[T] extends Semigroup[T]
- class Function1Monoid[T] extends Monoid[(T) => T]
Function1 monoid.
Function1 monoid. plus means function composition, zero is the identity function
- trait Functor[M[_]] extends AnyRef
Simple implementation of a Functor type-class.
Simple implementation of a Functor type-class.
Laws Functors must follow: map(m)(id) == m map(m)(f andThen g) == map(map(m)(f))(g)
- Annotations
- @implicitNotFound()
- class FunctorOperators[A, M[_]] extends AnyRef
This enrichment allows us to use our Functor instances in for expressions: if (import Functor._) has been done
- case class GenHLLAggregator[K](hllMonoid: HyperLogLogMonoid, hash: Hash128[K]) extends MonoidAggregator[K, HLL, HLL] with Product with Serializable
- trait GeneratedGroupImplicits extends AnyRef
- trait GeneratedMonoidImplicits extends AnyRef
- trait GeneratedRingImplicits extends AnyRef
- trait GeneratedSemigroupImplicits extends AnyRef
- trait GeneratedTupleAggregator extends AnyRef
- abstract class GenericMapMonoid[K, V, M <: Map[K, V]] extends Monoid[M] with MapOperations[K, V, M]
- trait GenericMapRing[K, V, M <: Map[K, V]] extends Rng[M] with MapOperations[K, V, M]
You can think of this as a Sparse vector ring
- trait Group[T] extends algebra.Group[T] with Monoid[T] with AdditiveGroup[T]
Group: this is a monoid that also has subtraction (and negation): So, you can do (a-b), or -a (which is equal to 0 - a).
Group: this is a monoid that also has subtraction (and negation): So, you can do (a-b), or -a (which is equal to 0 - a).
- Annotations
- @implicitNotFound()
- sealed abstract class HLL extends Serializable
- case class HLLSeries(bits: Int, rows: Vector[Map[Int, Long]]) extends Product with Serializable
HLLSeries can produce a HyperLogLog counter for any window into the past, using a constant factor more space than HyperLogLog.
HLLSeries can produce a HyperLogLog counter for any window into the past, using a constant factor more space than HyperLogLog.
For each hash bucket, rather than keeping a single max RhoW value, it keeps every RhoW value it has seen, and the max timestamp where it saw that value. This allows it to reconstruct an HLL as it would be had it started at zero at any given point in the past, and seen the same updates this structure has seen.
- bits
The number of bits to use
- rows
Vector of maps of RhoW -> max timestamp where it was seen
- returns
New HLLSeries
- trait Hash128[-K] extends Serializable
A typeclass to represent hashing to 128 bits.
A typeclass to represent hashing to 128 bits. Used for HLL, but possibly other applications
- class HashingTrickMonoid[V] extends Monoid[AdaptiveVector[V]]
- case class HeavyHitter[K](item: K, count: Long) extends Serializable with Product
- case class HeavyHitters[K](hhs: Set[HeavyHitter[K]]) extends Serializable with Product
Containers for holding heavy hitter items and their associated counts.
- abstract class HeavyHittersLogic[K] extends Serializable
Controls how a CMS that implements CMSHeavyHitters tracks heavy hitters.
- case class HyperLogLogAggregator(hllMonoid: HyperLogLogMonoid) extends MonoidAggregator[Array[Byte], HLL, HLL] with Product with Serializable
- class HyperLogLogMonoid extends Monoid[HLL] with BoundedSemilattice[HLL]
- class HyperLogLogSeriesMonoid extends Monoid[HLLSeries]
val hllSeriesMonoid = new HyperLogLogSeriesMonoid(bits)
Example Usage
val hllSeriesMonoid = new HyperLogLogSeriesMonoid(bits)
val examples: Seq[Array[Byte], Long] val series = examples .map { case (bytes, timestamp) => hllSeriesMonoid.create(bytes, timestamp) } .reduce { hllSeriesMonoid.plus(_,_) }
val estimate1 = series.since(timestamp1.toLong).toHLL.estimatedSize val estimate2 = series.since(timestamp2.toLong).toHLL.estimatedSize
- case class Identity[T](get: T) extends Product with Serializable
- sealed trait Implicits extends LowPrioImpicits
- case class InclusiveLower[T](lower: T) extends Interval[T] with Lower[T] with Product with Serializable
- case class InclusiveUpper[T](upper: T) extends Interval[T] with Upper[T] with Product with Serializable
- class IndexedSeqGroup[T] extends IndexedSeqMonoid[T] with Group[IndexedSeq[T]]
- class IndexedSeqMonoid[T] extends IndexedSeqSemigroup[T] with Monoid[IndexedSeq[T]]
- class IndexedSeqRing[T] extends IndexedSeqGroup[T] with Ring[IndexedSeq[T]]
- class IndexedSeqSemigroup[T] extends Semigroup[IndexedSeq[T]]
Note that this works similar to Semigroup[Map[Int,T]] not like Semigroup[List[T]] This does element-wise operations, like standard vector math, not concatenation, like Semigroup[String] or Semigroup[List[T]]
Note that this works similar to Semigroup[Map[Int,T]] not like Semigroup[List[T]] This does element-wise operations, like standard vector math, not concatenation, like Semigroup[String] or Semigroup[List[T]]
If l.size != r.size, then only sums the elements up to the index min(l.size, r.size); appends the remainder to the result.
- class IntegralPredecessible[T] extends Predecessible[T]
- class IntegralSuccessible[T] extends Successible[T]
- case class Intersection[L[t] <: Lower[t], U[t] <: Upper[t], T](lower: L[T], upper: U[T]) extends Interval[T] with Product with Serializable
- sealed trait Interval[T] extends Serializable
Represents a single interval on a T with an Ordering
- class InvariantGroup[T, U] extends InvariantMonoid[T, U] with Group[U]
- class InvariantMonoid[T, U] extends InvariantSemigroup[T, U] with Monoid[U]
- class InvariantRing[T, U] extends InvariantGroup[T, U] with Ring[U]
- class InvariantSemigroup[T, U] extends Semigroup[U]
- class JListMonoid[T] extends Monoid[List[T]]
Since Lists are mutable, this always makes a full copy.
Since Lists are mutable, this always makes a full copy. Prefer scala immutable Lists if you use scala immutable lists, the tail of the result of plus is always the right argument
- class JMapMonoid[K, V] extends Monoid[Map[K, V]]
Since maps are mutable, this always makes a full copy.
Since maps are mutable, this always makes a full copy. Prefer scala immutable maps if you use scala immutable maps, this operation is much faster TODO extend this to Group, Ring
- case class Last[+T](get: T) extends Product with Serializable
Tracks the "most recent", or last, wrapped instance of
T
by the order in which items are seen.Tracks the "most recent", or last, wrapped instance of
T
by the order in which items are seen.- get
wrapped instance of
T
- case class LastAggregator[T]() extends Aggregator[T, T, T] with Product with Serializable
Aggregator that selects the last instance of
T
in the aggregated stream. - class ListMonoid[T] extends Monoid[List[T]]
List concatenation monoid.
List concatenation monoid. plus means concatenation, zero is empty list
- sealed trait LowPrioImpicits extends AnyRef
- sealed trait Lower[T] extends Interval[T]
- trait MapAggregator[A, B, K, C] extends Aggregator[A, B, Map[K, C]]
- class MapGroup[K, V] extends MapMonoid[K, V] with Group[Map[K, V]]
You can think of this as a Sparse vector group
- class MapMonoid[K, V] extends GenericMapMonoid[K, V, Map[K, V]]
- trait MapMonoidAggregator[A, B, K, C] extends MonoidAggregator[A, B, Map[K, C]]
- trait MapOperations[K, V, M <: Map[K, V]] extends AnyRef
- trait MapPreparer[A, T] extends Preparer[A, T]
A Preparer that has had zero or more map transformations applied, but no flatMaps.
A Preparer that has had zero or more map transformations applied, but no flatMaps. This can produce any type of Aggregator.
- class MapRing[K, V] extends MapGroup[K, V] with GenericMapRing[K, V, Map[K, V]]
- case class Max[+T](get: T) extends Product with Serializable
Tracks the maximum wrapped instance of some ordered type
T
. - case class MaxAggregator[T]()(implicit ord: Ordering[T]) extends Aggregator[T, T, T] with Product with Serializable
Aggregator that selects the maximum instance of
T
in the aggregated stream. - trait Metric[-V] extends Serializable
- Annotations
- @implicitNotFound()
- case class Min[+T](get: T) extends Product with Serializable
Tracks the minimum wrapped instance of some ordered type
T
. - case class MinAggregator[T]()(implicit ord: Ordering[T]) extends Aggregator[T, T, T] with Product with Serializable
Aggregator that selects the minimum instance of
T
in the aggregated stream. - final case class MinHashSignature(bytes: Array[Byte]) extends AnyVal with Product with Serializable
MinHasher as a Monoid operates on this class to avoid the too generic Array[Byte].
MinHasher as a Monoid operates on this class to avoid the too generic Array[Byte]. The bytes are assumed to be never modified. The only reason we did not use IndexedSeq[Byte] instead of Array[Byte] is because a ByteBuffer is used internally in MinHasher and it can wrap Array[Byte].
- abstract class MinHasher[H] extends Monoid[MinHashSignature]
Instances of MinHasher can create, combine, and compare fixed-sized signatures of arbitrarily sized sets.
Instances of MinHasher can create, combine, and compare fixed-sized signatures of arbitrarily sized sets.
A signature is represented by a byte array of approx maxBytes size. You can initialize a signature with a single element, usually a Long or String. You can combine any two set's signatures to produce the signature of their union. You can compare any two set's signatures to estimate their Jaccard similarity. You can use a set's signature to estimate the number of distinct values in the set. You can also use a combination of the above to estimate the size of the intersection of two sets from their signatures. The more bytes in the signature, the more accurate all of the above will be.
You can also use these signatures to quickly find similar sets without doing n^2 comparisons. Each signature is assigned to several buckets; sets whose signatures end up in the same bucket are likely to be similar. The targetThreshold controls the desired level of similarity - the higher the threshold, the more efficiently you can find all the similar sets.
This abstract superclass is generic with regards to the size of the hash used. Depending on the number of unique values in the domain of the sets, you may want a MinHasher16, a MinHasher32, or a new custom subclass.
This implementation is modeled after Chapter 3 of Ullman and Rajaraman's Mining of Massive Datasets: http://infolab.stanford.edu/~ullman/mmds/ch3a.pdf
- class MinHasher16 extends MinHasher[Char]
- class MinHasher32 extends MinHasher[Int]
- sealed trait MinPlus[+V] extends Serializable
- class MinPlusSemiring[V] extends Rig[MinPlus[V]]
- final case class MinPlusValue[V](get: V) extends AnyVal with MinPlus[V] with Product with Serializable
- sealed class Moments extends Product with Serializable
A class to calculate the first five central moments over a sequence of Doubles.
A class to calculate the first five central moments over a sequence of Doubles. Given the first five central moments, we can then calculate metrics like skewness and kurtosis.
m{i} denotes the ith central moment.
This code manually inlines code to make it look like a case class. This is done because we changed the count from a Long to a Double to enable the scale method, which allows exponential decays of moments, but we didn't want to break backwards binary compatibility.
- class MomentsMonoid extends Monoid[Moments] with CommutativeMonoid[Moments]
- trait Monad[M[_]] extends Applicative[M]
Simple implementation of a Monad type-class.
Simple implementation of a Monad type-class. Subclasses only need to override apply and flatMap, but they should override map, join, joinWith, and sequence if there are better implementations.
Laws Monads must follow: identities: flatMap(apply(x))(fn) == fn(x) flatMap(m)(apply _) == m associativity on flatMap (you can either flatMap f first, or f to g: flatMap(flatMap(m)(f))(g) == flatMap(m) { x => flatMap(f(x))(g) }
- Annotations
- @implicitNotFound()
- class MonadOperators[A, M[_]] extends ApplicativeOperators[A, M]
This enrichment allows us to use our Monad instances in for expressions: if (import Monad._) has been done
- trait Monoid[T] extends Semigroup[T] with algebra.Monoid[T] with AdditiveMonoid[T]
Monoid (take a deep breath, and relax about the weird name): This is a semigroup that has an additive identity (called zero), such that a+0=a, 0+a=a, for every a
Monoid (take a deep breath, and relax about the weird name): This is a semigroup that has an additive identity (called zero), such that a+0=a, 0+a=a, for every a
- Annotations
- @implicitNotFound()
- trait MonoidAggregator[-A, B, +C] extends Aggregator[A, B, C]
- class MonoidCombinator[A, B] extends SemigroupCombinator[A, B] with Monoid[(A, B)]
- final case class MurmurHash128(seed: Long) extends AnyVal with Product with Serializable
- class NumericRing[T] extends Ring[T]
- trait NumericRingProvider extends AnyRef
- class OptionGroup[T] extends OptionMonoid[T] with Group[Option[T]]
Some(5) - Some(3) == Some(2) Some(5) - Some(5) == None negate Some(5) == Some(-5) Note: Some(0) and None are equivalent under this Group
- class OptionMonoid[T] extends Monoid[Option[T]]
Some(5) + Some(3) == Some(8) Some(5) + None == Some(5)
- final case class OrVal(get: Boolean) extends AnyVal with Product with Serializable
- trait Predecessible[T] extends Serializable
This is a typeclass to represent things which are countable down.
This is a typeclass to represent things which are countable down. Note that it is important that a value prev(t) is always less than t. Note that prev returns Option because this class comes with the notion that some items may reach a minimum key, which is None.
- sealed trait Preparer[A, T] extends Serializable
Preparer is a way to build up an Aggregator through composition using a more natural API: it allows you to start with the input type and describe a series of transformations and aggregations from there, rather than starting from the aggregation and composing "outwards" in both directions.
Preparer is a way to build up an Aggregator through composition using a more natural API: it allows you to start with the input type and describe a series of transformations and aggregations from there, rather than starting from the aggregation and composing "outwards" in both directions.
Uses of Preparer will always start with a call to Preparer[A], and end with a call to monoidAggregate or a related method, to produce an Aggregator instance.
- sealed trait Priority[+P, +F] extends AnyRef
Priority is a type class for prioritized implicit search.
Priority is a type class for prioritized implicit search.
This type class will attempt to provide an implicit instance of
P
(the preferred type). If that type is not available it will fallback toF
(the fallback type). If neither type is available then aPriority[P, F]
instance will not be available.This type can be useful for problems where multiple algorithms can be used, depending on the type classes available.
taken from non/algebra until we make algebird depend on non/algebra
- class Product10Group[X, A, B, C, D, E, F, G, H, I, J] extends Product10Monoid[X, A, B, C, D, E, F, G, H, I, J] with Group[X]
Combine 10 groups into a product group
- class Product10Monoid[X, A, B, C, D, E, F, G, H, I, J] extends Product10Semigroup[X, A, B, C, D, E, F, G, H, I, J] with Monoid[X]
Combine 10 monoids into a product monoid
- class Product10Ring[X, A, B, C, D, E, F, G, H, I, J] extends Product10Group[X, A, B, C, D, E, F, G, H, I, J] with Ring[X]
Combine 10 rings into a product ring
- class Product10Semigroup[X, A, B, C, D, E, F, G, H, I, J] extends Semigroup[X]
Combine 10 semigroups into a product semigroup
- class Product11Group[X, A, B, C, D, E, F, G, H, I, J, K] extends Product11Monoid[X, A, B, C, D, E, F, G, H, I, J, K] with Group[X]
Combine 11 groups into a product group
- class Product11Monoid[X, A, B, C, D, E, F, G, H, I, J, K] extends Product11Semigroup[X, A, B, C, D, E, F, G, H, I, J, K] with Monoid[X]
Combine 11 monoids into a product monoid
- class Product11Ring[X, A, B, C, D, E, F, G, H, I, J, K] extends Product11Group[X, A, B, C, D, E, F, G, H, I, J, K] with Ring[X]
Combine 11 rings into a product ring
- class Product11Semigroup[X, A, B, C, D, E, F, G, H, I, J, K] extends Semigroup[X]
Combine 11 semigroups into a product semigroup
- class Product12Group[X, A, B, C, D, E, F, G, H, I, J, K, L] extends Product12Monoid[X, A, B, C, D, E, F, G, H, I, J, K, L] with Group[X]
Combine 12 groups into a product group
- class Product12Monoid[X, A, B, C, D, E, F, G, H, I, J, K, L] extends Product12Semigroup[X, A, B, C, D, E, F, G, H, I, J, K, L] with Monoid[X]
Combine 12 monoids into a product monoid
- class Product12Ring[X, A, B, C, D, E, F, G, H, I, J, K, L] extends Product12Group[X, A, B, C, D, E, F, G, H, I, J, K, L] with Ring[X]
Combine 12 rings into a product ring
- class Product12Semigroup[X, A, B, C, D, E, F, G, H, I, J, K, L] extends Semigroup[X]
Combine 12 semigroups into a product semigroup
- class Product13Group[X, A, B, C, D, E, F, G, H, I, J, K, L, M] extends Product13Monoid[X, A, B, C, D, E, F, G, H, I, J, K, L, M] with Group[X]
Combine 13 groups into a product group
- class Product13Monoid[X, A, B, C, D, E, F, G, H, I, J, K, L, M] extends Product13Semigroup[X, A, B, C, D, E, F, G, H, I, J, K, L, M] with Monoid[X]
Combine 13 monoids into a product monoid
- class Product13Ring[X, A, B, C, D, E, F, G, H, I, J, K, L, M] extends Product13Group[X, A, B, C, D, E, F, G, H, I, J, K, L, M] with Ring[X]
Combine 13 rings into a product ring
- class Product13Semigroup[X, A, B, C, D, E, F, G, H, I, J, K, L, M] extends Semigroup[X]
Combine 13 semigroups into a product semigroup
- class Product14Group[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N] extends Product14Monoid[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N] with Group[X]
Combine 14 groups into a product group
- class Product14Monoid[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N] extends Product14Semigroup[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N] with Monoid[X]
Combine 14 monoids into a product monoid
- class Product14Ring[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N] extends Product14Group[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N] with Ring[X]
Combine 14 rings into a product ring
- class Product14Semigroup[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N] extends Semigroup[X]
Combine 14 semigroups into a product semigroup
- class Product15Group[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O] extends Product15Monoid[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O] with Group[X]
Combine 15 groups into a product group
- class Product15Monoid[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O] extends Product15Semigroup[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O] with Monoid[X]
Combine 15 monoids into a product monoid
- class Product15Ring[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O] extends Product15Group[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O] with Ring[X]
Combine 15 rings into a product ring
- class Product15Semigroup[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O] extends Semigroup[X]
Combine 15 semigroups into a product semigroup
- class Product16Group[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P] extends Product16Monoid[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P] with Group[X]
Combine 16 groups into a product group
- class Product16Monoid[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P] extends Product16Semigroup[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P] with Monoid[X]
Combine 16 monoids into a product monoid
- class Product16Ring[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P] extends Product16Group[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P] with Ring[X]
Combine 16 rings into a product ring
- class Product16Semigroup[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P] extends Semigroup[X]
Combine 16 semigroups into a product semigroup
- class Product17Group[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q] extends Product17Monoid[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q] with Group[X]
Combine 17 groups into a product group
- class Product17Monoid[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q] extends Product17Semigroup[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q] with Monoid[X]
Combine 17 monoids into a product monoid
- class Product17Ring[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q] extends Product17Group[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q] with Ring[X]
Combine 17 rings into a product ring
- class Product17Semigroup[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q] extends Semigroup[X]
Combine 17 semigroups into a product semigroup
- class Product18Group[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R] extends Product18Monoid[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R] with Group[X]
Combine 18 groups into a product group
- class Product18Monoid[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R] extends Product18Semigroup[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R] with Monoid[X]
Combine 18 monoids into a product monoid
- class Product18Ring[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R] extends Product18Group[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R] with Ring[X]
Combine 18 rings into a product ring
- class Product18Semigroup[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R] extends Semigroup[X]
Combine 18 semigroups into a product semigroup
- class Product19Group[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S] extends Product19Monoid[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S] with Group[X]
Combine 19 groups into a product group
- class Product19Monoid[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S] extends Product19Semigroup[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S] with Monoid[X]
Combine 19 monoids into a product monoid
- class Product19Ring[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S] extends Product19Group[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S] with Ring[X]
Combine 19 rings into a product ring
- class Product19Semigroup[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S] extends Semigroup[X]
Combine 19 semigroups into a product semigroup
- class Product20Group[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T] extends Product20Monoid[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T] with Group[X]
Combine 20 groups into a product group
- class Product20Monoid[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T] extends Product20Semigroup[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T] with Monoid[X]
Combine 20 monoids into a product monoid
- class Product20Ring[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T] extends Product20Group[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T] with Ring[X]
Combine 20 rings into a product ring
- class Product20Semigroup[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T] extends Semigroup[X]
Combine 20 semigroups into a product semigroup
- class Product21Group[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U] extends Product21Monoid[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U] with Group[X]
Combine 21 groups into a product group
- class Product21Monoid[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U] extends Product21Semigroup[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U] with Monoid[X]
Combine 21 monoids into a product monoid
- class Product21Ring[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U] extends Product21Group[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U] with Ring[X]
Combine 21 rings into a product ring
- class Product21Semigroup[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U] extends Semigroup[X]
Combine 21 semigroups into a product semigroup
- class Product22Group[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V] extends Product22Monoid[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V] with Group[X]
Combine 22 groups into a product group
- class Product22Monoid[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V] extends Product22Semigroup[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V] with Monoid[X]
Combine 22 monoids into a product monoid
- class Product22Ring[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V] extends Product22Group[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V] with Ring[X]
Combine 22 rings into a product ring
- class Product22Semigroup[X, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V] extends Semigroup[X]
Combine 22 semigroups into a product semigroup
- class Product2Group[X, A, B] extends Product2Monoid[X, A, B] with Group[X]
Combine 2 groups into a product group
- class Product2Monoid[X, A, B] extends Product2Semigroup[X, A, B] with Monoid[X]
Combine 2 monoids into a product monoid
- class Product2Ring[X, A, B] extends Product2Group[X, A, B] with Ring[X]
Combine 2 rings into a product ring
- class Product2Semigroup[X, A, B] extends Semigroup[X]
Combine 2 semigroups into a product semigroup
- class Product3Group[X, A, B, C] extends Product3Monoid[X, A, B, C] with Group[X]
Combine 3 groups into a product group
- class Product3Monoid[X, A, B, C] extends Product3Semigroup[X, A, B, C] with Monoid[X]
Combine 3 monoids into a product monoid
- class Product3Ring[X, A, B, C] extends Product3Group[X, A, B, C] with Ring[X]
Combine 3 rings into a product ring
- class Product3Semigroup[X, A, B, C] extends Semigroup[X]
Combine 3 semigroups into a product semigroup
- class Product4Group[X, A, B, C, D] extends Product4Monoid[X, A, B, C, D] with Group[X]
Combine 4 groups into a product group
- class Product4Monoid[X, A, B, C, D] extends Product4Semigroup[X, A, B, C, D] with Monoid[X]
Combine 4 monoids into a product monoid
- class Product4Ring[X, A, B, C, D] extends Product4Group[X, A, B, C, D] with Ring[X]
Combine 4 rings into a product ring
- class Product4Semigroup[X, A, B, C, D] extends Semigroup[X]
Combine 4 semigroups into a product semigroup
- class Product5Group[X, A, B, C, D, E] extends Product5Monoid[X, A, B, C, D, E] with Group[X]
Combine 5 groups into a product group
- class Product5Monoid[X, A, B, C, D, E] extends Product5Semigroup[X, A, B, C, D, E] with Monoid[X]
Combine 5 monoids into a product monoid
- class Product5Ring[X, A, B, C, D, E] extends Product5Group[X, A, B, C, D, E] with Ring[X]
Combine 5 rings into a product ring
- class Product5Semigroup[X, A, B, C, D, E] extends Semigroup[X]
Combine 5 semigroups into a product semigroup
- class Product6Group[X, A, B, C, D, E, F] extends Product6Monoid[X, A, B, C, D, E, F] with Group[X]
Combine 6 groups into a product group
- class Product6Monoid[X, A, B, C, D, E, F] extends Product6Semigroup[X, A, B, C, D, E, F] with Monoid[X]
Combine 6 monoids into a product monoid
- class Product6Ring[X, A, B, C, D, E, F] extends Product6Group[X, A, B, C, D, E, F] with Ring[X]
Combine 6 rings into a product ring
- class Product6Semigroup[X, A, B, C, D, E, F] extends Semigroup[X]
Combine 6 semigroups into a product semigroup
- class Product7Group[X, A, B, C, D, E, F, G] extends Product7Monoid[X, A, B, C, D, E, F, G] with Group[X]
Combine 7 groups into a product group
- class Product7Monoid[X, A, B, C, D, E, F, G] extends Product7Semigroup[X, A, B, C, D, E, F, G] with Monoid[X]
Combine 7 monoids into a product monoid
- class Product7Ring[X, A, B, C, D, E, F, G] extends Product7Group[X, A, B, C, D, E, F, G] with Ring[X]
Combine 7 rings into a product ring
- class Product7Semigroup[X, A, B, C, D, E, F, G] extends Semigroup[X]
Combine 7 semigroups into a product semigroup
- class Product8Group[X, A, B, C, D, E, F, G, H] extends Product8Monoid[X, A, B, C, D, E, F, G, H] with Group[X]
Combine 8 groups into a product group
- class Product8Monoid[X, A, B, C, D, E, F, G, H] extends Product8Semigroup[X, A, B, C, D, E, F, G, H] with Monoid[X]
Combine 8 monoids into a product monoid
- class Product8Ring[X, A, B, C, D, E, F, G, H] extends Product8Group[X, A, B, C, D, E, F, G, H] with Ring[X]
Combine 8 rings into a product ring
- class Product8Semigroup[X, A, B, C, D, E, F, G, H] extends Semigroup[X]
Combine 8 semigroups into a product semigroup
- class Product9Group[X, A, B, C, D, E, F, G, H, I] extends Product9Monoid[X, A, B, C, D, E, F, G, H, I] with Group[X]
Combine 9 groups into a product group
- class Product9Monoid[X, A, B, C, D, E, F, G, H, I] extends Product9Semigroup[X, A, B, C, D, E, F, G, H, I] with Monoid[X]
Combine 9 monoids into a product monoid
- class Product9Ring[X, A, B, C, D, E, F, G, H, I] extends Product9Group[X, A, B, C, D, E, F, G, H, I] with Ring[X]
Combine 9 rings into a product ring
- class Product9Semigroup[X, A, B, C, D, E, F, G, H, I] extends Semigroup[X]
Combine 9 semigroups into a product semigroup
- trait ProductGroups extends AnyRef
- trait ProductMonoids extends AnyRef
- trait ProductRings extends AnyRef
- trait ProductSemigroups extends AnyRef
- final class PureOp[A] extends AnyVal
- class QTree[A] extends Product6[Long, Int, Long, A, Option[QTree[A]], Option[QTree[A]]] with Serializable
- case class QTreeAggregator[T](percentile: Double, k: Int = QTreeAggregator.DefaultK)(implicit num: Numeric[T]) extends Aggregator[T, QTree[Unit], Intersection[InclusiveLower, InclusiveUpper, Double]] with QTreeAggregatorLike[T] with Product with Serializable
QTree aggregator is an aggregator that can be used to find the approximate percentile bounds.
QTree aggregator is an aggregator that can be used to find the approximate percentile bounds. The items that are iterated over to produce this approximation cannot be negative. Returns an Intersection which represents the bounded approximation.
- trait QTreeAggregatorLike[T] extends AnyRef
- case class QTreeAggregatorLowerBound[T](percentile: Double, k: Int = QTreeAggregator.DefaultK)(implicit num: Numeric[T]) extends Aggregator[T, QTree[Unit], Double] with QTreeAggregatorLike[T] with Product with Serializable
QTreeAggregatorLowerBound is an aggregator that is used to find an appoximate percentile.
QTreeAggregatorLowerBound is an aggregator that is used to find an appoximate percentile. This is similar to a QTreeAggregator, but is a convenience because instead of returning an Intersection, it instead returns the lower bound of the percentile. Like a QTreeAggregator, the items that are iterated over to produce this approximation cannot be negative.
- class QTreeSemigroup[A] extends Semigroup[QTree[A]]
- sealed trait ResetState[+A] extends AnyRef
Used to represent cases where we need to periodically reset
Used to represent cases where we need to periodically reset
a + b = a + b |a + b = |(a + b) a + |b = |b |a + |b = |b
- class ResetStateMonoid[A] extends Monoid[ResetState[A]]
- case class ResetValue[+A](get: A) extends ResetState[A] with Product with Serializable
- final class RichCBitSet extends AnyVal
- class RichTraversable[T] extends AnyRef
- sealed abstract class RightFolded[+In, +Out] extends AnyRef
- sealed abstract class RightFolded2[+In, +Out, +Acc] extends AnyRef
- class RightFolded2Monoid[In, Out, Acc] extends Monoid[RightFolded2[In, Out, Acc]]
- case class RightFoldedToFold[+In](in: List[In]) extends RightFolded[In, Nothing] with Product with Serializable
- case class RightFoldedToFold2[+In](in: List[In]) extends RightFolded2[In, Nothing, Nothing] with Product with Serializable
- case class RightFoldedValue[+Out](v: Out) extends RightFolded[Nothing, Out] with Product with Serializable
- case class RightFoldedValue2[+In, +Out, +Acc](v: Out, acc: Acc, rvals: List[In]) extends RightFolded2[In, Out, Acc] with Product with Serializable
- trait Ring[T] extends Group[T] with CommutativeGroup[T] with algebra.ring.Ring[T]
Ring: Group + multiplication (see: http://en.wikipedia.org/wiki/Ring_%28mathematics%29) and the three elements it defines:
Ring: Group + multiplication (see: http://en.wikipedia.org/wiki/Ring_%28mathematics%29) and the three elements it defines:
- additive identity aka zero
- addition
- multiplication
Note, if you have distributive property, additive inverses, and multiplicative identity you can prove you have a commutative group under the ring:
- (a + 1)*(b + 1) = a(b + 1) + (b + 1) 2. = ab + a + b + 1 3. or: 4. 5. = (a + 1)b + (a + 1) 6. = ab + b + a + 1 7. 8. So: ab + a + b + 1 == ab + b + a + 1 9. using the fact that -(ab) and -1 exist, we get: 10. a + b == b + a
- Annotations
- @implicitNotFound()
- trait RingAggregator[-A, B, +C] extends MonoidAggregator[A, B, C]
- sealed abstract class SGD[+Pos] extends AnyRef
- class SGDMonoid[Pos] extends Monoid[SGD[Pos]]
Basically a specific implementation of the RightFoldedMonoid gradient is the gradient of the function to be minimized To use this, you need to insert an initial weight SGDWeights before you start adding SGDPos objects.
Basically a specific implementation of the RightFoldedMonoid gradient is the gradient of the function to be minimized To use this, you need to insert an initial weight SGDWeights before you start adding SGDPos objects. Otherwise you will just be doing list concatenation.
- case class SGDPos[+Pos](pos: List[Pos]) extends SGD[Pos] with Product with Serializable
- case class SGDWeights(count: Long, weights: IndexedSeq[Double]) extends SGD[Nothing] with Product with Serializable
- case class SSMany[T] extends SpaceSaver[T] with Product with Serializable
- case class SSOne[T] extends SpaceSaver[T] with Product with Serializable
- class ScMapGroup[K, V] extends ScMapMonoid[K, V] with Group[Map[K, V]]
- class ScMapMonoid[K, V] extends GenericMapMonoid[K, V, Map[K, V]]
- class ScMapRing[K, V] extends ScMapGroup[K, V] with GenericMapRing[K, V, Map[K, V]]
- sealed abstract class Scan[-I, +O] extends Serializable
The Scan trait is an alternative to the
scanLeft
method on iterators/other collections for a range of of use-cases wherescanLeft
is awkward to use.The Scan trait is an alternative to the
scanLeft
method on iterators/other collections for a range of of use-cases wherescanLeft
is awkward to use. At a high level it provides some of the same functionality asscanLeft
, but with a separation of "what is the state of the scan" from "what are the elements that I'm scanning over?". In particular, when scanning over an iterator withN
elements, the output is an iterator withN
elements (in contrast to scanLeft'sN+1
).If you find yourself writing a
scanLeft
over pairs of elements, where you only use one element of the pair within thescanLeft
, then throw that element away in amap
immediately after the scanLeft is done, then this abstraction is for you.The canonical method to use a scan is
apply
.- I
The type of elements that the computation is scanning over.
- O
The output type of the scan (typically distinct from the hidden
State
of the scan).
- class ScanApplicative[I] extends Applicative[[β$1$]Scan[I, β$1$]]
- class ScopedTopNCMSMonoid[K1, K2] extends TopCMSMonoid[(K1, K2)]
- case class ScopedTopNLogic[K1, K2](heavyHittersN: Int) extends HeavyHittersLogic[(K1, K2)] with Product with Serializable
K1 defines a scope for the CMS.
K1 defines a scope for the CMS. For each k1, keep the top heavyHittersN associated k2 values.
- trait Semigroup[T] extends algebra.Semigroup[T] with AdditiveSemigroup[T]
A semigroup is any type
T
with an associative operation (plus
):A semigroup is any type
T
with an associative operation (plus
):a plus (b plus c) = (a plus b) plus c
Example instances:
Semigroup[Int]
:plus
Int#+
Semigroup[List[T]]
:plus
isList#++
- Annotations
- @implicitNotFound()
- class SemigroupCombinator[A, B] extends Semigroup[(A, B)]
This is a combinator on semigroups, after you do the plus, you transform B with a fold function This will not be valid for all fold functions.
This is a combinator on semigroups, after you do the plus, you transform B with a fold function This will not be valid for all fold functions. You need to prove that it is still associative.
Clearly only values of (a,b) are valid if fold(a,b) == b, so keep that in mind.
I have not yet found a sufficient condition on (A,B) => B that makes it correct Clearly a (trivial) constant function {(l,r) => r} works. Also, if B is List[T], and (l:A,r:List[T]) = r.sortBy(fn(l)) this works as well (due to the associativity on A, and the fact that the list never loses data).
For approximate lists (like top-K applications) this might work (or be close enough to associative that for approximation algorithms it is fine), and in fact, that is the main motivation of this code: Produce some ordering in A, and use it to do sorted-topK on the list in B.
Seems like an open topic here.... you are obliged to think on your own about this.
- class SentinelCache[K, V] extends AnyRef
This is a summing cache whose goal is to grow until we run out of memory, at which point it clears itself and stops growing.
This is a summing cache whose goal is to grow until we run out of memory, at which point it clears itself and stops growing. Note that we can lose the values in this cache at any point; we don't put anything here we care about.
- class SeqMonoid[T] extends Monoid[Seq[T]]
- sealed abstract case class SetDiff[T] extends Product with Serializable
SetDiff
is a class that represents changes applied to a set.SetDiff
is a class that represents changes applied to a set. It is in fact a Set[T] => Set[T], but doesn't extend Function1 since that brings in a pack of methods that we don't necessarily want. - class SetMonoid[T] extends Monoid[Set[T]]
Set union monoid.
Set union monoid. plus means union, zero is empty set
- case class SetSizeAggregator[A](hllBits: Int, maxSetSize: Int = 10)(implicit toBytes: (A) => Array[Byte]) extends SetSizeAggregatorBase[A] with Product with Serializable
- abstract class SetSizeAggregatorBase[A] extends EventuallyMonoidAggregator[A, HLL, Set[A], Long]
convert is not not implemented here
- case class SetSizeHashAggregator[A](hllBits: Int, maxSetSize: Int = 10)(implicit hash: Hash128[A]) extends SetSizeAggregatorBase[A] with Product with Serializable
Use a Hash128 when converting to HLL, rather than an implicit conversion to Array[Byte] Unifying with SetSizeAggregator would be nice, but since they only differ in an implicit parameter, scala seems to be giving me errors.
- case class SetValue[+A](get: A) extends ResetState[A] with Product with Serializable
- case class SketchMap[K, V](valuesTable: AdaptiveMatrix[V], heavyHitterKeys: List[K], totalValue: V) extends Serializable with Product
- case class SketchMapAggregator[K, V](params: SketchMapParams[K], skmMonoid: SketchMapMonoid[K, V])(implicit evidence$3: Ordering[V], evidence$4: Monoid[V]) extends MonoidAggregator[(K, V), SketchMap[K, V], SketchMap[K, V]] with Product with Serializable
An Aggregator for the SketchMap.
An Aggregator for the SketchMap. Can be created using SketchMap.aggregator
- case class SketchMapHash[K](hasher: CMSHash[Long], seed: Int)(implicit serialization: (K) => Array[Byte]) extends Product with Serializable
Hashes an arbitrary key type to one that the Sketch Map can use.
- class SketchMapMonoid[K, V] extends Monoid[SketchMap[K, V]] with CommutativeMonoid[SketchMap[K, V]]
Responsible for creating instances of SketchMap.
- case class SketchMapParams[K](seed: Int, width: Int, depth: Int, heavyHittersCount: Int)(implicit serialization: (K) => Array[Byte]) extends Product with Serializable
Convenience class for holding constant parameters of a Sketch Map.
- sealed abstract class SpaceSaver[T] extends AnyRef
Data structure used in the Space-Saving Algorithm to find the approximate most frequent and top-k elements.
Data structure used in the Space-Saving Algorithm to find the approximate most frequent and top-k elements. The algorithm is described in "Efficient Computation of Frequent and Top-k Elements in Data Streams". See here: www.cs.ucsb.edu/research/tech_reports/reports/2005-23.pdf In the paper the data structure is called StreamSummary but we chose to call it SpaceSaver instead. Note that the adaptation to hadoop and parallelization were not described in the article and have not been proven to be mathematically correct or preserve the guarantees or benefits of the algorithm.
- class SpaceSaverSemigroup[T] extends Semigroup[SpaceSaver[T]]
- case class SparseCMS[K](exactCountTable: Map[K, Long], totalCount: Long, params: CMSParams[K]) extends CMS[K] with Product with Serializable
A sparse Count-Min sketch structure, used for situations where the key is highly skewed.
- case class SparseHLL(bits: Int, maxRhow: Map[Int, Max[Byte]]) extends HLL with Product with Serializable
- case class SparseVector[V](map: Map[Int, V], sparseValue: V, length: Int) extends AdaptiveVector[V] with Product with Serializable
- trait StatefulSummer[V] extends Buffered[V, V]
A Stateful summer is something that is potentially more efficient (a buffer, a cache, etc...) that has the same result as a sum: Law 1: Semigroup.sumOption(items) == (Monoid.plus(items.map { stateful.put(_) }.filter { _.isDefined }, stateful.flush) && stateful.isFlushed) Law 2: isFlushed == flush.isEmpty
- trait Successible[T] extends Serializable
This is a typeclass to represent things which increase.
This is a typeclass to represent things which increase. Note that it is important that a value after being incremented is always larger than it was before. Note that next returns Option because this class comes with the notion of the "greatest" key, which is None. Ints, for example, will cycle if next(java.lang.Integer.MAX_VALUE) is called, therefore we need a notion of what happens when we hit the bounds at which our ordering is violating. This is also useful for closed sets which have a fixed progression.
- class SumAll[V] extends StatefulSummer[V]
Sum the entire iterator one item at a time.
Sum the entire iterator one item at a time. Only emits on flush you should probably prefer BufferedSumAll
- class SummingCache[K, V] extends StatefulSummer[Map[K, V]]
A Stateful Summer on Map[K,V] that keeps a cache of recent keys
- class SummingIterator[V] extends Serializable with Iterator[V]
- class SummingQueue[V] extends StatefulSummer[V]
- class SummingWithHitsCache[K, V] extends SummingCache[K, V]
A SummingCache that also tracks the number of key hits
- sealed abstract class TopCMS[K] extends Serializable with CMSCounting[K, TopCMS] with CMSHeavyHitters[K]
A Count-Min sketch data structure that allows for (a) counting and frequency estimation of elements in a data stream and (b) tracking the heavy hitters among these elements.
A Count-Min sketch data structure that allows for (a) counting and frequency estimation of elements in a data stream and (b) tracking the heavy hitters among these elements.
The logic of how heavy hitters are computed is pluggable, see HeavyHittersLogic.
Tip: If you do not need to track heavy hitters, take a look at CMS, which is more efficient in this case.
Usage
This example demonstrates how to count
Long
elements with TopCMS, i.e.K=Long
.Note that the actual counting is always performed with a
Long
, regardless of your choice ofK
. That is, the counting table behind the scenes is backed byLong
values (at least in the current implementation), and thus the returned frequency estimates are always instances ofApproximate[Long]
.- K
The type used to identify the elements to be counted.
// Creates a monoid for a CMS that can count `Long` elements. val topPctCMSMonoid: TopPctCMSMonoid[Long] = { val eps = 0.001 val delta = 1E-10 val seed = 1 val heavyHittersPct = 0.1 TopPctCMS.monoid[Long](eps, delta, seed, heavyHittersPct) } // Creates a TopCMS instance that has counted the element `1L`. val topCMS: TopCMS[Long] = topPctCMSMonoid.create(1L) // Estimates the frequency of `1L` val estimate: Approximate[Long] = topCMS.frequency(1L) // What are the heavy hitters so far? val heavyHitters: Set[Long] = topCMS.heavyHitters
Example: - class TopCMSAggregator[K] extends MonoidAggregator[K, TopCMS[K], TopCMS[K]]
- case class TopCMSInstance[K](cms: CMS[K], hhs: HeavyHitters[K], params: TopCMSParams[K]) extends TopCMS[K] with Product with Serializable
- case class TopCMSItem[K](item: K, cms: CMS[K], params: TopCMSParams[K]) extends TopCMS[K] with Product with Serializable
Used for holding a single element, to avoid repeatedly adding elements from sparse counts tables.
- class TopCMSMonoid[K] extends Monoid[TopCMS[K]]
- case class TopCMSParams[K](logic: HeavyHittersLogic[K]) extends Product with Serializable
- case class TopCMSZero[K](cms: CMS[K], params: TopCMSParams[K]) extends TopCMS[K] with Product with Serializable
Zero element.
Zero element. Used for initialization.
- case class TopK[N](size: Int, items: List[N], max: Option[N]) extends Product with Serializable
- class TopKMonoid[T] extends Monoid[TopK[T]]
A top-k monoid that is much faster than SortedListTake equivalent to: (left ++ right).sorted.take(k) but doesn't do a total sort If you can handle the mutability, mutable.PriorityQueueMonoid is even faster.
A top-k monoid that is much faster than SortedListTake equivalent to: (left ++ right).sorted.take(k) but doesn't do a total sort If you can handle the mutability, mutable.PriorityQueueMonoid is even faster.
NOTE!!!! This assumes the inputs are already sorted! resorting each time kills speed
- class TopKToListAggregator[A] extends MonoidAggregator[A, TopK[A], List[A]]
- case class TopNCMSAggregator[K](cmsMonoid: TopNCMSMonoid[K]) extends TopCMSAggregator[K] with Product with Serializable
An Aggregator for TopNCMS.
An Aggregator for TopNCMS. Can be created using TopNCMS.aggregator.
- class TopNCMSMonoid[K] extends TopCMSMonoid[K]
Monoid for top-N based TopCMS sketches.
Monoid for top-N based TopCMS sketches. Use with care! (see warning below)
Warning: Adding top-N CMS instances (
++
) is an unsafe operationTop-N computations are not associative. The effect is that a top-N CMS has an ordering bias (with regard to heavy hitters) when merging CMS instances (e.g. via
++
). This means merging heavy hitters across CMS instances may lead to incorrect, biased results: the outcome is biased by the order in which CMS instances / heavy hitters are being merged, with the rule of thumb being that the earlier a set of heavy hitters is being merged, the more likely is the end result biased towards these heavy hitters.The warning above only applies when adding CMS instances (think:
cms1 ++ cms2
). In comparison, heavy hitters are correctly computed when:- a top-N CMS instance is created from a single data stream, i.e.
Seq[K]
- items are added/counted individually, i.e.
cms + item
orcms + (item, count)
.
See the discussion in Algebird issue 353 for further details.
Alternatives
The following, alternative data structures may be better picks than a top-N based CMS given the warning above:
- TopPctCMS: Has safe merge semantics for its instances including heavy hitters.
- SpaceSaver: Has the same ordering bias than a top-N CMS, but at least it provides bounds on the bias.
Usage
The type
K
is the type of items you want to count. You must provide an implicitCMSHasher[K]
forK
, and Algebird ships with several such implicits for commonly used types such asLong
andBigInt
.If your type
K
is not supported out of the box, you have two options: 1) You provide a "translation" function to convert items of your (unsupported) typeK
to a supported type such as Double, and then use thecontramap
function of CMSHasher to create the requiredCMSHasher[K]
for your type (see the documentation of CMSHasher for an example); 2) You implement aCMSHasher[K]
from scratch, using the existing CMSHasher implementations as a starting point.Note: Because Arrays in Scala/Java not have sane
equals
andhashCode
implementations, you cannot safely use types such asArray[Byte]
. Extra work is required for Arrays. For example, you may opt to convertArray[T]
to aSeq[T]
viatoSeq
, or you can provide appropriate wrapper classes. Algebird provides one such wrapper class, Bytes, to safely wrap anArray[Byte]
for use with CMS.- K
The type used to identify the elements to be counted. For example, if you want to count the occurrence of user names, you could map each username to a unique numeric ID expressed as a
Long
, and then count the occurrences of thoseLong
s with a CMS of typeK=Long
. Note that this mapping between the elements of your problem domain and their identifiers used for counting via CMS should be bijective. We require a CMSHasher context bound forK
, see CMSHasher for available implicits that can be imported. Which type K should you pick in practice? For domains that have less than2^64
unique elements, you'd typically use
Long. For larger domains you can try
BigInt, for example.
- a top-N CMS instance is created from a single data stream, i.e.
- case class TopNLogic[K](heavyHittersN: Int) extends HeavyHittersLogic[K] with Product with Serializable
Tracks the top N heavy hitters, where
N
is defined byheavyHittersN
.Tracks the top N heavy hitters, where
N
is defined byheavyHittersN
.Warning: top-N computations are not associative. The effect is that a top-N CMS has an ordering bias (with regard to heavy hitters) when merging instances. This means merging heavy hitters across CMS instances may lead to incorrect, biased results: the outcome is biased by the order in which CMS instances / heavy hitters are being merged, with the rule of thumb being that the earlier a set of heavy hitters is being merged, the more likely is the end result biased towards these heavy hitters.
- See also
Discussion in Algebird issue 353
- case class TopPctCMSAggregator[K](cmsMonoid: TopPctCMSMonoid[K]) extends TopCMSAggregator[K] with Product with Serializable
An Aggregator for TopPctCMS.
An Aggregator for TopPctCMS. Can be created using TopPctCMS.aggregator.
- class TopPctCMSMonoid[K] extends TopCMSMonoid[K]
Monoid for Top-% based TopCMS sketches.
Monoid for Top-% based TopCMS sketches.
Usage
The type
K
is the type of items you want to count. You must provide an implicitCMSHasher[K]
forK
, and Algebird ships with several such implicits for commonly used types such asLong
andBigInt
.If your type
K
is not supported out of the box, you have two options: 1) You provide a "translation" function to convert items of your (unsupported) typeK
to a supported type such as Double, and then use thecontramap
function of CMSHasher to create the requiredCMSHasher[K]
for your type (see the documentation of CMSHasher for an example); 2) You implement aCMSHasher[K]
from scratch, using the existing CMSHasher implementations as a starting point.Note: Because Arrays in Scala/Java not have sane
equals
andhashCode
implementations, you cannot safely use types such asArray[Byte]
. Extra work is required for Arrays. For example, you may opt to convertArray[T]
to aSeq[T]
viatoSeq
, or you can provide appropriate wrapper classes. Algebird provides one such wrapper class, Bytes, to safely wrap anArray[Byte]
for use with CMS.- K
The type used to identify the elements to be counted. For example, if you want to count the occurrence of user names, you could map each username to a unique numeric ID expressed as a
Long
, and then count the occurrences of thoseLong
s with a CMS of typeK=Long
. Note that this mapping between the elements of your problem domain and their identifiers used for counting via CMS should be bijective. We require a CMSHasher context bound forK
, see CMSHasher for available implicits that can be imported. Which type K should you pick in practice? For domains that have less than2^64
unique elements, you'd typically use
Long. For larger domains you can try
BigInt, for example.
- case class TopPctLogic[K](heavyHittersPct: Double) extends HeavyHittersLogic[K] with Product with Serializable
Finds all heavy hitters, i.e., elements in the stream that appear at least
(heavyHittersPct * totalCount)
times.Finds all heavy hitters, i.e., elements in the stream that appear at least
(heavyHittersPct * totalCount)
times.Every item that appears at least
(heavyHittersPct * totalCount)
times is output, and with probabilityp >= 1 - delta
, no item whose count is less than(heavyHittersPct - eps) * totalCount
is output.This also means that this parameter is an upper bound on the number of heavy hitters that will be tracked: the set of heavy hitters contains at most
1 / heavyHittersPct
elements. For example, ifheavyHittersPct=0.01
(or 0.25), then at most1 / 0.01 = 100
items (or1 / 0.25 = 4
items) will be tracked/returned as heavy hitters. This parameter can thus control the memory footprint required for tracking heavy hitters. - class Tuple10Group[A, B, C, D, E, F, G, H, I, J] extends Tuple10Monoid[A, B, C, D, E, F, G, H, I, J] with Group[(A, B, C, D, E, F, G, H, I, J)]
Combine 10 groups into a product group
- class Tuple10Monoid[A, B, C, D, E, F, G, H, I, J] extends Tuple10Semigroup[A, B, C, D, E, F, G, H, I, J] with Monoid[(A, B, C, D, E, F, G, H, I, J)]
Combine 10 monoids into a product monoid
- class Tuple10Ring[A, B, C, D, E, F, G, H, I, J] extends Tuple10Group[A, B, C, D, E, F, G, H, I, J] with Ring[(A, B, C, D, E, F, G, H, I, J)]
Combine 10 rings into a product ring
- class Tuple10Semigroup[A, B, C, D, E, F, G, H, I, J] extends Semigroup[(A, B, C, D, E, F, G, H, I, J)]
Combine 10 semigroups into a product semigroup
- class Tuple11Group[A, B, C, D, E, F, G, H, I, J, K] extends Tuple11Monoid[A, B, C, D, E, F, G, H, I, J, K] with Group[(A, B, C, D, E, F, G, H, I, J, K)]
Combine 11 groups into a product group
- class Tuple11Monoid[A, B, C, D, E, F, G, H, I, J, K] extends Tuple11Semigroup[A, B, C, D, E, F, G, H, I, J, K] with Monoid[(A, B, C, D, E, F, G, H, I, J, K)]
Combine 11 monoids into a product monoid
- class Tuple11Ring[A, B, C, D, E, F, G, H, I, J, K] extends Tuple11Group[A, B, C, D, E, F, G, H, I, J, K] with Ring[(A, B, C, D, E, F, G, H, I, J, K)]
Combine 11 rings into a product ring
- class Tuple11Semigroup[A, B, C, D, E, F, G, H, I, J, K] extends Semigroup[(A, B, C, D, E, F, G, H, I, J, K)]
Combine 11 semigroups into a product semigroup
- class Tuple12Group[A, B, C, D, E, F, G, H, I, J, K, L] extends Tuple12Monoid[A, B, C, D, E, F, G, H, I, J, K, L] with Group[(A, B, C, D, E, F, G, H, I, J, K, L)]
Combine 12 groups into a product group
- class Tuple12Monoid[A, B, C, D, E, F, G, H, I, J, K, L] extends Tuple12Semigroup[A, B, C, D, E, F, G, H, I, J, K, L] with Monoid[(A, B, C, D, E, F, G, H, I, J, K, L)]
Combine 12 monoids into a product monoid
- class Tuple12Ring[A, B, C, D, E, F, G, H, I, J, K, L] extends Tuple12Group[A, B, C, D, E, F, G, H, I, J, K, L] with Ring[(A, B, C, D, E, F, G, H, I, J, K, L)]
Combine 12 rings into a product ring
- class Tuple12Semigroup[A, B, C, D, E, F, G, H, I, J, K, L] extends Semigroup[(A, B, C, D, E, F, G, H, I, J, K, L)]
Combine 12 semigroups into a product semigroup
- class Tuple13Group[A, B, C, D, E, F, G, H, I, J, K, L, M] extends Tuple13Monoid[A, B, C, D, E, F, G, H, I, J, K, L, M] with Group[(A, B, C, D, E, F, G, H, I, J, K, L, M)]
Combine 13 groups into a product group
- class Tuple13Monoid[A, B, C, D, E, F, G, H, I, J, K, L, M] extends Tuple13Semigroup[A, B, C, D, E, F, G, H, I, J, K, L, M] with Monoid[(A, B, C, D, E, F, G, H, I, J, K, L, M)]
Combine 13 monoids into a product monoid
- class Tuple13Ring[A, B, C, D, E, F, G, H, I, J, K, L, M] extends Tuple13Group[A, B, C, D, E, F, G, H, I, J, K, L, M] with Ring[(A, B, C, D, E, F, G, H, I, J, K, L, M)]
Combine 13 rings into a product ring
- class Tuple13Semigroup[A, B, C, D, E, F, G, H, I, J, K, L, M] extends Semigroup[(A, B, C, D, E, F, G, H, I, J, K, L, M)]
Combine 13 semigroups into a product semigroup
- class Tuple14Group[A, B, C, D, E, F, G, H, I, J, K, L, M, N] extends Tuple14Monoid[A, B, C, D, E, F, G, H, I, J, K, L, M, N] with Group[(A, B, C, D, E, F, G, H, I, J, K, L, M, N)]
Combine 14 groups into a product group
- class Tuple14Monoid[A, B, C, D, E, F, G, H, I, J, K, L, M, N] extends Tuple14Semigroup[A, B, C, D, E, F, G, H, I, J, K, L, M, N] with Monoid[(A, B, C, D, E, F, G, H, I, J, K, L, M, N)]
Combine 14 monoids into a product monoid
- class Tuple14Ring[A, B, C, D, E, F, G, H, I, J, K, L, M, N] extends Tuple14Group[A, B, C, D, E, F, G, H, I, J, K, L, M, N] with Ring[(A, B, C, D, E, F, G, H, I, J, K, L, M, N)]
Combine 14 rings into a product ring
- class Tuple14Semigroup[A, B, C, D, E, F, G, H, I, J, K, L, M, N] extends Semigroup[(A, B, C, D, E, F, G, H, I, J, K, L, M, N)]
Combine 14 semigroups into a product semigroup
- class Tuple15Group[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O] extends Tuple15Monoid[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O] with Group[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O)]
Combine 15 groups into a product group
- class Tuple15Monoid[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O] extends Tuple15Semigroup[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O] with Monoid[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O)]
Combine 15 monoids into a product monoid
- class Tuple15Ring[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O] extends Tuple15Group[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O] with Ring[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O)]
Combine 15 rings into a product ring
- class Tuple15Semigroup[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O] extends Semigroup[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O)]
Combine 15 semigroups into a product semigroup
- class Tuple16Group[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P] extends Tuple16Monoid[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P] with Group[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P)]
Combine 16 groups into a product group
- class Tuple16Monoid[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P] extends Tuple16Semigroup[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P] with Monoid[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P)]
Combine 16 monoids into a product monoid
- class Tuple16Ring[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P] extends Tuple16Group[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P] with Ring[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P)]
Combine 16 rings into a product ring
- class Tuple16Semigroup[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P] extends Semigroup[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P)]
Combine 16 semigroups into a product semigroup
- class Tuple17Group[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q] extends Tuple17Monoid[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q] with Group[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q)]
Combine 17 groups into a product group
- class Tuple17Monoid[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q] extends Tuple17Semigroup[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q] with Monoid[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q)]
Combine 17 monoids into a product monoid
- class Tuple17Ring[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q] extends Tuple17Group[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q] with Ring[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q)]
Combine 17 rings into a product ring
- class Tuple17Semigroup[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q] extends Semigroup[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q)]
Combine 17 semigroups into a product semigroup
- class Tuple18Group[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R] extends Tuple18Monoid[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R] with Group[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R)]
Combine 18 groups into a product group
- class Tuple18Monoid[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R] extends Tuple18Semigroup[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R] with Monoid[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R)]
Combine 18 monoids into a product monoid
- class Tuple18Ring[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R] extends Tuple18Group[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R] with Ring[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R)]
Combine 18 rings into a product ring
- class Tuple18Semigroup[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R] extends Semigroup[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R)]
Combine 18 semigroups into a product semigroup
- class Tuple19Group[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S] extends Tuple19Monoid[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S] with Group[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S)]
Combine 19 groups into a product group
- class Tuple19Monoid[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S] extends Tuple19Semigroup[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S] with Monoid[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S)]
Combine 19 monoids into a product monoid
- class Tuple19Ring[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S] extends Tuple19Group[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S] with Ring[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S)]
Combine 19 rings into a product ring
- class Tuple19Semigroup[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S] extends Semigroup[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S)]
Combine 19 semigroups into a product semigroup
- class Tuple20Group[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T] extends Tuple20Monoid[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T] with Group[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T)]
Combine 20 groups into a product group
- class Tuple20Monoid[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T] extends Tuple20Semigroup[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T] with Monoid[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T)]
Combine 20 monoids into a product monoid
- class Tuple20Ring[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T] extends Tuple20Group[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T] with Ring[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T)]
Combine 20 rings into a product ring
- class Tuple20Semigroup[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T] extends Semigroup[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T)]
Combine 20 semigroups into a product semigroup
- class Tuple21Group[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U] extends Tuple21Monoid[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U] with Group[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U)]
Combine 21 groups into a product group
- class Tuple21Monoid[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U] extends Tuple21Semigroup[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U] with Monoid[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U)]
Combine 21 monoids into a product monoid
- class Tuple21Ring[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U] extends Tuple21Group[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U] with Ring[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U)]
Combine 21 rings into a product ring
- class Tuple21Semigroup[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U] extends Semigroup[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U)]
Combine 21 semigroups into a product semigroup
- class Tuple22Group[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V] extends Tuple22Monoid[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V] with Group[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V)]
Combine 22 groups into a product group
- class Tuple22Monoid[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V] extends Tuple22Semigroup[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V] with Monoid[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V)]
Combine 22 monoids into a product monoid
- class Tuple22Ring[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V] extends Tuple22Group[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V] with Ring[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V)]
Combine 22 rings into a product ring
- class Tuple22Semigroup[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V] extends Semigroup[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V)]
Combine 22 semigroups into a product semigroup
- class Tuple2Group[A, B] extends Tuple2Monoid[A, B] with Group[(A, B)]
Combine 2 groups into a product group
- class Tuple2Monoid[A, B] extends Tuple2Semigroup[A, B] with Monoid[(A, B)]
Combine 2 monoids into a product monoid
- class Tuple2Ring[A, B] extends Tuple2Group[A, B] with Ring[(A, B)]
Combine 2 rings into a product ring
- class Tuple2Semigroup[A, B] extends Semigroup[(A, B)]
Combine 2 semigroups into a product semigroup
- class Tuple3Group[A, B, C] extends Tuple3Monoid[A, B, C] with Group[(A, B, C)]
Combine 3 groups into a product group
- class Tuple3Monoid[A, B, C] extends Tuple3Semigroup[A, B, C] with Monoid[(A, B, C)]
Combine 3 monoids into a product monoid
- class Tuple3Ring[A, B, C] extends Tuple3Group[A, B, C] with Ring[(A, B, C)]
Combine 3 rings into a product ring
- class Tuple3Semigroup[A, B, C] extends Semigroup[(A, B, C)]
Combine 3 semigroups into a product semigroup
- class Tuple4Group[A, B, C, D] extends Tuple4Monoid[A, B, C, D] with Group[(A, B, C, D)]
Combine 4 groups into a product group
- class Tuple4Monoid[A, B, C, D] extends Tuple4Semigroup[A, B, C, D] with Monoid[(A, B, C, D)]
Combine 4 monoids into a product monoid
- class Tuple4Ring[A, B, C, D] extends Tuple4Group[A, B, C, D] with Ring[(A, B, C, D)]
Combine 4 rings into a product ring
- class Tuple4Semigroup[A, B, C, D] extends Semigroup[(A, B, C, D)]
Combine 4 semigroups into a product semigroup
- class Tuple5Group[A, B, C, D, E] extends Tuple5Monoid[A, B, C, D, E] with Group[(A, B, C, D, E)]
Combine 5 groups into a product group
- class Tuple5Monoid[A, B, C, D, E] extends Tuple5Semigroup[A, B, C, D, E] with Monoid[(A, B, C, D, E)]
Combine 5 monoids into a product monoid
- class Tuple5Ring[A, B, C, D, E] extends Tuple5Group[A, B, C, D, E] with Ring[(A, B, C, D, E)]
Combine 5 rings into a product ring
- class Tuple5Semigroup[A, B, C, D, E] extends Semigroup[(A, B, C, D, E)]
Combine 5 semigroups into a product semigroup
- class Tuple6Group[A, B, C, D, E, F] extends Tuple6Monoid[A, B, C, D, E, F] with Group[(A, B, C, D, E, F)]
Combine 6 groups into a product group
- class Tuple6Monoid[A, B, C, D, E, F] extends Tuple6Semigroup[A, B, C, D, E, F] with Monoid[(A, B, C, D, E, F)]
Combine 6 monoids into a product monoid
- class Tuple6Ring[A, B, C, D, E, F] extends Tuple6Group[A, B, C, D, E, F] with Ring[(A, B, C, D, E, F)]
Combine 6 rings into a product ring
- class Tuple6Semigroup[A, B, C, D, E, F] extends Semigroup[(A, B, C, D, E, F)]
Combine 6 semigroups into a product semigroup
- class Tuple7Group[A, B, C, D, E, F, G] extends Tuple7Monoid[A, B, C, D, E, F, G] with Group[(A, B, C, D, E, F, G)]
Combine 7 groups into a product group
- class Tuple7Monoid[A, B, C, D, E, F, G] extends Tuple7Semigroup[A, B, C, D, E, F, G] with Monoid[(A, B, C, D, E, F, G)]
Combine 7 monoids into a product monoid
- class Tuple7Ring[A, B, C, D, E, F, G] extends Tuple7Group[A, B, C, D, E, F, G] with Ring[(A, B, C, D, E, F, G)]
Combine 7 rings into a product ring
- class Tuple7Semigroup[A, B, C, D, E, F, G] extends Semigroup[(A, B, C, D, E, F, G)]
Combine 7 semigroups into a product semigroup
- class Tuple8Group[A, B, C, D, E, F, G, H] extends Tuple8Monoid[A, B, C, D, E, F, G, H] with Group[(A, B, C, D, E, F, G, H)]
Combine 8 groups into a product group
- class Tuple8Monoid[A, B, C, D, E, F, G, H] extends Tuple8Semigroup[A, B, C, D, E, F, G, H] with Monoid[(A, B, C, D, E, F, G, H)]
Combine 8 monoids into a product monoid
- class Tuple8Ring[A, B, C, D, E, F, G, H] extends Tuple8Group[A, B, C, D, E, F, G, H] with Ring[(A, B, C, D, E, F, G, H)]
Combine 8 rings into a product ring
- class Tuple8Semigroup[A, B, C, D, E, F, G, H] extends Semigroup[(A, B, C, D, E, F, G, H)]
Combine 8 semigroups into a product semigroup
- class Tuple9Group[A, B, C, D, E, F, G, H, I] extends Tuple9Monoid[A, B, C, D, E, F, G, H, I] with Group[(A, B, C, D, E, F, G, H, I)]
Combine 9 groups into a product group
- class Tuple9Monoid[A, B, C, D, E, F, G, H, I] extends Tuple9Semigroup[A, B, C, D, E, F, G, H, I] with Monoid[(A, B, C, D, E, F, G, H, I)]
Combine 9 monoids into a product monoid
- class Tuple9Ring[A, B, C, D, E, F, G, H, I] extends Tuple9Group[A, B, C, D, E, F, G, H, I] with Ring[(A, B, C, D, E, F, G, H, I)]
Combine 9 rings into a product ring
- class Tuple9Semigroup[A, B, C, D, E, F, G, H, I] extends Semigroup[(A, B, C, D, E, F, G, H, I)]
Combine 9 semigroups into a product semigroup
- case class Universe[T]() extends Interval[T] with Product with Serializable
- class UnsafeFromAlgebraRig[T] extends Ring[T]
In some legacy cases, we have implemented Rings where we lacked the full laws.
In some legacy cases, we have implemented Rings where we lacked the full laws. This allows you to be precise (only implement the structure you have), but unsafely use it as a Ring in legacy code that is expecting a Ring.
- class UnsafeFromAlgebraRng[T] extends Ring[T]
In some legacy cases, we have implemented Rings where we lacked the full laws.
In some legacy cases, we have implemented Rings where we lacked the full laws. This allows you to be precise (only implement the structure you have), but unsafely use it as a Ring in legacy code that is expecting a Ring.
- sealed trait Upper[T] extends Interval[T]
- trait VectorSpace[F, C[_]] extends Serializable
- Annotations
- @implicitNotFound()
- sealed trait VectorSpaceOps extends AnyRef
- case class Window[T](total: T, items: Queue[T]) extends Product with Serializable
Convenience case class defined with a monoid for aggregating elements over a finite window.
Convenience case class defined with a monoid for aggregating elements over a finite window.
- total
Known running total of
T
- items
queue of known trailing elements. Example usage: case class W28[T](window: Window[T]) { def total = this.window.total def items = this.window.items def size \= this.window.size } object W28 { val windowSize = 28 def apply[T](v: T): W28[T] = W28[T](Window(v)) implicit def w28Monoid[T](implicit p: Priority[Group[T], Monoid[T]]): Monoid[W28[T]] = new Monoid[W28[T]] { private val WT: Monoid[Window[T]] = Window.monoid[T](windowSize) def zero = W28[T](WT.zero) def plus(a: W28[T], b: W28[T]): W28[T] = W28[T](WT.plus(a.window, b.window)) } } val elements = getElements() val trailing90Totals = elements .map{ W90 } .scanLeft(W90(0)) { (a, b) => a + b } .map{ _.total }
- abstract class WindowMonoid[T] extends Monoid[Window[T]]
Provides a natural monoid for combining windows truncated to some window size.
- final case class WindowMonoidFromGroup[T](windowSize: Int)(implicit group: Group[T]) extends WindowMonoid[T] with Product with Serializable
- final case class WindowMonoidFromMonoid[T](windowSize: Int)(implicit m: Monoid[T]) extends WindowMonoid[T] with Product with Serializable
Deprecated Type Members
- case class BitSetLite(in: Array[Byte]) extends Product with Serializable
A super lightweight (hopefully) version of BitSet
A super lightweight (hopefully) version of BitSet
- Annotations
- @deprecated
- Deprecated
(Since version 0.12.3) This is no longer used.
- class MinusOp[T] extends AnyRef
- Annotations
- @deprecated
- Deprecated
(Since version 0.13.8) use Operators.Ops
- class PlusOp[T] extends AnyRef
- Annotations
- @deprecated
- Deprecated
(Since version 0.13.8) use Operators.Ops
- class TimesOp[T] extends AnyRef
- Annotations
- @deprecated
- Deprecated
(Since version 0.13.8) use Operators.Ops
Value Members
- val Field: algebra.ring.Field.type
- object AdaptiveVector
Some functions to create or convert AdaptiveVectors
- object AdjoinedUnit extends Serializable
- object AffineFunction extends Serializable
- object Aggregator extends Serializable
Aggregators compose well.
Aggregators compose well.
To create a parallel aggregator that operates on a single input in parallel, use: GeneratedTupleAggregator.from2((agg1, agg2))
- object AndVal extends Serializable
- object AndValMonoid extends Monoid[AndVal]
Boolean AND monoid.
Boolean AND monoid. plus means logical AND, zero is true.
- object Applicative
Follows the type-class pattern for the Applicative trait
- object Approximate extends Serializable
- object ApproximateBoolean extends Serializable
- object ArrayBufferedOperation extends Serializable
- object AveragedGroup extends Group[AveragedValue] with CommutativeGroup[AveragedValue]
Group implementation for AveragedValue.
- object AveragedValue extends Serializable
Provides a set of operations needed to create and use AveragedValue instances.
- object Averager extends MonoidAggregator[Double, AveragedValue, Double]
Aggregator that uses AveragedValue to calculate the mean of all
Double
values in the stream.Aggregator that uses AveragedValue to calculate the mean of all
Double
values in the stream. Each Double value receives a count of 1 during aggregation. - object BF extends Serializable
- object BFInstance extends Serializable
- object Batched extends Serializable
- object BigDecimalRing extends NumericRing[BigDecimal]
- object BigIntRing extends NumericRing[BigInt]
- object BloomFilter
- object BloomFilterAggregator extends Serializable
- object BooleanRing extends Ring[Boolean]
- object Bytes extends Serializable
- object CMS extends Serializable
- object CMSFunctions
Helper functions to generate or to translate between various CMS parameters (cf.
Helper functions to generate or to translate between various CMS parameters (cf. CMSParams).
- object CMSHasher extends Serializable
- object CMSHasherImplicits
This formerly held the instances that moved to object CMSHasher
This formerly held the instances that moved to object CMSHasher
These instances are slow, but here for compatibility with old serialized data. For new code, avoid these and instead use the implicits found in the CMSHasher companion object.
- object CMSInstance extends Serializable
- object Correlation extends Serializable
- object CorrelationAggregator extends MonoidAggregator[(Double, Double), Correlation, Correlation]
- object CorrelationMonoid extends Monoid[Correlation]
- object DecayedValue extends Serializable
- object DecayedVector extends CompatDecayedVector with Serializable
Represents a container class together with time.
Represents a container class together with time. Its monoid consists of exponentially scaling the older value and summing with the newer one.
- object DecayingCMS extends Serializable
- object DoubleRing extends Ring[Double]
- object ExpHist extends Serializable
- object First extends FirstInstances with Serializable
Provides a set of operations and typeclass instances needed to use First instances.
- object FlatMapPreparer extends Serializable
- object FloatRing extends Ring[Float]
- object Fold extends CompatFold with Serializable
Methods to create and run Folds.
Methods to create and run Folds.
The Folds defined here are immutable and serializable, which we expect by default. It is important that you as a user indicate mutability or non-serializability when defining new Folds. Additionally, it is recommended that "end" functions not mutate the accumulator in order to support scans (producing a stream of intermediate outputs by calling "end" at each step).
- object Functor
Follows the type-class pattern for the Functor trait
- object GeneratedTupleAggregator extends GeneratedTupleAggregator
- object Group extends GeneratedGroupImplicits with ProductGroups with FromAlgebraGroupImplicit0 with Serializable
- object Hash128 extends Serializable
This gives default hashes using Murmur128 with a seed of 12345678 (for no good reason, but it should not be changed lest we break serialized HLLs)
- object HeavyHitters extends Serializable
- object HyperLogLog
Implementation of the HyperLogLog approximate counting as a Monoid
Implementation of the HyperLogLog approximate counting as a Monoid
- See also
http://algo.inria.fr/flajolet/Publications/FlFuGaMe07.pdf HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm Philippe Flajolet and Éric Fusy and Olivier Gandouet and Frédéric Meunier
- object HyperLogLogAggregator extends Serializable
This object makes it easier to create Aggregator instances that use HLL
- object Identity extends Serializable
- object IdentityMonad extends Monad[Identity]
- object IntRing extends Ring[Int]
- object IntegralPredecessible extends Serializable
- object IntegralSuccessible extends Serializable
- object Interval extends Serializable
- object JBoolRing extends Ring[Boolean]
- object JDoubleRing extends Ring[Double]
- object JFloatRing extends Ring[Float]
- object JIntRing extends Ring[Integer]
- object JLongRing extends Ring[Long]
- object JShortRing extends Ring[Short]
- object Last extends LastInstances with Serializable
Provides a set of operations and typeclass instances needed to use Last instances.
- object LongRing extends Ring[Long]
- object MapAggregator extends Serializable
- object MapAlgebra
- object MapPreparer extends Serializable
- object Max extends MaxInstances with Serializable
Provides a set of operations and typeclass instances needed to use Max instances.
- object Metric extends Serializable
A Metric[V] m is a function (V, V) => Double that satisfies the following properties:
A Metric[V] m is a function (V, V) => Double that satisfies the following properties:
- m(v1, v2) >= 0 2. m(v1, v2) == 0 iff v1 == v2 3. m(v1, v2) == m(v2, v1) 4. m(v1, v3) <= m(v1, v2) + m(v2, v3)
If you implement this trait, make sure that you follow these rules.
- object Min extends MinInstances with Serializable
Provides a set of operations and typeclass instances needed to use Min instances.
- object MinHasher extends Serializable
- object MinPlus extends Serializable
- case object MinPlusZero extends MinPlus[Nothing] with Product with Serializable
- object Moments extends Serializable
- object MomentsAggregator extends MonoidAggregator[Double, Moments, Moments]
- object Monad
Follows the type-class pattern for the Monad trait
- object Monoid extends GeneratedMonoidImplicits with ProductMonoids with FromAlgebraMonoidImplicit0 with Serializable
- object MultiAggregator
- object NullGroup extends ConstantGroup[Null]
- object Operators
- object OrVal extends Serializable
- object OrValMonoid extends Monoid[OrVal]
Boolean OR monoid.
Boolean OR monoid. plus means logical OR, zero is false.
- object Predecessible extends Serializable
- object Preparer extends Serializable
- object Priority extends FindPreferred
- object QTree extends Serializable
A QTree provides an approximate Map[Double,A:Monoid] suitable for range queries, quantile queries, and combinations of these (for example, if you use a numeric A, you can derive the inter-quartile mean).
A QTree provides an approximate Map[Double,A:Monoid] suitable for range queries, quantile queries, and combinations of these (for example, if you use a numeric A, you can derive the inter-quartile mean).
It is loosely related to the Q-Digest data structure from http://www.cs.virginia.edu/~son/cs851/papers/ucsb.sensys04.pdf, but using an immutable tree structure, and carrying a generalized sum (of type A) at each node instead of just a count.
The basic idea is to keep a binary tree, where the root represents the entire range of the input keys, and each child node represents either the lower or upper half of its parent's range. Ranges are constrained to be dyadic intervals (https://en.wikipedia.org/wiki/Interval_(mathematics)#Dyadic_intervals) for ease of merging.
To keep the size bounded, the total count carried by any sub-tree must be at least 1/(2^k) of the total count at the root. Any sub-trees that do not meet this criteria have their children pruned and become leaves. (It's important that they not be pruned away entirely, but that we keep a fringe of low-count leaves that can gain weight over time and ultimately split again when warranted).
Quantile and range queries both give hard upper and lower bounds; the true result will be somewhere in the range given.
Keys must be >= 0.
- object QTreeAggregator extends Serializable
- object ResetState
- object RichCBitSet
- object RightFolded
This is an associative, but not commutative monoid Also, you must start on the right, with a value, and all subsequent RightFolded must be RightFoldedToFold objects or zero
This is an associative, but not commutative monoid Also, you must start on the right, with a value, and all subsequent RightFolded must be RightFoldedToFold objects or zero
If you add two Folded values together, you always get the one on the left, so this forms a kind of reset of the fold.
- object RightFolded2
This monoid takes a list of values of type In or Out, and folds to the right all the Ins into Out values, leaving you with a list of Out values, then finally, maps those outs onto Acc, where there is a group, and adds all the Accs up.
This monoid takes a list of values of type In or Out, and folds to the right all the Ins into Out values, leaving you with a list of Out values, then finally, maps those outs onto Acc, where there is a group, and adds all the Accs up. So, if you have a list: I I I O I O O I O I O the monoid is equivalent to the computation:
map(fold(List(I,I,I),O)) + map(fold(List(I),O)) + map(fold(List(),O)) + map(fold(List(I),O)) + map(fold(List(I),O))
This models a version of the map/reduce paradigm, where the fold happens on the mappers for each group on Ins, and then they are mapped to Accs, sent to a single reducer and all the Accs are added up.
- case object RightFoldedZero extends RightFolded[Nothing, Nothing] with Product with Serializable
- case object RightFoldedZero2 extends RightFolded2[Nothing, Nothing, Nothing] with Product with Serializable
- object Ring extends GeneratedRingImplicits with ProductRings with RingImplicits0 with Serializable
- object SGD
- object SGDPos extends Serializable
- object SGDWeights extends Serializable
- case object SGDZero extends SGD[Nothing] with Product with Serializable
- object SSMany extends Serializable
- object Scan extends Serializable
- object ScopedTopNCMS
- object Semigroup extends GeneratedSemigroupImplicits with ProductSemigroups with FromAlgebraSemigroupImplicit0 with Serializable
- object SetDiff extends Serializable
- object ShortRing extends Ring[Short]
- object SketchMap extends Serializable
Data structure representing an approximation of Map[K, V], where V has an implicit ordering and monoid.
Data structure representing an approximation of Map[K, V], where V has an implicit ordering and monoid. This is a more generic version of CountMinSketch.
Values are stored in valuesTable, a 2D vector containing aggregated sums of values inserted to the Sketch Map.
The data structure stores top non-zero values, called Heavy Hitters. The values are sorted by an implicit reverse ordering for the value, and the number of heavy hitters stored is based on the heavyHittersCount set in params.
Use SketchMapMonoid to create instances of this class.
- object SketchMapParams extends Serializable
- object SpaceSaver
- object SparseCMS extends Serializable
- object StringMonoid extends Monoid[String]
- object Successible extends Serializable
- object SummingCache extends Serializable
- object SummingIterator extends Serializable
Creates an Iterator that emits partial sums of an input Iterator[V].
Creates an Iterator that emits partial sums of an input Iterator[V]. Generally this is useful to change from processing individual Vs to possibly blocks of V @see SummingQueue or a cache of recent Keys in a V=Map[K,W] case: @see SummingCache
- object SummingQueue extends Serializable
- object SummingWithHitsCache extends Serializable
- object TopCMSInstance extends Serializable
- object TopKMonoid extends Serializable
- object TopNCMS
- object TopPctCMS
- object UnitGroup extends ConstantGroup[Unit]
- object VectorSpace extends VectorSpaceOps with Implicits with Serializable
This class represents a vector space.
This class represents a vector space. For the required properties see:
http://en.wikipedia.org/wiki/Vector_space#Definition
- object Window extends Serializable
- object field
This is here to ease transition to using algebra.Field as the field type.
This is here to ease transition to using algebra.Field as the field type. Intended use is to do:
{code} import com.twitter.algebird.field._ {/code}
Note, this are not strictly lawful since floating point arithmetic using IEEE-754 is only approximately associative and distributive.
Deprecated Value Members
- object MomentsGroup extends MomentsMonoid with Group[Moments] with CommutativeGroup[Moments]
This should not be used as a group (avoid negate and minus).
This should not be used as a group (avoid negate and minus). It was wrongly believed that this was a group for several years in this code, however it was only being tested with positive counts (which is to say the generators were too weak). It isn't the case that minus and negate are totally wrong but (a - a) + b in general isn't associative: it won't equal a - (a - b) which it should.
- Annotations
- @deprecated
- Deprecated
(Since version 0.13.8) use Moments.momentsMonoid, this isn't lawful for negative counts