class CMSMonoid[K] extends Monoid[CMS[K]] with CommutativeMonoid[CMS[K]]
Monoid for adding CMS sketches.
Usage
eps
and delta
are parameters that bound the error of each query estimate. For example, errors in
answering point queries (e.g., how often has element x appeared in the stream described by the sketch?) are
often of the form: "with probability p >= 1 - delta, the estimate is close to the truth by some factor
depending on eps."
The type K
is the type of items you want to count. You must provide an implicit CMSHasher[K]
for K
,
and Algebird ships with several such implicits for commonly used types such as Long
and BigInt
.
If your type K
is not supported out of the box, you have two options: 1) You provide a "translation"
function to convert items of your (unsupported) type K
to a supported type such as Double, and then use
the contramap
function of CMSHasher to create the required CMSHasher[K]
for your type (see the
documentation of CMSHasher for an example); 2) You implement a CMSHasher[K]
from scratch, using the
existing CMSHasher implementations as a starting point.
Note: Because Arrays in Scala/Java not have sane equals
and hashCode
implementations, you cannot safely
use types such as Array[Byte]
. Extra work is required for Arrays. For example, you may opt to convert
Array[T]
to a Seq[T]
via toSeq
, or you can provide appropriate wrapper classes. Algebird provides one
such wrapper class, Bytes, to safely wrap an Array[Byte]
for use with CMS.
- K
The type used to identify the elements to be counted. For example, if you want to count the occurrence of user names, you could map each username to a unique numeric ID expressed as a
Long
, and then count the occurrences of thoseLong
s with a CMS of typeK=Long
. Note that this mapping between the elements of your problem domain and their identifiers used for counting via CMS should be bijective. We require a CMSHasher context bound forK
, see CMSHasherImplicits for available implicits that can be imported. Which type K should you pick in practice? For domains that have less than2^64
unique elements, you'd typically use
Long. For larger domains you can try
BigInt, for example. Other possibilities include Spire's
SafeLongand
Numericaldata types (https://github.com/non/spire), though Algebird does not include the required implicits for CMS-hashing (cf. CMSHasherImplicits.
- Alphabetic
- By Inheritance
- CMSMonoid
- CommutativeMonoid
- CommutativeSemigroup
- Monoid
- AdditiveMonoid
- Monoid
- Semigroup
- AdditiveSemigroup
- Semigroup
- Serializable
- AnyRef
- Any
- Hide All
- Show All
- Public
- Protected
Instance Constructors
- new CMSMonoid(eps: Double, delta: Double, seed: Int, maxExactCountOpt: Option[Int] = None)(implicit arg0: CMSHasher[K])
- eps
One-sided error bound on the error of each point query, i.e. frequency estimate.
- delta
A bound on the probability that a query estimate does not lie within some small interval (an interval that depends on
eps
) around the truth.- seed
A seed to initialize the random number generator used to create the pairwise independent hash functions.
- maxExactCountOpt
An Option parameter about how many exact counts a sparse CMS wants to keep.
Value Members
- final def !=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def ##: Int
- Definition Classes
- AnyRef → Any
- final def ==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- def additive: algebra.Monoid[CMS[K]]
These are from algebra.Monoid
- final def asInstanceOf[T0]: T0
- Definition Classes
- Any
- def assertNotZero(v: CMS[K]): Unit
- Definition Classes
- Monoid
- def clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.CloneNotSupportedException]) @native()
- def combine(l: CMS[K], r: CMS[K]): CMS[K]
- Definition Classes
- Semigroup → Semigroup
- def combineAll(t: TraversableOnce[CMS[K]]): CMS[K]
- Definition Classes
- Monoid → Monoid
- def combineAllOption(as: IterableOnce[CMS[K]]): Option[CMS[K]]
- Definition Classes
- Monoid → Semigroup
- def combineN(a: CMS[K], n: Int): CMS[K]
- Definition Classes
- Monoid → Semigroup
- def create(data: Seq[K]): CMS[K]
Creates a sketch out of multiple items.
- def create(item: K): CMS[K]
Creates a sketch out of a single item.
- def empty: CMS[K]
- Definition Classes
- Monoid → Monoid
- final def eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def equals(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef → Any
- def finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.Throwable])
- final def getClass(): Class[_ <: AnyRef]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- def hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- def isEmpty(a: CMS[K])(implicit ev: Eq[CMS[K]]): Boolean
- Definition Classes
- Monoid
- final def isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- def isNonZero(v: CMS[K]): Boolean
- Definition Classes
- Monoid
- def isZero(a: CMS[K])(implicit ev: Eq[CMS[K]]): Boolean
- Definition Classes
- AdditiveMonoid
- final def ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def nonZeroOption(v: CMS[K]): Option[CMS[K]]
- Definition Classes
- Monoid
- final def notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- final def notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- val params: CMSParams[K]
- def plus(left: CMS[K], right: CMS[K]): CMS[K]
Combines the two sketches.
Combines the two sketches.
The sketches must use the same hash functions.
- Definition Classes
- CMSMonoid → AdditiveSemigroup
- def positiveSumN(a: CMS[K], n: Int): CMS[K]
- Attributes
- protected[this]
- Definition Classes
- AdditiveSemigroup
- def repeatedCombineN(a: CMS[K], n: Int): CMS[K]
- Attributes
- protected[this]
- Definition Classes
- Semigroup
- def sum(sketches: TraversableOnce[CMS[K]]): CMS[K]
- def sumN(a: CMS[K], n: Int): CMS[K]
- Definition Classes
- AdditiveMonoid → AdditiveSemigroup
- def sumOption(sketches: TraversableOnce[CMS[K]]): Option[CMS[K]]
Returns an instance of
T
calculated by summing all instances initer
in one pass.Returns an instance of
T
calculated by summing all instances initer
in one pass. ReturnsNone
ifiter
is empty, elseSome[ T]
.- returns
None
ifiter
is empty, else an option value containing the summedT
- final def synchronized[T0](arg0: => T0): T0
- Definition Classes
- AnyRef
- def toString(): String
- Definition Classes
- AnyRef → Any
- def trySum(as: TraversableOnce[CMS[K]]): Option[CMS[K]]
- Definition Classes
- AdditiveMonoid → AdditiveSemigroup
- final def wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException]) @native()
- val zero: CMS[K]
- Definition Classes
- CMSMonoid → AdditiveMonoid