sealed abstract class SpaceSaver[T] extends AnyRef
Data structure used in the Space-Saving Algorithm to find the approximate most frequent and top-k elements. The algorithm is described in "Efficient Computation of Frequent and Top-k Elements in Data Streams". See here: www.cs.ucsb.edu/research/tech_reports/reports/2005-23.pdf In the paper the data structure is called StreamSummary but we chose to call it SpaceSaver instead. Note that the adaptation to hadoop and parallelization were not described in the article and have not been proven to be mathematically correct or preserve the guarantees or benefits of the algorithm.
- Alphabetic
- By Inheritance
- SpaceSaver
- AnyRef
- Any
- Hide All
- Show All
- Public
- Protected
Abstract Value Members
- abstract def ++(other: SpaceSaver[T]): SpaceSaver[T]
- abstract def capacity: Int
Maximum number of counters to keep (parameter "m" in the research paper).
- abstract def counters: Map[T, (Long, Long)]
Map of item to counter, where each counter consists of an observed count and possible over-estimation (error)
- abstract def min: Long
Current lowest value for count
Concrete Value Members
- final def !=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def ##: Int
- Definition Classes
- AnyRef → Any
- final def ==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def asInstanceOf[T0]: T0
- Definition Classes
- Any
- def clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.CloneNotSupportedException]) @native()
- def consistentWith(that: SpaceSaver[T]): Boolean
Check consistency with other SpaceSaver, useful for testing.
Check consistency with other SpaceSaver, useful for testing. Returns boolean indicating if they are consistent
- final def eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def equals(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef → Any
- def finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.Throwable])
- def frequency(item: T): Approximate[Long]
returns the frequency estimate for the item
- final def getClass(): Class[_ <: AnyRef]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- def hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- final def isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- def mostFrequent(thres: Int): Seq[(T, Approximate[Long], Boolean)]
Get the elements that show up more than thres times.
Get the elements that show up more than thres times. Returns sorted in descending order: (item, Approximate[Long], guaranteed)
- final def ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- final def notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- final def notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- final def synchronized[T0](arg0: => T0): T0
- Definition Classes
- AnyRef
- def toString(): String
- Definition Classes
- AnyRef → Any
- def topK(k: Int): Seq[(T, Approximate[Long], Boolean)]
Get the top-k elements.
Get the top-k elements. Returns sorted in descending order: (item, Approximate[Long], guaranteed)
- final def wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException]) @native()