Packages

sealed abstract class SpaceSaver[T] extends AnyRef

Data structure used in the Space-Saving Algorithm to find the approximate most frequent and top-k elements. The algorithm is described in "Efficient Computation of Frequent and Top-k Elements in Data Streams". See here: www.cs.ucsb.edu/research/tech_reports/reports/2005-23.pdf In the paper the data structure is called StreamSummary but we chose to call it SpaceSaver instead. Note that the adaptation to hadoop and parallelization were not described in the article and have not been proven to be mathematically correct or preserve the guarantees or benefits of the algorithm.

Linear Supertypes
AnyRef, Any
Known Subclasses
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. SpaceSaver
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. Protected

Abstract Value Members

  1. abstract def ++(other: SpaceSaver[T]): SpaceSaver[T]
  2. abstract def capacity: Int

    Maximum number of counters to keep (parameter "m" in the research paper).

  3. abstract def counters: Map[T, (Long, Long)]

    Map of item to counter, where each counter consists of an observed count and possible over-estimation (error)

  4. abstract def min: Long

    Current lowest value for count

Concrete Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##: Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.CloneNotSupportedException]) @native()
  6. def consistentWith(that: SpaceSaver[T]): Boolean

    Check consistency with other SpaceSaver, useful for testing.

    Check consistency with other SpaceSaver, useful for testing. Returns boolean indicating if they are consistent

  7. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  8. def equals(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef → Any
  9. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.Throwable])
  10. def frequency(item: T): Approximate[Long]

    returns the frequency estimate for the item

  11. final def getClass(): Class[_ <: AnyRef]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  12. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  13. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  14. def mostFrequent(thres: Int): Seq[(T, Approximate[Long], Boolean)]

    Get the elements that show up more than thres times.

    Get the elements that show up more than thres times. Returns sorted in descending order: (item, Approximate[Long], guaranteed)

  15. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  16. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  17. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  18. final def synchronized[T0](arg0: => T0): T0
    Definition Classes
    AnyRef
  19. def toString(): String
    Definition Classes
    AnyRef → Any
  20. def topK(k: Int): Seq[(T, Approximate[Long], Boolean)]

    Get the top-k elements.

    Get the top-k elements. Returns sorted in descending order: (item, Approximate[Long], guaranteed)

  21. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])
  22. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])
  23. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException]) @native()

Inherited from AnyRef

Inherited from Any

Ungrouped