com.twitter.algebird

SpaceSaver

Related Docs: object SpaceSaver | package algebird

sealed abstract class SpaceSaver[T] extends AnyRef

Data structure used in the Space-Saving Algorithm to find the approximate most frequent and top-k elements. The algorithm is described in "Efficient Computation of Frequent and Top-k Elements in Data Streams". See here: www.cs.ucsb.edu/research/tech_reports/reports/2005-23.pdf In the paper the data structure is called StreamSummary but we chose to call it SpaceSaver instead. Note that the adaptation to hadoop and parallelization were not described in the article and have not been proven to be mathematically correct or preserve the guarantees or benefits of the algorithm.

Linear Supertypes
AnyRef, Any
Known Subclasses
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. SpaceSaver
  2. AnyRef
  3. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Abstract Value Members

  1. abstract def ++(other: SpaceSaver[T]): SpaceSaver[T]

  2. abstract def capacity: Int

    Maximum number of counters to keep (parameter "m" in the research paper).

  3. abstract def counters: Map[T, (Long, Long)]

    Map of item to counter, where each counter consists of an observed count and possible over-estimation (error)

  4. abstract def min: Long

    Current lowest value for count

Concrete Value Members

  1. final def !=(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  5. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  6. def consistentWith(that: SpaceSaver[T]): Boolean

    Check consistency with other SpaceSaver, useful for testing.

    Check consistency with other SpaceSaver, useful for testing. Returns boolean indicating if they are consistent

  7. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  8. def equals(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  9. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  10. def frequency(item: T): Approximate[Long]

    returns the frequency estimate for the item

  11. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  12. def hashCode(): Int

    Definition Classes
    AnyRef → Any
  13. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  14. def mostFrequent(thres: Int): Seq[(T, Approximate[Long], Boolean)]

    Get the elements that show up more than thres times.

    Get the elements that show up more than thres times. Returns sorted in descending order: (item, Approximate[Long], guaranteed)

  15. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  16. final def notify(): Unit

    Definition Classes
    AnyRef
  17. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  18. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  19. def toString(): String

    Definition Classes
    AnyRef → Any
  20. def topK(k: Int): Seq[(T, Approximate[Long], Boolean)]

    Get the top-k elements.

    Get the top-k elements. Returns sorted in descending order: (item, Approximate[Long], guaranteed)

  21. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  22. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  23. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from AnyRef

Inherited from Any

Ungrouped