object Aggregator extends Serializable
Aggregators compose well.
To create a parallel aggregator that operates on a single input in parallel, use: GeneratedTupleAggregator.from2((agg1, agg2))
- Alphabetic
- By Inheritance
- Aggregator
- Serializable
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
appendMonoid[F, T, P](appnd: (T, F) ⇒ T, pres: (T) ⇒ P)(implicit m: Monoid[T]): MonoidAggregator[F, T, P]
Obtain a MonoidAggregator that uses an efficient append operation for faster aggregation
Obtain a MonoidAggregator that uses an efficient append operation for faster aggregation
- F
Data input type
- T
Aggregating Monoid type
- P
Presentation (output) type
- appnd
Function that appends the Monoid. Defines the MonoidAggregator.append method for this aggregator. Analogous to the 'seqop' function in Scala's sequence 'aggregate' method
- pres
The presentation function
- m
The Monoid type class
- Note
The function 'appnd' is expected to obey the law:
appnd(t, f) == m.plus(t, appnd(m.zero, f))
-
def
appendMonoid[F, T](appnd: (T, F) ⇒ T)(implicit m: Monoid[T]): MonoidAggregator[F, T, T]
Obtain a MonoidAggregator that uses an efficient append operation for faster aggregation.
Obtain a MonoidAggregator that uses an efficient append operation for faster aggregation. Equivalent to
appendMonoid(appnd, identity[T]_)(m)
-
def
appendSemigroup[F, T, P](prep: (F) ⇒ T, appnd: (T, F) ⇒ T, pres: (T) ⇒ P)(implicit sg: Semigroup[T]): Aggregator[F, T, P]
Obtain an Aggregator that uses an efficient append operation for faster aggregation
Obtain an Aggregator that uses an efficient append operation for faster aggregation
- F
Data input type
- T
Aggregating Semigroup type
- P
Presentation (output) type
- prep
The preparation function. Expected to construct an instance of type T from a single data element.
- appnd
Function that appends the Semigroup. Defines the Aggregator.append method for this aggregator. Analogous to the 'seqop' function in Scala's sequence 'aggregate' method
- pres
The presentation function
- sg
The Semigroup type class
- Note
The functions 'appnd' and 'prep' are expected to obey the law:
appnd(t, f) == sg.plus(t, prep(f))
-
def
appendSemigroup[F, T](prep: (F) ⇒ T, appnd: (T, F) ⇒ T)(implicit sg: Semigroup[T]): Aggregator[F, T, T]
Obtain an Aggregator that uses an efficient append operation for faster aggregation.
Obtain an Aggregator that uses an efficient append operation for faster aggregation. Equivalent to
appendSemigroup(prep, appnd, identity[T]_)(sg)
- implicit def applicative[I]: Applicative[[O]Aggregator[I, _, O]]
-
def
approximatePercentile[T](percentile: Double, k: Int = QTreeAggregator.DefaultK)(implicit num: Numeric[T]): QTreeAggregatorLowerBound[T]
Returns the lower bound of a given percentile where the percentile is between (0,1] The items that are iterated over cannot be negative.
-
def
approximatePercentileBounds[T](percentile: Double, k: Int = QTreeAggregator.DefaultK)(implicit num: Numeric[T]): QTreeAggregator[T]
Returns the intersection of a bounded percentile where the percentile is between (0,1] The items that are iterated over cannot be negative.
-
def
approximateUniqueCount[T](implicit arg0: Hash128[T]): MonoidAggregator[T, Either[HLL, Set[T]], Long]
Using a constant amount of memory, give an approximate unique count (~ 1% error).
Using a constant amount of memory, give an approximate unique count (~ 1% error). This uses an exact set for up to 100 items, then HyperLogLog (HLL) with an 1.2% standard error which uses at most 8192 bytes for each HLL. For more control, see HyperLogLogAggregator.
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
def
const[T](t: T): MonoidAggregator[Any, Unit, T]
This is a trivial aggregator that always returns a single value
-
def
count[T](pred: (T) ⇒ Boolean): MonoidAggregator[T, Long, Long]
How many items satisfy a predicate
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
exists[T](pred: (T) ⇒ Boolean): MonoidAggregator[T, Boolean, Boolean]
Do any items satisfy some predicate
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
def
forall[T](pred: (T) ⇒ Boolean): MonoidAggregator[T, Boolean, Boolean]
Do all items satisfy a predicate
- def fromMonoid[F, T](implicit mon: Monoid[T], prep: (F) ⇒ T): MonoidAggregator[F, T, T]
- def fromMonoid[T](implicit mon: Monoid[T]): MonoidAggregator[T, T, T]
-
def
fromReduce[T](red: (T, T) ⇒ T): Aggregator[T, T, T]
Using Aggregator.prepare,present you can add to this aggregator
- def fromRing[F, T](implicit rng: Ring[T], prep: (F) ⇒ T): RingAggregator[F, T, T]
- def fromRing[T](implicit rng: Ring[T]): RingAggregator[T, T, T]
- def fromSemigroup[T](implicit sg: Semigroup[T]): Aggregator[T, T, T]
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
head[T]: Aggregator[T, T, T]
Take the first (left most in reduce order) item found
-
def
immutableSortedReverseTake[T](count: Int)(implicit arg0: Ordering[T]): MonoidAggregator[T, TopK[T], Seq[T]]
Immutable version of sortedReverseTake, for frameworks that check immutability of reduce functions.
-
def
immutableSortedTake[T](count: Int)(implicit arg0: Ordering[T]): MonoidAggregator[T, TopK[T], Seq[T]]
Immutable version of sortedTake, for frameworks that check immutability of reduce functions.
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
def
last[T]: Aggregator[T, T, T]
Take the last (right most in reduce order) item found
-
def
max[T](implicit arg0: Ordering[T]): Aggregator[T, T, T]
Get the maximum item
- def maxBy[U, T](fn: (U) ⇒ T)(implicit arg0: Ordering[T]): Aggregator[U, U, U]
-
def
min[T](implicit arg0: Ordering[T]): Aggregator[T, T, T]
Get the minimum item
- def minBy[U, T](fn: (U) ⇒ T)(implicit arg0: Ordering[T]): Aggregator[U, U, U]
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
def
numericSum[T](implicit num: Numeric[T]): MonoidAggregator[T, Double, Double]
An aggregator that sums Numeric values into Doubles.
An aggregator that sums Numeric values into Doubles.
This is really no more than converting to Double and then summing. The conversion to double means we don't have the overflow semantics of integer types on the jvm (e.g. Int.MaxValue + 1 == Int.MinValue).
Note that if you instead wanted to aggregate Numeric values of a type T into the same type T (e.g. if you want MonoidAggregator[T, T, T] for some Numeric type T), you can directly use Aggregator.fromMonoid[T] after importing the numericRing implicit:
> import com.twitter.algebird.Ring.numericRing > def numericAggregator[T: Numeric]: MonoidAggregator[T, T, T] = Aggregator.fromMonoid[T]
- def prepareMonoid[F, T](prep: (F) ⇒ T)(implicit m: Monoid[T]): MonoidAggregator[F, T, T]
- def prepareSemigroup[F, T](prep: (F) ⇒ T)(implicit sg: Semigroup[T]): Aggregator[F, T, T]
-
def
randomSample[T](prob: Double, seed: Int = DefaultSeed): MonoidAggregator[T, Option[Batched[T]], List[T]]
Randomly selects input items where each item has an independent probability 'prob' of being selected.
Randomly selects input items where each item has an independent probability 'prob' of being selected. This assumes that all sampled records can fit in memory, so use this only when the expected number of sampled values is small.
-
def
reservoirSample[T](count: Int, seed: Int = DefaultSeed): MonoidAggregator[T, PriorityQueue[(Double, T)], Seq[T]]
Selects exactly 'count' of the input records randomly (or all of the records if there are less then 'count' total records).
Selects exactly 'count' of the input records randomly (or all of the records if there are less then 'count' total records). This assumes that all 'count' of the records can fit in memory, so use this only for small values of 'count'.
-
def
size: MonoidAggregator[Any, Long, Long]
This returns the number of items we find
-
def
sortByReverseTake[T, U](count: Int)(fn: (T) ⇒ U)(implicit arg0: Ordering[U]): MonoidAggregator[T, PriorityQueue[T], Seq[T]]
Same as sortedReverseTake, but using a function that returns a value that has an Ordering.
Same as sortedReverseTake, but using a function that returns a value that has an Ordering.
This function is like writing list.sortBy(fn).reverse.take(count).
-
def
sortByTake[T, U](count: Int)(fn: (T) ⇒ U)(implicit arg0: Ordering[U]): MonoidAggregator[T, PriorityQueue[T], Seq[T]]
Same as sortedTake, but using a function that returns a value that has an Ordering.
Same as sortedTake, but using a function that returns a value that has an Ordering.
This function is like writing list.sortBy(fn).take(count).
-
def
sortedReverseTake[T](count: Int)(implicit arg0: Ordering[T]): MonoidAggregator[T, PriorityQueue[T], Seq[T]]
Take the largest
count
items using a heap -
def
sortedTake[T](count: Int)(implicit arg0: Ordering[T]): MonoidAggregator[T, PriorityQueue[T], Seq[T]]
Take the smallest
count
items using a heap -
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toList[T]: MonoidAggregator[T, Option[Batched[T]], List[T]]
Put everything in a List.
Put everything in a List. Note, this could fill the memory if the List is very large.
-
def
toSet[T]: MonoidAggregator[T, Set[T], Set[T]]
Put everything in a Set.
Put everything in a Set. Note, this could fill the memory if the Set is very large.
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
def
uniqueCount[T]: MonoidAggregator[T, Set[T], Int]
This builds an in-memory Set, and then finally gets the size of that set.
This builds an in-memory Set, and then finally gets the size of that set. This may not be scalable if the Uniques are very large. You might check the approximateUniqueCount or HyperLogLog Aggregator to get an approximate version of this that is scalable.
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()