Aggregate the values of each key, using given combine functions and a neutral "zero value".
Aggregate the values of each key, using given combine functions and a neutral "zero value". This function can return a different result type, U, than the type of the values in this SCollection, V. Thus, we need one operation for merging a V into a U and one operation for merging two U's. To avoid memory allocation, both of these functions are allowed to modify and return their first argument instead of creating a new U.
Generic function to combine the elements for each key using a custom set of aggregation functions.
Generic function to combine the elements for each key using a custom set of aggregation functions. Turns an SCollection[(K, V)] into a result of type SCollection[(K, C)], for a "combined type" C Note that V and C can be different -- for example, one might group an SCollection of type (Int, Int) into an RDD of type (Int, Seq[Int]). Users provide three functions:
- createCombiner
, which turns a V into a C (e.g., creates a one-element list)
- mergeValue
, to merge a V into a C (e.g., adds it to the end of a list)
- mergeCombiners
, to combine two C's into a single one.
Fold by key with Monoid, which defines the associative function and "zero value" for V.
Fold by key with Monoid, which defines the associative function and "zero value" for V. This could be more powerful and better optimized in some cases.
Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.
Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
Merge the values for each key using an associative reduce function.
Merge the values for each key using an associative reduce function. This will also perform the merging locally on each mapper before sending results to a reducer, similarly to a "combiner" in MapReduce.
Reduce by key with Semigroup.
Reduce by key with Semigroup. This could be more powerful and better optimized in some cases.
Set a custom name for the next transform to be applied.
Set a custom name for the next transform to be applied.
An enhanced SCollection that uses an intermediate node to combine "hot" keys partially before performing the full combine.