The Batch is the fundamental work unit of the Hadoop portion of
Summingbird. Batches are processed offline and pushed into a
persistent store for serving. The offline Batches include the sum
of all Values for each Key. If the Value is zero, it is omitted
(i.e. zero[Value] is indistinguishable from having never seen a
Value for a given Key). Each Batch has a unique BatchID. Each
event falls into a single BatchID (which is a concrete type
isomorphic to Long).
The Batch is the fundamental work unit of the Hadoop portion of Summingbird. Batches are processed offline and pushed into a persistent store for serving. The offline Batches include the sum of all Values for each Key. If the Value is zero, it is omitted (i.e. zero[Value] is indistinguishable from having never seen a Value for a given Key). Each Batch has a unique BatchID. Each event falls into a single BatchID (which is a concrete type isomorphic to Long).