com.twitter.summingbird.scalding.store
The batcher for this store
The batcher for this store
Record a computed batch of code
Record a computed batch of code
These functions convert back and forth between a specific BatchID and the earliest time of the BatchID just after it.
These functions convert back and forth between a specific BatchID and the earliest time of the BatchID just after it.
The version numbers are the exclusive upper-bound of time covered by this store, while the batchIDs are the inclusive upper bound. Put another way, all events that occured before the version are included in this store.
instances of this trait MAY NOT change the logic here.
instances of this trait MAY NOT change the logic here. This always follows the rule that we look for existing data (avoiding reading deltas in that case), then we fall back to the last checkpointed output by calling readLast. In that case, we compute the results by rolling forward
For each batch, collect up values with the same key on mapside before the keys are expanded.
For each batch, collect up values with the same key on mapside before the keys are expanded.
Override this to set up store pruning, by default, no (key,value) pairs are pruned.
Override this to set up store pruning, by default, no (key,value) pairs are pruned. This is a house keeping function to permanently remove entries matching a criteria.
Returns a snapshot of the store's (K, V) pairs aggregated up to (but not including!) the time covered by the supplied batchID.
Returns a snapshot of the store's (K, V) pairs aggregated up to (but not including!) the time covered by the supplied batchID.
Aggregating the readLast for a particular batchID with the stream stored for the same batchID will return the aggregate up to (but not including) batchID.next. Streams deal with inclusive upper bound.
Override select if you don't want to materialize every batch.
Override select if you don't want to materialize every batch. Note that select MUST return a list containing the final batch in the supplied list; otherwise data would be lost.
For (firstNonZero - 1) we read empty.
For (firstNonZero - 1) we read empty. For all before we error on read. For all later, we proxy On write, we throw if batchID is less than firstNonZero
Allows subclasses to share the means of reading version numbers but plug in methods to actually read or write the data.