Package

com.twitter.summingbird.batch

state

Permalink

package state

Visibility
  1. Public
  2. All

Type Members

  1. trait CheckpointState[T] extends WaitingState[Interval[Timestamp]]

    Permalink

    State machine for checkpoint states.

    State machine for checkpoint states. It creates the requested time interval by asking

    CheckpointStore for startBatch and endBatch. After flow planner minifies the time interval(by checking data available), it decides to accept the interval by making sure there is no hole in between the start time of the interval and the end time of previous batch run.

    It requires a CheckpointStore to be provided

  2. trait CheckpointStore[T] extends AnyRef

    Permalink

    To create an implemetation of CheckpointState you need first define a class of CheckpointStore see com.twitter.summingbird.batch.state.HDFSCheckpointStore for an example

    To create an implemetation of CheckpointState you need first define a class of CheckpointStore see com.twitter.summingbird.batch.state.HDFSCheckpointStore for an example

    Subclass of CheckpointStore should be responsible for getting the startBatch by checking the checkpoints of previous batch run and getting the endBatch by number of batches the clients asks to run.

    The CheckpointStore should provide concrete implementation of how to read previous batch and checkpoint current batch

    Type T is the token of each batch run created by startBatch(), the token is then provided back to checkPoint store for checkpoint Success or Failure.

  3. class HDFSCheckpointStore extends CheckpointStore[Iterable[BatchID]]

    Permalink
  4. class HDFSState extends CheckpointState[Iterable[BatchID]]

    Permalink

Value Members

  1. object HDFSState

    Permalink

    State implementation that uses an HDFS folder as a crude key-value store that tracks the batches currently processed.

Ungrouped