State machine for checkpoint states.
To create an implemetation of CheckpointState you need first define a class of CheckpointStore
see com.twitter.summingbird.batch.state.HDFSCheckpointStore
for an example
To create an implemetation of CheckpointState you need first define a class of CheckpointStore
see com.twitter.summingbird.batch.state.HDFSCheckpointStore
for an example
Subclass of CheckpointStore should be responsible for getting the startBatch by checking the checkpoints of previous batch run and getting the endBatch by number of batches the clients asks to run.
The CheckpointStore should provide concrete implementation of how to read previous batch and checkpoint current batch
Type T is the token of each batch run created by startBatch(), the token is then provided back to checkPoint store for checkpoint Success or Failure.
State implementation that uses an HDFS folder as a crude key-value store that tracks the batches currently processed.
State machine for checkpoint states. It creates the requested time interval by asking
CheckpointStore for startBatch and endBatch. After flow planner minifies the time interval(by checking data available), it decides to accept the interval by making sure there is no hole in between the start time of the interval and the end time of previous batch run.
It requires a CheckpointStore to be provided