Class OffsetRangeTracker
- java.lang.Object
-
- org.apache.beam.sdk.io.range.OffsetRangeTracker
-
- All Implemented Interfaces:
RangeTracker<java.lang.Long>
public class OffsetRangeTracker extends java.lang.Object implements RangeTracker<java.lang.Long>
-
-
Field Summary
Fields Modifier and Type Field Description static long
OFFSET_INFINITY
Offset corresponding to infinity.
-
Constructor Summary
Constructors Constructor Description OffsetRangeTracker(long startOffset, long stopOffset)
Creates anOffsetRangeTracker
for the specified range.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description double
getFractionConsumed()
Returns the approximate fraction of positions in the source that have been consumed by successfulRangeTracker.tryReturnRecordAt(boolean, PositionT)
calls, or 0.0 if no such calls have happened.long
getPositionForFractionConsumed(double fraction)
Returns a positionP
such that the range[start, P)
represents approximately the given fraction of the range[start, end)
.long
getSplitPointsProcessed()
Returns the total number of split points that have been processed.java.lang.Long
getStartPosition()
Returns the starting position of the current range, inclusive.java.lang.Long
getStopPosition()
Returns the ending position of the current range, exclusive.boolean
isDone()
boolean
isStarted()
boolean
markDone()
Marks this range tracker as being done.java.lang.String
toString()
boolean
tryReturnRecordAt(boolean isAtSplitPoint, long recordStart)
boolean
tryReturnRecordAt(boolean isAtSplitPoint, java.lang.Long recordStart)
Atomically determines whether a record at the given position can be returned and updates internal state.boolean
trySplitAtPosition(long splitOffset)
boolean
trySplitAtPosition(java.lang.Long splitOffset)
Atomically splits the current range [RangeTracker.getStartPosition()
,RangeTracker.getStopPosition()
) into a "primary" part [RangeTracker.getStartPosition()
,splitPosition
) and a "residual" part [splitPosition
,RangeTracker.getStopPosition()
), assuming the current last-consumed position is within [RangeTracker.getStartPosition()
, splitPosition) (i.e.,splitPosition
has not been consumed yet).
-
-
-
Field Detail
-
OFFSET_INFINITY
public static final long OFFSET_INFINITY
Offset corresponding to infinity. This can only be used as the upper-bound of a range, and indicates reading all of the records until the end without specifying exactly what the end is.Infinite ranges cannot be split because it is impossible to estimate progress within them.
- See Also:
- Constant Field Values
-
-
Method Detail
-
isStarted
public boolean isStarted()
-
isDone
public boolean isDone()
-
getStartPosition
public java.lang.Long getStartPosition()
Description copied from interface:RangeTracker
Returns the starting position of the current range, inclusive.- Specified by:
getStartPosition
in interfaceRangeTracker<java.lang.Long>
-
getStopPosition
public java.lang.Long getStopPosition()
Description copied from interface:RangeTracker
Returns the ending position of the current range, exclusive.- Specified by:
getStopPosition
in interfaceRangeTracker<java.lang.Long>
-
tryReturnRecordAt
public boolean tryReturnRecordAt(boolean isAtSplitPoint, java.lang.Long recordStart)
Description copied from interface:RangeTracker
Atomically determines whether a record at the given position can be returned and updates internal state. In particular:- If
isAtSplitPoint
istrue
, andrecordStart
is outside the current range, returnsfalse
; - Otherwise, updates the last-consumed position to
recordStart
and returnstrue
.
This method MUST be called on all split point records. It may be called on every record.
- Specified by:
tryReturnRecordAt
in interfaceRangeTracker<java.lang.Long>
- If
-
tryReturnRecordAt
public boolean tryReturnRecordAt(boolean isAtSplitPoint, long recordStart)
-
trySplitAtPosition
public boolean trySplitAtPosition(java.lang.Long splitOffset)
Description copied from interface:RangeTracker
Atomically splits the current range [RangeTracker.getStartPosition()
,RangeTracker.getStopPosition()
) into a "primary" part [RangeTracker.getStartPosition()
,splitPosition
) and a "residual" part [splitPosition
,RangeTracker.getStopPosition()
), assuming the current last-consumed position is within [RangeTracker.getStartPosition()
, splitPosition) (i.e.,splitPosition
has not been consumed yet).Updates the current range to be the primary and returns
true
. This means that all further calls on the current object will interpret their arguments relative to the primary range.If the split position has already been consumed, or if no
RangeTracker.tryReturnRecordAt(boolean, PositionT)
call was made yet, returnsfalse
. The second condition is to prevent dynamic splitting during reader start-up.- Specified by:
trySplitAtPosition
in interfaceRangeTracker<java.lang.Long>
-
trySplitAtPosition
public boolean trySplitAtPosition(long splitOffset)
-
getPositionForFractionConsumed
public long getPositionForFractionConsumed(double fraction)
Returns a positionP
such that the range[start, P)
represents approximately the given fraction of the range[start, end)
. Assumes that the density of records in the range is approximately uniform.
-
getFractionConsumed
public double getFractionConsumed()
Description copied from interface:RangeTracker
Returns the approximate fraction of positions in the source that have been consumed by successfulRangeTracker.tryReturnRecordAt(boolean, PositionT)
calls, or 0.0 if no such calls have happened.- Specified by:
getFractionConsumed
in interfaceRangeTracker<java.lang.Long>
-
getSplitPointsProcessed
public long getSplitPointsProcessed()
Returns the total number of split points that have been processed.A split point at a particular offset has been seen if there has been a corresponding call to
tryReturnRecordAt(boolean, long)
withisAtSplitPoint
true. It has been processed if there has been a subsequent call totryReturnRecordAt(boolean, long)
withisAtSplitPoint
true and at a larger offset.Note that for correctness when implementing
BoundedSource.BoundedReader.getSplitPointsConsumed()
, if a reader finishes beforetryReturnRecordAt(boolean, long)
returns false, the reader should add an additional call tomarkDone()
. This will indicate that processing for the last seen split point has been finished.- See Also:
for a implemented using .
-
markDone
public boolean markDone()
Marks this range tracker as being done. Specifically, this will mark the current split point, if one exists, as being finished.Always returns false, so that it can be used in an implementation of
Source.Reader.start()
orSource.Reader.advance()
as follows:public boolean start() { return startImpl() && rangeTracker.tryReturnRecordAt(isAtSplitPoint, position) || rangeTracker.markDone(); }
-
toString
public java.lang.String toString()
- Overrides:
toString
in classjava.lang.Object
-
-