public abstract static class ByteOffsetBasedSource.ByteOffsetBasedReader<T> extends BoundedSource.AbstractBoundedReader<T>
Source.Reader that implements code common
to readers of all ByteOffsetBasedSources.| Modifier and Type | Field and Description |
|---|---|
protected OffsetRangeTracker |
rangeTracker
The
OffsetRangeTracker managing the range and current position of the source. |
| Constructor and Description |
|---|
ByteOffsetBasedReader(ByteOffsetBasedSource<T> source) |
| Modifier and Type | Method and Description |
|---|---|
protected abstract long |
getCurrentOffset()
Returns the starting offset of the
current record,
which has been read by the last successful Source.Reader.start() or
Source.Reader.advance() call. |
ByteOffsetBasedSource<T> |
getCurrentSource()
Returns a
Source describing the same input that this Reader reads
(including items already read). |
Double |
getFractionConsumed()
Returns a value in [0, 1] representing approximately what fraction of the source
(
BoundedSource.BoundedReader.getCurrentSource()) this reader has read so far. |
ByteOffsetBasedSource<T> |
splitAtFraction(double fraction)
Tells the reader to narrow the range of the input it's going to read and give up
the remainder, so that the new range would contain approximately the given
fraction of the amount of data in the current range.
|
getCurrentTimestampclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitadvance, close, getCurrent, getCurrentTimestamp, startprotected final OffsetRangeTracker rangeTracker
OffsetRangeTracker managing the range and current position of the source.
Subclasses MUST use it before returning records from Source.Reader.start() or Source.Reader.advance():
see documentation of RangeTracker.public ByteOffsetBasedReader(ByteOffsetBasedSource<T> source)
source - the ByteOffsetBasedSource to be read by the current reader.protected abstract long getCurrentOffset()
current record,
which has been read by the last successful Source.Reader.start() or
Source.Reader.advance() call.
If no such call has been made yet, the return value is unspecified.
See RangeTracker for description of offset semantics.
public ByteOffsetBasedSource<T> getCurrentSource()
Source.ReaderSource describing the same input that this Reader reads
(including items already read).
A reader created from the result of getCurrentSource, if consumed, MUST
return the same data items as the current reader.
public Double getFractionConsumed()
BoundedSource.BoundedReaderBoundedSource.BoundedReader.getCurrentSource()) this reader has read so far.
It is recommended that this method should satisfy the following properties:
Source.Reader.start() call.
Source.Reader.start() or Source.Reader.advance() call that returns false.
getFractionConsumed in interface BoundedSource.BoundedReader<T>getFractionConsumed in class BoundedSource.AbstractBoundedReader<T>null if such an estimate is not available.public ByteOffsetBasedSource<T> splitAtFraction(double fraction)
BoundedSource.BoundedReaderReturns a BoundedSource representing the remainder.
BoundedSource<T> initial = reader.getCurrentSource();
BoundedSource<T> residual = reader.splitAtFraction(fraction);
BoundedSource<T> primary = reader.getCurrentSource();
This method should return null if the split cannot be performed for this fraction
while satisfying the semantics above. E.g., a reader that reads a range of offsets
in a file should return null if it is already past the position in its range
corresponding to the given fraction. In this case, the method MUST have no effect
(the reader must behave as if the method hadn't been called at all).
It is also very important that this method always completes quickly, in particular, it should not perform or wait on any blocking operations such as I/O, RPCs etc. Violating this requirement may stall completion of the work item or even cause it to fail.
E.g. it is incorrect to make both this method and Source.Reader.start()/Source.Reader.advance()
synchronized, because those methods can perform blocking operations, and then
this method would have to wait for those calls to complete.
RangeTracker makes it easy to implement
this method safely and correctly.
splitAtFraction in interface BoundedSource.BoundedReader<T>splitAtFraction in class BoundedSource.AbstractBoundedReader<T>