@Experimental(value=SOURCE_SINK) protected abstract static class BlockBasedSource.BlockBasedReader<T> extends FileBasedSource.FileBasedReader<T>
Reader that reads records from a BlockBasedSource. If the source is a
subrange of a file, the blocks that will be read by this reader are those such that the first
byte of the block is within the range [start, end).rangeTracker| Modifier | Constructor and Description |
|---|---|
protected |
BlockBasedReader(BlockBasedSource<T> source) |
| Modifier and Type | Method and Description |
|---|---|
T |
getCurrent()
Returns the value of the data item that was read by the last
Source.Reader.start() or
Source.Reader.advance() call. |
abstract BlockBasedSource.Block<T> |
getCurrentBlock()
Returns the current block (the block that was read by the previous call to
readNextBlock()). |
abstract long |
getCurrentBlockOffset()
Returns the largest offset such that starting to read from that offset includes the current
block.
|
abstract long |
getCurrentBlockSize()
Returns the size of the current block in bytes as it is represented in the underlying file,
if possible.
|
protected long |
getCurrentOffset()
Returns the starting offset of the
current record,
which has been read by the last successful Source.Reader.start() or
Source.Reader.advance() call. |
Double |
getFractionConsumed()
Returns a value in [0, 1] representing approximately what fraction of the source
(
BoundedSource.BoundedReader.getCurrentSource()) this reader has read so far. |
protected boolean |
isAtSplitPoint()
Returns true if the reader is at a split point.
|
abstract boolean |
readNextBlock()
Read the next block from the input.
|
protected boolean |
readNextRecord()
Reads the next record from the channel provided by
FileBasedSource.FileBasedReader.startReading(java.nio.channels.ReadableByteChannel). |
advance, close, getCurrentSource, start, startReadingsplitAtFractiongetCurrentTimestampclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitgetCurrentTimestampprotected BlockBasedReader(BlockBasedSource<T> source)
public abstract boolean readNextBlock()
throws IOException
IOExceptionpublic abstract BlockBasedSource.Block<T> getCurrentBlock() throws NoSuchElementException
readNextBlock()).NoSuchElementExceptionpublic abstract long getCurrentBlockSize()
The size returned by this method must be such that for two successive blocks A and B,
offset(A) + size(A) <= offset(B). If this is not satisfied, the progress reported
by the BlockBasedReader will be non-monotonic and will interfere with the quality
(but not correctness) of dynamic work rebalancing.
This method and BlockBasedSource.Block.getFractionOfBlockConsumed() are used to provide an estimate
of progress within a block (currentBlock.getFractionOfBlockConsumed() *
getCurrentBlockSize()). It is acceptable for the result of this computation to be 0, but
progress estimation will be inaccurate.
public abstract long getCurrentBlockOffset()
public final T getCurrent() throws NoSuchElementException
Source.ReaderSource.Reader.start() or
Source.Reader.advance() call. The returned value must be effectively immutable and remain valid
indefinitely.
Multiple calls to this method without an intervening call to Source.Reader.advance() should
return the same result.
NoSuchElementException - if the reader is at the beginning of the input and
Source.Reader.start() or Source.Reader.advance() wasn't called, or if the last Source.Reader.start() or
Source.Reader.advance() returned false.protected boolean isAtSplitPoint()
BlockBasedReader is at a split
point if the current record is the first record in a block. In other words, split points
are block boundaries.isAtSplitPoint in class FileBasedSource.FileBasedReader<T>protected final boolean readNextRecord()
throws IOException
FileBasedSource.FileBasedReaderFileBasedSource.FileBasedReader.startReading(java.nio.channels.ReadableByteChannel). Methods
Source.Reader.getCurrent(), ByteOffsetBasedSource.ByteOffsetBasedReader.getCurrentOffset(), and FileBasedSource.FileBasedReader.isAtSplitPoint() should return
the corresponding information about the record read by the last invocation of this method.
Note that this method will be called the same way for reading the first record in the source (file or offset range in the file) and for reading subsequent records. It is up to the subclass to do anything special for locating and reading the first record, if necessary.
readNextRecord in class FileBasedSource.FileBasedReader<T>true if a record was successfully read, false if the end of the
channel was reached before successfully reading a new record.IOExceptionpublic Double getFractionConsumed()
BoundedSource.BoundedReaderBoundedSource.BoundedReader.getCurrentSource()) this reader has read so far.
It is recommended that this method should satisfy the following properties:
Source.Reader.start() call.
Source.Reader.start() or Source.Reader.advance() call that returns false.
getFractionConsumed in interface BoundedSource.BoundedReader<T>getFractionConsumed in class ByteOffsetBasedSource.ByteOffsetBasedReader<T>null if such an estimate is not available.protected long getCurrentOffset()
ByteOffsetBasedSource.ByteOffsetBasedReadercurrent record,
which has been read by the last successful Source.Reader.start() or
Source.Reader.advance() call.
If no such call has been made yet, the return value is unspecified.
See RangeTracker for description of offset semantics.
getCurrentOffset in class ByteOffsetBasedSource.ByteOffsetBasedReader<T>