Class Source.Reader<T>
- java.lang.Object
-
- org.apache.beam.sdk.io.Source.Reader<T>
-
- All Implemented Interfaces:
java.lang.AutoCloseable
- Direct Known Subclasses:
BoundedSource.BoundedReader
,UnboundedSource.UnboundedReader
public abstract static class Source.Reader<T> extends java.lang.Object implements java.lang.AutoCloseable
The interface that readers of custom input sources must implement.This interface is deliberately distinct from
Iterator
because the current model tends to be easier to program and more efficient in practice for iterating over sources such as files, databases etc. (rather than pure collections).Reading data from the
Source.Reader
must obey the following access pattern:- One call to
start()
- If
start()
returned true, any number of calls togetCurrent
* methods
- If
- Repeatedly, a call to
advance()
. This may be called regardless of what the previousstart()
/advance()
returned.- If
advance()
returned true, any number of calls togetCurrent
* methods
- If
For example, if the reader is reading a fixed set of data:
try { for (boolean available = reader.start(); available; available = reader.advance()) { T item = reader.getCurrent(); Instant timestamp = reader.getCurrentTimestamp(); ... } } finally { reader.close(); }
If the set of data being read is continually growing:
try { boolean available = reader.start(); while (true) { if (available) { T item = reader.getCurrent(); Instant timestamp = reader.getCurrentTimestamp(); ... resetExponentialBackoff(); } else { exponentialBackoff(); } available = reader.advance(); } } finally { reader.close(); }
Note: this interface is a work-in-progress and may change.
All
Reader
functions exceptgetCurrentSource()
do not need to be thread-safe; they may only be accessed by a single thread at once. However,getCurrentSource()
needs to be thread-safe, and other functions should assume that its returned value can change asynchronously.
-
-
Constructor Summary
Constructors Constructor Description Reader()
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description abstract boolean
advance()
Advances the reader to the next valid record.abstract void
close()
Closes the reader.abstract T
getCurrent()
abstract Source<T>
getCurrentSource()
Returns aSource
describing the same input that thisReader
currently reads (including items already read).abstract org.joda.time.Instant
getCurrentTimestamp()
Returns the timestamp associated with the current data item.abstract boolean
start()
Initializes the reader and advances the reader to the first record.
-
-
-
Method Detail
-
start
public abstract boolean start() throws java.io.IOException
Initializes the reader and advances the reader to the first record.This method should be called exactly once. The invocation should occur prior to calling
advance()
orgetCurrent()
. This method may perform expensive operations that are needed to initialize the reader.- Returns:
true
if a record was read,false
if there is no more input available.- Throws:
java.io.IOException
-
advance
public abstract boolean advance() throws java.io.IOException
Advances the reader to the next valid record.It is an error to call this without having called
start()
first.- Returns:
true
if a record was read,false
if there is no more input available.- Throws:
java.io.IOException
-
getCurrent
public abstract T getCurrent() throws java.util.NoSuchElementException
-
getCurrentTimestamp
public abstract org.joda.time.Instant getCurrentTimestamp() throws java.util.NoSuchElementException
Returns the timestamp associated with the current data item.If the source does not support timestamps, this should return
BoundedWindow.TIMESTAMP_MIN_VALUE
.Multiple calls to this method without an intervening call to
advance()
should return the same result.
-
close
public abstract void close() throws java.io.IOException
Closes the reader. The reader cannot be used after this method is called.- Specified by:
close
in interfacejava.lang.AutoCloseable
- Throws:
java.io.IOException
-
getCurrentSource
public abstract Source<T> getCurrentSource()
Returns aSource
describing the same input that thisReader
currently reads (including items already read).Usually, an implementation will simply return the immutable
Source
object from which the currentSource.Reader
was constructed, or delegate to the base class. However, when using or implementing this method on aBoundedSource.BoundedReader
, special considerations apply, see documentation forBoundedSource.BoundedReader.getCurrentSource()
.
-
-