Class Source.Reader<T>

  • All Implemented Interfaces:
    java.lang.AutoCloseable
    Direct Known Subclasses:
    BoundedSource.BoundedReader, UnboundedSource.UnboundedReader
    Enclosing class:
    Source<T>

    public abstract static class Source.Reader<T>
    extends java.lang.Object
    implements java.lang.AutoCloseable
    The interface that readers of custom input sources must implement.

    This interface is deliberately distinct from Iterator because the current model tends to be easier to program and more efficient in practice for iterating over sources such as files, databases etc. (rather than pure collections).

    Reading data from the Source.Reader must obey the following access pattern:

    • One call to start()
      • If start() returned true, any number of calls to getCurrent* methods
    • Repeatedly, a call to advance(). This may be called regardless of what the previous start()/advance() returned.
      • If advance() returned true, any number of calls to getCurrent* methods

    For example, if the reader is reading a fixed set of data:

       try {
         for (boolean available = reader.start(); available; available = reader.advance()) {
           T item = reader.getCurrent();
           Instant timestamp = reader.getCurrentTimestamp();
           ...
         }
       } finally {
         reader.close();
       }
     

    If the set of data being read is continually growing:

       try {
         boolean available = reader.start();
         while (true) {
           if (available) {
             T item = reader.getCurrent();
             Instant timestamp = reader.getCurrentTimestamp();
             ...
             resetExponentialBackoff();
           } else {
             exponentialBackoff();
           }
           available = reader.advance();
         }
       } finally {
         reader.close();
       }
     

    Note: this interface is a work-in-progress and may change.

    All Reader functions except getCurrentSource() do not need to be thread-safe; they may only be accessed by a single thread at once. However, getCurrentSource() needs to be thread-safe, and other functions should assume that its returned value can change asynchronously.

    • Constructor Summary

      Constructors 
      Constructor Description
      Reader()  
    • Method Summary

      All Methods Instance Methods Abstract Methods 
      Modifier and Type Method Description
      abstract boolean advance()
      Advances the reader to the next valid record.
      abstract void close()
      Closes the reader.
      abstract T getCurrent()
      Returns the value of the data item that was read by the last start() or advance() call.
      abstract Source<T> getCurrentSource()
      Returns a Source describing the same input that this Reader currently reads (including items already read).
      abstract org.joda.time.Instant getCurrentTimestamp()
      Returns the timestamp associated with the current data item.
      abstract boolean start()
      Initializes the reader and advances the reader to the first record.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • Reader

        public Reader()
    • Method Detail

      • start

        public abstract boolean start()
                               throws java.io.IOException
        Initializes the reader and advances the reader to the first record.

        This method should be called exactly once. The invocation should occur prior to calling advance() or getCurrent(). This method may perform expensive operations that are needed to initialize the reader.

        Returns:
        true if a record was read, false if there is no more input available.
        Throws:
        java.io.IOException
      • advance

        public abstract boolean advance()
                                 throws java.io.IOException
        Advances the reader to the next valid record.

        It is an error to call this without having called start() first.

        Returns:
        true if a record was read, false if there is no more input available.
        Throws:
        java.io.IOException
      • getCurrent

        public abstract T getCurrent()
                              throws java.util.NoSuchElementException
        Returns the value of the data item that was read by the last start() or advance() call. The returned value must be effectively immutable and remain valid indefinitely.

        Multiple calls to this method without an intervening call to advance() should return the same result.

        Throws:
        java.util.NoSuchElementException - if start() was never called, or if the last start() or advance() returned false.
      • getCurrentTimestamp

        public abstract org.joda.time.Instant getCurrentTimestamp()
                                                           throws java.util.NoSuchElementException
        Returns the timestamp associated with the current data item.

        If the source does not support timestamps, this should return BoundedWindow.TIMESTAMP_MIN_VALUE.

        Multiple calls to this method without an intervening call to advance() should return the same result.

        Throws:
        java.util.NoSuchElementException - if the reader is at the beginning of the input and start() or advance() wasn't called, or if the last start() or advance() returned false.
      • close

        public abstract void close()
                            throws java.io.IOException
        Closes the reader. The reader cannot be used after this method is called.
        Specified by:
        close in interface java.lang.AutoCloseable
        Throws:
        java.io.IOException
      • getCurrentSource

        public abstract Source<T> getCurrentSource()
        Returns a Source describing the same input that this Reader currently reads (including items already read).

        Usually, an implementation will simply return the immutable Source object from which the current Source.Reader was constructed, or delegate to the base class. However, when using or implementing this method on a BoundedSource.BoundedReader, special considerations apply, see documentation for BoundedSource.BoundedReader.getCurrentSource().