类 Iterations


  • @Experimental
    public class Iterations
    extends Object
    A helper class to create iterations. To construct an iteration, Users are required to provide
    • initVariableStreams: the initial values of the variable data streams which would be updated in each round.
    • dataStreams: the other data streams used inside the iteration, but would not be updated.
    • iterationBody: specifies the subgraph to update the variable streams and the outputs.

    The iteration body will be invoked with two parameters: The first parameter is a list of input variable streams, which are created as the union of the initial variable streams and the corresponding feedback variable streams (returned by the iteration body); The second parameter is the data streams given to this method.

    During the execution of iteration body, each of the records involved in the iteration has an epoch attached, which is mark the progress of the iteration. The epoch is computed as:

    • All records in the initial variable streams and initial data streams has epoch = 0.
    • For any record emitted by this operator into a non-feedback stream, the epoch of this emitted record = the epoch of the input record that triggers this emission. If this record is emitted by onEpochWatermarkIncremented(), then the epoch of this record = epochWatermark.
    • For any record emitted by this operator into a feedback variable stream, the epoch of the emitted record = the epoch of the input record that triggers this emission + 1.

    The framework would given the notification at the end of each epoch for operators and UDFs that implements IterationListener.

    The limitation of constructing the subgraph inside the iteration body could be refer in IterationBody.

    An example of the iteration is like:

    
     DataStreamList result = Iterations.iterateUnboundedStreams(
      DataStreamList.of(first, second),
      DataStreamList.of(third),
      (variableStreams, dataStreams) -> {
          ...
          return new IterationBodyResult(
              DataStreamList.of(firstFeedback, secondFeedback),
              DataStreamList.of(output));
      }
      result.<Integer>get(0).addSink(...);
     
    • 构造器详细资料

      • Iterations

        public Iterations()
    • 方法详细资料

      • iterateUnboundedStreams

        public static DataStreamList iterateUnboundedStreams​(DataStreamList initVariableStreams,
                                                             DataStreamList dataStreams,
                                                             IterationBody body)
        This method uses an iteration body to process records in possibly unbounded data streams. The iteration would not terminate if at least one of its inputs is unbounded. Otherwise it will terminated after all the inputs are terminated and no more records are iterating.
        参数:
        initVariableStreams - The initial variable streams, which is merged with the feedback variable streams before being used as the 1st parameter to invoke the iteration body.
        dataStreams - The non-variable streams also refer in the body.
        body - The computation logic which takes variable/data streams and returns feedback/output streams.
        返回:
        The list of output streams returned by the iteration boy.
      • iterateBoundedStreamsUntilTermination

        public static DataStreamList iterateBoundedStreamsUntilTermination​(DataStreamList initVariableStreams,
                                                                           ReplayableDataStreamList dataStreams,
                                                                           IterationConfig config,
                                                                           IterationBody body)
        This method uses an iteration body to process records in some bounded data streams iteratively until no more records are iterating or the given terminating criteria stream is empty in one round.
        参数:
        initVariableStreams - The initial variable streams, which is merged with the feedback variable streams before being used as the 1st parameter to invoke the iteration body.
        dataStreams - The non-variable streams also refer in the body and if each of them needs replayed for each round.
        config - The config for the iteration, like whether to re-create the operator on each round.
        body - The computation logic which takes variable/data streams and returns feedback/output streams.
        返回:
        The list of output streams returned by the iteration boy.