Class OperatorSubtaskState

  • All Implemented Interfaces:
    Serializable, CompositeStateHandle, StateObject
    Direct Known Subclasses:
    FinishedOperatorSubtaskState

    public class OperatorSubtaskState
    extends Object
    implements CompositeStateHandle
    This class encapsulates the state for one parallel instance of an operator. The complete state of a (logical) operator (e.g. a flatmap operator) consists of the union of all OperatorSubtaskStates from all parallel tasks that physically execute parallelized, physical instances of the operator.

    The full state of the logical operator is represented by OperatorState which consists of OperatorSubtaskStates.

    Typically, we expect all collections in this class to be of size 0 or 1, because there is up to one state handle produced per state type (e.g. managed-keyed, raw-operator, ...). In particular, this holds when taking a snapshot. The purpose of having the state handles in collections is that this class is also reused in restoring state. Under normal circumstances, the expected size of each collection is still 0 or 1, except for scale-down. In scale-down, one operator subtask can become responsible for the state of multiple previous subtasks. The collections can then store all the state handles that are relevant to build up the new subtask state.

    See Also:
    Serialized Form
    • Method Detail

      • discardState

        public void discardState()
        Description copied from interface: StateObject
        Discards the state referred to and solemnly owned by this handle, to free up resources in the persistent storage. This method is called when the state represented by this object will not be used anymore.
        Specified by:
        discardState in interface StateObject
      • registerSharedStates

        public void registerSharedStates​(SharedStateRegistry sharedStateRegistry,
                                         long checkpointID)
        Description copied from interface: CompositeStateHandle
        Register both newly created and already referenced shared states in the given SharedStateRegistry. This method is called when the checkpoint successfully completes or is recovered from failures.

        After this is completed, newly created shared state is considered as published is no longer owned by this handle. This means that it should no longer be deleted as part of calls to StateObject.discardState(). Instead, StateObject.discardState() will trigger an unregistration from the registry.

        Specified by:
        registerSharedStates in interface CompositeStateHandle
        Parameters:
        sharedStateRegistry - The registry where shared states are registered.
      • getCheckpointedSize

        public long getCheckpointedSize()
        Description copied from interface: CompositeStateHandle
        Returns the persisted data size during checkpoint execution in bytes. If incremental checkpoint is enabled, this value represents the incremental persisted data size, and usually smaller than StateObject.getStateSize(). If the size is unknown, this method would return same result as StateObject.getStateSize().
        Specified by:
        getCheckpointedSize in interface CompositeStateHandle
        Returns:
        The persisted data size during checkpoint execution in bytes.
      • getStateSize

        public long getStateSize()
        Description copied from interface: StateObject
        Returns the size of the state in bytes. If the size is not known, this method should return 0.

        The values produced by this method are only used for informational purposes and for metrics/monitoring. If this method returns wrong values, the checkpoints and recovery will still behave correctly. However, efficiency may be impacted (wrong space pre-allocation) and functionality that depends on metrics (like monitoring) will be impacted.

        Note for implementors: This method should not perform any I/O operations while obtaining the state size (hence it does not declare throwing an IOException). Instead, the state size should be stored in the state object, or should be computable from the state stored in this object. The reason is that this method is called frequently by several parts of the checkpointing and issuing I/O requests from this method accumulates a heavy I/O load on the storage system at higher scale.

        Specified by:
        getStateSize in interface StateObject
        Returns:
        Size of the state in bytes.
      • isFinished

        public boolean isFinished()
      • hashCode

        public int hashCode()
        Overrides:
        hashCode in class Object
      • hasState

        public boolean hasState()