Class QueryChangeStreamAction


  • public class QueryChangeStreamAction
    extends java.lang.Object
    Main action class for querying a partition change stream. This class will perform the change stream query and depending on the record type received, it will dispatch the processing of it to one of the following: ChildPartitionsRecordAction, HeartbeatRecordAction or DataChangeRecordAction.

    This class will also make sure to mirror the current watermark (event timestamp processed) in the Connector's metadata tables, by registering a bundle after commit action.

    When the change stream query for the partition is finished, this class will update the state of the partition in the metadata tables as FINISHED, indicating completion.

    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      org.apache.beam.sdk.transforms.DoFn.ProcessContinuation run​(PartitionMetadata partition, org.apache.beam.sdk.transforms.splittabledofn.RestrictionTracker<TimestampRange,​com.google.cloud.Timestamp> tracker, org.apache.beam.sdk.transforms.DoFn.OutputReceiver<DataChangeRecord> receiver, org.apache.beam.sdk.transforms.splittabledofn.ManualWatermarkEstimator<org.joda.time.Instant> watermarkEstimator, org.apache.beam.sdk.transforms.DoFn.BundleFinalizer bundleFinalizer)
      This method will dispatch a change stream query for the given partition, it delegate the processing of the records to one of the corresponding action classes registered and it will keep the state of the partition up to date in the Connector's metadata table.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Method Detail

      • run

        public org.apache.beam.sdk.transforms.DoFn.ProcessContinuation run​(PartitionMetadata partition,
                                                                           org.apache.beam.sdk.transforms.splittabledofn.RestrictionTracker<TimestampRange,​com.google.cloud.Timestamp> tracker,
                                                                           org.apache.beam.sdk.transforms.DoFn.OutputReceiver<DataChangeRecord> receiver,
                                                                           org.apache.beam.sdk.transforms.splittabledofn.ManualWatermarkEstimator<org.joda.time.Instant> watermarkEstimator,
                                                                           org.apache.beam.sdk.transforms.DoFn.BundleFinalizer bundleFinalizer)
        This method will dispatch a change stream query for the given partition, it delegate the processing of the records to one of the corresponding action classes registered and it will keep the state of the partition up to date in the Connector's metadata table.

        The algorithm is as follows:

        1. A change stream query for the partition is performed.
        2. For each record, we check the type of the record and dispatch the processing to one of the actions registered.
        3. If an Optional with a DoFn.ProcessContinuation.stop() is returned from the actions, we stop processing and return.
        4. Before returning we register a bundle finalizer callback to update the watermark of the partition in the metadata tables to the latest processed timestamp.
        5. When a change stream query finishes successfully (no more records) we update the partition state to FINISHED.
        There might be cases where due to a split at the exact end timestamp of a partition's change stream query, this function could process a residual with an invalid timestamp. In this case, the error is ignored and no work is done for the residual.
        Parameters:
        partition - the current partition being processed
        tracker - the restriction tracker of the ReadChangeStreamPartitionDoFn SDF
        receiver - the output receiver of the ReadChangeStreamPartitionDoFn SDF
        watermarkEstimator - the watermark estimator of the ReadChangeStreamPartitionDoFn SDF
        bundleFinalizer - the bundle finalizer for ReadChangeStreamPartitionDoFn SDF bundles
        Returns:
        a DoFn.ProcessContinuation.stop() if a record timestamp could not be claimed or if the partition processing has finished