Class DetectNewPartitionsAction
- java.lang.Object
-
- org.apache.beam.sdk.io.gcp.spanner.changestreams.action.DetectNewPartitionsAction
-
public class DetectNewPartitionsAction extends java.lang.Object
This class is responsible for scheduling partitions. It obtains partitions to be scheduled from the partition metadata table. The full algorithm is described inrun(RestrictionTracker, OutputReceiver, ManualWatermarkEstimator)
.
-
-
Constructor Summary
Constructors Constructor Description DetectNewPartitionsAction(PartitionMetadataDao dao, PartitionMetadataMapper mapper, ChangeStreamMetrics metrics, org.joda.time.Duration resumeDuration)
Constructs an action class for detecting / scheduling new partitions.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description org.apache.beam.sdk.transforms.DoFn.ProcessContinuation
run(org.apache.beam.sdk.transforms.splittabledofn.RestrictionTracker<TimestampRange,com.google.cloud.Timestamp> tracker, org.apache.beam.sdk.transforms.DoFn.OutputReceiver<PartitionMetadata> receiver, org.apache.beam.sdk.transforms.splittabledofn.ManualWatermarkEstimator<org.joda.time.Instant> watermarkEstimator)
Executes the main logic to schedule new partitions.
-
-
-
Constructor Detail
-
DetectNewPartitionsAction
public DetectNewPartitionsAction(PartitionMetadataDao dao, PartitionMetadataMapper mapper, ChangeStreamMetrics metrics, org.joda.time.Duration resumeDuration)
Constructs an action class for detecting / scheduling new partitions.
-
-
Method Detail
-
run
public org.apache.beam.sdk.transforms.DoFn.ProcessContinuation run(org.apache.beam.sdk.transforms.splittabledofn.RestrictionTracker<TimestampRange,com.google.cloud.Timestamp> tracker, org.apache.beam.sdk.transforms.DoFn.OutputReceiver<PartitionMetadata> receiver, org.apache.beam.sdk.transforms.splittabledofn.ManualWatermarkEstimator<org.joda.time.Instant> watermarkEstimator)
Executes the main logic to schedule new partitions. It follows this procedure periodically:- Fetches the min watermark from all the unfinished partitions in the metadata tables.
- If there are no unfinished partitions, this function will stop and not be re-scheduled.
- Updates the component's watermark to the min fetched.
- Fetches the read timestamp from the restriction.
- Fetches all the partitions with a createdAt timestamp > read timestamp.
- Groups the partitions by createdAt timestamp.
- Process the groups in ascending order of createdAt timestamp (oldest first)
- For each group, updates the state to
PartitionMetadata.State.SCHEDULED
. - Tries to claim the createdAt timestamp of the group within the restriction.
- If it is possible to claim the timestamp, outputs each partition to the next stage. It then proceeds to process the next batch. When there are no more batches to process, schedules the function to resume after the configured resume duration.
- If it is not possible to claim the timestamp, stops.
- Parameters:
tracker
- an instance ofDetectNewPartitionsRangeTracker
receiver
- aPartitionMetadata
DoFn.OutputReceiver
watermarkEstimator
- aManualWatermarkEstimator
ofInstant
- Returns:
- a
DoFn.ProcessContinuation.stop()
if there are no more partitions to process orDoFn.ProcessContinuation.resume()
to re-schedule the function after the configured interval.
-
-