@DoFn.UnboundedPerElement public class ReadChangeStreamPartitionDoFn extends org.apache.beam.sdk.transforms.DoFn<PartitionMetadata,DataChangeRecord> implements java.io.Serializable
The processing of a partition is delegated to the QueryChangeStreamAction
.
org.apache.beam.sdk.transforms.DoFn.AlwaysFetched, org.apache.beam.sdk.transforms.DoFn.BoundedPerElement, org.apache.beam.sdk.transforms.DoFn.BundleFinalizer, org.apache.beam.sdk.transforms.DoFn.Element, org.apache.beam.sdk.transforms.DoFn.FieldAccess, org.apache.beam.sdk.transforms.DoFn.FinishBundle, org.apache.beam.sdk.transforms.DoFn.FinishBundleContext, org.apache.beam.sdk.transforms.DoFn.GetInitialRestriction, org.apache.beam.sdk.transforms.DoFn.GetInitialWatermarkEstimatorState, org.apache.beam.sdk.transforms.DoFn.GetRestrictionCoder, org.apache.beam.sdk.transforms.DoFn.GetSize, org.apache.beam.sdk.transforms.DoFn.GetWatermarkEstimatorStateCoder, org.apache.beam.sdk.transforms.DoFn.Key, org.apache.beam.sdk.transforms.DoFn.MultiOutputReceiver, org.apache.beam.sdk.transforms.DoFn.NewTracker, org.apache.beam.sdk.transforms.DoFn.NewWatermarkEstimator, org.apache.beam.sdk.transforms.DoFn.OnTimer, org.apache.beam.sdk.transforms.DoFn.OnTimerContext, org.apache.beam.sdk.transforms.DoFn.OnTimerFamily, org.apache.beam.sdk.transforms.DoFn.OnWindowExpiration, org.apache.beam.sdk.transforms.DoFn.OnWindowExpirationContext, org.apache.beam.sdk.transforms.DoFn.OutputReceiver<T>, org.apache.beam.sdk.transforms.DoFn.ProcessContext, org.apache.beam.sdk.transforms.DoFn.ProcessContinuation, org.apache.beam.sdk.transforms.DoFn.ProcessElement, org.apache.beam.sdk.transforms.DoFn.RequiresStableInput, org.apache.beam.sdk.transforms.DoFn.RequiresTimeSortedInput, org.apache.beam.sdk.transforms.DoFn.Restriction, org.apache.beam.sdk.transforms.DoFn.Setup, org.apache.beam.sdk.transforms.DoFn.SideInput, org.apache.beam.sdk.transforms.DoFn.SplitRestriction, org.apache.beam.sdk.transforms.DoFn.StartBundle, org.apache.beam.sdk.transforms.DoFn.StartBundleContext, org.apache.beam.sdk.transforms.DoFn.StateId, org.apache.beam.sdk.transforms.DoFn.Teardown, org.apache.beam.sdk.transforms.DoFn.TimerFamily, org.apache.beam.sdk.transforms.DoFn.TimerId, org.apache.beam.sdk.transforms.DoFn.Timestamp, org.apache.beam.sdk.transforms.DoFn.TruncateRestriction, org.apache.beam.sdk.transforms.DoFn.UnboundedPerElement, org.apache.beam.sdk.transforms.DoFn.WatermarkEstimatorState, org.apache.beam.sdk.transforms.DoFn.WindowedContext
Constructor and Description |
---|
ReadChangeStreamPartitionDoFn(DaoFactory daoFactory,
MapperFactory mapperFactory,
ActionFactory actionFactory,
ChangeStreamMetrics metrics)
This class needs a
DaoFactory to build DAOs to access the partition metadata tables and
to perform the change streams query. |
Modifier and Type | Method and Description |
---|---|
org.joda.time.Instant |
getInitialWatermarkEstimatorState(PartitionMetadata partition) |
double |
getSize(PartitionMetadata partition,
TimestampRange range) |
TimestampRange |
initialRestriction(PartitionMetadata partition)
The restriction for a partition will be defined from the start and end timestamp to query the
partition for.
|
ReadChangeStreamPartitionRangeTracker |
newTracker(PartitionMetadata partition,
TimestampRange range) |
org.apache.beam.sdk.transforms.splittabledofn.ManualWatermarkEstimator<org.joda.time.Instant> |
newWatermarkEstimator(org.joda.time.Instant watermarkEstimatorState) |
org.apache.beam.sdk.transforms.DoFn.ProcessContinuation |
processElement(PartitionMetadata partition,
org.apache.beam.sdk.transforms.splittabledofn.RestrictionTracker<TimestampRange,com.google.cloud.Timestamp> tracker,
org.apache.beam.sdk.transforms.DoFn.OutputReceiver<DataChangeRecord> receiver,
org.apache.beam.sdk.transforms.splittabledofn.ManualWatermarkEstimator<org.joda.time.Instant> watermarkEstimator,
org.apache.beam.sdk.transforms.DoFn.BundleFinalizer bundleFinalizer)
Performs a change stream query for a given partition.
|
void |
setThroughputEstimator(BytesThroughputEstimator<DataChangeRecord> throughputEstimator)
Sets the estimator to calculate the backlog of this function.
|
void |
setup()
Constructs instances for the
PartitionMetadataDao , ChangeStreamDao , ChangeStreamRecordMapper , PartitionMetadataMapper , DataChangeRecordAction ,
HeartbeatRecordAction , ChildPartitionsRecordAction and QueryChangeStreamAction . |
public ReadChangeStreamPartitionDoFn(DaoFactory daoFactory, MapperFactory mapperFactory, ActionFactory actionFactory, ChangeStreamMetrics metrics)
DaoFactory
to build DAOs to access the partition metadata tables and
to perform the change streams query. It uses mappers to transform database rows into the ChangeStreamRecord
model. It uses the
ActionFactory
to construct the action dispatchers, which will perform the change stream
query and process each type of record received. It emits metrics for the partition using the
ChangeStreamMetrics
.daoFactory
- the DaoFactory
to construct PartitionMetadataDao
s and ChangeStreamDao
smapperFactory
- the MapperFactory
to construct ChangeStreamRecordMapper
sactionFactory
- the ActionFactory
to construct actionsmetrics
- the ChangeStreamMetrics
to emit partition related metrics@DoFn.GetInitialWatermarkEstimatorState public org.joda.time.Instant getInitialWatermarkEstimatorState(@DoFn.Element PartitionMetadata partition)
@DoFn.NewWatermarkEstimator public org.apache.beam.sdk.transforms.splittabledofn.ManualWatermarkEstimator<org.joda.time.Instant> newWatermarkEstimator(@DoFn.WatermarkEstimatorState org.joda.time.Instant watermarkEstimatorState)
@DoFn.GetInitialRestriction public TimestampRange initialRestriction(@DoFn.Element PartitionMetadata partition)
TimestampRange
restriction represents a closed-open interval, while
the start / end timestamps represent a closed-closed interval, so we add 1 nanosecond to the
end timestamp to convert it to closed-open.
In this function we also update the partition state to PartitionMetadata.State#RUNNING
.
partition
- the partition to be queried@DoFn.GetSize public double getSize(@DoFn.Element PartitionMetadata partition, @DoFn.Restriction TimestampRange range) throws java.lang.Exception
java.lang.Exception
@DoFn.NewTracker public ReadChangeStreamPartitionRangeTracker newTracker(@DoFn.Element PartitionMetadata partition, @DoFn.Restriction TimestampRange range)
@DoFn.Setup public void setup()
PartitionMetadataDao
, ChangeStreamDao
, ChangeStreamRecordMapper
, PartitionMetadataMapper
, DataChangeRecordAction
,
HeartbeatRecordAction
, ChildPartitionsRecordAction
and QueryChangeStreamAction
.@DoFn.ProcessElement public org.apache.beam.sdk.transforms.DoFn.ProcessContinuation processElement(@DoFn.Element PartitionMetadata partition, org.apache.beam.sdk.transforms.splittabledofn.RestrictionTracker<TimestampRange,com.google.cloud.Timestamp> tracker, org.apache.beam.sdk.transforms.DoFn.OutputReceiver<DataChangeRecord> receiver, org.apache.beam.sdk.transforms.splittabledofn.ManualWatermarkEstimator<org.joda.time.Instant> watermarkEstimator, org.apache.beam.sdk.transforms.DoFn.BundleFinalizer bundleFinalizer)
The processing of a partition is delegated to the QueryChangeStreamAction
.
partition
- the partition to be queriedtracker
- an instance of ReadChangeStreamPartitionRangeTracker
receiver
- a DataChangeRecord
OutputReceiver
watermarkEstimator
- a ManualWatermarkEstimator
of Instant
bundleFinalizer
- the bundle finalizerProcessContinuation#stop()
if a record timestamp could not be claimed or if
the partition processing has finishedpublic void setThroughputEstimator(BytesThroughputEstimator<DataChangeRecord> throughputEstimator)
throughputEstimator
- an estimator to calculate local throughput.