Class MySqlReadOnlyIncrementalSnapshotChangeEventSource<T extends DataCollectionId>
- java.lang.Object
-
- io.debezium.pipeline.source.snapshot.incremental.AbstractIncrementalSnapshotChangeEventSource<MySqlPartition,T>
-
- io.debezium.connector.mysql.MySqlReadOnlyIncrementalSnapshotChangeEventSource<T>
-
- All Implemented Interfaces:
IncrementalSnapshotChangeEventSource<MySqlPartition,T>
public class MySqlReadOnlyIncrementalSnapshotChangeEventSource<T extends DataCollectionId> extends AbstractIncrementalSnapshotChangeEventSource<MySqlPartition,T>
A MySQL specific read-only incremental snapshot change event source. Uses executed GTID set as low/high watermarks for incremental snapshot window to support read-only connection.Prerequisites
- gtid_mode=ON
- enforce_gtid_consistency=ON
- If the connector is reading from a replica, then for multithreaded replicas (replicas on which replica_parallel_workers is set to a value greater than 0) it’s required to set replica_preserve_commit_order=1 or slave_preserve_commit_order=1
When a chunk should be snapshotted
- streaming is paused (this is implicit when the watermarks are handled)
- a SHOW MASTER STATUS query is executed and the low watermark is set to executed_gtid_set
- a new data chunk is read from a database by generating the SELECT statement and placed into a window buffer keyed by primary keys
- a SHOW MASTER STATUS query is executed and the high watermark is set to executed_gtid_set from SHOW MASTER STATUS subtract low watermark. In case the high watermark contains more than one unique server UUID value, steps 2 - 4 get redone
- streaming is resumed
During the subsequent streaming
- if binlog event is received and its GTID is outside of the low watermark GTID set then window processing mode is enabled
- if binlog event is received and its GTID is outside of the high watermark GTID set then window processing mode is disabled and the rest of the window’s buffer is streamed
- if server heartbeat event is received and its GTID reached the largest transaction id of high watermark then window processing mode is disabled and the rest of the window’s buffer is streamed
- if window processing mode is enabled then if the event key is contained in the window buffer then it is removed from the window buffer
- event is streamed
Watermark checksIf a watermark's GTID set doesn’t contain a binlog event’s GTID then the watermark is passed and the window processing mode gets updated. Multiple binlog events can have the same GTID, this is why the algorithm waits for the binlog event with GTID outside of watermark’s GTID set to close the window, instead of closing it as soon as the largest transaction id is reached.
The deduplication starts with the first event after the low watermark because up to the point when GTID is contained in the low watermark (executed_gtid_set that was captured before the chunk select statement). A COMMIT after the low watermark is used to make sure a chunk selection sees the changes that are committed before its execution.
The deduplication continues for all the events that are in the high watermark. The deduplicated chunk events are inserted right before the first event that is outside of the high watermark.
No binlog eventsServer heartbeat events (events that are sent by a primary to a replica to let the replica know that the primary is still alive) are used to update the window processing mode when the rate of binlog updates is low. Server heartbeat is sent only if there are no binlog events for the duration of a heartbeat interval.
The heartbeat has the same GTID as the latest binlog event at the moment (it’s a technical event that doesn’t get written into the output stream, but can be used in events processing logic). In case there are zero updates after the chunk selection, the server heartbeat’s GTID will be within a high watermark. This is why for server heartbeat event’s GTID it’s enough to reach the largest transaction id of a high watermark to disable the window processing mode, send a chunk and proceed to the next one.
The server UUID part of heartbeat’s GTID is used to get the max transaction id of a high watermark for the same server UUID. High watermark is set to a difference between executed_gtid_set before and after chunk selection. If a high watermark contains more than one unique server UUID the chunk selection is redone and watermarks are recaptured. This is done to avoid the scenario when the window is closed too early by heartbeat because server UUID changes between high and low watermarks. Heartbeat doesn’t need to check the window processing mode, it doesn’t affect correctness and simplifies the checks for the cases when the binlog reader was up to date with the low watermark and when there are no new events between high and low watermarks.
No changes between watermarksA window can be opened and closed right away by the same event. This can happen when a high watermark is an empty set, which means there were no binlog events during the chunk select. Chunk will get inserted right after the low watermark, no events will be deduplicated from the chunk
No updates for included tablesIt’s important to receive binlog events for the incremental snapshot to make progress. All binlog events are checked against the low and high watermarks, including the events from the tables that aren’t included in the connector. This guarantees that the window processing mode gets updated even when none of the tables included in the connector are getting binlog events.
-
-
Field Summary
Fields Modifier and Type Field Description private KafkaSignalThread<T>
kafkaSignal
private static org.slf4j.Logger
LOGGER
private String
showMasterStmt
-
Fields inherited from class io.debezium.pipeline.source.snapshot.incremental.AbstractIncrementalSnapshotChangeEventSource
context, dispatcher, jdbcConnection, window
-
-
Constructor Summary
Constructors Constructor Description MySqlReadOnlyIncrementalSnapshotChangeEventSource(RelationalDatabaseConnectorConfig config, JdbcConnection jdbcConnection, EventDispatcher<MySqlPartition,T> dispatcher, DatabaseSchema<?> databaseSchema, Clock clock, SnapshotProgressListener<MySqlPartition> progressListener, DataChangeEventListener<MySqlPartition> dataChangeEventListener)
-
Method Summary
-
Methods inherited from class io.debezium.pipeline.source.snapshot.incremental.AbstractIncrementalSnapshotChangeEventSource
addDataCollectionNamesToSnapshot, addKeyColumnsToCondition, buildChunkQuery, buildChunkQuery, buildMaxPrimaryKeyQuery, closeWindow, deduplicateWindow, getChangeRecordEmitter, getSignalTableName, postIncrementalSnapshotCompleted, postReadChunk, preReadChunk, readChunk, readTableChunkStatement, refreshTableSchema, sendWindowEvents, setContext
-
-
-
-
Field Detail
-
LOGGER
private static final org.slf4j.Logger LOGGER
-
showMasterStmt
private final String showMasterStmt
- See Also:
- Constant Field Values
-
kafkaSignal
private final KafkaSignalThread<T extends DataCollectionId> kafkaSignal
-
-
Constructor Detail
-
MySqlReadOnlyIncrementalSnapshotChangeEventSource
public MySqlReadOnlyIncrementalSnapshotChangeEventSource(RelationalDatabaseConnectorConfig config, JdbcConnection jdbcConnection, EventDispatcher<MySqlPartition,T> dispatcher, DatabaseSchema<?> databaseSchema, Clock clock, SnapshotProgressListener<MySqlPartition> progressListener, DataChangeEventListener<MySqlPartition> dataChangeEventListener)
-
-
Method Detail
-
init
public void init(MySqlPartition partition, OffsetContext offsetContext)
- Specified by:
init
in interfaceIncrementalSnapshotChangeEventSource<MySqlPartition,T extends DataCollectionId>
- Overrides:
init
in classAbstractIncrementalSnapshotChangeEventSource<MySqlPartition,T extends DataCollectionId>
-
processMessage
public void processMessage(MySqlPartition partition, DataCollectionId dataCollectionId, Object key, OffsetContext offsetContext) throws InterruptedException
- Throws:
InterruptedException
-
processHeartbeat
public void processHeartbeat(MySqlPartition partition, OffsetContext offsetContext) throws InterruptedException
- Throws:
InterruptedException
-
readUntilGtidChange
private void readUntilGtidChange(MySqlPartition partition, OffsetContext offsetContext) throws InterruptedException
- Throws:
InterruptedException
-
processFilteredEvent
public void processFilteredEvent(MySqlPartition partition, OffsetContext offsetContext) throws InterruptedException
- Throws:
InterruptedException
-
enqueueDataCollectionNamesToSnapshot
public void enqueueDataCollectionNamesToSnapshot(List<String> dataCollectionIds, long signalOffset)
-
processTransactionStartedEvent
public void processTransactionStartedEvent(MySqlPartition partition, OffsetContext offsetContext) throws InterruptedException
- Throws:
InterruptedException
-
processTransactionCommittedEvent
public void processTransactionCommittedEvent(MySqlPartition partition, OffsetContext offsetContext) throws InterruptedException
- Throws:
InterruptedException
-
updateLowWatermark
protected void updateLowWatermark()
-
updateHighWatermark
protected void updateHighWatermark()
-
emitWindowOpen
protected void emitWindowOpen()
- Specified by:
emitWindowOpen
in classAbstractIncrementalSnapshotChangeEventSource<MySqlPartition,T extends DataCollectionId>
-
emitWindowClose
protected void emitWindowClose(MySqlPartition partition) throws InterruptedException
- Specified by:
emitWindowClose
in classAbstractIncrementalSnapshotChangeEventSource<MySqlPartition,T extends DataCollectionId>
- Throws:
InterruptedException
-
sendEvent
protected void sendEvent(MySqlPartition partition, EventDispatcher<MySqlPartition,T> dispatcher, OffsetContext offsetContext, Object[] row) throws InterruptedException
- Overrides:
sendEvent
in classAbstractIncrementalSnapshotChangeEventSource<MySqlPartition,T extends DataCollectionId>
- Throws:
InterruptedException
-
rereadChunk
public void rereadChunk(MySqlPartition partition) throws InterruptedException
- Throws:
InterruptedException
-
checkEnqueuedSnapshotSignals
private void checkEnqueuedSnapshotSignals(MySqlPartition partition, OffsetContext offsetContext) throws InterruptedException
- Throws:
InterruptedException
-
addDataCollectionNamesToSnapshot
private void addDataCollectionNamesToSnapshot(ExecuteSnapshotKafkaSignal executeSnapshotSignal, MySqlPartition partition, OffsetContext offsetContext) throws InterruptedException
- Throws:
InterruptedException
-
getContext
private MySqlReadOnlyIncrementalSnapshotContext<T> getContext()
-
-