Class MongoSourceTask
- All Implemented Interfaces:
org.apache.kafka.connect.connector.Task
Copy Existing Data
If configured the connector will copy the existing data from the collection, database or client. All namespaces that exist at the time of starting the task will be broadcast onto the topic as insert operations. Only when all the data from all namespaces have been broadcast will the change stream cursor start broadcasting new changes. The logic for copying existing data is as follows:
- Get the latest resumeToken from MongoDB
- Create insert events for all configured namespaces using multiple threads. This step is completed only after all collections are successfully copied.
- Start a change stream cursor from the saved resumeToken
It should be noted that the reading of all the data during the copy and then the subsequent change stream events may produce duplicated events. During the copy, clients can make changes to the data in MongoDB, which may be represented both by the copying process and the change stream. However, as the change stream events are idempotent the changes can be applied so that the data is eventually consistent.
It should also be noted renaming a collection during the copying process is not supported.
Restarts
Restarting the connector during the copying phase, will cause the whole copy process to restart. Restarts after the copying process will resume from the last seen resumeToken.-
Field Summary
Fields inherited from class org.apache.kafka.connect.source.SourceTask
context
-
Constructor Summary
-
Method Summary
Methods inherited from class org.apache.kafka.connect.source.SourceTask
commit, commitRecord, initialize
-
Field Details
-
ID_FIELD
- See Also:
-
-
Constructor Details
-
MongoSourceTask
public MongoSourceTask()
-
-
Method Details
-
version
-
start
- Specified by:
start
in interfaceorg.apache.kafka.connect.connector.Task
- Specified by:
start
in classorg.apache.kafka.connect.source.SourceTask
-
poll
- Specified by:
poll
in classorg.apache.kafka.connect.source.SourceTask
-
stop
public void stop()- Specified by:
stop
in interfaceorg.apache.kafka.connect.connector.Task
- Specified by:
stop
in classorg.apache.kafka.connect.source.SourceTask
-
commitRecord
public void commitRecord(org.apache.kafka.connect.source.SourceRecord record, org.apache.kafka.clients.producer.RecordMetadata metadata) - Overrides:
commitRecord
in classorg.apache.kafka.connect.source.SourceTask
-