Class MongoSourceTask

  • All Implemented Interfaces:
    org.apache.kafka.connect.connector.Task

    public final class MongoSourceTask
    extends org.apache.kafka.connect.source.SourceTask
    A Kafka Connect source task that uses change streams to broadcast changes to the collection, database or client.

    Copy Existing Data

    If configured the connector will copy the existing data from the collection, database or client. All namespaces that exist at the time of starting the task will be broadcast onto the topic as insert operations. Only when all the data from all namespaces have been broadcast will the change stream cursor start broadcasting new changes. The logic for copying existing data is as follows:

    1. Get the latest resumeToken from MongoDB
    2. Create insert events for all configured namespaces using multiple threads. This step is completed only after all collections are successfully copied.
    3. Start a change stream cursor from the saved resumeToken

    It should be noted that the reading of all the data during the copy and then the subsequent change stream events may produce duplicated events. During the copy, clients can make changes to the data in MongoDB, which may be represented both by the copying process and the change stream. However, as the change stream events are idempotent the changes can be applied so that the data is eventually consistent.

    It should also be noted renaming a collection during the copying process is not supported.

    Restarts

    Restarting the connector during the copying phase, will cause the whole copy process to restart. Restarts after the copying process will resume from the last seen resumeToken.
    • Field Summary

      Fields 
      Modifier and Type Field Description
      static java.lang.String ID_FIELD  
      • Fields inherited from class org.apache.kafka.connect.source.SourceTask

        context
    • Constructor Summary

      Constructors 
      Constructor Description
      MongoSourceTask()  
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      java.util.List<org.apache.kafka.connect.source.SourceRecord> poll()  
      void start​(java.util.Map<java.lang.String,​java.lang.String> props)  
      void stop()  
      java.lang.String version()  
      • Methods inherited from class org.apache.kafka.connect.source.SourceTask

        commit, commitRecord, commitRecord, initialize
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • MongoSourceTask

        public MongoSourceTask()
    • Method Detail

      • version

        public java.lang.String version()
      • start

        public void start​(java.util.Map<java.lang.String,​java.lang.String> props)
        Specified by:
        start in interface org.apache.kafka.connect.connector.Task
        Specified by:
        start in class org.apache.kafka.connect.source.SourceTask
      • poll

        public java.util.List<org.apache.kafka.connect.source.SourceRecord> poll()
        Specified by:
        poll in class org.apache.kafka.connect.source.SourceTask
      • stop

        public void stop()
        Specified by:
        stop in interface org.apache.kafka.connect.connector.Task
        Specified by:
        stop in class org.apache.kafka.connect.source.SourceTask