Class ByLogicalTableRouter<R extends org.apache.kafka.connect.connector.ConnectRecord<R>>

  • Type Parameters:
    R - the subtype of ConnectRecord on which this transformation will operate
    All Implemented Interfaces:
    Closeable, AutoCloseable, org.apache.kafka.common.Configurable, org.apache.kafka.connect.transforms.Transformation<R>

    public class ByLogicalTableRouter<R extends org.apache.kafka.connect.connector.ConnectRecord<R>>
    extends Object
    implements org.apache.kafka.connect.transforms.Transformation<R>
    A logical table consists of one or more physical tables with the same schema. A common use case is sharding -- the two physical tables `db_shard1.my_table` and `db_shard2.my_table` together form one logical table.

    This Transformation allows us to change a record's topic name and send change events from multiple physical tables to one topic. For instance, we might choose to send the two tables from the above example to the topic `db_shard.my_table`. The config options TOPIC_REGEX and TOPIC_REPLACEMENT are used to change the record's topic.

    Now that multiple physical tables can share a topic, the event's key may need to be augmented to include fields other than just those for the record's primary/unique key, since these are not guaranteed to be unique across tables. We need some identifier added to the key that distinguishes the different physical tables. The field name specified by the config option KEY_FIELD_NAME is added to the key schema for this purpose. By default, its value will be the old topic name, but if a custom value is desired, the config options KEY_FIELD_REGEX and KEY_FIELD_REPLACEMENT may be used to change it. For instance, in our above example, we might choose to make the identifier `db_shard1` and `db_shard2` respectively.

    Author:
    David Leibovic, Mario Mueller
    • Field Detail

      • TOPIC_REGEX

        private static final Field TOPIC_REGEX
      • TOPIC_REPLACEMENT

        private static final Field TOPIC_REPLACEMENT
      • KEY_ENFORCE_UNIQUENESS

        private static final Field KEY_ENFORCE_UNIQUENESS
      • KEY_FIELD_REGEX

        private static final Field KEY_FIELD_REGEX
      • KEY_FIELD_NAME

        private static final Field KEY_FIELD_NAME
      • KEY_FIELD_REPLACEMENT

        private static final Field KEY_FIELD_REPLACEMENT
      • LOGGER

        private static final org.slf4j.Logger LOGGER
      • topicRegex

        private Pattern topicRegex
      • topicReplacement

        private String topicReplacement
      • keyFieldRegex

        private Pattern keyFieldRegex
      • keyEnforceUniqueness

        private boolean keyEnforceUniqueness
      • keyFieldReplacement

        private String keyFieldReplacement
      • keyFieldName

        private String keyFieldName
      • keySchemaUpdateCache

        private final org.apache.kafka.common.cache.Cache<org.apache.kafka.connect.data.Schema,​org.apache.kafka.connect.data.Schema> keySchemaUpdateCache
      • envelopeSchemaUpdateCache

        private final org.apache.kafka.common.cache.Cache<org.apache.kafka.connect.data.Schema,​org.apache.kafka.connect.data.Schema> envelopeSchemaUpdateCache
      • keyRegexReplaceCache

        private final org.apache.kafka.common.cache.Cache<String,​String> keyRegexReplaceCache
      • topicRegexReplaceCache

        private final org.apache.kafka.common.cache.Cache<String,​String> topicRegexReplaceCache
      • smtManager

        private SmtManager<R extends org.apache.kafka.connect.connector.ConnectRecord<R>> smtManager
    • Constructor Detail

      • ByLogicalTableRouter

        public ByLogicalTableRouter()
    • Method Detail

      • validateKeyFieldReplacement

        private static int validateKeyFieldReplacement​(Configuration config,
                                                       Field field,
                                                       Field.ValidationOutput problems)
        If KEY_FIELD_REGEX has a value that is really a regex, then the KEY_FIELD_REPLACEMENT must be a non-empty value.
      • configure

        public void configure​(Map<String,​?> props)
        Specified by:
        configure in interface org.apache.kafka.common.Configurable
      • apply

        public R apply​(R record)
        Specified by:
        apply in interface org.apache.kafka.connect.transforms.Transformation<R extends org.apache.kafka.connect.connector.ConnectRecord<R>>
      • close

        public void close()
        Specified by:
        close in interface AutoCloseable
        Specified by:
        close in interface Closeable
        Specified by:
        close in interface org.apache.kafka.connect.transforms.Transformation<R extends org.apache.kafka.connect.connector.ConnectRecord<R>>
      • config

        public org.apache.kafka.common.config.ConfigDef config()
        Specified by:
        config in interface org.apache.kafka.connect.transforms.Transformation<R extends org.apache.kafka.connect.connector.ConnectRecord<R>>
      • determineNewTopic

        private String determineNewTopic​(String oldTopic)
        Determine the new topic name.
        Parameters:
        oldTopic - the name of the old topic
        Returns:
        return the new topic name, if the regex applies. Otherwise, return null.
      • updateKeySchema

        private org.apache.kafka.connect.data.Schema updateKeySchema​(org.apache.kafka.connect.data.Schema oldKeySchema,
                                                                     String newTopicName)
      • updateKey

        private org.apache.kafka.connect.data.Struct updateKey​(org.apache.kafka.connect.data.Schema newKeySchema,
                                                               org.apache.kafka.connect.data.Struct oldKey,
                                                               String oldTopic)
      • updateEnvelopeSchema

        private org.apache.kafka.connect.data.Schema updateEnvelopeSchema​(org.apache.kafka.connect.data.Schema oldEnvelopeSchema,
                                                                          String newTopicName)
      • updateEnvelope

        private org.apache.kafka.connect.data.Struct updateEnvelope​(org.apache.kafka.connect.data.Schema newEnvelopeSchema,
                                                                    org.apache.kafka.connect.data.Struct oldEnvelope)
      • updateValue

        private org.apache.kafka.connect.data.Struct updateValue​(org.apache.kafka.connect.data.Schema newValueSchema,
                                                                 org.apache.kafka.connect.data.Struct oldValue)
      • copySchemaExcludingName

        private org.apache.kafka.connect.data.SchemaBuilder copySchemaExcludingName​(org.apache.kafka.connect.data.Schema source,
                                                                                    org.apache.kafka.connect.data.SchemaBuilder builder)
      • copySchemaExcludingName

        private org.apache.kafka.connect.data.SchemaBuilder copySchemaExcludingName​(org.apache.kafka.connect.data.Schema source,
                                                                                    org.apache.kafka.connect.data.SchemaBuilder builder,
                                                                                    boolean copyFields)