Class KafkaStreamingSourceOptions
- java.lang.Object
-
- software.amazon.awssdk.services.glue.model.KafkaStreamingSourceOptions
-
- All Implemented Interfaces:
Serializable
,SdkPojo
,ToCopyableBuilder<KafkaStreamingSourceOptions.Builder,KafkaStreamingSourceOptions>
@Generated("software.amazon.awssdk:codegen") public final class KafkaStreamingSourceOptions extends Object implements SdkPojo, Serializable, ToCopyableBuilder<KafkaStreamingSourceOptions.Builder,KafkaStreamingSourceOptions>
Additional options for streaming.
- See Also:
- Serialized Form
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static interface
KafkaStreamingSourceOptions.Builder
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description String
addRecordTimestamp()
When this option is set to 'true', the data output will contain an additional column named "__src_timestamp" that indicates the time when the corresponding record received by the topic.String
assign()
The specificTopicPartitions
to consume.String
bootstrapServers()
A list of bootstrap server URLs, for example, asb-1.vpc-test-2.o4q88o.c6.kafka.us-east-1.amazonaws.com:9094
.static KafkaStreamingSourceOptions.Builder
builder()
String
classification()
An optional classification.String
connectionName()
The name of the connection.String
delimiter()
Specifies the delimiter character.String
emitConsumerLagMetrics()
When this option is set to 'true', for each batch, it will emit the metrics for the duration between the oldest record received by the topic and the time it arrives in Glue to CloudWatch.String
endingOffsets()
The end point when a batch query is ended.boolean
equals(Object obj)
boolean
equalsBySdkFields(Object obj)
<T> Optional<T>
getValueForField(String fieldName, Class<T> clazz)
int
hashCode()
Boolean
includeHeaders()
Whether to include the Kafka headers.Long
maxOffsetsPerTrigger()
The rate limit on the maximum number of offsets that are processed per trigger interval.Integer
minPartitions()
The desired minimum number of partitions to read from Kafka.Integer
numRetries()
The number of times to retry before failing to fetch Kafka offsets.Long
pollTimeoutMs()
The timeout in milliseconds to poll data from Kafka in Spark job executors.Long
retryIntervalMs()
The time in milliseconds to wait before retrying to fetch Kafka offsets.List<SdkField<?>>
sdkFields()
String
securityProtocol()
The protocol used to communicate with brokers.static Class<? extends KafkaStreamingSourceOptions.Builder>
serializableBuilderClass()
String
startingOffsets()
The starting position in the Kafka topic to read data from.Instant
startingTimestamp()
The timestamp of the record in the Kafka topic to start reading data from.String
subscribePattern()
A Java regex string that identifies the topic list to subscribe to.KafkaStreamingSourceOptions.Builder
toBuilder()
String
topicName()
The topic name as specified in Apache Kafka.String
toString()
Returns a string representation of this object.-
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface software.amazon.awssdk.utils.builder.ToCopyableBuilder
copy
-
-
-
-
Method Detail
-
bootstrapServers
public final String bootstrapServers()
A list of bootstrap server URLs, for example, as
b-1.vpc-test-2.o4q88o.c6.kafka.us-east-1.amazonaws.com:9094
. This option must be specified in the API call or defined in the table metadata in the Data Catalog.- Returns:
- A list of bootstrap server URLs, for example, as
b-1.vpc-test-2.o4q88o.c6.kafka.us-east-1.amazonaws.com:9094
. This option must be specified in the API call or defined in the table metadata in the Data Catalog.
-
securityProtocol
public final String securityProtocol()
The protocol used to communicate with brokers. The possible values are
"SSL"
or"PLAINTEXT"
.- Returns:
- The protocol used to communicate with brokers. The possible values are
"SSL"
or"PLAINTEXT"
.
-
connectionName
public final String connectionName()
The name of the connection.
- Returns:
- The name of the connection.
-
topicName
public final String topicName()
The topic name as specified in Apache Kafka. You must specify at least one of
"topicName"
,"assign"
or"subscribePattern"
.- Returns:
- The topic name as specified in Apache Kafka. You must specify at least one of
"topicName"
,"assign"
or"subscribePattern"
.
-
assign
public final String assign()
The specific
TopicPartitions
to consume. You must specify at least one of"topicName"
,"assign"
or"subscribePattern"
.- Returns:
- The specific
TopicPartitions
to consume. You must specify at least one of"topicName"
,"assign"
or"subscribePattern"
.
-
subscribePattern
public final String subscribePattern()
A Java regex string that identifies the topic list to subscribe to. You must specify at least one of
"topicName"
,"assign"
or"subscribePattern"
.- Returns:
- A Java regex string that identifies the topic list to subscribe to. You must specify at least one of
"topicName"
,"assign"
or"subscribePattern"
.
-
classification
public final String classification()
An optional classification.
- Returns:
- An optional classification.
-
delimiter
public final String delimiter()
Specifies the delimiter character.
- Returns:
- Specifies the delimiter character.
-
startingOffsets
public final String startingOffsets()
The starting position in the Kafka topic to read data from. The possible values are
"earliest"
or"latest"
. The default value is"latest"
.- Returns:
- The starting position in the Kafka topic to read data from. The possible values are
"earliest"
or"latest"
. The default value is"latest"
.
-
endingOffsets
public final String endingOffsets()
The end point when a batch query is ended. Possible values are either
"latest"
or a JSON string that specifies an ending offset for eachTopicPartition
.- Returns:
- The end point when a batch query is ended. Possible values are either
"latest"
or a JSON string that specifies an ending offset for eachTopicPartition
.
-
pollTimeoutMs
public final Long pollTimeoutMs()
The timeout in milliseconds to poll data from Kafka in Spark job executors. The default value is
512
.- Returns:
- The timeout in milliseconds to poll data from Kafka in Spark job executors. The default value is
512
.
-
numRetries
public final Integer numRetries()
The number of times to retry before failing to fetch Kafka offsets. The default value is
3
.- Returns:
- The number of times to retry before failing to fetch Kafka offsets. The default value is
3
.
-
retryIntervalMs
public final Long retryIntervalMs()
The time in milliseconds to wait before retrying to fetch Kafka offsets. The default value is
10
.- Returns:
- The time in milliseconds to wait before retrying to fetch Kafka offsets. The default value is
10
.
-
maxOffsetsPerTrigger
public final Long maxOffsetsPerTrigger()
The rate limit on the maximum number of offsets that are processed per trigger interval. The specified total number of offsets is proportionally split across
topicPartitions
of different volumes. The default value is null, which means that the consumer reads all offsets until the known latest offset.- Returns:
- The rate limit on the maximum number of offsets that are processed per trigger interval. The specified
total number of offsets is proportionally split across
topicPartitions
of different volumes. The default value is null, which means that the consumer reads all offsets until the known latest offset.
-
minPartitions
public final Integer minPartitions()
The desired minimum number of partitions to read from Kafka. The default value is null, which means that the number of spark partitions is equal to the number of Kafka partitions.
- Returns:
- The desired minimum number of partitions to read from Kafka. The default value is null, which means that the number of spark partitions is equal to the number of Kafka partitions.
-
includeHeaders
public final Boolean includeHeaders()
Whether to include the Kafka headers. When the option is set to "true", the data output will contain an additional column named "glue_streaming_kafka_headers" with type
Array[Struct(key: String, value: String)]
. The default value is "false". This option is available in Glue version 3.0 or later only.- Returns:
- Whether to include the Kafka headers. When the option is set to "true", the data output will contain an
additional column named "glue_streaming_kafka_headers" with type
Array[Struct(key: String, value: String)]
. The default value is "false". This option is available in Glue version 3.0 or later only.
-
addRecordTimestamp
public final String addRecordTimestamp()
When this option is set to 'true', the data output will contain an additional column named "__src_timestamp" that indicates the time when the corresponding record received by the topic. The default value is 'false'. This option is supported in Glue version 4.0 or later.
- Returns:
- When this option is set to 'true', the data output will contain an additional column named "__src_timestamp" that indicates the time when the corresponding record received by the topic. The default value is 'false'. This option is supported in Glue version 4.0 or later.
-
emitConsumerLagMetrics
public final String emitConsumerLagMetrics()
When this option is set to 'true', for each batch, it will emit the metrics for the duration between the oldest record received by the topic and the time it arrives in Glue to CloudWatch. The metric's name is "glue.driver.streaming.maxConsumerLagInMs". The default value is 'false'. This option is supported in Glue version 4.0 or later.
- Returns:
- When this option is set to 'true', for each batch, it will emit the metrics for the duration between the oldest record received by the topic and the time it arrives in Glue to CloudWatch. The metric's name is "glue.driver.streaming.maxConsumerLagInMs". The default value is 'false'. This option is supported in Glue version 4.0 or later.
-
startingTimestamp
public final Instant startingTimestamp()
The timestamp of the record in the Kafka topic to start reading data from. The possible values are a timestamp string in UTC format of the pattern
yyyy-mm-ddTHH:MM:SSZ
(where Z represents a UTC timezone offset with a +/-. For example: "2023-04-04T08:00:00+08:00").Only one of
StartingTimestamp
orStartingOffsets
must be set.- Returns:
- The timestamp of the record in the Kafka topic to start reading data from. The possible values are a
timestamp string in UTC format of the pattern
yyyy-mm-ddTHH:MM:SSZ
(where Z represents a UTC timezone offset with a +/-. For example: "2023-04-04T08:00:00+08:00").Only one of
StartingTimestamp
orStartingOffsets
must be set.
-
toBuilder
public KafkaStreamingSourceOptions.Builder toBuilder()
- Specified by:
toBuilder
in interfaceToCopyableBuilder<KafkaStreamingSourceOptions.Builder,KafkaStreamingSourceOptions>
-
builder
public static KafkaStreamingSourceOptions.Builder builder()
-
serializableBuilderClass
public static Class<? extends KafkaStreamingSourceOptions.Builder> serializableBuilderClass()
-
equalsBySdkFields
public final boolean equalsBySdkFields(Object obj)
- Specified by:
equalsBySdkFields
in interfaceSdkPojo
-
toString
public final String toString()
Returns a string representation of this object. This is useful for testing and debugging. Sensitive data will be redacted from this string using a placeholder value.
-
-