Package

org.apache.spark.examples.sql

streaming

Permalink

package streaming

Visibility
  1. Public
  2. All

Type Members

  1. final class JavaStructuredKafkaWordCount extends AnyRef

    Permalink
  2. final class JavaStructuredNetworkWordCount extends AnyRef

    Permalink
  3. final class JavaStructuredNetworkWordCountWindowed extends AnyRef

    Permalink

Value Members

  1. object StructuredKafkaWordCount

    Permalink

    Consumes messages from one or more topics in Kafka and does wordcount.

    Consumes messages from one or more topics in Kafka and does wordcount. Usage: StructuredKafkaWordCount <bootstrap-servers> <subscribe-type> <topics> <bootstrap-servers> The Kafka "bootstrap.servers" configuration. A comma-separated list of host:port. <subscribe-type> There are three kinds of type, i.e. 'assign', 'subscribe', 'subscribePattern'. |- <assign> Specific TopicPartitions to consume. Json string | {"topicA":[0,1],"topicB":[2,4]}. |- <subscribe> The topic list to subscribe. A comma-separated list of | topics. |- <subscribePattern> The pattern used to subscribe to topic(s). | Java regex string. |- Only one of "assign, "subscribe" or "subscribePattern" options can be | specified for Kafka source. <topics> Different value format depends on the value of 'subscribe-type'.

    Example: $ bin/run-example \ sql.streaming.StructuredKafkaWordCount host1:port1,host2:port2 \ subscribe topic1,topic2

  2. object StructuredNetworkWordCount

    Permalink

    Counts words in UTF8 encoded, '\n' delimited text received from the network.

    Counts words in UTF8 encoded, '\n' delimited text received from the network.

    Usage: StructuredNetworkWordCount <hostname> <port> <hostname> and <port> describe the TCP server that Structured Streaming would connect to receive data.

    To run this on your local machine, you need to first run a Netcat server $ nc -lk 9999 and then run the example $ bin/run-example sql.streaming.StructuredNetworkWordCount localhost 9999

  3. object StructuredNetworkWordCountWindowed

    Permalink

    Counts words in UTF8 encoded, '\n' delimited text received from the network over a sliding window of configurable duration.

    Counts words in UTF8 encoded, '\n' delimited text received from the network over a sliding window of configurable duration. Each line from the network is tagged with a timestamp that is used to determine the windows into which it falls.

    Usage: StructuredNetworkWordCountWindowed <hostname> <port> <window duration> [<slide duration>] <hostname> and <port> describe the TCP server that Structured Streaming would connect to receive data. <window duration> gives the size of window, specified as integer number of seconds <slide duration> gives the amount of time successive windows are offset from one another, given in the same units as above. <slide duration> should be less than or equal to <window duration>. If the two are equal, successive windows have no overlap. If <slide duration> is not provided, it defaults to <window duration>.

    To run this on your local machine, you need to first run a Netcat server $ nc -lk 9999 and then run the example $ bin/run-example sql.streaming.StructuredNetworkWordCountWindowed localhost 9999 <window duration in seconds> [<slide duration in seconds>]

    One recommended <window duration>, <slide duration> pair is 10, 5

Ungrouped