Package

org.apache.spark.examples

streaming

Permalink

package streaming

Visibility
  1. Public
  2. All

Type Members

  1. class CustomReceiver extends Receiver[String] with Logging

    Permalink
  2. class JavaCustomReceiver extends Receiver[String]

    Permalink
  3. final class JavaDirectKafkaWordCount extends AnyRef

    Permalink
  4. final class JavaFlumeEventCount extends AnyRef

    Permalink
  5. final class JavaNetworkWordCount extends AnyRef

    Permalink
  6. final class JavaQueueStream extends AnyRef

    Permalink
  7. class JavaRecord extends Serializable

    Permalink
  8. final class JavaRecoverableNetworkWordCount extends AnyRef

    Permalink
  9. final class JavaSqlNetworkWordCount extends AnyRef

    Permalink
  10. class JavaStatefulNetworkWordCount extends AnyRef

    Permalink
  11. case class Record(word: String) extends Product with Serializable

    Permalink

    Case class for converting RDD to DataFrame

Value Members

  1. object CustomReceiver extends Serializable

    Permalink

    Custom Receiver that receives data over a socket.

    Custom Receiver that receives data over a socket. Received bytes are interpreted as text and \n delimited lines are considered as records. They are then counted and printed.

    To run this on your local machine, you need to first run a Netcat server $ nc -lk 9999 and then run the example $ bin/run-example org.apache.spark.examples.streaming.CustomReceiver localhost 9999

  2. object DirectKafkaWordCount

    Permalink

    Consumes messages from one or more topics in Kafka and does wordcount.

    Consumes messages from one or more topics in Kafka and does wordcount. Usage: DirectKafkaWordCount <brokers> <topics> <brokers> is a list of one or more Kafka brokers <topics> is a list of one or more kafka topics to consume from

    Example: $ bin/run-example streaming.DirectKafkaWordCount broker1-host:port,broker2-host:port \ topic1,topic2

  3. object DroppedWordsCounter

    Permalink

    Use this singleton to get or register an Accumulator.

  4. object FlumeEventCount

    Permalink

    Produces a count of events received from Flume.

    Produces a count of events received from Flume.

    This should be used in conjunction with an AvroSink in Flume. It will start an Avro server on at the request host:port address and listen for requests. Your Flume AvroSink should be pointed to this address.

    Usage: FlumeEventCount <host> <port> <host> is the host the Flume receiver will be started on - a receiver creates a server and listens for flume events. <port> is the port the Flume receiver will listen on.

    To run this example: $ bin/run-example org.apache.spark.examples.streaming.FlumeEventCount <host> <port>

  5. object FlumePollingEventCount

    Permalink

    Produces a count of events received from Flume.

    Produces a count of events received from Flume.

    This should be used in conjunction with the Spark Sink running in a Flume agent. See the Spark Streaming programming guide for more details.

    Usage: FlumePollingEventCount <host> <port> host is the host on which the Spark Sink is running. port is the port at which the Spark Sink is listening.

    To run this example: $ bin/run-example org.apache.spark.examples.streaming.FlumePollingEventCount [host] [port]

  6. object HdfsWordCount

    Permalink

    Counts words in new text files created in the given directory Usage: HdfsWordCount <directory> <directory> is the directory that Spark Streaming will use to find and read new text files.

    Counts words in new text files created in the given directory Usage: HdfsWordCount <directory> <directory> is the directory that Spark Streaming will use to find and read new text files.

    To run this on your local machine on directory localdir, run this example $ bin/run-example \ org.apache.spark.examples.streaming.HdfsWordCount localdir

    Then create a text file in localdir and the words in the file will get counted.

  7. object NetworkWordCount

    Permalink

    Counts words in UTF8 encoded, '\n' delimited text received from the network every second.

    Counts words in UTF8 encoded, '\n' delimited text received from the network every second.

    Usage: NetworkWordCount <hostname> <port> <hostname> and <port> describe the TCP server that Spark Streaming would connect to receive data.

    To run this on your local machine, you need to first run a Netcat server $ nc -lk 9999 and then run the example $ bin/run-example org.apache.spark.examples.streaming.NetworkWordCount localhost 9999

  8. object QueueStream

    Permalink
  9. object RawNetworkGrep

    Permalink

    Receives text from multiple rawNetworkStreams and counts how many '\n' delimited lines have the word 'the' in them.

    Receives text from multiple rawNetworkStreams and counts how many '\n' delimited lines have the word 'the' in them. This is useful for benchmarking purposes. This will only work with spark.streaming.util.RawTextSender running on all worker nodes and with Spark using Kryo serialization (set Java property "spark.serializer" to "org.apache.spark.serializer.KryoSerializer"). Usage: RawNetworkGrep <numStreams> <host> <port> <batchMillis> <numStream> is the number rawNetworkStreams, which should be same as number of work nodes in the cluster <host> is "localhost". <port> is the port on which RawTextSender is running in the worker nodes. <batchMillise> is the Spark Streaming batch duration in milliseconds.

  10. object RecoverableNetworkWordCount

    Permalink

    Counts words in text encoded with UTF8 received from the network every second.

    Counts words in text encoded with UTF8 received from the network every second. This example also shows how to use lazily instantiated singleton instances for Accumulator and Broadcast so that they can be registered on driver failures.

    Usage: RecoverableNetworkWordCount <hostname> <port> <checkpoint-directory> <output-file> <hostname> and <port> describe the TCP server that Spark Streaming would connect to receive data. <checkpoint-directory> directory to HDFS-compatible file system which checkpoint data <output-file> file to which the word counts will be appended

    <checkpoint-directory> and <output-file> must be absolute paths

    To run this on your local machine, you need to first run a Netcat server

    $ nc -lk 9999

    and run the example as

    $ ./bin/run-example org.apache.spark.examples.streaming.RecoverableNetworkWordCount \ localhost 9999 ~/checkpoint/ ~/out

    If the directory ~/checkpoint/ does not exist (e.g. running for the first time), it will create a new StreamingContext (will print "Creating new context" to the console). Otherwise, if checkpoint data exists in ~/checkpoint/, then it will create StreamingContext from the checkpoint data.

    Refer to the online documentation for more details.

  11. object SparkSessionSingleton

    Permalink

    Lazily instantiated singleton instance of SparkSession

  12. object SqlNetworkWordCount

    Permalink

    Use DataFrames and SQL to count words in UTF8 encoded, '\n' delimited text received from the network every second.

    Use DataFrames and SQL to count words in UTF8 encoded, '\n' delimited text received from the network every second.

    Usage: SqlNetworkWordCount <hostname> <port> <hostname> and <port> describe the TCP server that Spark Streaming would connect to receive data.

    To run this on your local machine, you need to first run a Netcat server $ nc -lk 9999 and then run the example $ bin/run-example org.apache.spark.examples.streaming.SqlNetworkWordCount localhost 9999

  13. object StatefulNetworkWordCount

    Permalink

    Counts words cumulatively in UTF8 encoded, '\n' delimited text received from the network every second starting with initial value of word count.

    Counts words cumulatively in UTF8 encoded, '\n' delimited text received from the network every second starting with initial value of word count. Usage: StatefulNetworkWordCount <hostname> <port> <hostname> and <port> describe the TCP server that Spark Streaming would connect to receive data.

    To run this on your local machine, you need to first run a Netcat server $ nc -lk 9999 and then run the example $ bin/run-example org.apache.spark.examples.streaming.StatefulNetworkWordCount localhost 9999

  14. object StreamingExamples extends Logging

    Permalink

    Utility functions for Spark Streaming examples.

  15. object WordBlacklist

    Permalink

    Use this singleton to get or register a Broadcast variable.

  16. package clickstream

    Permalink

Ungrouped