sink

Type Members

class EventBatch extends SpecificRecordBase with SpecificRecord
trait SparkFlumeProtocol extends AnyRef
class SparkSink extends AbstractSink with Logging with Configurable

A sink that uses Avro RPC to run a server that can be polled by Spark's FlumePollingInputDStream.
A sink that uses Avro RPC to run a server that can be polled by Spark's FlumePollingInputDStream. This sink has the following configuration parameters:
hostname - The hostname to bind to. Default: 0.0.0.0 port - The port to bind to. (No default - mandatory) timeout - Time in seconds after which a transaction is rolled back, if an ACK is not received from Spark within that time threads - Number of threads to use to receive requests from Spark (Default: 10)
This sink is unlike other Flume sinks in the sense that it does not push data, instead the process method in this sink simply blocks the SinkRunner the first time it is called. This sink starts up an Avro IPC server that uses the SparkFlumeProtocol.
Each time a getEventBatch call comes, creates a transaction and reads events from the channel. When enough events are read, the events are sent to the Spark receiver and the thread itself is blocked and a reference to it saved off.
When the ack for that batch is received, the thread which created the transaction is is retrieved and it commits the transaction with the channel from the same thread it was originally created in (since Flume transactions are thread local). If a nack is received instead, the sink rolls back the transaction. If no ack is received within the specified timeout, the transaction is rolled back too. If an ack comes after that, it is simply ignored and the events get re-sent.
class SparkSinkEvent extends SpecificRecordBase with SpecificRecord

package sink

Type Members

class EventBatch extends SpecificRecordBase with SpecificRecord

trait SparkFlumeProtocol extends AnyRef

class SparkSink extends AbstractSink with Logging with Configurable

class SparkSinkEvent extends SpecificRecordBase with SpecificRecord

Ungrouped