datastream

Type Members

trait Aggregation extends AnyRef
trait DataStream extends Logging

A DataStream is kind of like a table of data.
A DataStream is kind of like a table of data. It has fields (like columns) and rows of data. Each row has an entry for each field (this may be null depending on the field definition).
It is a lazily evaluated data structure. Each operation on a stream will create a new derived stream, but those operations will only occur when a final action is performed.
You can create a DataStream from an IO source, such as a Parquet file or a Hive table, or you may create a fully evaluated one from an in memory structure. In the case of the former, the data will only be loaded on demand as an action is performed.
A DataStream is split into one or more flows. Each flow can operate independantly of the others. For example, if you filter a flow, each flow will be filtered seperately, which allows it to be parallelized. If you write out a flow, each partition can be written out to individual files, again allowing parallelization.
class DataStreamPublisher extends DataStream

An implementation of DataStream for which items are emitted by calling publish.
An implementation of DataStream for which items are emitted by calling publish. When no more items are to be published, call close() so that downstream subscribers can complete.
Subscribers to this publisher will block as normal, and so they should normally be placed into a separate thread.
class DataStreamSource extends DataStream with Using with Logging
abstract class DefaultAggregation extends Aggregation
class DelegateSubscriber[T] extends Subscriber[T]
class ExistsSubscriber extends Subscriber[Seq[Row]] with Logging
class FindSubscriber extends Subscriber[Seq[Row]] with Logging
trait GroupedDataStream extends AnyRef
case class IteratorAction(ds: DataStream) extends Product with Serializable
trait Publisher[T] extends AnyRef
case class SinkAction(ds: DataStream, sink: Sink, parallelism: Int) extends Logging with Product with Serializable
trait Subscriber[T] extends AnyRef
trait Subscription extends AnyRef

Value Members

object Aggregation
object DataStream
object GroupedDataStream
object Publisher extends Logging
object Subscription

package datastream

Type Members

trait Aggregation extends AnyRef

trait DataStream extends Logging

class DataStreamPublisher extends DataStream

class DataStreamSource extends DataStream with Using with Logging

abstract class DefaultAggregation extends Aggregation

class DelegateSubscriber[T] extends Subscriber[T]

class ExistsSubscriber extends Subscriber[Seq[Row]] with Logging

class FindSubscriber extends Subscriber[Seq[Row]] with Logging

trait GroupedDataStream extends AnyRef

case class IteratorAction(ds: DataStream) extends Product with Serializable

trait Publisher[T] extends AnyRef

case class SinkAction(ds: DataStream, sink: Sink, parallelism: Int) extends Logging with Product with Serializable

trait Subscriber[T] extends AnyRef

trait Subscription extends AnyRef

Value Members

object Aggregation

object DataStream

object GroupedDataStream

object Publisher extends Logging

object Subscription

Ungrouped