Class

org.apache.flink.streaming.api.scala

DataStream

Related Doc: package scala

Permalink

class DataStream[T] extends AnyRef

Annotations
@Public()
Linear Supertypes
AnyRef, Any
Known Subclasses
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. DataStream
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new DataStream(stream: datastream.DataStream[T])

    Permalink

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. def addSink(fun: (T) ⇒ Unit): DataStreamSink[T]

    Permalink

    Adds the given sink to this DataStream.

    Adds the given sink to this DataStream. Only streams with sinks added will be executed once the StreamExecutionEnvironment.execute(...) method is called.

  5. def addSink(sinkFunction: SinkFunction[T]): DataStreamSink[T]

    Permalink

    Adds the given sink to this DataStream.

    Adds the given sink to this DataStream. Only streams with sinks added will be executed once the StreamExecutionEnvironment.execute(...) method is called.

  6. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  7. def assignAscendingTimestamps(extractor: (T) ⇒ Long): DataStream[T]

    Permalink

    Assigns timestamps to the elements in the data stream and periodically creates watermarks to signal event time progress.

    Assigns timestamps to the elements in the data stream and periodically creates watermarks to signal event time progress.

    This method is a shortcut for data streams where the element timestamp are known to be monotonously ascending within each parallel stream. In that case, the system can generate watermarks automatically and perfectly by tracking the ascending timestamps.

    For cases where the timestamps are not monotonously increasing, use the more general methods assignTimestampsAndWatermarks(AssignerWithPeriodicWatermarks) and assignTimestampsAndWatermarks(AssignerWithPunctuatedWatermarks).

    Annotations
    @PublicEvolving()
  8. def assignTimestampsAndWatermarks(assigner: AssignerWithPunctuatedWatermarks[T]): DataStream[T]

    Permalink

    Assigns timestamps to the elements in the data stream and periodically creates watermarks to signal event time progress.

    Assigns timestamps to the elements in the data stream and periodically creates watermarks to signal event time progress.

    This method creates watermarks based purely on stream elements. For each element that is handled via long), the AssignerWithPunctuatedWatermarks#checkAndGetNextWatermark() method is called, and a new watermark is emitted, if the returned watermark value is larger than the previous watermark.

    This method is useful when the data stream embeds watermark elements, or certain elements carry a marker that can be used to determine the current event time watermark. This operation gives the programmer full control over the watermark generation. Users should be aware that too aggressive watermark generation (i.e., generating hundreds of watermarks every second) can cost some performance.

    For cases where watermarks should be created in a regular fashion, for example every x milliseconds, use the AssignerWithPeriodicWatermarks.

    Annotations
    @PublicEvolving()
    See also

    #assignTimestampsAndWatermarks(AssignerWithPeriodicWatermarks)

    AssignerWithPeriodicWatermarks

    AssignerWithPunctuatedWatermarks

  9. def assignTimestampsAndWatermarks(assigner: AssignerWithPeriodicWatermarks[T]): DataStream[T]

    Permalink

    Assigns timestamps to the elements in the data stream and periodically creates watermarks to signal event time progress.

    Assigns timestamps to the elements in the data stream and periodically creates watermarks to signal event time progress.

    This method creates watermarks periodically (for example every second), based on the watermarks indicated by the given watermark generator. Even when no new elements in the stream arrive, the given watermark generator will be periodically checked for new watermarks. The interval in which watermarks are generated is defined in org.apache.flink.api.common.ExecutionConfig#setAutoWatermarkInterval(long).

    Use this method for the common cases, where some characteristic over all elements should generate the watermarks, or where watermarks are simply trailing behind the wall clock time by a certain amount.

    For the second case and when the watermarks are required to lag behind the maximum timestamp seen so far in the elements of the stream by a fixed amount of time, and this amount is known in advance, use the BoundedOutOfOrdernessTimestampExtractor.

    For cases where watermarks should be created in an irregular fashion, for example based on certain markers that some element carry, use the AssignerWithPunctuatedWatermarks.

    Annotations
    @PublicEvolving()
    See also

    #assignTimestampsAndWatermarks(AssignerWithPunctuatedWatermarks)

    AssignerWithPunctuatedWatermarks

    AssignerWithPeriodicWatermarks

  10. def broadcast(broadcastStateDescriptors: MapStateDescriptor[_, _]*): BroadcastStream[T]

    Permalink

    Sets the partitioning of the DataStream so that the output elements are broadcasted to every parallel instance of the next operation.

    Sets the partitioning of the DataStream so that the output elements are broadcasted to every parallel instance of the next operation. In addition, it implicitly creates as many broadcast states as the specified descriptors which can be used to store the element of the stream.

    broadcastStateDescriptors

    the descriptors of the broadcast states to create.

    returns

    A BroadcastStream which can be used in the DataStream.connect(BroadcastStream) to create a BroadcastConnectedStream for further processing of the elements.

    Annotations
    @PublicEvolving()
  11. def broadcast: DataStream[T]

    Permalink

    Sets the partitioning of the DataStream so that the output tuples are broad casted to every parallel instance of the next component.

  12. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  13. def coGroup[T2](otherStream: DataStream[T2]): CoGroupedStreams[T, T2]

    Permalink

    Creates a co-group operation.

    Creates a co-group operation. See CoGroupedStreams for an example of how the keys and window can be specified.

  14. def connect[R](broadcastStream: BroadcastStream[R]): BroadcastConnectedStream[T, R]

    Permalink

    Creates a new BroadcastConnectedStream by connecting the current DataStream or KeyedStream with a BroadcastStream.

    Creates a new BroadcastConnectedStream by connecting the current DataStream or KeyedStream with a BroadcastStream.

    The latter can be created using the broadcast(MapStateDescriptor[]) method.

    The resulting stream can be further processed using the broadcastConnectedStream.process(myFunction) method, where myFunction can be either a org.apache.flink.streaming.api.functions.co.KeyedBroadcastProcessFunction or a org.apache.flink.streaming.api.functions.co.BroadcastProcessFunction depending on the current stream being a KeyedStream or not.

    broadcastStream

    The broadcast stream with the broadcast state to be connected with this stream.

    returns

    The BroadcastConnectedStream.

    Annotations
    @PublicEvolving()
  15. def connect[T2](dataStream: DataStream[T2]): ConnectedStreams[T, T2]

    Permalink

    Creates a new ConnectedStreams by connecting DataStream outputs of different type with each other.

    Creates a new ConnectedStreams by connecting DataStream outputs of different type with each other. The DataStreams connected using this operators can be used with CoFunctions.

  16. def countWindowAll(size: Long): AllWindowedStream[T, GlobalWindow]

    Permalink

    Windows this DataStream into tumbling count windows.

    Windows this DataStream into tumbling count windows.

    Note: This operation can be inherently non-parallel since all elements have to pass through the same operator instance. (Only for special cases, such as aligned time windows is it possible to perform this operation in parallel).

    size

    The size of the windows in number of elements.

  17. def countWindowAll(size: Long, slide: Long): AllWindowedStream[T, GlobalWindow]

    Permalink

    Windows this DataStream into sliding count windows.

    Windows this DataStream into sliding count windows.

    Note: This operation can be inherently non-parallel since all elements have to pass through the same operator instance. (Only for special cases, such as aligned time windows is it possible to perform this operation in parallel).

    size

    The size of the windows in number of elements.

    slide

    The slide interval in number of elements.

  18. def dataType: TypeInformation[T]

    Permalink

    Returns the TypeInformation for the elements of this DataStream.

  19. def disableChaining(): DataStream[T]

    Permalink

    Turns off chaining for this operator so thread co-location will not be used as an optimization.

    Turns off chaining for this operator so thread co-location will not be used as an optimization. Chaining can be turned off for the whole job by StreamExecutionEnvironment.disableOperatorChaining() however it is not advised for performance considerations.

    Annotations
    @PublicEvolving()
  20. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  21. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  22. def executionConfig: ExecutionConfig

    Permalink

    Returns the execution config.

  23. def executionEnvironment: StreamExecutionEnvironment

    Permalink

    Returns the StreamExecutionEnvironment associated with this data stream

  24. def filter(fun: (T) ⇒ Boolean): DataStream[T]

    Permalink

    Creates a new DataStream that contains only the elements satisfying the given filter predicate.

  25. def filter(filter: FilterFunction[T]): DataStream[T]

    Permalink

    Creates a new DataStream that contains only the elements satisfying the given filter predicate.

  26. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  27. def flatMap[R](fun: (T) ⇒ TraversableOnce[R])(implicit arg0: TypeInformation[R]): DataStream[R]

    Permalink

    Creates a new DataStream by applying the given function to every element and flattening the results.

  28. def flatMap[R](fun: (T, Collector[R]) ⇒ Unit)(implicit arg0: TypeInformation[R]): DataStream[R]

    Permalink

    Creates a new DataStream by applying the given function to every element and flattening the results.

  29. def flatMap[R](flatMapper: FlatMapFunction[T, R])(implicit arg0: TypeInformation[R]): DataStream[R]

    Permalink

    Creates a new DataStream by applying the given function to every element and flattening the results.

  30. def forward: DataStream[T]

    Permalink

    Sets the partitioning of the DataStream so that the output tuples are forwarded to the local subtask of the next component (whenever possible).

  31. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  32. def getSideOutput[X](tag: OutputTag[X])(implicit arg0: TypeInformation[X]): DataStream[X]

    Permalink
    Annotations
    @PublicEvolving()
  33. def global: DataStream[T]

    Permalink

    Sets the partitioning of the DataStream so that the output values all go to the first instance of the next processing operator.

    Sets the partitioning of the DataStream so that the output values all go to the first instance of the next processing operator. Use this setting with care since it might cause a serious performance bottleneck in the application.

    Annotations
    @PublicEvolving()
  34. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  35. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  36. def iterate[R, F](stepFunction: (ConnectedStreams[T, F]) ⇒ (DataStream[F], DataStream[R]), maxWaitTimeMillis: Long)(implicit arg0: TypeInformation[F]): DataStream[R]

    Permalink

    Initiates an iterative part of the program that creates a loop by feeding back data streams.

    Initiates an iterative part of the program that creates a loop by feeding back data streams. To create a streaming iteration the user needs to define a transformation that creates two DataStreams. The first one is the output that will be fed back to the start of the iteration and the second is the output stream of the iterative part.

    The input stream of the iterate operator and the feedback stream will be treated as a ConnectedStreams where the input is connected with the feedback stream.

    This allows the user to distinguish standard input from feedback inputs.

    stepfunction: initialStream => (feedback, output)

    The user must set the max waiting time for the iteration head. If no data received in the set time the stream terminates. If this parameter is set to 0 then the iteration sources will indefinitely, so the job must be killed to stop.

    Annotations
    @PublicEvolving()
  37. def iterate[R](stepFunction: (DataStream[T]) ⇒ (DataStream[T], DataStream[R]), maxWaitTimeMillis: Long = 0): DataStream[R]

    Permalink

    Initiates an iterative part of the program that creates a loop by feeding back data streams.

    Initiates an iterative part of the program that creates a loop by feeding back data streams. To create a streaming iteration the user needs to define a transformation that creates two DataStreams. The first one is the output that will be fed back to the start of the iteration and the second is the output stream of the iterative part.

    stepfunction: initialStream => (feedback, output)

    A common pattern is to use output splitting to create feedback and output DataStream. Please refer to the split method of the DataStream

    By default a DataStream with iteration will never terminate, but the user can use the maxWaitTime parameter to set a max waiting time for the iteration head. If no data received in the set time the stream terminates.

    Parallelism of the feedback stream must match the parallelism of the original stream. Please refer to the setParallelism method for parallelism modification

    Annotations
    @PublicEvolving()
  38. def javaStream: datastream.DataStream[T]

    Permalink

    Gets the underlying java DataStream object.

  39. def join[T2](otherStream: DataStream[T2]): JoinedStreams[T, T2]

    Permalink

    Creates a join operation.

    Creates a join operation. See JoinedStreams for an example of how the keys and window can be specified.

  40. def keyBy[K](fun: (T) ⇒ K)(implicit arg0: TypeInformation[K]): KeyedStream[T, K]

    Permalink

    Groups the elements of a DataStream by the given K key to be used with grouped operators like grouped reduce or grouped aggregations.

  41. def keyBy(firstField: String, otherFields: String*): KeyedStream[T, Tuple]

    Permalink

    Groups the elements of a DataStream by the given field expressions to be used with grouped operators like grouped reduce or grouped aggregations.

  42. def keyBy(fields: Int*): KeyedStream[T, Tuple]

    Permalink

    Groups the elements of a DataStream by the given key positions (for tuple/array types) to be used with grouped operators like grouped reduce or grouped aggregations.

  43. def map[R](mapper: MapFunction[T, R])(implicit arg0: TypeInformation[R]): DataStream[R]

    Permalink

    Creates a new DataStream by applying the given function to every element of this DataStream.

  44. def map[R](fun: (T) ⇒ R)(implicit arg0: TypeInformation[R]): DataStream[R]

    Permalink

    Creates a new DataStream by applying the given function to every element of this DataStream.

  45. def minResources: ResourceSpec

    Permalink

    Returns the minimum resources of this operation.

    Returns the minimum resources of this operation.

    Annotations
    @PublicEvolving()
  46. def name(name: String): DataStream[T]

    Permalink

    Sets the name of the current data stream.

    Sets the name of the current data stream. This name is used by the visualization and logging during runtime.

    returns

    The named operator

  47. def name: String

    Permalink

    Gets the name of the current data stream.

    Gets the name of the current data stream. This name is used by the visualization and logging during runtime.

    returns

    Name of the stream.

  48. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  49. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  50. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  51. def parallelism: Int

    Permalink

    Returns the parallelism of this operation.

  52. def partitionCustom[K](partitioner: Partitioner[K], fun: (T) ⇒ K)(implicit arg0: TypeInformation[K]): DataStream[T]

    Permalink

    Partitions a DataStream on the key returned by the selector, using a custom partitioner.

    Partitions a DataStream on the key returned by the selector, using a custom partitioner. This method takes the key selector to get the key to partition on, and a partitioner that accepts the key type.

    Note: This method works only on single field keys, i.e. the selector cannot return tuples of fields.

  53. def partitionCustom[K](partitioner: Partitioner[K], field: String)(implicit arg0: TypeInformation[K]): DataStream[T]

    Permalink

    Partitions a POJO DataStream on the specified key fields using a custom partitioner.

    Partitions a POJO DataStream on the specified key fields using a custom partitioner. This method takes the key expression to partition on, and a partitioner that accepts the key type.

    Note: This method works only on single field keys.

  54. def partitionCustom[K](partitioner: Partitioner[K], field: Int)(implicit arg0: TypeInformation[K]): DataStream[T]

    Permalink

    Partitions a tuple DataStream on the specified key fields using a custom partitioner.

    Partitions a tuple DataStream on the specified key fields using a custom partitioner. This method takes the key position to partition on, and a partitioner that accepts the key type.

    Note: This method works only on single field keys.

  55. def preferredResources: ResourceSpec

    Permalink

    Returns the preferred resources of this operation.

    Returns the preferred resources of this operation.

    Annotations
    @PublicEvolving()
  56. def print(): DataStreamSink[T]

    Permalink

    Writes a DataStream to the standard output stream (stdout).

    Writes a DataStream to the standard output stream (stdout). For each element of the DataStream the result of .toString is written.

    Annotations
    @PublicEvolving()
  57. def printToErr(): DataStreamSink[T]

    Permalink

    Writes a DataStream to the standard output stream (stderr).

    Writes a DataStream to the standard output stream (stderr).

    For each element of the DataStream the result of AnyRef.toString() is written.

    returns

    The closed DataStream.

    Annotations
    @PublicEvolving()
  58. def process[R](processFunction: ProcessFunction[T, R])(implicit arg0: TypeInformation[R]): DataStream[R]

    Permalink

    Applies the given ProcessFunction on the input stream, thereby creating a transformed output stream.

    Applies the given ProcessFunction on the input stream, thereby creating a transformed output stream.

    The function will be called for every element in the stream and can produce zero or more output.

    processFunction

    The ProcessFunction that is called for each element in the stream.

    Annotations
    @PublicEvolving()
  59. def rebalance: DataStream[T]

    Permalink

    Sets the partitioning of the DataStream so that the output tuples are distributed evenly to the next component.

  60. def rescale: DataStream[T]

    Permalink

    Sets the partitioning of the DataStream so that the output tuples are distributed evenly to a subset of instances of the downstream operation.

    Sets the partitioning of the DataStream so that the output tuples are distributed evenly to a subset of instances of the downstream operation.

    The subset of downstream operations to which the upstream operation sends elements depends on the degree of parallelism of both the upstream and downstream operation. For example, if the upstream operation has parallelism 2 and the downstream operation has parallelism 4, then one upstream operation would distribute elements to two downstream operations while the other upstream operation would distribute to the other two downstream operations. If, on the other hand, the downstream operation has parallelism 2 while the upstream operation has parallelism 4 then two upstream operations will distribute to one downstream operation while the other two upstream operations will distribute to the other downstream operations.

    In cases where the different parallelisms are not multiples of each other one or several downstream operations will have a differing number of inputs from upstream operations.

    Annotations
    @PublicEvolving()
  61. def setBufferTimeout(timeoutMillis: Long): DataStream[T]

    Permalink

    Sets the maximum time frequency (ms) for the flushing of the output buffer.

    Sets the maximum time frequency (ms) for the flushing of the output buffer. By default the output buffers flush only when they are full.

    timeoutMillis

    The maximum time between two output flushes.

    returns

    The operator with buffer timeout set.

  62. def setConfigItem(key: String, value: String): DataStream[T]

    Permalink

    Sets the value of the given option for the operator.

    Sets the value of the given option for the operator.

    key

    The name of the option to be updated.

    value

    The value of the option to be updated.

  63. def setConfigItem(key: ConfigOption[String], value: String): DataStream[T]

    Permalink

    Sets the value of the given option for the operator.

    Sets the value of the given option for the operator.

    key

    The option to be updated.

    value

    The value of the option to be updated.

  64. def setMaxParallelism(maxParallelism: Int): DataStream[T]

    Permalink
  65. def setParallelism(parallelism: Int): DataStream[T]

    Permalink

    Sets the parallelism of this operation.

    Sets the parallelism of this operation. This must be at least 1.

  66. def setResourceConstraints(resourceConstraints: ResourceConstraints): DataStream[T]

    Permalink
    Annotations
    @PublicEvolving()
  67. def setResources(resources: ResourceSpec): DataStream[T]

    Permalink

    Sets the resource of this operation.

    Sets the resource of this operation.

    Annotations
    @PublicEvolving()
  68. def setResources(minResources: ResourceSpec, preferredResources: ResourceSpec): DataStream[T]

    Permalink

    Sets the minimum and preferred resources of this operation.

    Sets the minimum and preferred resources of this operation.

    Annotations
    @PublicEvolving()
  69. def setUidHash(hash: String): DataStream[T]

    Permalink

    Sets an user provided hash for this operator.

    Sets an user provided hash for this operator. This will be used AS IS the create the JobVertexID.

    The user provided hash is an alternative to the generated hashes, that is considered when identifying an operator through the default hash mechanics fails (e.g. because of changes between Flink versions).

    Important: this should be used as a workaround or for trouble shooting. The provided hash needs to be unique per transformation and job. Otherwise, job submission will fail. Furthermore, you cannot assign user-specified hash to intermediate nodes in an operator chain and trying so will let your job fail.

    hash

    the user provided hash for this operator.

    returns

    The operator with the user provided hash.

    Annotations
    @PublicEvolving()
  70. def shuffle: DataStream[T]

    Permalink

    Sets the partitioning of the DataStream so that the output tuples are shuffled to the next component.

    Sets the partitioning of the DataStream so that the output tuples are shuffled to the next component.

    Annotations
    @PublicEvolving()
  71. def slotSharingGroup(slotSharingGroup: String): DataStream[T]

    Permalink

    Sets the slot sharing group of this operation.

    Sets the slot sharing group of this operation. Parallel instances of operations that are in the same slot sharing group will be co-located in the same TaskManager slot, if possible.

    Operations inherit the slot sharing group of input operations if all input operations are in the same slot sharing group and no slot sharing group was explicitly specified.

    Initially an operation is in the default slot sharing group. An operation can be put into the default group explicitly by setting the slot sharing group to "default".

    slotSharingGroup

    The slot sharing group name.

    Annotations
    @PublicEvolving()
  72. def split(fun: (T) ⇒ TraversableOnce[String]): SplitStream[T]

    Permalink

    Creates a new SplitStream that contains only the elements satisfying the given output selector predicate.

  73. def split(selector: OutputSelector[T]): SplitStream[T]

    Permalink

    Operator used for directing tuples to specific named outputs using an OutputSelector.

    Operator used for directing tuples to specific named outputs using an OutputSelector. Calling this method on an operator creates a new SplitStream.

  74. def startNewChain(): DataStream[T]

    Permalink

    Starts a new task chain beginning at this operator.

    Starts a new task chain beginning at this operator. This operator will not be chained (thread co-located for increased performance) to any previous tasks even if possible.

    Annotations
    @PublicEvolving()
  75. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  76. def timeWindowAll(size: Time, slide: Time): AllWindowedStream[T, TimeWindow]

    Permalink

    Windows this DataStream into sliding time windows.

    Windows this DataStream into sliding time windows.

    This is a shortcut for either .window(SlidingEventTimeWindows.of(size, slide)) or .window(SlidingProcessingTimeWindows.of(size, slide)) depending on the time characteristic set using StreamExecutionEnvironment.setStreamTimeCharacteristic.

    Note: This operation can be inherently non-parallel since all elements have to pass through the same operator instance. (Only for special cases, such as aligned time windows is it possible to perform this operation in parallel).

    size

    The size of the window.

  77. def timeWindowAll(size: Time): AllWindowedStream[T, TimeWindow]

    Permalink

    Windows this DataStream into tumbling time windows.

    Windows this DataStream into tumbling time windows.

    This is a shortcut for either .window(TumblingEventTimeWindows.of(size)) or .window(TumblingProcessingTimeWindows.of(size)) depending on the time characteristic set using StreamExecutionEnvironment.setStreamTimeCharacteristic.

    Note: This operation can be inherently non-parallel since all elements have to pass through the same operator instance. (Only for special cases, such as aligned time windows is it possible to perform this operation in parallel).

    size

    The size of the window.

  78. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  79. def transform[R](operatorName: String, operator: OneInputStreamOperator[T, R])(implicit arg0: TypeInformation[R]): DataStream[R]

    Permalink

    Transforms the DataStream by using a custom OneInputStreamOperator.

    Transforms the DataStream by using a custom OneInputStreamOperator.

    R

    the type of elements emitted by the operator

    operatorName

    name of the operator, for logging purposes

    operator

    the object containing the transformation logic

    Annotations
    @PublicEvolving()
  80. def uid(uid: String): DataStream[T]

    Permalink

    Sets an ID for this operator.

    Sets an ID for this operator.

    The specified ID is used to assign the same operator ID across job submissions (for example when starting a job from a savepoint).

    Important: this ID needs to be unique per transformation and job. Otherwise, job submission will fail.

    uid

    The unique user-specified ID of this transformation.

    returns

    The operator with the specified ID.

    Annotations
    @PublicEvolving()
  81. def union(dataStreams: DataStream[T]*): DataStream[T]

    Permalink

    Creates a new DataStream by merging DataStream outputs of the same type with each other.

    Creates a new DataStream by merging DataStream outputs of the same type with each other. The DataStreams merged using this operator will be transformed simultaneously.

  82. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  83. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  84. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  85. def windowAll[W <: Window](assigner: WindowAssigner[_ >: T, W]): AllWindowedStream[T, W]

    Permalink

    Windows this data stream to a AllWindowedStream, which evaluates windows over a key grouped stream.

    Windows this data stream to a AllWindowedStream, which evaluates windows over a key grouped stream. Elements are put into windows by a WindowAssigner. The grouping of elements is done both by key and by window.

    A org.apache.flink.streaming.api.windowing.triggers.Trigger can be defined to specify when windows are evaluated. However, WindowAssigner have a default Trigger that is used if a Trigger is not specified.

    Note: This operation can be inherently non-parallel since all elements have to pass through the same operator instance. (Only for special cases, such as aligned time windows is it possible to perform this operation in parallel).

    assigner

    The WindowAssigner that assigns elements to windows.

    returns

    The trigger windows data stream.

    Annotations
    @PublicEvolving()
  86. def writeAsCsv(path: String, writeMode: WriteMode, rowDelimiter: String, fieldDelimiter: String): DataStreamSink[T]

    Permalink

    Writes the DataStream in CSV format to the file specified by the path parameter.

    Writes the DataStream in CSV format to the file specified by the path parameter. The writing is performed periodically every millis milliseconds.

    path

    Path to the location of the CSV file

    writeMode

    Controls whether an existing file is overwritten or not

    rowDelimiter

    Delimiter for consecutive rows

    fieldDelimiter

    Delimiter for consecutive fields

    returns

    The closed DataStream

    Annotations
    @PublicEvolving()
  87. def writeAsCsv(path: String, writeMode: WriteMode): DataStreamSink[T]

    Permalink

    Writes the DataStream in CSV format to the file specified by the path parameter.

    Writes the DataStream in CSV format to the file specified by the path parameter. The writing is performed periodically every millis milliseconds.

    path

    Path to the location of the CSV file

    writeMode

    Controls whether an existing file is overwritten or not

    returns

    The closed DataStream

    Annotations
    @PublicEvolving()
  88. def writeAsCsv(path: String): DataStreamSink[T]

    Permalink

    Writes the DataStream in CSV format to the file specified by the path parameter.

    Writes the DataStream in CSV format to the file specified by the path parameter. The writing is performed periodically every millis milliseconds.

    path

    Path to the location of the CSV file

    returns

    The closed DataStream

    Annotations
    @PublicEvolving()
  89. def writeAsText(path: String, writeMode: WriteMode): DataStreamSink[T]

    Permalink

    Writes a DataStream to the file specified by path in text format.

    Writes a DataStream to the file specified by path in text format. For every element of the DataStream the result of .toString is written.

    path

    The path pointing to the location the text file is written to

    writeMode

    Controls the behavior for existing files. Options are NO_OVERWRITE and OVERWRITE.

    returns

    The closed DataStream

    Annotations
    @PublicEvolving()
  90. def writeAsText(path: String): DataStreamSink[T]

    Permalink

    Writes a DataStream to the file specified by path in text format.

    Writes a DataStream to the file specified by path in text format. For every element of the DataStream the result of .toString is written.

    path

    The path pointing to the location the text file is written to

    returns

    The closed DataStream

    Annotations
    @PublicEvolving()
  91. def writeToSocket(hostname: String, port: Integer, schema: SerializationSchema[T]): DataStreamSink[T]

    Permalink

    Writes the DataStream to a socket as a byte array.

    Writes the DataStream to a socket as a byte array. The format of the output is specified by a SerializationSchema.

    Annotations
    @PublicEvolving()
  92. def writeUsingOutputFormat(format: OutputFormat[T]): DataStreamSink[T]

    Permalink

    Writes a DataStream using the given OutputFormat.

    Writes a DataStream using the given OutputFormat.

    Annotations
    @PublicEvolving()

Deprecated Value Members

  1. def assignTimestamps(extractor: TimestampExtractor[T]): DataStream[T]

    Permalink

    Extracts a timestamp from an element and assigns it as the internal timestamp of that element.

    Extracts a timestamp from an element and assigns it as the internal timestamp of that element. The internal timestamps are, for example, used to to event-time window operations.

    If you know that the timestamps are strictly increasing you can use an AscendingTimestampExtractor. Otherwise, you should provide a TimestampExtractor that also implements TimestampExtractor#getCurrentWatermark to keep track of watermarks.

    Annotations
    @deprecated
    Deprecated
    See also

    org.apache.flink.streaming.api.watermark.Watermark

  2. def getExecutionConfig: ExecutionConfig

    Permalink

    Returns the execution config.

    Returns the execution config.

    Annotations
    @deprecated @PublicEvolving()
    Deprecated
  3. def getExecutionEnvironment: StreamExecutionEnvironment

    Permalink

    Returns the StreamExecutionEnvironment associated with the current DataStream.

    Returns the StreamExecutionEnvironment associated with the current DataStream.

    returns

    associated execution environment

    Annotations
    @deprecated @PublicEvolving()
    Deprecated
  4. def getName: String

    Permalink

    Gets the name of the current data stream.

    Gets the name of the current data stream. This name is used by the visualization and logging during runtime.

    returns

    Name of the stream.

    Annotations
    @deprecated @PublicEvolving()
    Deprecated
  5. def getParallelism: Int

    Permalink

    Returns the parallelism of this operation.

    Returns the parallelism of this operation.

    Annotations
    @deprecated @PublicEvolving()
    Deprecated
  6. def getType(): TypeInformation[T]

    Permalink

    Returns the TypeInformation for the elements of this DataStream.

    Returns the TypeInformation for the elements of this DataStream.

    Annotations
    @deprecated @PublicEvolving()
    Deprecated

Inherited from AnyRef

Inherited from Any

Ungrouped