Interface DataStreamSinkProvider

  • All Superinterfaces:
    org.apache.flink.table.connector.sink.DynamicTableSink.SinkRuntimeProvider, org.apache.flink.table.connector.ParallelismProvider

    @PublicEvolving
    public interface DataStreamSinkProvider
    extends org.apache.flink.table.connector.sink.DynamicTableSink.SinkRuntimeProvider, org.apache.flink.table.connector.ParallelismProvider
    Provider that consumes a Java DataStream as a runtime implementation for DynamicTableSink.

    Note: This provider is only meant for advanced connector developers. Usually, a sink should consist of a single entity expressed via SinkFunctionProvider, or OutputFormatProvider. When using a DataStream an implementer needs to pay attention to how changes are shuffled to not mess up the changelog per parallel subtask.

    • Method Detail

      • consumeDataStream

        default org.apache.flink.streaming.api.datastream.DataStreamSink<?> consumeDataStream​(ProviderContext providerContext,
                                                                                              org.apache.flink.streaming.api.datastream.DataStream<org.apache.flink.table.data.RowData> dataStream)
        Consumes the given Java DataStream and returns the sink transformation DataStreamSink.

        Note: If the CompiledPlan feature should be supported, this method MUST set a unique identifier for each transformation/operator in the data stream. This enables stateful Flink version upgrades for streaming jobs. The identifier is used to map state back from a savepoint to an actual operator in the topology. The framework can generate topology-wide unique identifiers with ProviderContext.generateUid(String).

        See Also:
        SingleOutputStreamOperator.uid(String)
      • consumeDataStream

        @Deprecated
        default org.apache.flink.streaming.api.datastream.DataStreamSink<?> consumeDataStream​(org.apache.flink.streaming.api.datastream.DataStream<org.apache.flink.table.data.RowData> dataStream)
        Deprecated.
        Use consumeDataStream(ProviderContext, DataStream) and correctly set a unique identifier for each data stream transformation.
        Consumes the given Java DataStream and returns the sink transformation DataStreamSink.
      • getParallelism

        default Optional<Integer> getParallelism()

        Note: If a custom parallelism is returned and consumeDataStream(ProviderContext, DataStream) applies multiple transformations, make sure to set the same custom parallelism to each operator to not mess up the changelog.

        Specified by:
        getParallelism in interface org.apache.flink.table.connector.ParallelismProvider