Enum ArgumentTrait
- java.lang.Object
-
- java.lang.Enum<ArgumentTrait>
-
- org.apache.flink.table.annotation.ArgumentTrait
-
- All Implemented Interfaces:
Serializable,Comparable<ArgumentTrait>
@PublicEvolving public enum ArgumentTrait extends Enum<ArgumentTrait>
Declares traits forArgumentHint. They enable basic validation by the framework.Some traits have dependencies to other traits, which is why this enum reflects a hierarchy in which
SCALAR,ROW_SEMANTIC_TABLE, andSET_SEMANTIC_TABLEare the top-level roots.
-
-
Enum Constant Summary
Enum Constants Enum Constant Description OPTIONAL_PARTITION_BYDefines that a PARTITION BY clause is optional forSET_SEMANTIC_TABLE.PASS_COLUMNS_THROUGHDefines that all columns of a table argument (i.e.REQUIRE_FULL_DELETEDefines that a table argument whichSUPPORT_UPDATESshould include all fields in theRowKind.DELETEmessage if the updating table is backed by an upsert changelog.REQUIRE_ON_TIMEDefines that anon_timeargument must be provided, referencing a watermarked timestamp column in the given table.REQUIRE_UPDATE_BEFOREDefines that a table argument whichSUPPORT_UPDATESshould include aRowKind.UPDATE_BEFOREmessage when encoding updates.ROW_SEMANTIC_TABLEAn argument that accepts a table with row semantics.SCALARAn argument that accepts a scalar value.SET_SEMANTIC_TABLEAn argument that accepts a table with set semantics.SUPPORT_UPDATESDefines that updates are allowed as input to the given table argument.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description booleanisRoot()StaticArgumentTraittoStaticTrait()static ArgumentTraitvalueOf(String name)Returns the enum constant of this type with the specified name.static ArgumentTrait[]values()Returns an array containing the constants of this enum type, in the order they are declared.
-
-
-
Enum Constant Detail
-
SCALAR
public static final ArgumentTrait SCALAR
An argument that accepts a scalar value. For example: f(1), f(true), f('Some string').It's the default if no
ArgumentHintis provided.
-
ROW_SEMANTIC_TABLE
public static final ArgumentTrait ROW_SEMANTIC_TABLE
An argument that accepts a table with row semantics. This trait only applies toProcessTableFunction(PTF).For scalability, input tables are distributed across so-called "virtual processors". A virtual processor, as defined by the SQL standard, executes a PTF instance and has access only to a portion of the entire table. The argument declaration decides about the size of the portion and co-location of data. Conceptually, tables can be processed either "per row" (i.e. with row semantics) or "per set" (i.e. with set semantics).
A table with row semantics assumes that there is no correlation between rows and each row can be processed independently. The framework is free in how to distribute rows across virtual processors and each virtual processor has access only to the currently processed row.
-
SET_SEMANTIC_TABLE
public static final ArgumentTrait SET_SEMANTIC_TABLE
An argument that accepts a table with set semantics. This trait only applies toProcessTableFunction(PTF).For scalability, input tables are distributed across so-called "virtual processors". A virtual processor, as defined by the SQL standard, executes a PTF instance and has access only to a portion of the entire table. The argument declaration decides about the size of the portion and co-location of data. Conceptually, tables can be processed either "per row" (i.e. with row semantics) or "per set" (i.e. with set semantics).
A table with set semantics assumes that there is a correlation between rows. When calling the function, the PARTITION BY clause defines the columns for correlation. The framework ensures that all rows belonging to same set are co-located. A PTF instance is able to access all rows belonging to the same set. In other words: The virtual processor is scoped by a key context.
It is also possible not to provide a key (
OPTIONAL_PARTITION_BY), in which case only one virtual processor handles the entire table, thereby losing scalability benefits.
-
OPTIONAL_PARTITION_BY
public static final ArgumentTrait OPTIONAL_PARTITION_BY
Defines that a PARTITION BY clause is optional forSET_SEMANTIC_TABLE. By default, it is mandatory for improving the parallel execution by distributing the table by key.Note: This trait is only valid for
SET_SEMANTIC_TABLEarguments.
-
PASS_COLUMNS_THROUGH
public static final ArgumentTrait PASS_COLUMNS_THROUGH
Defines that all columns of a table argument (i.e.ROW_SEMANTIC_TABLEorSET_SEMANTIC_TABLE) are included in the output of the PTF. By default, only columns of the PARTITION BY clause are passed through.Given a table t (containing columns k and v), and a PTF f() (producing columns c1 and c2), the output of a
SELECT * FROM f(table_arg => TABLE t PARTITION BY k)uses the following order:Default: | k | c1 | c2 | With pass-through columns: | k | v | c1 | c2 |
Pass-through columns are only available for append-only PTFs taking a single table argument and don't use timers.
Note: This trait is valid for
ROW_SEMANTIC_TABLEandSET_SEMANTIC_TABLEarguments.
-
SUPPORT_UPDATES
public static final ArgumentTrait SUPPORT_UPDATES
Defines that updates are allowed as input to the given table argument. By default, a table argument is insert-only and updates will be rejected.Input tables become updating when sub queries such as aggregations or outer joins force an incremental computation. For example, the following query only works if the function is able to digest retraction messages:
// The change +I[1] followed by -U[1], +U[2], -U[2], +U[3] will enter the function // if `table_arg` is declared with SUPPORTS_UPDATES WITH UpdatingTable AS ( SELECT COUNT(*) FROM (VALUES 1, 2, 3) ) SELECT * FROM f(table_arg => TABLE UpdatingTable)
If updates should be supported, ensure that the data type of the table argument is chosen in a way that it can encode changes. In other words: choose a row type that exposes the
RowKindchange flag.The changelog of the backing input table decides which kinds of changes enter the function. The function receives {+I} when the input table is append-only. The function receives {+I,+U,-D} if the input table is upserting using the same upsert key as the partition key. Otherwise, retractions {+I,-U,+U,-D} (i.e. including
RowKind.UPDATE_BEFORE) enter the function. UseREQUIRE_UPDATE_BEFOREto enforce retractions for all updating cases.For upserting tables, if the changelog contains key-only deletions (also known as partial deletions), only upsert key fields are set when a row enters the function. Non-key fields are set to null, regardless of NOT NULL constraints. Use
REQUIRE_FULL_DELETEto enforce that only full deletes enter the function.This trait is intended for advanced use cases. Please note that inputs are always insert-only in batch mode. Thus, if the PTF should produce the same results in both batch and streaming mode, results should be emitted based on watermarks and event-time.
The trait
PASS_COLUMNS_THROUGHis not supported if this trait is declared.The `on_time` argument is not supported if the PTF receives updates.
Note: This trait is valid for
ROW_SEMANTIC_TABLEandSET_SEMANTIC_TABLEarguments.- See Also:
REQUIRE_UPDATE_BEFORE,REQUIRE_FULL_DELETE
-
REQUIRE_UPDATE_BEFORE
public static final ArgumentTrait REQUIRE_UPDATE_BEFORE
Defines that a table argument whichSUPPORT_UPDATESshould include aRowKind.UPDATE_BEFOREmessage when encoding updates. In other words: it enforces presenting the updating table in retract changelog mode.This trait is intended for advanced use cases. By default, updates are encoded as emitted by the input operation. Thus, the updating table might be encoded in upsert changelog mode and deletes might only contain keys.
The following example shows how the input changelog encodes updates differently:
// Given a table UpdatingTable(name STRING PRIMARY KEY, score INT) // backed by upsert changelog with changes // +I[Alice, 42], +I[Bob, 0], +U[Bob, 2], +U[Bob, 100], -D[Bob, NULL]. // Given a function `f` that declares `table_arg` with REQUIRE_UPDATE_BEFORE. SELECT * FROM f(table_arg => TABLE UpdatingTable PARTITION BY name) // The following changes will enter the function: // +I[Alice, 42], +I[Bob, 0], -U[Bob, 0], +U[Bob, 2], -U[Bob, 2], +U[Bob, 100], -U[Bob, 100] // In both encodings, a materialized table would only contain a row for Alice.
Note: This trait is valid for
SET_SEMANTIC_TABLEarguments thatSUPPORT_UPDATES.- See Also:
SUPPORT_UPDATES
-
REQUIRE_FULL_DELETE
public static final ArgumentTrait REQUIRE_FULL_DELETE
Defines that a table argument whichSUPPORT_UPDATESshould include all fields in theRowKind.DELETEmessage if the updating table is backed by an upsert changelog.This trait is intended for advanced use cases. For upserting tables, if the changelog contains key-only deletes (also known as partial deletes), only upsert key fields are set when a row enters the function. Non-key fields are set to null, regardless of NOT NULL constraints.
The following example shows how the input changelog encodes updates differently:
// Given a table UpdatingTable(name STRING PRIMARY KEY, score INT) // backed by upsert changelog with changes // +I[Alice, 42], +I[Bob, 0], +U[Bob, 2], +U[Bob, 100], -D[Bob, NULL]. // Given a function `f` that declares `table_arg` with REQUIRE_FULL_DELETE. SELECT * FROM f(table_arg => TABLE UpdatingTable PARTITION BY name) // The following changes will enter the function: // +I[Alice, 42], +I[Bob, 0], +U[Bob, 2], +U[Bob, 100], -D[Bob, 100]. // In both encodings, a materialized table would only contain a row for Alice.
Note: This trait is valid for
SET_SEMANTIC_TABLEarguments thatSUPPORT_UPDATES.- See Also:
SUPPORT_UPDATES
-
REQUIRE_ON_TIME
public static final ArgumentTrait REQUIRE_ON_TIME
Defines that anon_timeargument must be provided, referencing a watermarked timestamp column in the given table.The
on_timeargument indicates which column provides the event-time timestamp. In other words, it specifies the column that defines the timestamp for when a row was generated. This timestamp is used within the PTF for timers and time-based operations when the watermark progresses the logical clock.By default, the
on_timeargument is optional. If no timestamp column is set for the PTF, theProcessTableFunction.TimeContext.time()will return null. If theon_timeargument is provided,ProcessTableFunction.TimeContext.time()will return it and the PTF will return arowtimecolumn in the output, allowing subsequent operations to access and propagate the resulting event-time timestamp.For example:
CREATE TABLE t (v STRING, ts TIMESTAMP_LTZ(3), WATERMARK FOR ts AS ts - INTERVAL '2' SECONDS); SELECT v, rowtime FROM f(table_arg => TABLE t, on_time => DESCRIPTOR(ts));Note: This trait is valid for
ROW_SEMANTIC_TABLEandSET_SEMANTIC_TABLEarguments.
-
-
Method Detail
-
values
public static ArgumentTrait[] values()
Returns an array containing the constants of this enum type, in the order they are declared. This method may be used to iterate over the constants as follows:for (ArgumentTrait c : ArgumentTrait.values()) System.out.println(c);
- Returns:
- an array containing the constants of this enum type, in the order they are declared
-
valueOf
public static ArgumentTrait valueOf(String name)
Returns the enum constant of this type with the specified name. The string must match exactly an identifier used to declare an enum constant in this type. (Extraneous whitespace characters are not permitted.)- Parameters:
name- the name of the enum constant to be returned.- Returns:
- the enum constant with the specified name
- Throws:
IllegalArgumentException- if this enum type has no constant with the specified nameNullPointerException- if the argument is null
-
isRoot
public boolean isRoot()
-
toStaticTrait
public StaticArgumentTrait toStaticTrait()
-
-