Enum ArgumentTrait

    • Enum Constant Detail

      • SCALAR

        public static final ArgumentTrait SCALAR
        An argument that accepts a scalar value. For example: f(1), f(true), f('Some string').

        It's the default if no ArgumentHint is provided.

      • ROW_SEMANTIC_TABLE

        public static final ArgumentTrait ROW_SEMANTIC_TABLE
        An argument that accepts a table with row semantics. This trait only applies to ProcessTableFunction (PTF).

        For scalability, input tables are distributed across so-called "virtual processors". A virtual processor, as defined by the SQL standard, executes a PTF instance and has access only to a portion of the entire table. The argument declaration decides about the size of the portion and co-location of data. Conceptually, tables can be processed either "per row" (i.e. with row semantics) or "per set" (i.e. with set semantics).

        A table with row semantics assumes that there is no correlation between rows and each row can be processed independently. The framework is free in how to distribute rows across virtual processors and each virtual processor has access only to the currently processed row.

      • SET_SEMANTIC_TABLE

        public static final ArgumentTrait SET_SEMANTIC_TABLE
        An argument that accepts a table with set semantics. This trait only applies to ProcessTableFunction (PTF).

        For scalability, input tables are distributed across so-called "virtual processors". A virtual processor, as defined by the SQL standard, executes a PTF instance and has access only to a portion of the entire table. The argument declaration decides about the size of the portion and co-location of data. Conceptually, tables can be processed either "per row" (i.e. with row semantics) or "per set" (i.e. with set semantics).

        A table with set semantics assumes that there is a correlation between rows. When calling the function, the PARTITION BY clause defines the columns for correlation. The framework ensures that all rows belonging to same set are co-located. A PTF instance is able to access all rows belonging to the same set. In other words: The virtual processor is scoped by a key context.

        It is also possible not to provide a key (OPTIONAL_PARTITION_BY), in which case only one virtual processor handles the entire table, thereby losing scalability benefits.

      • OPTIONAL_PARTITION_BY

        public static final ArgumentTrait OPTIONAL_PARTITION_BY
        Defines that a PARTITION BY clause is optional for SET_SEMANTIC_TABLE. By default, it is mandatory for improving the parallel execution by distributing the table by key.

        Note: This trait is only valid for SET_SEMANTIC_TABLE arguments.

      • PASS_COLUMNS_THROUGH

        public static final ArgumentTrait PASS_COLUMNS_THROUGH
        Defines that all columns of a table argument (i.e. ROW_SEMANTIC_TABLE or SET_SEMANTIC_TABLE) are included in the output of the PTF. By default, only columns of the PARTITION BY clause are passed through.

        Given a table t (containing columns k and v), and a PTF f() (producing columns c1 and c2), the output of a SELECT * FROM f(table_arg => TABLE t PARTITION BY k) uses the following order:

         Default: | k | c1 | c2 |
         With pass-through columns: | k | v | c1 | c2 |
         

        Pass-through columns are only available for append-only PTFs taking a single table argument and don't use timers.

        Note: This trait is valid for ROW_SEMANTIC_TABLE and SET_SEMANTIC_TABLE arguments.

      • SUPPORT_UPDATES

        public static final ArgumentTrait SUPPORT_UPDATES
        Defines that updates are allowed as input to the given table argument. By default, a table argument is insert-only and updates will be rejected.

        Input tables become updating when sub queries such as aggregations or outer joins force an incremental computation. For example, the following query only works if the function is able to digest retraction messages:

         // The change +I[1] followed by -U[1], +U[2], -U[2], +U[3] will enter the function
         // if `table_arg` is declared with SUPPORTS_UPDATES
         WITH UpdatingTable AS (
           SELECT COUNT(*) FROM (VALUES 1, 2, 3)
         )
         SELECT * FROM f(table_arg => TABLE UpdatingTable)
         

        If updates should be supported, ensure that the data type of the table argument is chosen in a way that it can encode changes. In other words: choose a row type that exposes the RowKind change flag.

        The changelog of the backing input table decides which kinds of changes enter the function. The function receives {+I} when the input table is append-only. The function receives {+I,+U,-D} if the input table is upserting using the same upsert key as the partition key. Otherwise, retractions {+I,-U,+U,-D} (i.e. including RowKind.UPDATE_BEFORE) enter the function. Use REQUIRE_UPDATE_BEFORE to enforce retractions for all updating cases.

        For upserting tables, if the changelog contains key-only deletions (also known as partial deletions), only upsert key fields are set when a row enters the function. Non-key fields are set to null, regardless of NOT NULL constraints. Use REQUIRE_FULL_DELETE to enforce that only full deletes enter the function.

        This trait is intended for advanced use cases. Please note that inputs are always insert-only in batch mode. Thus, if the PTF should produce the same results in both batch and streaming mode, results should be emitted based on watermarks and event-time.

        The trait PASS_COLUMNS_THROUGH is not supported if this trait is declared.

        The `on_time` argument is not supported if the PTF receives updates.

        Note: This trait is valid for ROW_SEMANTIC_TABLE and SET_SEMANTIC_TABLE arguments.

        See Also:
        REQUIRE_UPDATE_BEFORE, REQUIRE_FULL_DELETE
      • REQUIRE_UPDATE_BEFORE

        public static final ArgumentTrait REQUIRE_UPDATE_BEFORE
        Defines that a table argument which SUPPORT_UPDATES should include a RowKind.UPDATE_BEFORE message when encoding updates. In other words: it enforces presenting the updating table in retract changelog mode.

        This trait is intended for advanced use cases. By default, updates are encoded as emitted by the input operation. Thus, the updating table might be encoded in upsert changelog mode and deletes might only contain keys.

        The following example shows how the input changelog encodes updates differently:

         // Given a table UpdatingTable(name STRING PRIMARY KEY, score INT)
         // backed by upsert changelog with changes
         // +I[Alice, 42], +I[Bob, 0], +U[Bob, 2], +U[Bob, 100], -D[Bob, NULL].
        
         // Given a function `f` that declares `table_arg` with REQUIRE_UPDATE_BEFORE.
         SELECT * FROM f(table_arg => TABLE UpdatingTable PARTITION BY name)
        
         // The following changes will enter the function:
         // +I[Alice, 42], +I[Bob, 0], -U[Bob, 0], +U[Bob, 2], -U[Bob, 2], +U[Bob, 100], -U[Bob, 100]
        
         // In both encodings, a materialized table would only contain a row for Alice.
         

        Note: This trait is valid for SET_SEMANTIC_TABLE arguments that SUPPORT_UPDATES.

        See Also:
        SUPPORT_UPDATES
      • REQUIRE_FULL_DELETE

        public static final ArgumentTrait REQUIRE_FULL_DELETE
        Defines that a table argument which SUPPORT_UPDATES should include all fields in the RowKind.DELETE message if the updating table is backed by an upsert changelog.

        This trait is intended for advanced use cases. For upserting tables, if the changelog contains key-only deletes (also known as partial deletes), only upsert key fields are set when a row enters the function. Non-key fields are set to null, regardless of NOT NULL constraints.

        The following example shows how the input changelog encodes updates differently:

         // Given a table UpdatingTable(name STRING PRIMARY KEY, score INT)
         // backed by upsert changelog with changes
         // +I[Alice, 42], +I[Bob, 0], +U[Bob, 2], +U[Bob, 100], -D[Bob, NULL].
        
         // Given a function `f` that declares `table_arg` with REQUIRE_FULL_DELETE.
         SELECT * FROM f(table_arg => TABLE UpdatingTable PARTITION BY name)
        
         // The following changes will enter the function:
         // +I[Alice, 42], +I[Bob, 0], +U[Bob, 2], +U[Bob, 100], -D[Bob, 100].
        
         // In both encodings, a materialized table would only contain a row for Alice.
         

        Note: This trait is valid for SET_SEMANTIC_TABLE arguments that SUPPORT_UPDATES.

        See Also:
        SUPPORT_UPDATES
      • REQUIRE_ON_TIME

        public static final ArgumentTrait REQUIRE_ON_TIME
        Defines that an on_time argument must be provided, referencing a watermarked timestamp column in the given table.

        The on_time argument indicates which column provides the event-time timestamp. In other words, it specifies the column that defines the timestamp for when a row was generated. This timestamp is used within the PTF for timers and time-based operations when the watermark progresses the logical clock.

        By default, the on_time argument is optional. If no timestamp column is set for the PTF, the ProcessTableFunction.TimeContext.time() will return null. If the on_time argument is provided, ProcessTableFunction.TimeContext.time() will return it and the PTF will return a rowtime column in the output, allowing subsequent operations to access and propagate the resulting event-time timestamp.

        For example:

             CREATE TABLE t (v STRING, ts TIMESTAMP_LTZ(3), WATERMARK FOR ts AS ts - INTERVAL '2' SECONDS);
        
             SELECT v, rowtime FROM f(table_arg => TABLE t, on_time => DESCRIPTOR(ts));
         

        Note: This trait is valid for ROW_SEMANTIC_TABLE and SET_SEMANTIC_TABLE arguments.

    • Method Detail

      • values

        public static ArgumentTrait[] values()
        Returns an array containing the constants of this enum type, in the order they are declared. This method may be used to iterate over the constants as follows:
        for (ArgumentTrait c : ArgumentTrait.values())
            System.out.println(c);
        
        Returns:
        an array containing the constants of this enum type, in the order they are declared
      • valueOf

        public static ArgumentTrait valueOf​(String name)
        Returns the enum constant of this type with the specified name. The string must match exactly an identifier used to declare an enum constant in this type. (Extraneous whitespace characters are not permitted.)
        Parameters:
        name - the name of the enum constant to be returned.
        Returns:
        the enum constant with the specified name
        Throws:
        IllegalArgumentException - if this enum type has no constant with the specified name
        NullPointerException - if the argument is null
      • isRoot

        public boolean isRoot()