A function that get the absolute value of the numeric value.
Adds an item to a set.
A specific implementation of an aggregate function.
Used to assign a new name to a computation.
A reference to an attribute produced by another operator in the tree.
A Set designed to hold AttributeReference objects, that performs equality checking using expression id instead of standard java equality.
A function that calculates bitwise and(&) of two numbers.
A function that calculates bitwise not(~) of a number.
A function that calculates bitwise or(|) of two numbers.
A function that calculates bitwise xor(^) of two numbers.
A bound reference points to a specific slot in the input tuple, allowing the actual value to be retrieved more efficiently.
Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END".
Cast the child expression to the target data type.
Combines the elements of two sets.
A function that returns true if the string left
contains the string right
.
Returns the number of elements in the input set.
Returns an Array containing the evaluation of all children expressions.
DynamicRows use scala's Dynamic trait to emulate an ORM of in a dynamically typed language.
A function that returns true if the string left
ends with the string right
.
Given an input array produces a sequence of rows for each value in the array.
A globally unique (within this JVM) id for a given named expression.
An expression that produces zero or more rows given a single input row.
A row implementation that uses an array of objects as the underlying storage.
Returns the value of fields in the Struct child
.
Returns the item at ordinal
in the Array child
or the Key ordinal
in Map child
.
Evaluates to true
if list
contains value
.
Optimized version of In clause, when all filter values of In clause are static.
A MutableProjection that is calculated by calling eval
on each of the specified
expressions.
A Projection that is calculated by calling the eval
of each of the specified expressions.
A mutable wrapper that makes two rows appear as a single concatenated row.
JIT HACK: Replace with macros
The JoinedRow
class is used in many performance critical situation.
JIT HACK: Replace with macros
JIT HACK: Replace with macros
JIT HACK: Replace with macros
Simple RegEx pattern matching function
A function that converts the characters of a string to lowercase.
Create a Decimal from an unscaled Long value
Converts a Row to another Row given a sequence of expression that define each column of the new row.
An extended interface to Row that allows the values for each column to be updated.
A parent class for mutable container objects that are reused when the values are changed, resulting in less garbage.
Creates a new set of the specified type
An AggregateExpression that can be partially computed without seeing all relevant tuples.
Converts a Row to another Row given a sequence of expression that define each column of the new row.
Represents one row of output from a relational operator.
User-defined function.
An expression that can be used to sort a tuple.
A row type that holds an array specialized container objects, of type MutableValue, chosen based on the dataTypes of each column.
Represents an aggregation that has been rewritten to be performed in two steps.
A function that returns true if the string left
starts with the string right
.
A base trait for functions that compare two strings, returning a boolean.
A function that takes a substring of its first argument starting at a given position.
Return the unscaled Long value of a Decimal, assuming it fits in a Long
A function that converts the characters of a string to uppercase.
Wrap a Row as a DynamicRow.
Builds a map that is keyed by an Attribute's expression id.
The data type representing DynamicRow values.
A row with no data.
Extractor for retrieving Int literals.
A collection of generators that build custom bytecode at runtime for performing the evaluation of catalyst expression.
A set of classes that can be used to represent trees of relational expressions. A key goal of the expression library is to hide the details of naming and scoping from developers who want to manipulate trees of relational operators. As such, the library defines a special type of expression, a NamedExpression in addition to the standard collection of expressions.
Standard Expressions
A library of standard expressions (e.g., Add, EqualTo), aggregates (e.g., SUM, COUNT), and other computations (e.g. UDFs). Each expression type is capable of determining its output schema as a function of its children's output schema.
Named Expressions
Some expression are named and thus can be referenced by later operators in the dataflow graph. The two types of named expressions are AttributeReferences and Aliases. AttributeReferences refer to attributes of the input tuple for a given operator and form the leaves of some expression trees. Aliases assign a name to intermediate computations. For example, in the SQL statement
SELECT a+b AS c FROM ...
, the expressionsa
andb
would be represented byAttributeReferences
andc
would be represented by anAlias
.During analysis, all named expressions are assigned a globally unique expression id, which can be used for equality comparisons. While the original names are kept around for debugging purposes, they should never be used to check if two attributes refer to the same value, as plan transformations can result in the introduction of naming ambiguity. For example, consider a plan that contains subqueries, both of which are reading from the same table. If an optimization removes the subqueries, scoping information would be destroyed, eliminating the ability to reason about which subquery produced a given attribute.
Evaluation
The result of expressions can be evaluated using the
Expression.apply(Row)
method.